{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Step 3\n", "\n", "In this notebook we incrementally export frequently changing orders and order details data from MySQL into Neptune.\n", "\n", "For details on converting from a relational data model to a graph data model, see [Converting a Relational Data Model to a Graph Model](https://github.com/aws-samples/aws-dbs-refarch-graph/tree/master/src/converting-to-graph#converting-a-relational-data-model-to-a-graph-model)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Incremental Import from MySQL to Neptune\n", "\n", "Now that we have some static data in the graph (product and supplier) we can import data that changes frequently in the source system – that is, new orders and order details.\n", "\n", "We'll run an AWS Glue job, `export_from_mysql_to_neptune_incremental`, that imports a day's worth of sales data from MySQL into Neptune. Each run saves a _checkpoint_ (a timestamp) in the graph. The next time the job is run, it will import the data for the day following this checkpoint. \n", "\n", "The script for the job can be found [here](https://github.com/aws-samples/amazon-neptune-samples/tree/master/gremlin/glue-neptune/glue-jobs/mysql-neptune/export-from-mysql-to-neptune-incremental.py). The script uses the [neptune-python-utils](https://github.com/awslabs/amazon-neptune-tools/tree/master/neptune-python-utils) Python library." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%run './glue_utils.py'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import nest_asyncio\n", "nest_asyncio.apply()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## First incremental load" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll run the first [incremental export job](https://github.com/aws-samples/amazon-neptune-samples/tree/master/gremlin/glue-neptune/glue-jobs/mysql-neptune/export-from-mysql-to-neptune-incremental.py)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "job_name = glue_resource('export_from_mysql_to_neptune_incremental')\n", "\n", "run_job(job_name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Count the number of orders and order details" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from neptune_python_utils.gremlin_utils import GremlinUtils\n", "\n", "GremlinUtils.init_statics(globals())\n", "\n", "gremlin_utils = GremlinUtils()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "conn = gremlin_utils.remote_connection()\n", "g = gremlin_utils.traversal_source(connection=conn)\n", "\n", "print('# Orders : {}'.format(g.V().hasLabel('Order').count().next()))\n", "print('# OrderDetails: {}'.format(g.V().hasLabel('OrderDetail').count().next()))\n", "\n", "conn.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Second incremental load" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "run_job(job_name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Count the number of orders and order details" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "conn = gremlin_utils.remote_connection()\n", "g = gremlin_utils.traversal_source(connection=conn)\n", "\n", "print('# Orders : {}'.format(g.V().hasLabel('Order').count().next()))\n", "print('# OrderDetails: {}'.format(g.V().hasLabel('OrderDetail').count().next()))\n", "\n", "conn.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Third incremental load" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "run_job(job_name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Count the number of orders and order details" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "conn = gremlin_utils.remote_connection()\n", "g = gremlin_utils.traversal_source(connection=conn)\n", "\n", "print('# Orders : {}'.format(g.V().hasLabel('Order').count().next()))\n", "print('# OrderDetails: {}'.format(g.V().hasLabel('OrderDetail').count().next()))\n", "\n", "conn.close()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.12" } }, "nbformat": 4, "nbformat_minor": 2 }