{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Use Case 3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Find the people in more senior roles at the companies where X worked" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What questions would we have to ask of our data?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "_\"Who were in senior roles at the companies where X worked?\"_\n", "\n", "Once again, we're looking at roles. But now we see there are some structural relations between roles – a role hierarchy. Structural relations are best modelled using edges. Therefore, we will now promote role to become a vertex:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<img src=\"https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/images/data-modelling-08.png\"/>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With role now a vertex, our overall data model looks like this:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<img src=\"https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/images/data-modelling-09.png\"/>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sample dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<img src=\"https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/images/data-modelling-03.png\"/>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating some revised sample data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%load_ext ipython_unittest\n", "%run '../util/neptune.py'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "neptune.clear()\n", "g = neptune.graphTraversal()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from datetime import *\n", "\n", "(g.\n", " addV('Person').property(id,'p-1').property('firstName','Martha').property('lastName','Rivera').\n", " addV('Person').property(id,'p-2').property('firstName','Richard').property('lastName','Roe').\n", " addV('Person').property(id,'p-3').property('firstName','Li').property('lastName','Juan').\n", " addV('Person').property(id,'p-4').property('firstName','John').property('lastName','Stiles').\n", " addV('Person').property(id,'p-5').property('firstName','Saanvi').property('lastName','Sarkar').\n", " addV('Role').property(id,'r-1').property('name','Analyst').\n", " addV('Role').property(id,'r-2').property('name','Senior Analyst').\n", " addV('Role').property(id,'r-3').property('name','Principal Analyst').\n", " addV('Role').property(id,'r-4').property('name','Associate Analyst').\n", " addV('Role').property(id,'r-5').property('name','Manager').\n", " addV('Company').property(id,'c-1').property('name','Example Corp').\n", " addV('Company').property(id,'c-2').property('name','AnyCompany').\n", " addV('Location').property(id,'l-1').property('name','HQ').property('address','100 Main St, Anytown').\n", " addV('Location').property(id,'l-2').property('name','Offices').property('address','Downtown, Anytown').\n", " addV('Location').property(id,'l-3').property('name','Exchange').property('address','50 High St, Anytown').\n", " addV('Job').property(id,'j-1').property('from',datetime(2010,10,20)).property('to',datetime(2017,11,1)).\n", " addV('Job').property(id,'j-2').property('from',datetime(2011,2,16)).property('to',datetime(2013,9,17)).\n", " addV('Job').property(id,'j-3').property('from',datetime(2013,11,21)).property('to',datetime(2016,3,23)).\n", " addV('Job').property(id,'j-4').property('from',datetime(2015,2,2)).property('to',datetime(2018,2,8)).\n", " addV('Job').property(id,'j-5').property('from',datetime(2011,7,15)).property('to',datetime(2017,10,14)).\n", " addV('Job').property(id,'j-6').property('from',datetime(2012,3,23)).property('to',datetime(2013,11,1)).\n", " V('r-1').addE('PARENT_ROLE').to(V('r-2')).\n", " V('r-2').addE('PARENT_ROLE').to(V('r-3')).\n", " V('r-4').addE('PARENT_ROLE').to(V('r-5')).\n", " V('c-1').addE('LOCATION').to(V('l-1')).\n", " V('c-1').addE('LOCATION').to(V('l-2')).\n", " V('c-2').addE('LOCATION').to(V('l-3')). \n", " V('p-1').addE('JOB').to(V('j-1')).\n", " V('j-1').addE('ROLE').to(V('r-3')).\n", " V('j-1').addE('COMPANY').to(V('c-1')).\n", " V('j-1').addE('LOCATION').to(V('l-1')). \n", " V('p-2').addE('JOB').to(V('j-2')).\n", " V('j-2').addE('ROLE').to(V('r-2')).\n", " V('j-2').addE('COMPANY').to(V('c-1')).\n", " V('j-2').addE('LOCATION').to(V('l-2')). \n", " V('p-3').addE('JOB').to(V('j-3')).\n", " V('j-3').addE('ROLE').to(V('r-1')).\n", " V('j-3').addE('COMPANY').to(V('c-1')).\n", " V('j-3').addE('LOCATION').to(V('l-1')).\n", " V('p-4').addE('JOB').to(V('j-4')).\n", " V('j-4').addE('ROLE').to(V('r-1')).\n", " V('j-4').addE('COMPANY').to(V('c-1')).\n", " V('j-4').addE('LOCATION').to(V('l-2')). \n", " V('p-5').addE('JOB').to(V('j-5')).\n", " V('j-5').addE('ROLE').to(V('r-5')).\n", " V('j-5').addE('COMPANY').to(V('c-2')).\n", " V('j-5').addE('LOCATION').to(V('l-3')).\n", " V('p-3').addE('JOB').to(V('j-6')).\n", " V('j-6').addE('ROLE').to(V('r-4')).\n", " V('j-6').addE('COMPANY').to(V('c-2')).\n", " V('j-6').addE('LOCATION').to(V('l-3')).\n", " toList())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Querying the data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Query 3 – Who were in senior roles at the companies where Li worked?" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "To answer this question, we'll have to perform the following steps:\n", "\n", " 1. Start at the Person's vertex\n", " 2. Follow JOB and ROLE edges to Roles\n", " 3. Traverse up Role hierarchy\n", " 4. For each parent Role: \n", " - Get associated Jobs\n", " - Filter Jobs by date\n", " - Get Role and Person details for each Job" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%unittest\n", "\n", "results = (g.V('p-3').out('JOB').as_('j1'). # traverse from Person to each Job \n", " out('ROLE'). # traverse to job's Role\n", " repeat(out('PARENT_ROLE')).until(outE().count().is_(0)). # climb the Role hierarchy...\n", " emit().in_('ROLE').as_('j2'). # for each Role encountered in the hierarchy, traverse to the Jobs associated with that role\n", " or_( \n", " (where('j1', between('j2', 'j2')).by('from').by('from').by('to')), # filter based on the dates of the original Job (j1)\n", " (where('j1', between('j2', 'j2')).by('to').by('from').by('to')),\n", " (where('j1', lte('j2').and_(gt('j2'))).by('from').by('from').by('to').by('from'))\n", " ).\n", " order().by(id). \n", " project('role', 'name'). # for each Job emitted from above\n", " by(out('ROLE').values('name')). # get Role\n", " by(in_('JOB').values('firstName', 'lastName').fold()). # get Person's name\n", " toList())\n", "\n", "assert results == [{'role': 'Principal Analyst', 'name': ['Martha', 'Rivera']}, \n", " {'role': 'Manager', 'name': ['Saanvi', 'Sarkar']}]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Query 2 – Who worked for Example Corp, and at which locations, between 2015-2017?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%unittest\n", "\n", "results = (g.\n", " V('c-1').in_('COMPANY').\n", " or_( \n", " (has('from', between(datetime(2015,1,1), datetime(2018,1,1)))),\n", " (has('to', between(datetime(2015,1,1), datetime(2018,1,1))))\n", " ).\n", " order().by(id).\n", " project('name', 'location').\n", " by(in_('JOB').values('firstName', 'lastName').fold()).\n", " by(out('LOCATION').values('name', 'address').fold()).\n", " toList())\n", "\n", "assert results == [{'name': ['Martha', 'Rivera'], 'location': ['HQ', '100 Main St, Anytown']},\n", " {'name': ['Li', 'Juan'], 'location': ['HQ', '100 Main St, Anytown']},\n", " {'name': ['John', 'Stiles'], 'location': ['Offices', 'Downtown, Anytown']}]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Broken tests" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once again we've broken the test for Query 1. Not surprising given that role has been promoted to a vertex. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Query 1 [BROKEN] – Which companies has Li worked for, and in what roles?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%unittest\n", "\n", "results = (g.V('p-3').\n", " out('JOB').\n", " project('company', 'role').\n", " by(out('COMPANY').values('name')).\n", " by('role').\n", " toList())\n", "\n", "assert results == [{'company': 'Example Corp', 'role': 'Analyst'}, \n", " {'company': 'AnyCompany', 'role': 'Associate Analyst'}]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Query 1 (revised) – Which companies has Li worked for, and in what roles?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%unittest\n", "\n", "results = (g.V('p-3').\n", " out('JOB').\n", " project('company', 'role').\n", " by(out('COMPANY').values('name')).\n", " by(out('ROLE').values('name')).\n", " toList())\n", "\n", "assert results == [{'company': 'Example Corp', 'role': 'Analyst'}, \n", " {'company': 'AnyCompany', 'role': 'Associate Analyst'}]" ] } ], "metadata": { "kernelspec": { "display_name": "conda_python3", "language": "python", "name": "conda_python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }