{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Use Case 3"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Find the people in more senior roles at the companies where X worked"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## What questions would we have to ask of our data?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"_\"Who were in senior roles at the companies where X worked?\"_\n",
"\n",
"Once again, we're looking at roles. But now we see there are some structural relations between roles – a role hierarchy. Structural relations are best modelled using edges. Therefore, we will now promote role to become a vertex:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With role now a vertex, our overall data model looks like this:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Sample dataset"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating some revised sample data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%load_ext ipython_unittest\n",
"%run '../util/neptune.py'"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"neptune.clear()\n",
"g = neptune.graphTraversal()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from datetime import *\n",
"\n",
"(g.\n",
" addV('Person').property(id,'p-1').property('firstName','Martha').property('lastName','Rivera').\n",
" addV('Person').property(id,'p-2').property('firstName','Richard').property('lastName','Roe').\n",
" addV('Person').property(id,'p-3').property('firstName','Li').property('lastName','Juan').\n",
" addV('Person').property(id,'p-4').property('firstName','John').property('lastName','Stiles').\n",
" addV('Person').property(id,'p-5').property('firstName','Saanvi').property('lastName','Sarkar').\n",
" addV('Role').property(id,'r-1').property('name','Analyst').\n",
" addV('Role').property(id,'r-2').property('name','Senior Analyst').\n",
" addV('Role').property(id,'r-3').property('name','Principal Analyst').\n",
" addV('Role').property(id,'r-4').property('name','Associate Analyst').\n",
" addV('Role').property(id,'r-5').property('name','Manager').\n",
" addV('Company').property(id,'c-1').property('name','Example Corp').\n",
" addV('Company').property(id,'c-2').property('name','AnyCompany').\n",
" addV('Location').property(id,'l-1').property('name','HQ').property('address','100 Main St, Anytown').\n",
" addV('Location').property(id,'l-2').property('name','Offices').property('address','Downtown, Anytown').\n",
" addV('Location').property(id,'l-3').property('name','Exchange').property('address','50 High St, Anytown').\n",
" addV('Job').property(id,'j-1').property('from',datetime(2010,10,20)).property('to',datetime(2017,11,1)).\n",
" addV('Job').property(id,'j-2').property('from',datetime(2011,2,16)).property('to',datetime(2013,9,17)).\n",
" addV('Job').property(id,'j-3').property('from',datetime(2013,11,21)).property('to',datetime(2016,3,23)).\n",
" addV('Job').property(id,'j-4').property('from',datetime(2015,2,2)).property('to',datetime(2018,2,8)).\n",
" addV('Job').property(id,'j-5').property('from',datetime(2011,7,15)).property('to',datetime(2017,10,14)).\n",
" addV('Job').property(id,'j-6').property('from',datetime(2012,3,23)).property('to',datetime(2013,11,1)).\n",
" V('r-1').addE('PARENT_ROLE').to(V('r-2')).\n",
" V('r-2').addE('PARENT_ROLE').to(V('r-3')).\n",
" V('r-4').addE('PARENT_ROLE').to(V('r-5')).\n",
" V('c-1').addE('LOCATION').to(V('l-1')).\n",
" V('c-1').addE('LOCATION').to(V('l-2')).\n",
" V('c-2').addE('LOCATION').to(V('l-3')). \n",
" V('p-1').addE('JOB').to(V('j-1')).\n",
" V('j-1').addE('ROLE').to(V('r-3')).\n",
" V('j-1').addE('COMPANY').to(V('c-1')).\n",
" V('j-1').addE('LOCATION').to(V('l-1')). \n",
" V('p-2').addE('JOB').to(V('j-2')).\n",
" V('j-2').addE('ROLE').to(V('r-2')).\n",
" V('j-2').addE('COMPANY').to(V('c-1')).\n",
" V('j-2').addE('LOCATION').to(V('l-2')). \n",
" V('p-3').addE('JOB').to(V('j-3')).\n",
" V('j-3').addE('ROLE').to(V('r-1')).\n",
" V('j-3').addE('COMPANY').to(V('c-1')).\n",
" V('j-3').addE('LOCATION').to(V('l-1')).\n",
" V('p-4').addE('JOB').to(V('j-4')).\n",
" V('j-4').addE('ROLE').to(V('r-1')).\n",
" V('j-4').addE('COMPANY').to(V('c-1')).\n",
" V('j-4').addE('LOCATION').to(V('l-2')). \n",
" V('p-5').addE('JOB').to(V('j-5')).\n",
" V('j-5').addE('ROLE').to(V('r-5')).\n",
" V('j-5').addE('COMPANY').to(V('c-2')).\n",
" V('j-5').addE('LOCATION').to(V('l-3')).\n",
" V('p-3').addE('JOB').to(V('j-6')).\n",
" V('j-6').addE('ROLE').to(V('r-4')).\n",
" V('j-6').addE('COMPANY').to(V('c-2')).\n",
" V('j-6').addE('LOCATION').to(V('l-3')).\n",
" toList())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Querying the data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Query 3 – Who were in senior roles at the companies where Li worked?"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"To answer this question, we'll have to perform the following steps:\n",
"\n",
" 1. Start at the Person's vertex\n",
" 2. Follow JOB and ROLE edges to Roles\n",
" 3. Traverse up Role hierarchy\n",
" 4. For each parent Role: \n",
" - Get associated Jobs\n",
" - Filter Jobs by date\n",
" - Get Role and Person details for each Job"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%unittest\n",
"\n",
"results = (g.V('p-3').out('JOB').as_('j1'). # traverse from Person to each Job \n",
" out('ROLE'). # traverse to job's Role\n",
" repeat(out('PARENT_ROLE')).until(outE().count().is_(0)). # climb the Role hierarchy...\n",
" emit().in_('ROLE').as_('j2'). # for each Role encountered in the hierarchy, traverse to the Jobs associated with that role\n",
" or_( \n",
" (where('j1', between('j2', 'j2')).by('from').by('from').by('to')), # filter based on the dates of the original Job (j1)\n",
" (where('j1', between('j2', 'j2')).by('to').by('from').by('to')),\n",
" (where('j1', lte('j2').and_(gt('j2'))).by('from').by('from').by('to').by('from'))\n",
" ).\n",
" order().by(id). \n",
" project('role', 'name'). # for each Job emitted from above\n",
" by(out('ROLE').values('name')). # get Role\n",
" by(in_('JOB').values('firstName', 'lastName').fold()). # get Person's name\n",
" toList())\n",
"\n",
"assert results == [{'role': 'Principal Analyst', 'name': ['Martha', 'Rivera']}, \n",
" {'role': 'Manager', 'name': ['Saanvi', 'Sarkar']}]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Query 2 – Who worked for Example Corp, and at which locations, between 2015-2017?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%unittest\n",
"\n",
"results = (g.\n",
" V('c-1').in_('COMPANY').\n",
" or_( \n",
" (has('from', between(datetime(2015,1,1), datetime(2018,1,1)))),\n",
" (has('to', between(datetime(2015,1,1), datetime(2018,1,1))))\n",
" ).\n",
" order().by(id).\n",
" project('name', 'location').\n",
" by(in_('JOB').values('firstName', 'lastName').fold()).\n",
" by(out('LOCATION').values('name', 'address').fold()).\n",
" toList())\n",
"\n",
"assert results == [{'name': ['Martha', 'Rivera'], 'location': ['HQ', '100 Main St, Anytown']},\n",
" {'name': ['Li', 'Juan'], 'location': ['HQ', '100 Main St, Anytown']},\n",
" {'name': ['John', 'Stiles'], 'location': ['Offices', 'Downtown, Anytown']}]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Broken tests"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Once again we've broken the test for Query 1. Not surprising given that role has been promoted to a vertex. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Query 1 [BROKEN] – Which companies has Li worked for, and in what roles?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%unittest\n",
"\n",
"results = (g.V('p-3').\n",
" out('JOB').\n",
" project('company', 'role').\n",
" by(out('COMPANY').values('name')).\n",
" by('role').\n",
" toList())\n",
"\n",
"assert results == [{'company': 'Example Corp', 'role': 'Analyst'}, \n",
" {'company': 'AnyCompany', 'role': 'Associate Analyst'}]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Query 1 (revised) – Which companies has Li worked for, and in what roles?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%unittest\n",
"\n",
"results = (g.V('p-3').\n",
" out('JOB').\n",
" project('company', 'role').\n",
" by(out('COMPANY').values('name')).\n",
" by(out('ROLE').values('name')).\n",
" toList())\n",
"\n",
"assert results == [{'company': 'Example Corp', 'role': 'Analyst'}, \n",
" {'company': 'AnyCompany', 'role': 'Associate Analyst'}]"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "conda_python3",
"language": "python",
"name": "conda_python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}