{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Use Case 3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Find the people in more senior roles at the companies where X worked"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## What questions would we have to ask of our data?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "_\"Who were in senior roles at the companies where X worked?\"_\n",
    "\n",
    "Once again, we're looking at roles. But now we see there are some structural relations between roles – a role hierarchy. Structural relations are best modelled using edges. Therefore, we will now promote role to become a vertex:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/images/data-modelling-08.png\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With role now a vertex, our overall data model looks like this:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/images/data-modelling-09.png\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Sample dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/images/data-modelling-03.png\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Creating some revised sample data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%load_ext ipython_unittest\n",
    "%run '../util/neptune.py'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "neptune.clear()\n",
    "g = neptune.graphTraversal()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from datetime import *\n",
    "\n",
    "(g.\n",
    "   addV('Person').property(id,'p-1').property('firstName','Martha').property('lastName','Rivera').\n",
    "   addV('Person').property(id,'p-2').property('firstName','Richard').property('lastName','Roe').\n",
    "   addV('Person').property(id,'p-3').property('firstName','Li').property('lastName','Juan').\n",
    "   addV('Person').property(id,'p-4').property('firstName','John').property('lastName','Stiles').\n",
    "   addV('Person').property(id,'p-5').property('firstName','Saanvi').property('lastName','Sarkar').\n",
    "   addV('Role').property(id,'r-1').property('name','Analyst').\n",
    "   addV('Role').property(id,'r-2').property('name','Senior Analyst').\n",
    "   addV('Role').property(id,'r-3').property('name','Principal Analyst').\n",
    "   addV('Role').property(id,'r-4').property('name','Associate Analyst').\n",
    "   addV('Role').property(id,'r-5').property('name','Manager').\n",
    "   addV('Company').property(id,'c-1').property('name','Example Corp').\n",
    "   addV('Company').property(id,'c-2').property('name','AnyCompany').\n",
    "   addV('Location').property(id,'l-1').property('name','HQ').property('address','100 Main St, Anytown').\n",
    "   addV('Location').property(id,'l-2').property('name','Offices').property('address','Downtown, Anytown').\n",
    "   addV('Location').property(id,'l-3').property('name','Exchange').property('address','50 High St, Anytown').\n",
    "   addV('Job').property(id,'j-1').property('from',datetime(2010,10,20)).property('to',datetime(2017,11,1)).\n",
    "   addV('Job').property(id,'j-2').property('from',datetime(2011,2,16)).property('to',datetime(2013,9,17)).\n",
    "   addV('Job').property(id,'j-3').property('from',datetime(2013,11,21)).property('to',datetime(2016,3,23)).\n",
    "   addV('Job').property(id,'j-4').property('from',datetime(2015,2,2)).property('to',datetime(2018,2,8)).\n",
    "   addV('Job').property(id,'j-5').property('from',datetime(2011,7,15)).property('to',datetime(2017,10,14)).\n",
    "   addV('Job').property(id,'j-6').property('from',datetime(2012,3,23)).property('to',datetime(2013,11,1)).\n",
    "   V('r-1').addE('PARENT_ROLE').to(V('r-2')).\n",
    "   V('r-2').addE('PARENT_ROLE').to(V('r-3')).\n",
    "   V('r-4').addE('PARENT_ROLE').to(V('r-5')).\n",
    "   V('c-1').addE('LOCATION').to(V('l-1')).\n",
    "   V('c-1').addE('LOCATION').to(V('l-2')).\n",
    "   V('c-2').addE('LOCATION').to(V('l-3')). \n",
    "   V('p-1').addE('JOB').to(V('j-1')).\n",
    "   V('j-1').addE('ROLE').to(V('r-3')).\n",
    "   V('j-1').addE('COMPANY').to(V('c-1')).\n",
    "   V('j-1').addE('LOCATION').to(V('l-1')).                            \n",
    "   V('p-2').addE('JOB').to(V('j-2')).\n",
    "   V('j-2').addE('ROLE').to(V('r-2')).\n",
    "   V('j-2').addE('COMPANY').to(V('c-1')).\n",
    "   V('j-2').addE('LOCATION').to(V('l-2')).                            \n",
    "   V('p-3').addE('JOB').to(V('j-3')).\n",
    "   V('j-3').addE('ROLE').to(V('r-1')).\n",
    "   V('j-3').addE('COMPANY').to(V('c-1')).\n",
    "   V('j-3').addE('LOCATION').to(V('l-1')).\n",
    "   V('p-4').addE('JOB').to(V('j-4')).\n",
    "   V('j-4').addE('ROLE').to(V('r-1')).\n",
    "   V('j-4').addE('COMPANY').to(V('c-1')).\n",
    "   V('j-4').addE('LOCATION').to(V('l-2')).                              \n",
    "   V('p-5').addE('JOB').to(V('j-5')).\n",
    "   V('j-5').addE('ROLE').to(V('r-5')).\n",
    "   V('j-5').addE('COMPANY').to(V('c-2')).\n",
    "   V('j-5').addE('LOCATION').to(V('l-3')).\n",
    "   V('p-3').addE('JOB').to(V('j-6')).\n",
    "   V('j-6').addE('ROLE').to(V('r-4')).\n",
    "   V('j-6').addE('COMPANY').to(V('c-2')).\n",
    "   V('j-6').addE('LOCATION').to(V('l-3')).\n",
    "   toList())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Querying the data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Query 3 – Who were in senior roles at the companies where Li worked?"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To answer this question, we'll have to perform the following steps:\n",
    "\n",
    " 1. Start at the Person's vertex\n",
    " 2. Follow JOB and ROLE edges to Roles\n",
    " 3. Traverse up Role hierarchy\n",
    " 4. For each parent Role: \n",
    "   - Get associated Jobs\n",
    "   - Filter Jobs by date\n",
    "   - Get Role and Person details for each Job"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%unittest\n",
    "\n",
    "results = (g.V('p-3').out('JOB').as_('j1'). # traverse from Person to each Job \n",
    " out('ROLE'). # traverse to job's Role\n",
    " repeat(out('PARENT_ROLE')).until(outE().count().is_(0)). # climb the Role hierarchy...\n",
    " emit().in_('ROLE').as_('j2'). # for each Role encountered in the hierarchy, traverse to the Jobs associated with that role\n",
    "   or_(        \n",
    "     (where('j1', between('j2', 'j2')).by('from').by('from').by('to')), # filter based on the dates of the original Job (j1)\n",
    "     (where('j1', between('j2', 'j2')).by('to').by('from').by('to')),\n",
    "     (where('j1', lte('j2').and_(gt('j2'))).by('from').by('from').by('to').by('from'))\n",
    "   ).\n",
    " order().by(id). \n",
    " project('role', 'name'). # for each Job emitted from above\n",
    "   by(out('ROLE').values('name')). # get Role\n",
    "   by(in_('JOB').values('firstName', 'lastName').fold()). # get Person's name\n",
    " toList())\n",
    "\n",
    "assert results == [{'role': 'Principal Analyst', 'name': ['Martha', 'Rivera']}, \n",
    "                   {'role': 'Manager', 'name': ['Saanvi', 'Sarkar']}]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Query 2 – Who worked for Example Corp, and at which locations, between 2015-2017?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%unittest\n",
    "\n",
    "results = (g.\n",
    " V('c-1').in_('COMPANY').\n",
    "   or_(        \n",
    "       (has('from', between(datetime(2015,1,1), datetime(2018,1,1)))),\n",
    "       (has('to', between(datetime(2015,1,1), datetime(2018,1,1))))\n",
    "     ).\n",
    " order().by(id).\n",
    " project('name', 'location').\n",
    "   by(in_('JOB').values('firstName', 'lastName').fold()).\n",
    "   by(out('LOCATION').values('name', 'address').fold()).\n",
    " toList())\n",
    "\n",
    "assert results == [{'name': ['Martha', 'Rivera'], 'location': ['HQ', '100 Main St, Anytown']},\n",
    " {'name': ['Li', 'Juan'], 'location': ['HQ', '100 Main St, Anytown']},\n",
    " {'name': ['John', 'Stiles'], 'location': ['Offices', 'Downtown, Anytown']}]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Broken tests"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once again we've broken the test for Query 1. Not surprising given that role has been promoted to a vertex. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Query 1 [BROKEN] – Which companies has Li worked for, and in what roles?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%unittest\n",
    "\n",
    "results = (g.V('p-3').\n",
    "             out('JOB').\n",
    "             project('company', 'role').\n",
    "             by(out('COMPANY').values('name')).\n",
    "             by('role').\n",
    "             toList())\n",
    "\n",
    "assert results == [{'company': 'Example Corp', 'role': 'Analyst'}, \n",
    "                   {'company': 'AnyCompany', 'role': 'Associate Analyst'}]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Query 1 (revised) – Which companies has Li worked for, and in what roles?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%unittest\n",
    "\n",
    "results = (g.V('p-3').\n",
    "             out('JOB').\n",
    "             project('company', 'role').\n",
    "             by(out('COMPANY').values('name')).\n",
    "             by(out('ROLE').values('name')).\n",
    "             toList())\n",
    "\n",
    "assert results == [{'company': 'Example Corp', 'role': 'Analyst'}, \n",
    "                   {'company': 'AnyCompany', 'role': 'Associate Analyst'}]"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_python3",
   "language": "python",
   "name": "conda_python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}