{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Use Case 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Find the companies where X has worked, and their roles at those companies"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## What questions would we have to ask of our data?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "_\"Which companies has X worked for, and in what roles?\"_\n",
    "\n",
    "Reviewing this question we can identify several entities, attributes and relationships. We have the concept of a _company_, a _person_ (X), and a _role_. Further, a person _worked for_ a company.\n",
    "\n",
    "_Company_ and _Person_ are both entities, which we'll model as vertices with appropriate labels. For now, we'll assume a direct relationship between a person and a company: a person _WORKED FOR_ a company. We'll make _role_ an attribute of this relationship.\n",
    "\n",
    "Adding in a few properties – _firstName_ and _lastName_ for a person, _name_ for a company – we end up with the following data model:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/images/data-modelling-04.png\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Keep it simple"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Over the course of this exercise we'll see _role_ change place several times. At this stage it's a simple attribute of a relationship. In later steps we'll see it promoted to being a vertex in its own right.\n",
    "\n",
    "As far as our current use case is concerned, role appears to be a simple value type, much like colour, height or weight. If it were a complex value type with several fields – such as address – or if there were some explicit structural relations between values – as there are in a category hierarchy – we would consider making it a vertex from the outset."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Sample dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll now create a sample dataset in line with our model. We'll include enough data to ensure that our queries have to exclude some portions of the graph in order to return a correct result."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/images/data-modelling-01.png\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Creating our sample data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%load_ext ipython_unittest\n",
    "%run '../util/neptune.py'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "neptune.clear()\n",
    "g = neptune.graphTraversal()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(g.\n",
    "   addV('Person').property(id,'p-1').property('firstName','Martha').property('lastName','Rivera').\n",
    "   addV('Person').property(id,'p-2').property('firstName','Richard').property('lastName','Roe').\n",
    "   addV('Person').property(id,'p-3').property('firstName','Li').property('lastName','Juan').\n",
    "   addV('Person').property(id,'p-4').property('firstName','John').property('lastName','Stiles').\n",
    "   addV('Person').property(id,'p-5').property('firstName','Saanvi').property('lastName','Sarkar').\n",
    "   addV('Company').property(id,'c-1').property('name','Example Corp').\n",
    "   addV('Company').property(id,'c-2').property('name','AnyCompany').\n",
    "   V('p-1').addE('WORKED_FOR').to(V('c-1')).property('role','Principal Analyst').                         \n",
    "   V('p-2').addE('WORKED_FOR').to(V('c-1')).property('role','Senior Analyst').                           \n",
    "   V('p-3').addE('WORKED_FOR').to(V('c-1')).property('role','Analyst').\n",
    "   V('p-4').addE('WORKED_FOR').to(V('c-1')).property('role','Analyst').                           \n",
    "   V('p-5').addE('WORKED_FOR').to(V('c-2')).property('role','Manager').\n",
    "   V('p-3').addE('WORKED_FOR').to(V('c-2')).property('role','Associate Analyst').\n",
    "   toList())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Querying the data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Query 1 – Which companies has Li worked for, and in what roles?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To answer this question, we'll have to perform the following steps:\n",
    "\n",
    " 1. Start at the Person vertex representing Li\n",
    " 2. Follow WORKED_FOR edges to find each Company for whom Li has worked\n",
    " 3. Select the Company details, and the role property of the relationship"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Write a failing unit test"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%unittest\n",
    "\n",
    "results = None # TODO\n",
    "\n",
    "assert results == [{'company': 'Example Corp', 'role': 'Analyst'}, \n",
    "                   {'company': 'AnyCompany', 'role': 'Associate Analyst'}]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### And write the query to make it pass"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%unittest\n",
    "\n",
    "results = (g.V('p-3').\n",
    "             outE('WORKED_FOR').as_('e').\n",
    "             otherV().\n",
    "             project('company', 'role').\n",
    "             by('name').\n",
    "             by(select('e').values('role')).\n",
    "             toList())\n",
    "\n",
    "assert results == [{'company': 'Example Corp', 'role': 'Analyst'}, \n",
    "                   {'company': 'AnyCompany', 'role': 'Associate Analyst'}]"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_python3",
   "language": "python",
   "name": "conda_python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}