{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# The Labelled Property Graph Model\n",
    "\n",
    "Neptune supports two different types of graph data: RDF graphs and labelled property graphs. In this tutorial we're going to concentrate on the labelled property graph model. \n",
    "\n",
    "A labelled property graph consists of _vertices, edges, properties_ and _labels_. Here's an example of a labelled property graph showing a couple of users, together with some of the jobs they've had and the companies for whom they've worked:\n",
    "\n",
    "<img src=\"https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/images/02-example-graph.png\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Vertices\n",
    "\n",
    "Use vertices to represent entity instances. The vertex below has been labelled _User_, indicating that it represents a user in our dataset:\n",
    "\n",
    "<img src=\"https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/images/02-vertex.png\"/>\n",
    "\n",
    "### Vertex Properties\n",
    "\n",
    "Attach properties – key-value pairs – to a vertex to represent an entity's attributes. Our user vertex has \"firstName\", \"lastName\" and \"email\" properties.\n",
    "\n",
    "### Property Cardinality\n",
    "\n",
    "In Neptune, property values can have either ```single``` or ```set``` cardinality. The default cardinality in Neptune is ```set```, which permits multiple values per key. Using set cardinality we could, for example, specify multiple email addresses for our user:\n",
    "\n",
    "```email: [bsmith@example.com, b_smith@example.co.uk]```\n",
    "\n",
    "### Vertex Ids\n",
    "\n",
    "Every vertex has an ```id```. You can specify a String ```id``` when you create a vertex. If you don't specify an ```id```, Neptune will create a UUID-based String ```id``` for you.\n",
    "\n",
    "You could if you wanted also create a property called \"id\", but in general you should avoid doing this because of the potential for confusion it introduces into your model.\n",
    "\n",
    "### Vertex Labels\n",
    "\n",
    "Our user node has been labelled _User_. Vertex labels allow us to indicate the role(s) each vertex plays within our domain.\n",
    "\n",
    "Each vertex must have at least one label. Unlike some other Tinkerpop-enabled graph databases, Neptune allows you to attach multiple labels to a vertex. The Vertex below has been labelled both _User_ and _Admin_:\n",
    "\n",
    "<img src=\"https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/images/02-multiple-labels.png\"/>\n",
    "\n",
    "Note also that the vertex representing Dr Sarah Jones includes a \"title\" property, something our other user vertex is lacking. This illustrates the schema-free nature of Neptune: no two vertices, even those sharing the same label, need have the exact same set of properties."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Edges\n",
    "\n",
    "The edge labelled _Role_ show below connects a job vertex to a role vertex:\n",
    "\n",
    "<img src=\"https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/images/02-edge.png\"/>\n",
    "\n",
    "Edges represent the connections between entities. By connecting pairs of vertices with edges, we introduce structure into our domain model. Every edge _must_ have a start vertex and an end vertex, and exactly one label. An edge's label and direction lend semantic clarity and context to the vertices attached to the edge. \n",
    "\n",
    "### Edge Properties\n",
    "\n",
    "Just like vertices, edges can also have properties associated with them. We typically use edge properties to represent the strength, weight or quality of a connection, or to attach some metadata, such as a timestamp, to the edge. \n",
    "\n",
    "The edge show below includes a \"likelihood\" property with a numeric value, indicating the likelihood that Bob knows Sarah.\n",
    "\n",
    "<img src=\"https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/images/02-edge-properties.png\"/>\n",
    "\n",
    "### Edge Ids\n",
    "\n",
    "Just as with vertices, every edge has an `id` whose vaue you can specify when you create the edge. If you don't supply a value, Neptune will automatically generate one.\n",
    "\n",
    "(We've omitted edge ids from our diagrams.)\n",
    "\n",
    "### Self Edges\n",
    "\n",
    "Every edge must have a start vertex and an end vertex, but these can be the same vertex, thereby indicating a self relationship:\n",
    "\n",
    "<img src=\"https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/images/02-self-edge.png\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Representing Complexity\n",
    "\n",
    "The labelled property graph model provides a powerful abstraction for modelling both variable structure and connectedness. \n",
    "\n",
    "Variable structure is provided for by virtue of connections being specified at the instance level rather than the class level. Edges join individual vertices, not classes of vertices – that is, each edge in the graph represents a specific connection between two particular things. In consequence, no two vertices need be connected in exactly the same way to their neighbours, and as a result, no two subgraphs need be structured exactly alike. It's this instance-level focus on things and the connections between things that makes graphs ideal for representing and navigating a variably structured domain.\n",
    "\n",
    "Edges not only specify that two things are connected; they also describe the nature and quality of that connection. To the extent that complexity is a function of the ways in which the semantic, structural and qualitative aspects of the connections in a domain can vary, our data models require a means of expressing and exploiting this connectedness. The labelled property graph model, wherein every edge can not only be specified independently of every other, but also annotated with properties that describe how and in what degree, and with what weight, strength or quality, entities are connected, provides a powerful means for representing the nature of this connectedness."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_python3",
   "language": "python",
   "name": "conda_python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}