# GraphETL Demo

This notebook accompanies the blog post for the GraphETL sample. It demonstrates basic queries using the graph traversal language Gremlin.

### Pre-requisites

This notebook assumes you have successfully deployed the [GraphETL](https://github.com/aws-samples/etl-into-amazon-neptune-graph) sample

### Visualize the graph

Make sure to click the `Graph` tab for queries that end with the `path()` step.

In [None]:
%%gremlin

g.V().bothE().otherV().path().by(elementMap())

### Exploring the graph

We can use the following query to see what entities have been extracted from our media files. 

In [None]:
%%gremlin

g.V().label().groupCount().unfold().order()

In [None]:
%%gremlin

g.E().label().groupCount()

### Exploring connections

Which node in our graph is connected to the most nodes?

In [None]:
%%gremlin

g.V()
 .project('node','degree')
 .by(id())
 .by(both().count())
 .order().by('degree',desc)

Let's take a closer look at the node that's connected to the most other nodes (Animal).

In [None]:
%%gremlin -le 40

g.V('NODE_ID') # An example NODE_ID looks like '3f257e684a3beb0e303fe0572ab07e1de2950880f59821b6ff7449013ee3a063'
 .bothE().otherV().path().by(elementMap())

What entities were detected in the `video01.mp4` media file?

In [None]:
%%gremlin

g.V()
 .hasLabel('video01.mp4')
 .bothE('APPEARS_IN').otherV()
 .path().by(elementMap())


Does it have any shared entities with the `video02.mp4` media file?

In [None]:
%%gremlin

g.V()
 .hasLabel('video01.mp4')
 .repeat(bothE('APPEARS_IN').otherV()).times(2)
 .hasLabel('video02.mp4')
 .path().by(elementMap())


What entities are present in all 3 of our media files?

In [None]:
%%gremlin -le 40

g.V()
 .where(out('APPEARS_IN').hasLabel('video01.mp4'))
 .where(out('APPEARS_IN').hasLabel('video02.mp4'))
 .where(out('APPEARS_IN').hasLabel('video03.mp4'))
 .bothE('APPEARS_IN').otherV()
 .hasLabel(within('video01.mp4','video02.mp4','video03.mp4'))
 .path().by(elementMap())

For objects that have a confidence score greater than 80, how do they appear with other objects of confidence score greater than 80?

In [None]:
%%gremlin -le 40

g.V().has('TYPE','OBJECT')
 .has('CONFIDENCE',gt(80))
 .bothE('APPEARS_WITH').otherV()
 .has('CONFIDENCE',gt(80))
 .path().by(elementMap())