# Gremlin Tutorial

This example is a gremlin tutorial that shows how to explore the graph with sample queries. It also shows how to make recommendations using collaborative filtering. 

## Prerequisite

This tutorial assumes you already have your environment setup. To setup a new environment, create an Amazon Neptune Cluster. 

See the following links on how to create an Amazon Neptune Cluster for Gremlin and setup IAM authentication:  

* https://docs.aws.amazon.com/neptune/latest/userguide/get-started-CreateInstance-Console.html
* https://docs.aws.amazon.com/neptune/latest/userguide/bulk-load-tutorial-IAM.html
* https://docs.aws.amazon.com/neptune/latest/userguide/bulk-load.html
* https://docs.aws.amazon.com/neptune/latest/userguide/bulk-load-tutorial-format-gremlin.html

You will also need to load the data [files](data/) into an S3 bucket. 


## Use Case

In this tutorial, we'll traverse console game preferences among a small set of gamers and games. We'll explore commonality, preferences and make potential game recommendations. These queries are for the purposes of learning gremlin and Amazon Neptune. 


![cloudformation](images/image1.jpg)


## Step 1 (Load Data Sample data)


**Game & Player Vertices** (~id,~label,GamerAlias:String,ReleaseDate:Date,GameGenre:String,ESRBRating:String,Developer:String,Platform:String,GameTitle:String)

```
curl -X POST \
    -H 'Content-Type: application/json' \
    http://your-neptune-endpoint:8182/loader -d '
    { 
      "source" : "s3://your-s3-bucket/vertex.txt", 
      "iamRoleArn" : "arn:aws:iam::account-id:role/role-name",
      "format" : "csv", 
      "region" : "us-east-1", 
      "failOnError" : "FALSE"
    }'
```

**Edges** (~id, ~from, ~to, ~label, weight:Double) 
```
curl -X POST \
    -H 'Content-Type: application/json' \
    http://your-neptune-endpoint:8182/loader -d '
    { 
      "source" : "s3://your-s3-bucket/recommendation/edges.txt", 
      "iamRoleArn" : "arn:aws:iam::account-id:role/role-name",
      "format" : "csv", 
      "region" : "us-east-1", 
      "failOnError" : "FALSE"
    }'
```

**Tip.** Alternatively, you could load all of the files by loading the entire directory
```
curl -X POST \
    -H 'Content-Type: application/json' \
    http://your-neptune-endpoint:8182/loader -d '
    { 
      "source" : "s3://your-s3-bucket/", 
      "iamRoleArn" : "arn:aws:iam::account-id:role/role-name",
      "format" : "csv", 
      "region" : "us-east-1", 
      "failOnError" : "FALSE"
    }'
```

**Tip.**
Upon executing each curl command, Neptune will return the loadId associated with each request. You can check the status of your load with the following command:
```
curl http://your-neptune-endpoint:8182/loader?loadId=[loadId value]
```

For more information about loading data into Amazon Neptune visit: https://docs.aws.amazon.com/neptune/latest/userguide/bulk-load.html

## Sample Queries


**Query for a particular vertex (gamer)**

```
gremlin> g.V().hasId('Luke').valueMap()
==>{GamerAlias=[skywalker123]}

gremlin> g.V().has("GamerAlias","skywalker123").valueMap()
==>{GamerAlias=[skywalker123]}

gremlin> g.V().has('GamerAlias','skywalker123')
==>v[Luke]
```

**Sample some of the edges (limit 5)**
```
gremlin> g.E().limit(5)
==>e[e25][Luke-likes->SuperMarioOdyssey]
==>e[e26][Mike-likes->SuperMarioOdyssey]
==>e[e8][Mike-likes->CallOfDutyBO4]
==>e[e1][Luke-likes->HorizonZeroDawn]
==>e[e9][Mike-likes->GranTurismoSport]
```

**Sample some of the vertices (limit 4)**
```
gremlin> g.V().limit(4)
==>v[SuperMarioOdyssey]
==>v[Luke]
==>v[Emma]
==>v[MarioKart8]
```

**Count the in-degree centrality of incoming edges to each vertex**
```
gremlin> g.V().group().by().by(inE().count())
==>{v[HorizonZeroDawn]=2, v[Luke]=0, v[ARMS]=2, v[Ratchet&Clank]=3, v[SuperMarioOdyssey]=3, v[GravityRush]=2, v[CallOfDutyBO4]=1, v[MarioKart8]=3, v[Fifa18]=1, v[Nioh]=1, v[Mike]=0, v[Knack]=2, v[Lina]=0, v[TombRaider]=2, v[GranTurismoSport]=2, v[Emma]=0}
```

**Count the out-degree centrality of outgoing edges from each vertex**
```
gremlin> g.V().group().by().by(outE().count())
==>{v[HorizonZeroDawn]=0, v[Luke]=8, v[ARMS]=0, v[Ratchet&Clank]=0, v[SuperMarioOdyssey]=0, v[GravityRush]=0, v[CallOfDutyBO4]=0, v[MarioKart8]=0, v[Fifa18]=0, v[Nioh]=0, v[Mike]=8, v[Knack]=0, v[Lina]=6, v[TombRaider]=0, v[GranTurismoSport]=0, v[Emma]=2}

```
**Count the out-degree centrality of outgoing edges from each vertex by order of degree**
```
gremlin> g.V().project("v","degree").by().by(bothE().count()).order().by(select("degree"), decr)
==>{v=v[Luke], degree=8}
==>{v=v[Mike], degree=8}
==>{v=v[Lina], degree=6}
==>{v=v[SuperMarioOdyssey], degree=3}
==>{v=v[MarioKart8], degree=3}
==>{v=v[Ratchet&Clank], degree=3}
==>{v=v[Emma], degree=2}
==>{v=v[HorizonZeroDawn], degree=2}
==>{v=v[GranTurismoSport], degree=2}
==>{v=v[ARMS], degree=2}
==>{v=v[GravityRush], degree=2}
==>{v=v[TombRaider], degree=2}
==>{v=v[Knack], degree=2}
==>{v=v[Fifa18], degree=1}
==>{v=v[Nioh], degree=1}
==>{v=v[CallOfDutyBO4], degree=1}
```

**Return only the vertices that are games**
```
gremlin> g.V().hasLabel('game')
==>v[Mario+Rabbids]
==>v[ARMS]
==>v[HorizonZeroDawn]
==>v[GranTurismoSport]
==>v[Ratchet&Clank]
==>v[Fifa18]
==>v[GravityRush]
==>v[Nioh]
==>v[TombRaider]
==>v[CallOfDutyBO4]
==>v[Knack]
==>v[SuperMarioOdyssey]
==>v[MarioKart8]
```

**Return only the vertices that are gamers**
```
gremlin> g.V().hasLabel('person')
==>v[Luke]
==>v[Emma]
==>v[Lina]
==>v[Mike]
```

**Return counts of games grouped by game genre**
```
gremlin> g.V().hasLabel('game').groupCount().by("GameGenre")
==>{Shooter=2, Action=3, Adventure=5, Racing=2, Sports=1}
```
**Return counts of games grouped by developer**
```
gremlin> g.V().hasLabel('game').groupCount().by("Developer")
==>{Activision=1, Nintendo=3, Square Enix=1, Guerrilla Games=1, Sony Interactive Entertainment=2, Insomniac Games=1, Electronic Arts=1, Project Siren=1, Ubisoft=1, Team Ninja=1}
```

**Return counts of games grouped by Platform**
```
gremlin> g.V().hasLabel('game').groupCount().by("Platform")
==>{PS4=9, Switch=4}
```

**What is the average weighted rating of MarioKart8?**
```
gremlin> g.V().hasLabel('game').has('GameTitle','MarioKart8').inE('likes').values('weight').mean()
==>0.6333333333333334
```

**What games does skywalker123 like?**
```
gremlin> g.V().has('GamerAlias','skywalker123').as('gamer').out('likes')
==>v[ARMS]
==>v[HorizonZeroDawn]
==>v[GranTurismoSport]
==>v[Ratchet&Clank]
==>v[Fifa18]
==>v[GravityRush]
==>v[SuperMarioOdyssey]
==>v[MarioKart8]
```

**What games does skywalker123 like using weight (greater than)?**
```
gremlin> g.V().has('GamerAlias','skywalker123').outE("likes").has('weight', P.gt(0.7f))
==>e[e17][Luke-likes->Mario+Rabbids]
==>e[e3][Luke-likes->Ratchet&Clank]
```

**What games does skywalker123 like using weight (less than)?**
```
gremlin> g.V().has('GamerAlias','skywalker123').outE("likes").has('weight', P.lt(0.5f))
==>e[e1][Luke-likes->HorizonZeroDawn]
==>e[e2][Luke-likes->GranTurismoSport]
==>e[e4][Luke-likes->Fifa18]
==>e[e5][Luke-likes->GravityRush]
==>e[e21][Luke-likes->MarioKart8]
```

**Who else likes the same games?**
```
gremlin> g.V().has('GamerAlias','skywalker123').out('likes').in('likes').dedup().values('GamerAlias')
==>forchinet
==>skywalker123
==>bringit32
==>smiles007
```

**Who else likes these games (exclude yourself)?**
```
gremlin> g.V().has('GamerAlias','skywalker123').as('TargetGamer').out('likes').in('likes').where(neq('TargetGamer')).dedup().values('GamerAlias')
==>forchinet
==>bringit32
==>smiles007
```

**What are other game titles do other gamers like, who have commonality?**
```
gremlin> g.V().has('GamerAlias','skywalker123').as('TargetGamer').out('likes').in('likes').where(neq('TargetGamer')).out('likes').dedup().values('GameTitle')
==>ARMs
==>HorizonZeroDawn
==>GranTurismoSport
==>Nioh
==>TombRaider
==>CallOfDutyBO4
==>SuperMarioOdyssey
==>MarioKart8
==>Ratchet&Clank
==>GravityRush
==>Knack
```

**Which games might make sense to recommend to a specific gamer that they don't already like?**
```
gremlin> g.V().has('GamerAlias','skywalker123').as('TargetGamer').out('likes').aggregate('self').in('likes').where(neq('TargetGamer')).out('likes').where(without('self')).dedup().values('GameTitle')
==>Nioh
==>TombRaider
==>CallOfDutyBO4
==>Knack
```

For more recommendation example queries or other gremlin recipes you can also visit: http://tinkerpop.apache.org/docs/current/recipes/#recommendation


**Drop data**
```
gremlin> g.V().drop().iterate()

```
## License Summary

This sample code is made available under a modified MIT license. See the LICENSE file.