# How to use Vector Enrichment Jobs for Map Matching

---

This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. 

![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/sagemaker-geospatial|vector-enrichment-map-matching|vector-enrichment-map-matching.ipynb)

---

This notebook demonstrates how to use Amazon SageMaker geospatial capabilities to perform a vector-based map matching operation and visualize the results.

Map matching allows you to snap GPS coordinates to road segments. With Amazon SageMaker geospatial capabilities it is possible to perform a Vector Enrichtment Job (VEJ) for map matching. This type of job takes a CSV file with route information (such as longitude, latitude and timestamps of GPS measurements) as input, and produces a GeoJSON file that contains the predicted route.

The workflow is as follows:

- Step 1: [Import SageMaker geospatial capabilities SDK](#Import-SageMaker-geospatial-capabilities-SDK)
- Step 2: [Inspect input data and upload to S3](#Inspect-input-data-and-upload-to-S3)
- Step 3: [Create an Vector Enrichtment Job (VEJ) for match making](#Create-an-Vector-Enrichtment-Job-for-match-making)
- Step 4: [Export VEJ output to S3](#Export-VEJ-output-to-S3)
- Step 5: [Visualize predicted routes in Amazon SageMaker geospatial Map SDK](#Visualize-predicted-routes-in-Amazon-SageMaker-geospatial-Map-SDK)

## Prerequisites

This notebook runs with Kernel Geospatial 1.0. Note that the following policies need to be attached to the execution role that you used to run this notebook:

- AmazonSageMakerFullAccess
- AmazonSageMakerGeospatialFullAccess

You can see the policies attached to the role in the IAM console under the permissions tab. If required, add the roles using the 'Add Permissions' button.

In addition to these policies, ensure that the execution role's trust policy allows the SageMaker-GeoSpatial service to assume the role. This can be done by adding the following trust policy using the 'Trust relationships' tab:

```
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": [
                    "sagemaker.amazonaws.com",
                    "sagemaker-geospatial.amazonaws.com"
                ]
            },
            "Action": "sts:AssumeRole"
        }
    ]
}
```

## Import SageMaker geospatial capabilities SDK

In [None]:
import boto3
import sagemaker
import sagemaker_geospatial_map

session = boto3.Session()
execution_role = sagemaker.get_execution_role()
geospatial_client = session.client(service_name="sagemaker-geospatial")

## Inspect input data and upload to S3

The following cells will upload the example input data (synthetic GPS traces in CSV format) to a S3 bucket. The CSV file needs to contain a header line. The header names are used in the `MapMatchingConfig` for the Vector Enrichment Job for mapping the CSV columns to the expected attributes.

In [None]:
import pandas as pd

input_df = pd.read_csv("./data/example_gps_traces.csv")
input_df

In [None]:
import boto3

sagemaker_session = sagemaker.Session()
s3_bucket = sagemaker_session.default_bucket()  # Alternatively you can use your custom bucket here.
bucket_prefix = "vej_example_map_matching"
map_matching_input_object_key = f"{bucket_prefix}/input/example_gps_traces.csv"

s3_client = boto3.client("s3")
response = s3_client.upload_file("./data/example_gps_traces.csv", s3_bucket, map_matching_input_object_key)

## Create an Vector Enrichtment Job for match making

The following cell will define and start a Vector Enrichment Job for a map matching operation. Selected headers of the CSV file are mapped to required attributes for the map matching algorithm.

In [None]:
job_config = {
    "MapMatchingConfig": {
        "IdAttributeName": "route_id",
        "TimestampAttributeName": "timestamp",
        "XAttributeName": "longitude",
        "YAttributeName": "latitude",
    },
}

input_config = {
    "DataSourceConfig": {"S3Data": {"S3Uri": f"s3://{s3_bucket}/{map_matching_input_object_key}"}},
    "DocumentType": "CSV",
}

response = geospatial_client.start_vector_enrichment_job(
    Name="vej_example_map_matching",
    ExecutionRoleArn=execution_role,
    InputConfig=input_config,
    JobConfig=job_config,
)

vej_arn = response["Arn"]
vej_arn

In [None]:
import time
import datetime

# check status of created Vector Enrichtment Job and wait until it is completed
job_completed = False
while not job_completed:
    response = geospatial_client.get_vector_enrichment_job(Arn=vej_arn)
    print(
        "Job status: {} (Last update: {})".format(response["Status"], datetime.datetime.now()),
        end="\r",
    )
    job_completed = True if response["Status"] == "COMPLETED" else False
    if not job_completed:
        time.sleep(30)

## Export VEJ output to S3

An export of a map matching VEJ produces the following output artifacts:
- `links.geojson` is a GeoJSON file containing links of the predicted route
- `waypoints.geojson` is a GeoJSON file containing the snap points provided in the input
- `mapmatch_output.json` is a regular JSON file containing links and snap points in a combined fashion

The following cell will export the output of the VEJ into a S3 bucket.

In [None]:
bucket_output_prefix = f"{bucket_prefix}/output/"

response = geospatial_client.export_vector_enrichment_job(
    Arn=vej_arn,
    ExecutionRoleArn=execution_role,
    OutputConfig={"S3Data": {"S3Uri": f"s3://{s3_bucket}/{bucket_output_prefix}"}},
)

# Wait until VEJ has been exported to S3
while not response["ExportStatus"] == "SUCCEEDED":
    response = geospatial_client.get_vector_enrichment_job(Arn=vej_arn)
    print(
        "Export status: {} (Last update: {})".format(
            response["ExportStatus"], datetime.datetime.now()
        ),
        end="\r",
    )
    if not response["ExportStatus"] == "SUCCEEDED":
        time.sleep(15)

## Visualize predicted routes in Amazon SageMaker geospatial Map SDK

The following cells will create an interactive map with the Amazon SageMaker geospatial Map SDK. The output data of the VEJ, the predicted routes, will be loaded from S3 into a geopandas dataframe and then visualized in the embedded map.

In [None]:
Map = sagemaker_geospatial_map.create_map({"is_raster": True})
Map.set_sagemaker_geospatial_client(geospatial_client)

In [None]:
Map.render()

### Load predicted route data into geopandas dataframe

In [None]:
import boto3
import geopandas

s3_client = boto3.client("s3")


def get_file_content_from_s3(bucket_name, object_key):
    response = s3_client.get_object(Bucket=bucket_name, Key=object_key)
    return response.get("Body")


s3_bucket_resource = session.resource("s3").Bucket(s3_bucket)
s3_link_data_output_key = ""
s3_waypoint_data_output_key = ""

for s3_object in s3_bucket_resource.objects.filter(Prefix=bucket_output_prefix).all():
    if s3_object.key.endswith("links.geojson"):
        s3_link_data_output_key = s3_object.key
    if s3_object.key.endswith("waypoints.geojson"):
        s3_waypoint_data_output_key = s3_object.key

link_df = geopandas.read_file(get_file_content_from_s3(s3_bucket, s3_link_data_output_key))
waypoint_df = geopandas.read_file(get_file_content_from_s3(s3_bucket, s3_waypoint_data_output_key))

### Add data of first drive (route_id 1) to map visualization 

In [None]:
dataset_links_drive_01 = Map.add_dataset(
    {"data": link_df.loc[link_df["driveId"] == "1"], "label": "drive_01_links"},
    auto_create_layers=True,
)

dataset_waypoints_drive_01 = Map.add_dataset(
    {"data": waypoint_df.loc[waypoint_df["driveId"] == "1"], "label": "drive_01_waypoints"},
    auto_create_layers=True,
)

### Add data of second drive (route_id 2) to map visualization 

In [None]:
dataset_links_drive_02 = Map.add_dataset(
    {"data": link_df.loc[link_df["driveId"] == "2"], "label": "drive_02_links"},
    auto_create_layers=True,
)

dataset_waypoints_drive_02 = Map.add_dataset(
    {"data": waypoint_df.loc[waypoint_df["driveId"] == "2"], "label": "drive_02_waypoints"},
    auto_create_layers=True,
)

## Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.

![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/sagemaker-geospatial|vector-enrichment-map-matching|vector-enrichment-map-matching.ipynb)

![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/sagemaker-geospatial|vector-enrichment-map-matching|vector-enrichment-map-matching.ipynb)

![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/sagemaker-geospatial|vector-enrichment-map-matching|vector-enrichment-map-matching.ipynb)

![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/sagemaker-geospatial|vector-enrichment-map-matching|vector-enrichment-map-matching.ipynb)

![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/sagemaker-geospatial|vector-enrichment-map-matching|vector-enrichment-map-matching.ipynb)

![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/sagemaker-geospatial|vector-enrichment-map-matching|vector-enrichment-map-matching.ipynb)

![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/sagemaker-geospatial|vector-enrichment-map-matching|vector-enrichment-map-matching.ipynb)

![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/sagemaker-geospatial|vector-enrichment-map-matching|vector-enrichment-map-matching.ipynb)

![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/sagemaker-geospatial|vector-enrichment-map-matching|vector-enrichment-map-matching.ipynb)

![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/sagemaker-geospatial|vector-enrichment-map-matching|vector-enrichment-map-matching.ipynb)

![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/sagemaker-geospatial|vector-enrichment-map-matching|vector-enrichment-map-matching.ipynb)

![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/sagemaker-geospatial|vector-enrichment-map-matching|vector-enrichment-map-matching.ipynb)

![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/sagemaker-geospatial|vector-enrichment-map-matching|vector-enrichment-map-matching.ipynb)

![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/sagemaker-geospatial|vector-enrichment-map-matching|vector-enrichment-map-matching.ipynb)

![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/sagemaker-geospatial|vector-enrichment-map-matching|vector-enrichment-map-matching.ipynb)
