# Welcome to the Amazon Neptune and AWS CDK for Amundsen project!
## Overview
This project has an [associated blog](https://aws.amazon.com/blogs/database/category/database/amazon-neptune/) which provides extensive detail on Amundsen, as well as greater detail regarding this solution.
### What is Amundsen?
[Amundsen](https://github.com/amundsen-io/amundsen) is a data discovery and metadata engine for improving the productivity of data analysts, data scientists and engineers when interacting with data. It does that today by indexing data resources (tables, dashboards, streams, etc.) and powering a page-rank style search based on usage patterns (e.g. highly queried tables show up earlier than less queried tables). Think of it as Google search for data. The project is named after Norwegian explorer Roald Amundsen, the first person to discover the South Pole.
### Why this Project?
The goal of this project is to simplify the provisioning and configuration of an environment for you to take advantage of Amundsen. This project leverages [Amazon Neptune](https://aws.amazon.com/neptune/) and [Amazon Elasticsearch Service](https://aws.amazon.com/elasticsearch-service/) for the Amundsen Metadata and Search Services, and uses [AWS Cloud Development Kit (AWS CDK)](https://aws.amazon.com/cdk/) to synthesize CloudFormation templates necessary to provision the infrastructure as code. In addition, this project provisions [Amazon RDS for PostgreSQL](https://aws.amazon.com/rds/postgresql/), [Amazon RedShift](https://aws.amazon.com/redshift/), and other resources to streamline the loading and indexing of sample data dump from an existing Amazon Neptune sample project - [Knowledge Graph Chatbot Full Stack Application](https://github.com/aws-samples/amazon-neptune-samples/blob/master/gremlin/chatbot-full-stack-application).
## Solution Overview
### Architecture
### Customization
The `cdk.json` file stores variables used by the AWS CDK Toolkit. The default variable values should be acceptable for most environments, but if not, you can change any of these values followed by [Build and Deploy](#Build-and-Deploy) per the instructions below.
Customiz the AWS CDK toolkit with the following custom variables in `cdk.json`:
vpc-cidr
rds-engine
rds-port
rds-database
sample-data-s3-bucket
sample-data-rds-dump-filename
sample-data-redshift-query-s3-bucket
sample-data-redshift-query-filename
application
environment
aws iam create-service-linked-role --aws-service-name es.amazonaws.com
### Build and Deploy
From a terminal window, you will need to clone the GitHub repo, install packages, build, and synthesize the CloudFormation templates. Issue the following commands in a terminal window in Cloud9. By default, the AWS CDK will prompt you to deploy changes. If you want to skip confirmations, add the following command line option to the AWS CDK commands below. --require-approval never
. In addition, you can deploy all stacks at once by issuing cdk deploy --all
rather than issuing separate cdk deploy ``
commands.
git clone https://github.com/aws-samples/amazon-neptune-samples
cd amazon-neptune-samples/amazon-neptune-and-aws-cdk-for-amundsen
# Update to latest npm
npm install -g npm@latest
# Install packages
npm install
# Build
npm run build
# Bootstrap AWS Cloud Development Kit (AWS CDK)
cdk bootstrap
# Synthesize CloudFormation
cdk synth
# Deploy each stack
cdk deploy Amundsen-Blog-VPC-Stack
cdk deploy Amundsen-Blog-RDS-Stack
cdk deploy Amundsen-Blog-Redshift-Stack
cdk deploy Amundsen-Blog-Bastion-Stack
cdk deploy Amundsen-Blog-Amundsen-Stack
cdk deploy Amundsen-Blog-Databuilder-Stack
The Amundsen Frontend hostname will be output to multiple places. First, the AWS CDK console output will include the following:
`Amundsen-Blog-Amundsen-Stack.amundsenfrontendhostname =