# Prepare inputs for Amazon Quicksight visualization

Amazon QuickSight is a cloud-scale business intelligence (BI) service that you can use to deliver easy-to-understand insights to the people who you work with, wherever they are. Amazon QuickSight connects to your data in the cloud and combines data from many different sources. In a single data dashboard, QuickSight can include AWS data, third-party data, big data, spreadsheet data, SaaS data, B2B data, and more. As a fully managed cloud-based service, Amazon QuickSight provides enterprise-grade security, global availability, and built-in redundancy. It also provides the user-management tools that you need to scale from 10 users to 10,000, all with no infrastructure to deploy or manage.

In this notebook, we will prepare the manifest file that we need to use with Amazon Quicksight to visualize insights we generated from our customer call transcripts.

### Initialize libraries and import variables

In [None]:
# import libraries
import pandas as pd
import boto3
import json
import csv
import os

# initialize variables we need
infile = 'quicksight_raw_manifest.json'
outfile = 'quicksight_formatted_manifest_type.json'

inprefix = 'quicksight/data'
manifestprefix = 'quicksight/manifest'

bucket = '' # Enter your bucket name here

s3 = boto3.client('s3')

try:
 s3.head_bucket(Bucket=bucket)
except:
 print("The S3 bucket name {} you entered seems to be incorrect, please try again".format(bucket))

### Review transcripts with insights for QuickSight
When we ran the previous notebooks, we created CSV files containing speaker and time segmentation, the inference results that classified the transcripts to CTA/No CTA using Amazon Comprehend custom classification, we detected custom entities using Amazon Comprehend custom entity recognizer, and we finally detected the sentiment of the call transcripts using Amazon Comprehend Sentiment anlysis feature. These are available in our temp folder, let us move these to the quicksight/input folder

In [None]:
# Lets review what CSV files we have for QuickSight
!aws s3 ls s3://{bucket}/{inprefix} --recursive 

### Update QuickSight Manifest
We will replace the S3 bucket and prefix from the raw manifest file with what you have entered in STEP 0 - CELL 1 above. We will then create a new formatted manifest file that will be used for creating a dataset with Amazon QuickSight based on the content we extract from the handwritten documents.

In [None]:
# S3 boto3 client handle
s3 = boto3.client('s3')

# Create formatted manifests for each type of dataset we need from the raw manifest JSON
types = ['transcripts', 'entity', 'cta', 'sentiment']

manifest = open(infile, 'r')
ln = json.load(manifest)
t = json.dumps(ln['fileLocations'][0]['URIPrefixes'])
for type in types:
 t1 = t.replace('bucket', bucket).replace('prefix', inprefix + '/' + type)
 ln['fileLocations'][0]['URIPrefixes'] = json.loads(t1)
 outfile_rep = outfile.replace('type', type)
 with open(outfile_rep, 'w', encoding='utf-8') as out:
 json.dump(ln, out, ensure_ascii=False, indent=4)
 # Upload the manifest to S3
 s3.upload_file(outfile_rep, bucket, manifestprefix + '/' + outfile_rep)
 print("Manifest file uploaded to: s3://{}/{}".format(bucket, manifestprefix + '/' + outfile_rep))

#### Please copy the manifest S3 URIs above. We need it when we build the datasets for the QuickSight dashboard.

### We are done here. Please go back to workshop instructions.