# Detect sentiment in customer calls using Amazon Comprehend

Now we will detect the customer sentiment in the call conversations using Amazon Comprehend. 

### Import libraries and initialize variables

In [None]:
import boto3
import pandas as pd

inprefix = 'comprehend/input'
outprefix = 'quicksight/temp/insights'
# Amazon Comprehend client
comprehend = boto3.client('comprehend')
# Amazon S3 clients
s3 = boto3.client('s3')
s3_resource = boto3.resource('s3')

bucket = '' # Enter your bucket name here

try:
 s3.head_bucket(Bucket=bucket)
except:
 print("The S3 bucket name {} you entered seems to be incorrect, please try again".format(bucket))

### Detect sentiment of transcripts
For our workshop we will determine the sentiment of an entire call transcript to use with our visuals, but you can also capture sentiment trends in a conversation. We will demonstrate this during the workshop using the new **Transcribe Call Analytics** solution. If you like to try how this looks, please execute the optional code block at the end of this notebook.

In [None]:
# Prepare to page through our transcripts in S3
paginator = s3.get_paginator('list_objects_v2')
pages = paginator.paginate(Bucket=bucket, Prefix=inprefix)
job_name_list = []
t_prefix = 'quicksight/data/sentiment'

# We will define a DataFrame to store the results of the sentiment analysis
cols = ['transcript_name', 'sentiment']
df_sent = pd.DataFrame(columns=cols)

# Now lets page through the transcripts
for page in pages:
 for obj in page['Contents']:
 # get the transcript file name
 transcript_file_name = obj['Key'].split('/')[2]
 # now lets get the transcript file contents
 temp = s3_resource.Object(bucket, obj['Key'])
 transcript_contents = temp.get()['Body'].read().decode('utf-8')
 # Call Comprehend to detect sentiment
 response = comprehend.detect_sentiment(Text=transcript_contents, LanguageCode='en')
 # Update the results DataFrame with the cta predicted label
 # Create a CSV file with cta label from this DataFrame
 df_sent.loc[len(df_sent.index)] = [transcript_file_name.strip('en-').strip('.txt'),response['Sentiment']]
 
df_sent.to_csv('s3://' + bucket + '/' + t_prefix + '/' + 'sentiment.csv', index=False)
df_sent

### OPTIONAL - Detect sentiment Trend
We will now take one of the transcripts and show you how to detect sentiment trend in conversations. This can be a powerful insight to both demonstrate and understand the triggers for a shift in customer perspective as well as how to remedy it.

In [None]:
# Select one of the transcripts we created in 1-Transcribe-Translate
import os
rootdir = '/home/ec2-user/SageMaker/aim317-uncover-insights-customer-conversations/notebooks/1-Transcribe-Translate-Calls'
csvfile = ''
for subdir, dirs, files in os.walk(rootdir):
 for file in files:
 filepath = subdir + os.sep + file
 if filepath.endswith(".csv"):
 csvfile = str(filepath)
 break
 
df_t = pd.read_csv(csvfile)
df_t.head()

Separate the sentences spoken by each of the speakers to their own dictionaries along with the last timestamp when their sentence ended

In [None]:
spk_0 = {}
spk_1 = {}
a = ''
b = ''
j = 0
k = 0
for i, row in df_t.iterrows():
 if row['speaker_label'] == 'spk_0':
 if len(b) > 0:
 j += 1
 spk_1['end_time'+str(j)] = row['start_time'] 
 spk_1['transcript'+str(j)] = b
 b = ''
 a += row['content'] + ' '
 if row['speaker_label'] == 'spk_1':
 if len(a) > 0:
 k += 1
 spk_0['end_time'+str(k)] = row['start_time']
 spk_0['transcript'+str(k)] = a
 a = ''
 b += row['content'] + ' '
if len(a) > 0:
 spk_0['transcript'+str(j+1)] = a
 spk_0['end_time'+str(j+1)] = row['end_time']
if len(b) > 0:
 spk_1['transcript'+str(k+1)] = b
 spk_1['end_time'+str(k+1)] = row['end_time']

#### Check the results

In [None]:
spk_0

Now get the **sentiment for each line using Amazon Comprehend** and update the transcript with the sentiment

In [None]:
import re
for line in spk_0:
 if 'transcript' in line:
 res0 = comprehend.detect_sentiment(Text=spk_0[line], LanguageCode='en')['Sentiment']
 spk_0[line] = res0

for line in spk_1:
 if 'transcript' in line:
 res1 = comprehend.detect_sentiment(Text=spk_1[line], LanguageCode='en')['Sentiment']
 spk_1[line] = res1

In [None]:
spk_1

#### Let us now graph it

In [None]:
!pip install matplotlib

In [None]:
import matplotlib.pyplot as plt

spk_0_end_time = []
spk_0_sentiment = []
spk_1_end_time = []
spk_1_sentiment = []


for x in spk_0:
 if 'end_time' in x:
 spk_0_end_time.append(spk_0[x])
 if 'transcript' in x:
 spk_0_sentiment.append(spk_0[x])

for x in spk_1:
 if 'end_time' in x:
 spk_1_end_time.append(spk_1[x])
 if 'transcript' in x:
 spk_1_sentiment.append(spk_1[x])
 
plt.plot(spk_0_end_time, spk_0_sentiment, color = 'g', label = 'Speaker 0 Sentiment Trend')
plt.plot(spk_1_end_time, spk_1_sentiment, color = 'b', label = 'Speaker 1 Sentiment Trend')
plt.xlabel('Call time in seconds')
plt.ylabel('Sentiment')
plt.legend()

As you can see above, the sky's the limit on what you can do with the Amazon Transcribe output in tandem with Amazon Comprehend. Please go back now to watch your team members create some **AWSome visuals using Amazon QuickSight!!**

## End of notebook. Please go back to the workshop instructions to review the next steps.