{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# PART 1 - Video metadata extraction with Amazon Rekognition\n",
"In the following section of the workshop, we're going to extract metadata and labels from the video as well as identify the different scenes from the video, all of that via computer vision methods and more precisely via Amazon Rekognition.
\n",
"
\n",
"Amazon Rekognition is an AWS services that makes it easy to add image and video analysis to your applications using proven, highly scalable, deep learning technology that requires no machine learning expertise to use. With Amazon Rekognition, you can identify objects, people, text, scenes, and activities in images and videos, as well as detect any inappropriate content. Amazon Rekognition also provides highly accurate facial analysis and facial search capabilities that you can use to detect, analyze, and compare faces for a wide variety of user verification, people counting, and public safety use cases.\n",
"\n",
"https://docs.aws.amazon.com/rekognition/latest/dg/what-is.html"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#load stored variable from previous notebooks\n",
"%store -r"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install boto3"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import boto3\n",
"from botocore.exceptions import ClientError\n",
"import os\n",
"import time\n",
"\n",
"import logging\n",
"import sys\n",
"logging.basicConfig(format='%(asctime)s | %(levelname)s : %(message)s',\n",
" level=logging.INFO, stream=sys.stdout)\n",
"log = logging.getLogger('knowledge-graph-logger')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Scene detection job\n",
"\n",
"Let's start by detecting the different \"scenes\" or \"shots\" from the video. This is notably useful to detect fix/black screens."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The below function starts a segment detection job with predefined settings for confidence and coverage notably. We're choosing to identify both \"SHOT\" (=scenes) and \"TECHNICAL_CUE\" (e.g. black/fix screen)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"rek = boto3.client('rekognition')\n",
"\n",
"#Method calling the Rekognition segment detection api\n",
"def startSegmentDetection(s3_bucket, name):\n",
" min_Technical_Cue_Confidence = 80.0\n",
" min_Shot_Confidence = 80.0\n",
" max_pixel_threshold = 0.1\n",
" min_coverage_percentage = 60\n",
"\n",
" response = rek.start_segment_detection(\n",
" Video={\"S3Object\": {\"Bucket\": s3_bucket, \"Name\": name}},\n",
" NotificationChannel={\n",
" \"RoleArn\": role_arn,\n",
" \"SNSTopicArn\": sns_topic_arn,\n",
" },\n",
" SegmentTypes=[\"TECHNICAL_CUE\", \"SHOT\"],\n",
" Filters={\n",
" \"TechnicalCueFilter\": {\n",
" \"MinSegmentConfidence\": min_Technical_Cue_Confidence,\n",
" },\n",
" \"ShotFilter\": {\"MinSegmentConfidence\": min_Shot_Confidence},\n",
" }\n",
" )\n",
" return response\n",
" \n",
"segment_detection_job = startSegmentDetection(bucket, os.path.join(s3_video_input_path, video_file))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"segment_detection_job_id = segment_detection_job['JobId']\n",
"print(f\"Job Id: {segment_detection_job_id}\")\n",
"\n",
"#Grab the segment detection response\n",
"SegmentDetectionOutput = rek.get_segment_detection(\n",
" JobId=segment_detection_job_id\n",
")\n",
"\n",
"#Determine the state. If the job is still processing we will wait a bit and check again\n",
"while(SegmentDetectionOutput['JobStatus'] == 'IN_PROGRESS'):\n",
" time.sleep(5)\n",
" print('.', end='')\n",
" \n",
" SegmentDetectionOutput = rek.get_segment_detection(\n",
" JobId=segment_detection_job_id)\n",
" \n",
"#Once the job is no longer in progress we will proceed\n",
"print(SegmentDetectionOutput['JobStatus'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's have a look at the output from the job"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"SegmentDetectionOutput['VideoMetadata']"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"SegmentDetectionOutput['AudioMetadata']"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#looking at some of the segments/scenes. note that they will either be of type TECHNICAL_CUE or SHOT \n",
"SegmentDetectionOutput['Segments'][:5]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's park segment detection for the moment. we'll use those metadata in the Neptune related notebook when storing our data into the graph."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Label detection in the video\n",
"We're now using a different functionality of Amazon Rekognition: Label detection.\n",
"By running this job on the video, Rekognition will automatically labels objects, concepts, scenes, and actions in your images, and provides an associated confidence score."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#method starting a individual rekognition labeling job\n",
"def start_label_detection(s3_bucket, name, roleArn, sns_topic_arn):\n",
" response = rek.start_label_detection(\n",
" Video={\"S3Object\": {\"Bucket\": s3_bucket, \"Name\": name}},\n",
" NotificationChannel={\n",
" \"RoleArn\": roleArn,\n",
" \"SNSTopicArn\": sns_topic_arn,\n",
" }\n",
" )\n",
" return response"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Launching the label detection job."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#launching it on our test video.\n",
"label_detection_response = start_label_detection(bucket, os.path.join(s3_video_input_path, video_file), role_arn, sns_topic_arn)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Monitoring when the job is finished"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"label_detection_job_id = label_detection_response['JobId']\n",
"print(f\"Job Id: {label_detection_job_id}\")\n",
"\n",
"#Grab the segment detection response\n",
"LabelDetectionOutput = rek.get_label_detection(\n",
" JobId=label_detection_job_id\n",
")\n",
"\n",
"#Determine the state. If the job is still processing we will wait a bit and check again\n",
"while(LabelDetectionOutput['JobStatus'] == 'IN_PROGRESS'):\n",
" time.sleep(5)\n",
" print('.', end='')\n",
" \n",
" LabelDetectionOutput = rek.get_label_detection(\n",
" JobId=label_detection_job_id)\n",
" \n",
"#Once the job is no longer in progress we will proceed\n",
"print(LabelDetectionOutput['JobStatus'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's have a look at the output from the label detection job"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"LabelDetectionOutput['Labels'][:5]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Again, let's park the label detection for the moment and export/store the outputs of those 2 jobs so that we can use it in part 3 for when we create our Graph."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%store SegmentDetectionOutput\n",
"%store LabelDetectionOutput"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"instance_type": "ml.t3.medium",
"kernelspec": {
"display_name": "Python 3 (Base Python)",
"language": "python",
"name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:ap-southeast-2:452832661640:image/python-3.6"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.13"
}
},
"nbformat": 4,
"nbformat_minor": 4
}