{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Object and scene detection using Amazon Rekognition" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", "This notebook provides a walkthrough of [object detection API](https://docs.aws.amazon.com/rekognition/latest/dg/labels.html) in Amazon Rekognition to identify objects.\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Initialize stuff" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Initialise Notebook\n", "import boto3\n", "from IPython.display import HTML, display, Image as IImage\n", "from PIL import Image, ImageDraw, ImageFont\n", "import time\n", "import os" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Curent AWS Region. Use this to choose corresponding S3 bucket with sample content\n", "\n", "mySession = boto3.session.Session()\n", "awsRegion = mySession.region_name" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Init clients\n", "rekognition = boto3.client('rekognition')\n", "s3 = boto3.client('s3')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# S3 bucket that contains sample images and videos\n", "\n", "# We are providing sample images and videos in this bucket so\n", "# you do not have to manually download/upload test images and videos.\n", "\n", "bucketName = \"aws-workshops-\" + awsRegion" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create temporary directory\n", "# This directory is not needed to call Rekognition APIs.\n", "# We will only use this directory to download images from S3 bucket and draw bounding boxes\n", "\n", "!mkdir m1tmp\n", "tempFolder = 'm1tmp/'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Detect objects in image\n", "***" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "imageName = \"content-moderation/media/cars.png\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "display(IImage(url=s3.generate_presigned_url('get_object', Params={'Bucket': bucketName, 'Key': imageName})))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Call Rekognition to detect objects in the image" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Call Amazon Rekognition to detect objects in the image\n", "# https://docs.aws.amazon.com/rekognition/latest/dg/API_DetectLabels.html\n", "\n", "detectLabelsResponse = rekognition.detect_labels(\n", " Image={\n", " 'S3Object': {\n", " 'Bucket': bucketName,\n", " 'Name': imageName,\n", " }\n", " }\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Review the raw JSON reponse from Rekognition" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Show JSON response returned by Rekognition Labels API (Object Detection)\n", "# In the JSON response below, you will see Label, detected instances, confidence score and additional information.\n", "\n", "display(detectLabelsResponse)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Display list of detected unsafe objects}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "flaggedObjects = [\"Car\"]\n", "\n", "for label in detectLabelsResponse[\"Labels\"]:\n", " if(label[\"Name\"] in flaggedObjects):\n", " print(\"Detected unsafe object:\")\n", " print(\"- {} (Confidence: {})\".format(label[\"Name\"], label[\"Confidence\"]))\n", " print(\" - Parents: {}\".format(label[\"Parents\"]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Recognize objects in video\n", " Object recognition in video is an async operation. \n", "https://docs.aws.amazon.com/rekognition/latest/dg/API_StartLabelDetection.html. \n", "\n", "- First we start a label detection job which returns a Job Id.\n", "- We can then call `get_label_detection` to get the job status and after job is complete, we can get object metadata.\n", "- In production use cases, you would usually use StepFunction or SNS topic to get notified when job is complete.\n", "***" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "videoName = \"content-moderation/media/GrandTour720.mp4\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Call Rekognition to start a job for object detection" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Start video label recognition job\n", "startLabelDetection = rekognition.start_label_detection(\n", " Video={\n", " 'S3Object': {\n", " 'Bucket': bucketName,\n", " 'Name': videoName,\n", " }\n", " },\n", ")\n", "\n", "labelsJobId = startLabelDetection['JobId']\n", "display(\"Job Id: {0}\".format(labelsJobId))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Additional (Optional) Request Attributes\n", "\n", "ClientRequestTokenL\n", "https://docs.aws.amazon.com/rekognition/latest/dg/API_StartLabelDetection.html#rekognition-StartLabelDetection-request-ClientRequestToken\n", "\n", "JobTag:\n", "https://docs.aws.amazon.com/rekognition/latest/dg/API_StartLabelDetection.html#rekognition-StartLabelDetection-request-JobTag\n", "\n", "MinConfidence:\n", "https://docs.aws.amazon.com/rekognition/latest/dg/API_StartLabelDetection.html#rekognition-StartLabelDetection-request-MinConfidence\n", "\n", "NotificationChannel:\n", "https://docs.aws.amazon.com/rekognition/latest/dg/API_StartLabelDetection.html#rekognition-StartLabelDetection-request-NotificationChannel\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Wait for object detection job to complete" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Wait for object detection job to complete\n", "# In production use cases, you would usually use StepFunction or SNS topic to get notified when job is complete.\n", "getObjectDetection = rekognition.get_label_detection(\n", " JobId=labelsJobId,\n", " SortBy='TIMESTAMP'\n", ")\n", "\n", "while(getObjectDetection['JobStatus'] == 'IN_PROGRESS'):\n", " time.sleep(5)\n", " print('.', end='')\n", " \n", " getObjectDetection = rekognition.get_label_detection(\n", " JobId=labelsJobId,\n", " SortBy='TIMESTAMP')\n", " \n", "display(getObjectDetection['JobStatus'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Review raw JSON reponse from Rekognition" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Show JSON response returned by Rekognition Object Detection API\n", "# In the JSON response below, you will see list of detected objects and activities.\n", "# For each detected object, you will see information like Timestamp\n", "\n", "display(getObjectDetection)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Display names of recognized objects in the video" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "flaggedObjectsInVideo = [\"Car\"]\n", "\n", "theObjects = {}\n", "\n", "# Display timestamps and objects detected at that time\n", "strDetail = \"Objects detected in video
=======================================
\"\n", "strOverall = \"Objects in the overall video:
=======================================
\"\n", "\n", "# Objects detected in each frame\n", "for obj in getObjectDetection['Labels']:\n", " ts = obj [\"Timestamp\"]\n", " cconfidence = obj['Label'][\"Confidence\"]\n", " oname = obj['Label'][\"Name\"]\n", " \n", " if(oname in flaggedObjectsInVideo):\n", " print(\"Found flagged object at {} ms: {} (Confidence: {})\".format(ts, oname, round(cconfidence,2)))\n", " \n", " strDetail = strDetail + \"At {} ms: {} (Confidence: {})
\".format(ts, oname, round(cconfidence,2))\n", " if oname in theObjects:\n", " cojb = theObjects[oname]\n", " theObjects[oname] = {\"Name\" : oname, \"Count\": 1+cojb[\"Count\"]}\n", " else:\n", " theObjects[oname] = {\"Name\" : oname, \"Count\": 1}\n", "\n", "# Unique objects detected in video\n", "for theObject in theObjects:\n", " strOverall = strOverall + \"Name: {}, Count: {}
\".format(theObject, theObjects[theObject][\"Count\"])\n", "\n", "# Display results\n", "display(HTML(strOverall))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Show video in the player" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Show video in a player\n", "\n", "s3VideoUrl = s3.generate_presigned_url('get_object', Params={'Bucket': bucketName, 'Key': videoName})\n", "\n", "videoTag = \"\".format(s3VideoUrl)\n", "\n", "videoui = \"
{}
\".format(videoTag)\n", "\n", "display(HTML(videoui))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "listui = \"
{}
\".format(strDetail)\n", "display(HTML(listui))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Worker Safety with Amazon Rekognition\n", "***\n", "You can use Amazon Rekognition to detect if certain objects are not present in the image or video. For example you can perform worker safety audit by revieweing images/video of a construction site and detecting if there are any workers without safety hat." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "imageName = \"content-moderation/media/hat-detection.png\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "display(IImage(url=s3.generate_presigned_url('get_object', Params={'Bucket': bucketName, 'Key': imageName})))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Call Amazon Rekognition to detect objects in the image\n", "\n", "detectLabelsResponse = rekognition.detect_labels(\n", " Image={\n", " 'S3Object': {\n", " 'Bucket': bucketName,\n", " 'Name': imageName,\n", " }\n", " }\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Display Rekognition response\n", "display(detectLabelsResponse)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define a function that will display image with bounded boxes around recognized objects\n", "# We will call this function in next step\n", " \n", "def drawBoundingBoxes (sourceImage, boxes):\n", " # blue, green, red, grey\n", " colors = ((255,255,255),(255,255,255),(76,182,252),(52,194,123))\n", " \n", " # Download image locally\n", " imageLocation = tempFolder+os.path.basename(sourceImage)\n", " s3.download_file(bucketName, sourceImage, imageLocation)\n", "\n", " # Draws BB on Image\n", " bbImage = Image.open(imageLocation)\n", " draw = ImageDraw.Draw(bbImage)\n", " width, height = bbImage.size\n", " col = 0\n", " maxcol = len(colors)\n", " line= 3\n", " for box in boxes:\n", " x1 = int(box[1]['Left'] * width)\n", " y1 = int(box[1]['Top'] * height)\n", " x2 = int(box[1]['Left'] * width + box[1]['Width'] * width)\n", " y2 = int(box[1]['Top'] * height + box[1]['Height'] * height)\n", " \n", " draw.text((x1,y1),box[0],colors[col])\n", " for l in range(line):\n", " draw.rectangle((x1-l,y1-l,x2+l,y2+l),outline=colors[col])\n", " col = (col+1)%maxcol\n", " \n", " imageFormat = \"PNG\"\n", " ext = sourceImage.lower()\n", " if(ext.endswith('jpg') or ext.endswith('jpeg')):\n", " imageFormat = 'JPEG'\n", "\n", " bbImage.save(imageLocation,format=imageFormat)\n", "\n", " display(bbImage)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Show image and bounded boxes around detected objects\n", "boxes = []\n", "objects = detectLabelsResponse['Labels']\n", "for obj in objects:\n", " for einstance in obj[\"Instances\"]:\n", " boxes.append ((obj['Name'], einstance['BoundingBox']))\n", " \n", "drawBoundingBoxes(imageName, boxes)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def matchPersonsAndHats(personsList, hardhatsList):\n", "\n", " persons = []\n", " hardhats = []\n", " personsWithHats = []\n", "\n", " for person in personsList:\n", " persons.append(person)\n", " for hardhat in hardhatsList:\n", " hardhats.append(hardhat)\n", "\n", " h = 0\n", " matched = 0\n", " totalHats = len(hardhats)\n", " while(h < totalHats):\n", " hardhat = hardhats[h-matched]\n", " totalPersons = len(persons)\n", " p = 0\n", " while(p < totalPersons):\n", " person = persons[p]\n", " if(not (hardhat['BoundingBoxCoordinates']['x2'] < person['BoundingBoxCoordinates']['x1']\n", " or hardhat['BoundingBoxCoordinates']['x1'] > person['BoundingBoxCoordinates']['x2']\n", " or hardhat['BoundingBoxCoordinates']['y4'] < person['BoundingBoxCoordinates']['y1']\n", " or hardhat['BoundingBoxCoordinates']['y1'] > person['BoundingBoxCoordinates']['y4']\n", " )):\n", "\n", " personsWithHats.append({'Person' : person, 'Hardhat' : hardhat})\n", "\n", " del persons[p]\n", " del hardhats[h - matched]\n", "\n", " matched = matched + 1\n", "\n", " break\n", " p = p + 1\n", " h = h + 1\n", "\n", " return (personsWithHats, persons, hardhats)\n", "\n", "def getBoundingBoxCoordinates(boundingBox, imageWidth, imageHeight):\n", " x1 = 0\n", " y1 = 0\n", " x2 = 0\n", " y2 = 0\n", " x3 = 0\n", " y3 = 0\n", " x4 = 0\n", " y4 = 0\n", "\n", " boxWidth = boundingBox['Width']*imageWidth\n", " boxHeight = boundingBox['Height']*imageHeight\n", "\n", " x1 = boundingBox['Left']*imageWidth\n", " y1 = boundingBox['Top']*imageWidth\n", "\n", " x2 = x1 + boxWidth\n", " y2 = y1\n", "\n", " x3 = x2\n", " y3 = y1 + boxHeight\n", "\n", " x4 = x1\n", " y4 = y3\n", "\n", " return({'x1': x1, 'y1' : y1, 'x2' : x2, 'y2' : y2, 'x3' : x3, 'y3' : y3, 'x4' : x4, 'y4' : y4})\n", "\n", "def getPersonsAndHardhats(labelsResponse, imageWidth, imageHeight):\n", "\n", " persons = []\n", " hardhats = []\n", "\n", " for label in labelsResponse['Labels']:\n", " if label['Name'] == 'Person' and 'Instances' in label:\n", " for person in label['Instances']:\n", " persons.append({'BoundingBox' : person['BoundingBox'], 'BoundingBoxCoordinates' : getBoundingBoxCoordinates(person['BoundingBox'], imageWidth, imageHeight), 'Confidence' : person['Confidence']})\n", " elif ((label['Name'] == 'Hardhat' or label['Name'] == 'Helmet') and 'Instances' in label):\n", " for hardhat in label['Instances']:\n", " hardhats.append({'BoundingBox' : hardhat['BoundingBox'], 'BoundingBoxCoordinates' : getBoundingBoxCoordinates(hardhat['BoundingBox'], imageWidth, imageHeight), 'Confidence' : hardhat['Confidence']})\n", "\n", " return (persons, hardhats)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "s3Resource = boto3.resource('s3')\n", "bucket = s3Resource.Bucket(bucketName)\n", "iojb = bucket.Object(imageName)\n", "response = iojb.get()\n", "file_stream = response['Body']\n", "im = Image.open(file_stream)\n", "imageWidth, imageHeight = im.size" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "persons, hardhats = getPersonsAndHardhats(detectLabelsResponse, imageWidth, imageHeight)\n", "\n", "personsWithHats, personsWithoutHats, hatsWihoutPerson = matchPersonsAndHats(persons, hardhats)\n", "\n", "personsWithHatsCount = len(personsWithHats)\n", "personsWithoutHatsCount = len(personsWithoutHats)\n", "hatsWihoutPersonCount = len(hatsWihoutPerson)\n", "\n", "outputMessage = \"Person(s): {}\".format(personsWithHatsCount+personsWithoutHatsCount)\n", "outputMessage = outputMessage + \"\\nPerson(s) With Safety Hat: {}\\nPerson(s) Without Safety Hat: {}\".format(personsWithHatsCount, personsWithoutHatsCount)\n", "print(outputMessage)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Learn more about building an end to end worker safety app\n", "To learn on how to build an end to end worker safety app, learn more at https://github.com/aws-samples/aws-deeplens-worker-safety-project" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", "### References\n", "- https://docs.aws.amazon.com/rekognition/latest/dg/API_DetectLabels.html\n", "- https://docs.aws.amazon.com/rekognition/latest/dg/API_StartLabelDetection.html\n", "- https://docs.aws.amazon.com/rekognition/latest/dg/API_GetLabelDetection.html\n", "\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You have successfully used Amazon Rekognition to identify specific objects in images and videos. In the next module,nyou will learn how to recognize explicit, suggestive or violent content in the images and videos." ] } ], "metadata": { "kernelspec": { "display_name": "conda_python3", "language": "python", "name": "conda_python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }