{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "08010a53-1695-4da2-aa66-5e9e91c710d0",
   "metadata": {},
   "source": [
    "# Video Object Detection -- Rekognition Custom Labels\n",
    "This sample notebook shows demonstrates how to detect custom labels in a video with Amazon Rekognition Custom Labels and draw their corresponding bounding boxes."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a1ad8d05-8a62-4cfe-b737-e4b53bceafdb",
   "metadata": {},
   "source": [
    "### Import and install necessary packages\n",
    "Let's begin by importing and installing all the necessary packages we need to make the notebook run."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ffcf8054-d084-4283-bd00-7325a40b1fd3",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Install the first time you execute the notebook\n",
    "!apt-get -qq update\n",
    "!apt-get -qq install ffmpeg -y \n",
    "!pip install --quiet opencv-python-headless"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9203c3ac-6dcd-4554-ad8b-50ec0e5df743",
   "metadata": {},
   "outputs": [],
   "source": [
    "import cv2\n",
    "import boto3\n",
    "import os\n",
    "import glob\n",
    "import time\n",
    "import queue\n",
    "import shutil\n",
    "from IPython.display import Video\n",
    "from multiprocessing import Lock, Process, Queue, current_process\n",
    "rekognition = boto3.client('rekognition')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d4862501-429e-440f-873b-d4e352960909",
   "metadata": {},
   "source": [
    "To be able to use your Amazon Rekognition Custom Labels running model, insert the arn of the project below, which you will locate in the Custom Labels console."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e3af6fc2-ad74-469d-b050-497b84173b12",
   "metadata": {},
   "outputs": [],
   "source": [
    "projectVersionArn = \"***\" ## INSERT THE ARN OF YOUR CUSTOM LABELS PROJECT"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "26d6a21a-c157-4f5e-8315-10fe60c12a04",
   "metadata": {},
   "source": [
    "### Helper Functions\n",
    "Here are a couple of helper functions we are going to use to process our video frames and detect and draw the bounding boxes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "53c96d21-b311-4caa-8bde-ad9e35484336",
   "metadata": {},
   "outputs": [],
   "source": [
    "os.mkdir(\"input_video\")\n",
    "os.mkdir(\"output_video\")\n",
    "\n",
    "def chunks(lst, n):\n",
    "    for i in range(0, len(lst), n):\n",
    "        yield lst[i:i + n]\n",
    "        \n",
    "def transform_bounding(frame,box):\n",
    "    imgWidth, imgHeight = frame\n",
    "    left = int(imgWidth * box['Left'])\n",
    "    top = int(imgHeight * box['Top'])\n",
    "    right = left + int(imgWidth * box['Width'])\n",
    "    bottom = top + int(imgHeight * box['Height'])\n",
    "    return left,top,right,bottom\n",
    "\n",
    "def process_frames(frames_list):\n",
    "    for f in frames_list:\n",
    "        frame = cv2.imread(f)\n",
    "        image_bytes = cv2.imencode('.png', frame)[1].tobytes()\n",
    "        response = rekognition.detect_custom_labels(\n",
    "                        Image={'Bytes': image_bytes},\n",
    "                        ProjectVersionArn = projectVersionArn\n",
    "                    )\n",
    "        if (len(response[\"CustomLabels\"]) > 0):\n",
    "            for elabel in response[\"CustomLabels\"]:\n",
    "                if int(elabel[\"Confidence\"]) > 50:\n",
    "                    left,top,right,bottom = transform_bounding(size ,elabel[\"Geometry\"][\"BoundingBox\"])\n",
    "                    label = elabel[\"Name\"]\n",
    "                    conf = elabel[\"Confidence\"]\n",
    "\n",
    "                    imgWidth, imgHeight= size\n",
    "                    thick = int((imgHeight + imgWidth) // 500)\n",
    "\n",
    "                    color = (0,255,0)\n",
    "                    cv2.rectangle(frame,(left, top), (right, bottom), color, thick)\n",
    "                    cv2.putText(frame, label+\":\"+str(conf)[0:4], (left, top - 12), 0, 1e-3 * imgHeight, color, thick//1)    \n",
    "                    cv2.imwrite(f,frame)  \n",
    "                else:\n",
    "                    cv2.imwrite(f,frame)\n",
    "        else:\n",
    "            cv2.imwrite(f,frame)\n",
    "            \n",
    "def detect_labels(frames_queue):\n",
    "    while True:\n",
    "        try:\n",
    "            task = frames_queue.get_nowait()\n",
    "        except queue.Empty:\n",
    "            print(\"Queue Empty\")\n",
    "            break\n",
    "        else:\n",
    "            process_frames(task)\n",
    "    return True\n",
    "\n",
    "def get_video_info(video):\n",
    "    cap = cv2.VideoCapture(original_video)\n",
    "    fps = cap.get(cv2.CAP_PROP_FPS)\n",
    "    size = (int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)),\n",
    "            int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)))\n",
    "    cap.release()\n",
    "    return fps, size"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a729cbe6-87fd-43dd-8392-394d23755b4c",
   "metadata": {},
   "source": [
    "### Input/Output Configuration\n",
    "Upload a video into the \"input_video\" folder to be processed. Next, specify the name of the file below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e9691f93-0e6d-40ff-bfa6-e8d78d04e7f2",
   "metadata": {},
   "outputs": [],
   "source": [
    "#---------------------------------------------------------------------------\n",
    "# INSERT THE NAME OF THE INPUT VIDEO LOCATED IN THE INPUT_VIDEO FOLDER\n",
    "original_video_name = \"***.mp4\" # eg. \"original.mp4\"\n",
    "#---------------------------------------------------------------------------"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2cf98985-8830-4bda-b4e5-c229113bcf3b",
   "metadata": {},
   "outputs": [],
   "source": [
    "original_video = \"input_video/{}\".format(original_video_name)\n",
    "frames_folder = \"frames-{}\".format(original_video_name.split('.')[0])\n",
    "output_video = \"output_video/{}-labeled.mp4\".format(original_video_name.split('.')[0])\n",
    "fps, size = get_video_info(original_video)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "68878324-d3bb-40e9-b070-e2322bcd57e7",
   "metadata": {},
   "source": [
    "Review the video you have chosen to process."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b4551872-699c-41cc-b253-ff4bd7d8aa59",
   "metadata": {},
   "outputs": [],
   "source": [
    "Video(original_video)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f2bea1c1-fc43-4464-888f-ef46a6dff747",
   "metadata": {},
   "source": [
    "### Split the video into frames "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e4b219ed-6b6b-480f-8a13-df9c0ed1046c",
   "metadata": {},
   "source": [
    "Now we are going to split our input video into frames, we'll use FFMPEG for this task and save the frames into a frames folder."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "74b6ea60-e284-4c43-885b-7e7ccd41961a",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Splitting frames...\")\n",
    "os.mkdir(frames_folder)\n",
    "!ffmpeg -hide_banner -loglevel error -i {original_video} {frames_folder}/frame-%03d.png\n",
    "print(\"Splitting frames... -- Complete!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f28ba6e8-c365-4d51-9a68-1ec31d464288",
   "metadata": {},
   "source": [
    "### Multiprocessing Configuration"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aa4a6a3e-f9ea-4b99-837a-40d0dbf6ea0e",
   "metadata": {},
   "source": [
    "Now we have our video split into frames let's move them into a queue divided in X chunks."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ca1d9707-1070-43da-a8ae-bf8669b4bdf6",
   "metadata": {},
   "outputs": [],
   "source": [
    "files_list = glob.glob(\"{}/*\".format(frames_folder))\n",
    "n = 20 #Frames divided into chunks of n\n",
    "number_of_chunks = int(len(files_list)/n)+1\n",
    "split_list = list(chunks(files_list, number_of_chunks))\n",
    "\n",
    "frames_queue = Queue()\n",
    "for chunk in split_list:\n",
    "    frames_queue.put(chunk)\n",
    "\n",
    "number_of_processors = 5 #Number of subprocesses\n",
    "processes = []\n",
    "\n",
    "print(\"Size of queue:\",frames_queue.qsize())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4ecb125c-f177-438b-be1e-dab30d666457",
   "metadata": {},
   "source": [
    "### Get bounding boxes for frames and overwrite the image file"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "18f148ba-b937-4804-a70a-a47258fcf4ac",
   "metadata": {},
   "source": [
    "Let's iterate over the queue of chunks (using multiprocessing) to call Amazon Rekognition Custom Labels to detect objects in our frames."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cf266622-aa64-49e9-a676-01946da87e9e",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Detecting Labels...\")\n",
    "for w in range(number_of_processors):\n",
    "    p = Process(target=detect_labels, args=(frames_queue,))\n",
    "    processes.append(p)\n",
    "    p.start()\n",
    "for p in processes:\n",
    "        p.join()\n",
    "        p.kill()\n",
    "print(\"Detecting Labels... -- Complete!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b831b58d-90e8-42c5-8779-c7fdd9e2538c",
   "metadata": {},
   "source": [
    "### Create the labeled video from frames"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e411b6e3-11bb-4ae1-97d4-063995823d52",
   "metadata": {},
   "source": [
    "Once we have finished detecting objects and drawing the bounding boxes over the frames, we can stich the video back together."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ce562cc2-6a26-4ce3-8421-61b678c5a98a",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Creating output video...\")\n",
    "!ffmpeg -hide_banner -loglevel error -f image2 -r {fps} -i {frames_folder}/frame-%03d.png -vcodec libx264 -crf 18  -pix_fmt yuv420p {output_video} -y\n",
    "print(\"Creating output video... -- Complete!\")\n",
    "shutil.rmtree(frames_folder)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0650c183-3648-40f3-b755-b8b3bcdf2ca0",
   "metadata": {},
   "source": [
    "Review your labeled video"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5f4f2712-c409-4462-9ced-9b9460508d0b",
   "metadata": {},
   "outputs": [],
   "source": [
    "Video(output_video)"
   ]
  }
 ],
 "metadata": {
  "instance_type": "ml.t3.medium",
  "kernelspec": {
   "display_name": "Python 3 (Data Science)",
   "language": "python",
   "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:eu-west-1:470317259841:image/datascience-1.0"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}