{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Key Identity Verification APIs\n", "\n", "-------\n", "\n", "This lab provides hands on experience with the key Amazon Rekognition APIs for identity verification. \n", "\n", "## Introduction \n", "\n", "-------\n", "\n", "In-person user identity verification is slow to scale, costly, and high friction for users. Machine learning powered facial biometrics can enable online user identity verification. Amazon Rekognition offers pre-trained facial recognition and analysis capabilities that you can quickly add to your user onboarding and authentication workflows to verify opted-in users' identity online. \n", "\n", "In this notebook, we'll use the Amazon Rekgonition's key APIs for Identity Verification. After running this notebook you should be able to use the following APIs:\n", "\n", "- DetectFaces: DetectFaces detects the 100 largest faces in the image. \n", "- CompareFaces: Compares a face in the source input image with each of the 100 largest faces detected in the target input image.\n", "- CreateCollection: Creates a searchable index of faces. \n", "- IndexFaces: Detects faces in the input image and adds them to the specified collection and returns a face id which can be used in subsequent searches. \n", "- SearchFacesByImage: For a given input image, first detects the largest face in the image, and then searches the specified collection for matching faces.\n", "- SearchFaces: For a given input face ID, searches for matching faces in the collection of the face it belongs to.\n", "- DeleteFaces: Deletes faces from a collection. You specify a collection ID and an array of face IDs to remove from the collection.\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import io\n", "import boto3\n", "import json\n", "from IPython.display import Image as IImage\n", "import pandas as pd\n", "\n", "%store -r bucket_name\n", "mySession = boto3.session.Session()\n", "aws_region = mySession.region_name\n", "print(\"AWS Region: {}\".format(aws_region))\n", "print(\"AWS Bucket: {}\".format(bucket_name))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setup Clients \n", "-----\n", "Here we are going to use both S3 and Rekognition APIs " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "s3_client = boto3.client('s3')\n", "rek_client = boto3.client('rekognition')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Display a Face" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "## Image of a Face\n", "face_image = \"face_6.jpeg\"\n", "print(face_image)\n", "display(IImage(url=s3_client.generate_presigned_url('get_object', \n", " Params={'Bucket': bucket_name, \n", " 'Key' : face_image})))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## DetectFaces API\n", "----\n", "\n", "[DetectFaces](\"https://docs.aws.amazon.com/rekognition/latest/APIReference/API_DetectFaces.html\") detects the 100 largest faces in the image. For each face detected, the operation returns face details. These details include a bounding box of the face, a confidence value (that the bounding box contains a face), and a fixed set of attributes such as facial landmarks (for example, coordinates of eye and mouth), presence of beard, sunglasses, and so on.\n", "\n", "\n", "
FaceDetails \n", " \n", "Details of each face found in the image. A [FaceDetail](\"https://docs.aws.amazon.com/rekognition/latest/APIReference/API_FaceDetail.html\") object contains either the default facial attributes or all facial attributes. The default attributes are BoundingBox, Confidence, Landmarks, Pose, and Quality.\n", "\n", "**Take a moment and look at the response from the detect faces api.**\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "response = rek_client.detect_faces(Image={'S3Object':{\n", " 'Bucket': bucket_name,\n", " 'Name' : face_image}},\n", " Attributes=['ALL'])\n", "\n", "response\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Compare Faces \n", "------\n", "\n", "[CompareFaces](\"https://docs.aws.amazon.com/rekognition/latest/APIReference/API_CompareFaces.html\")\n", "Compares a face in the source input image with each of the 100 largest faces detected in the target input image. CompareFaces also returns an array of faces that don't match the source image. For each face, it returns a bounding box, confidence value, landmarks, pose details, and quality. The response also returns information about the face in the source image, including the bounding box of the face and confidence value.\n", "\n", "
FaceMatches \n", " \n", " [FaceMatches](\"https://docs.aws.amazon.com/rekognition/latest/APIReference/API_CompareFaces.html#API_CompareFaces_ResponseSyntax\") returns an array of faces in the target image that match the source image face. Each CompareFacesMatch object provides the bounding box, the confidence level that the bounding box contains a face, and the similarity score for the face in the bounding box and the face in the source image.\n", "\n", "\n", "
\n", "\n", "
Key Parameters \n", "\n", "- **QualityFilter:** A filter that specifies a quality bar for how much filtering is done to identify faces. Filtered faces aren't compared. If you specify AUTO, Amazon Rekognition chooses the quality bar. If you specify LOW, MEDIUM, or HIGH, filtering removes all faces that don not meet the chosen quality bar.\n", "\n", "- **SimilarityThreshold:** The minimum level of confidence in the face matches that a match must meet to be included in the FaceMatches array. Minimum 0 maximum 100.\n", "\n", "**NOTE: take a look at the Quality, Similarity and various landmarks**\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "source_image = \"face_6.jpeg\"\n", "target_image = \"face_6_3.jpeg\"\n", "\n", "response = rek_client.compare_faces(SourceImage={\n", " 'S3Object': {\n", " 'Bucket': bucket_name,\n", " 'Name': source_image,\n", " }\n", " },\n", " TargetImage={\n", " 'S3Object': {\n", " 'Bucket': bucket_name,\n", " 'Name': target_image,\n", " }\n", " },\n", " SimilarityThreshold = 90,\n", " QualityFilter='HIGH'\n", ")\n", " \n", "response \n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating a Collection\n", "-----\n", "\n", "A collection is a searchable index of faces. You create a named collection in an AWS Region and then add faces to the collection using the IndexFaces operation. once the collection is created and faces have been indexed you can search the collection to determine if a face has been indexed in the collection either by image or by face identifier. \n", "\n", "\n", "For example, you might create collections, one for each of your application users. A user can then index faces using the IndexFaces operation and persist results in a specific collection. Then, a user can search the collection for faces in the user-specific container.\n", "\n", "\n", "
Creating, Indexing and Searching Collections \n", " \n", "The following steps are used to create a collection, index faces and finally search the collection for a face. \n", "\n", "1. [CreateCollection](\"https://docs.aws.amazon.com/rekognition/latest/APIReference/API_CreateCollection.html\") Creates a collection in an AWS Region.\n", "2. [IndexFaces](\"https://docs.aws.amazon.com/rekognition/latest/APIReference/API_IndexFaces.html\")Detects faces in the input image and adds them to the specified collection.\n", "3. [SearchFacesByImage](\"https://docs.aws.amazon.com/rekognition/latest/APIReference/API_SearchFacesByImage.html\") For a given input image, first detects the largest face in the image, and then searches the specified collection for matching faces. The operation compares the features of the input face with faces in the specified collection.\n", "4. [SearchFaces](\"https://docs.aws.amazon.com/rekognition/latest/APIReference/API_SearchFacesByImage.html\") For a given input face ID, searches for matching faces in the collection the face belongs to. You get a face ID when you add a face to the collection using the IndexFaces operation. The operation compares the features of the input face with faces in the specified collection.\n", "5. [DeleteFaces](\"https://docs.aws.amazon.com/rekognition/latest/APIReference/API_DeleteFaces.html\")Deletes faces from a collection. You specify a collection ID and an array of face IDs to remove from the collection.\n", "\n", "\n", "#### Other Collection APIs to try.\n", "------\n", "- [ListCollections](\"https://docs.aws.amazon.com/rekognition/latest/APIReference/API_ListCollections.html\")Returns list of collection IDs in your account. \n", "- [ListFaces](\"https://docs.aws.amazon.com/rekognition/latest/APIReference/API_ListFaces.html\")Returns metadata for faces in the specified collection. This metadata includes information such as the bounding box coordinates, the confidence (that the bounding box contains a face), and face ID.\n", "\n", "### Name your Collection \n", "\n", "------\n", "\n", "Here you can update the name of your collection\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "## Name your Collection \n", "collection_name = \" \" # replace with your own collection name\n", "\n", "try:\n", " # try and create a colleciton, \n", " response = rek_client.create_collection(\n", " CollectionId=collection_name)\n", "except Exception as err:\n", " print(\"ERROR: {}\".format(err))\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Indexing Faces \n", "-----\n", "\n", "Here we'll read our collection image mapping file into a panda's dataframe, the image map contains a reference name (used as an external identifier) a reference image (a synthtetically generated face) and four synthetically aged selfie images. \n", "\n", "
Next Steps: \n", " \n", "- IndexFaces: index a single image into our collection, index multiple faces into our collection \n", "- SearchFacesByImage : search a collection for a specific image \n", "- SearchFaces : search a collection for a specific FaceID \n", "
\n", "\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# -- read the image map into a pandas dataframe --\n", "obj = s3_client.get_object(Bucket=bucket_name, Key='IDVImageMapping.xlsx')\n", "\n", "image_map = pd.read_excel(io.BytesIO(obj['Body'].read()), engine='openpyxl')\n", "image_map.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### IndexFaces \n", "-----\n", "\n", "Here we are going to index a single face of Aaron Peirsol - an American former competition swimmer and backstroke specialist who is a former world champion and world record-holder.\n", "\n", "
Make note of FaceID \n", " \n", "Make note of the extracted the FaceID from the response JSON. we'll use this to search for a specific face in the collection. \n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "image_name = \"face_7.jpeg\"\n", "image_external_id = \"Entity_X_Gentry\"\n", "\n", "# Index a single face \n", "response = rek_client.index_faces(\n", " CollectionId=collection_name,\n", " Image={\n", " 'S3Object': {\n", " 'Bucket': bucket_name,\n", " 'Name': image_name,\n", " }\n", " },\n", " ExternalImageId=image_external_id,\n", " DetectionAttributes=[\n", " 'DEFAULT',\n", " ],\n", " MaxFaces=1, # how many faces to detect \n", " QualityFilter='AUTO' # apply quality filter \n", ")\n", "\n", "face_id = response['FaceRecords'][0]['Face']['FaceId']\n", "print(\"-- FaceID : {} --\\n\".format(face_id))\n", "print(json.dumps(response, indent=3)) \n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Index Multiple Faces \n", "-----\n", "\n", "The easy way to index multiple faces is to simply make a list of faces and loop over the list, in this case we are going to use the \"reference name\" as our external image id and the reference image will be the image we add to our collection. Of course you can use a variety of parallel processing methods to speed up indexing operations, however in this case here we'll keep it simple and just use a for loop. Note we are using about 100 images so this may take a few seconds. \n", "\n", "
Make note of ExternalImageID \n", " \n", "The ExternalImageID is the ID that you typically use to correlate a face match to your external systems, it can also be used to check identities. In this case our external image id is simply the name of the person in the image.\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "## Index several faces \n", "dict_of_faces = image_map[[\"reference_name\",\"reference_image\"]].to_dict('records')\n", "\n", "for rec in dict_of_faces:\n", " try:\n", " response = rek_client.index_faces(\n", " CollectionId= collection_name,\n", " Image={\n", " 'S3Object': {\n", " 'Bucket': bucket_name,\n", " 'Name': rec[\"reference_image\"],\n", " }\n", " },\n", " ExternalImageId=rec['reference_name'],\n", " DetectionAttributes=[\n", " 'DEFAULT',\n", " ],\n", " MaxFaces=1, # maximum faces detected \n", " QualityFilter='AUTO' # apply the quality filter. \n", " )\n", " face_id = response['FaceRecords'][0]['Face']['FaceId']\n", " print(\"ImageName: {}, FaceID: {}\".format(rec[\"reference_image\"], face_id))\n", " except:\n", " print(\"Failed: ImageName: {}, FaceID: {}\".format(rec[\"reference_image\"], face_id))\n", " \n", "\n", "print(\"indexing complete\")\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## SearchFacesByImage\n", "-----\n", "\n", "One of the primary use cases of identity verification is to simply search the collection of faces to see if a particular face already exists in the collection and assess the similarity to the indexed image. \n", "\n", "
Things to Note \n", "\n", "- MaxFaces: returns the maximum number of faces with the highest confidence in the match. Minimum value of 1. Maximum value of 4096.\n", "- FaceMatchThreshold: specifies the minimum confidence in the face match to return. For example, don't return any matches where confidence in matches is less than 70%. The default value is 80%. \n", "- QualityFilter: allows you to filter out detected faces that do not meet a required quality bar. The quality bar is based on a variety of common use cases. Use QualityFilter to set the quality bar for filtering by specifying LOW, MEDIUM, or HIGH.\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "search_image_name = \"face_5_3.jpeg\"\n", "response = rek_client.search_faces_by_image(\n", " CollectionId=collection_name,\n", " Image={\n", " 'S3Object': {\n", " 'Bucket': bucket_name,\n", " 'Name': search_image_name,\n", " }\n", " },\n", " MaxFaces=1,\n", " FaceMatchThreshold=90,\n", " QualityFilter='AUTO'\n", ")\n", "response\n", "\n", "external_image_id = response[\"FaceMatches\"][0][\"Face\"][\"ExternalImageId\"]\n", "similarity_score = response[\"FaceMatches\"][0][\"Similarity\"]\n", "face_id = response[\"FaceMatches\"][0][\"Face\"][\"FaceId\"]\n", "\n", "print(\"-- FaceID : {} \".format(face_id))\n", "print(\"-- Similarity : {} \".format(similarity_score))\n", "print(\"-- ExternalImageID : {} \\n\".format(external_image_id))\n", "\n", "print(json.dumps(response, indent=3)) \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## SearchFaces \n", "-----\n", "For a given input **face ID**, search the collection for matching faces. You get a face ID when you add a face to the collection using the IndexFaces operation(above) we are simply going to index additional faces below, and snag a couple of face ID. The operation compares the features of the input face with faces in the specified collection. Essentially matching faces within a collection. \n", "\n", "\n", "\n", "
Setup \n", "\n", "- Index additional faces, in this case we'll use our image_map dataframe and take the first 10 selfie_images \n", "- Make note of a face ID returned from the index \n", "- SearchFaces for a given *FaceID* and assess the match. \n", "\n", "
\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "## Index several faces \n", "dict_of_faces = image_map[[\"reference_name\",\"selfie_0\"]].head(10).to_dict('records')\n", "\n", "for rec in dict_of_faces:\n", " \n", " response = rek_client.index_faces(\n", " CollectionId= collection_name,\n", " Image={\n", " 'S3Object': {\n", " 'Bucket': bucket_name,\n", " 'Name': rec[\"selfie_0\"],\n", " }\n", " },\n", " ExternalImageId=rec['reference_name'],\n", " DetectionAttributes=[\n", " 'DEFAULT',\n", " ],\n", " MaxFaces=1, # maximum faces detected \n", " QualityFilter='AUTO' # apply the quality filter. \n", " )\n", " face_id = response['FaceRecords'][0]['Face']['FaceId']\n", " print(\"ImageName: {}, FaceID: {}\".format(rec[\"selfie_0\"], face_id))\n", " \n", "\n", "print(\"indexing complete\")\n", " \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Snag one of the FaceIDs above \n", "----\n", "Pass the FaceID above to search_faces, this will identify potential matches within the collection. \n", "\n", "**Select a face id from the results above and enter it below** " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "face_id = \"\" # replace with your own face ID\n", "try:\n", " response = rek_client.search_faces(\n", " CollectionId=collection_name,\n", " FaceId=face_id,\n", " MaxFaces=10\n", " )\n", " print(json.dumps(response, indent=2)) \n", "except Exception as err:\n", " print(\"ERROR: {}\".format(err))\n", "\n", "\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Other Collection APIs\n", "-----\n", "\n", "There are several other APIs which we can take advantage of to manage and maintain our collection. \n", "\n", "
Add'l APIs \n", "\n", "- ListCollections: Returns list of collection IDs in your account.\n", "- DeleteFaces: Deletes faces from a collection. You specify a collection ID and an array of face IDs to remove from the collection\n", "- ListFaces: Returns meta data for faces in the specified collection. This meta data includes information such as the bounding box coordinates, the confidence (that the bounding box contains a face), and face ID.\n", "- DeleteCollection: Deletes the specified collection. Note that this operation removes all faces in the collection.\n", "\n", "
\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### ListCollections \n", "-----\n", "ListCollections returns an array of Collection Ids and corresponding FaceModelVersions. This is helpful information to note when upgrading to newer versions of Rekognition's faces model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "## -- list collections \n", "response = rek_client.list_collections(\n", " MaxResults=123\n", ")\n", "\n", "response" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### DeleteFaces \n", "\n", "------\n", "Deletes faces from a collection. You specify a collection ID and an array of face IDs to remove from the collection. to do this snag two or more FaceIDs from the results above and simply pass them to the DeleteFaces API to remove them from the collection.\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "## -- Delete Faces\n", "## Snag a few faceIDs from above. \n", "response = rek_client.delete_faces(\n", " CollectionId=collection_name,\n", " FaceIds=[ \"\", \"\"] # enter faceIDs here\n", ")\n", "response" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### What happens when you pass a faceID that doesn't exist? \n", "----\n", "Here we are simply going to search for a face id that we deleted, you'll note we get an error that the faceID does not exist" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "## Search for a face that doesn't exist \n", "\n", "face_id = \"48c04911-4573-439d-9389-bebc0994013a\" # replace with your own face ID\n", "try:\n", " response = rek_client.search_faces(\n", " CollectionId=collection_name,\n", " FaceId=face_id,\n", " MaxFaces=10\n", " )\n", "except Exception as err:\n", " print(\"ERROR: {}\".format(err))\n", " print(\"FaceID: \\'{}\\' NOT FOUND IN Collection: \\'{}\\' \".format(face_id,collection_name ))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### ListFaces \n", "\n", "------\n", "To list the faces in a collection, you can use the ListFaces operation to list the faces in a collection. ListFaces returns metadata for faces in the specified collection. This metadata includes information such as the bounding box coordinates, the confidence (that the bounding box contains a face), and face ID. \n", "\n", "\n", "\n", "
ListFaces API \n", "\n", "- CollectionId : collection identifier\n", "- MaxResults : Maximum number of faces to return. Valid Range: Minimum value of 0. Maximum value of 4096.\n", "\n", "note the \"NextToken\" will allow you to fetch the next set of faces \n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "maxResults = 100\n", "response=rek_client.list_faces(CollectionId=collection_name,\n", " MaxResults=maxResults)\n", "\n", "print('Faces in collection ' + collection_name)\n", "\n", "tokens=True\n", "while tokens:\n", " faces=response['Faces']\n", "\n", " for face in faces:\n", " print (face)\n", " if 'NextToken' in response:\n", " nextToken=response['NextToken']\n", " response=rek_client.list_faces(CollectionId=collection_name,\n", " NextToken=nextToken,MaxResults=maxResults)\n", " else:\n", " tokens=False" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### DeleteCollection \n", "----- \n", "\n", "As part of our cleanup, we can a delete our collection. This will delete the collection and all the face vectors contained within " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "try:\n", " rek_client.delete_collection(\n", " CollectionId=collection_name\n", " )\n", "except Exception as err:\n", " print(\"ERROR: {}\".format(err))" ] } ], "metadata": { "instance_type": "ml.t3.medium", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" }, "vscode": { "interpreter": { "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6" } } }, "nbformat": 4, "nbformat_minor": 4 }