# Improving face match accuracy with Amazon Rekognition

In this lab we will see how creating collections with multiple profiles of faces leads to better face match accuracy than capturing a single face profile. In real life scenarios, selfies may have poor lighting or camera quality, which can affect face match accuracy. For this reason we encourage our customers to take multiple selfies, which will improve the accuracy. 

## Steps

These are the following steps we are going to accomplish:
- **Step 0 - Load libraries**
- **Step 1 - List existing collections**
- **Step 2 - Create collections**
- **Step 3 - Populate collections**
- **Step 4 - Compare faces against both collections**
- **Step 5 - Review results**
- **Step 6 - Clean up resources**

## Step 0 - Load libraries

In [None]:
!pip install -qU opencv-python-headless
import boto3, os, io, glob, cv2
import matplotlib.pyplot as plt
%matplotlib inline 
client=boto3.client('rekognition')

## Step 1 - List existing collections 

Before we create new collections, let's have a look if there are any existing collections in our account.

In [None]:
def list_collections():

 max_results=10
 
 print('Displaying collections...')
 response=client.list_collections(MaxResults=max_results)
 collection_count=0
 done=False
 
 while not done:
 collections=response['CollectionIds']

 for collection in collections:
 print (collection)
 collection_count+=1
 if 'NextToken' in response:
 nextToken=response['NextToken']
 response=client.list_collections(NextToken=nextToken,MaxResults=max_results)
 
 else:
 done=True

 return collection_count 

collection_count=list_collections()

print("There are: {} collections in your account ".format(collection_count))

## Step 2 - Create new collections

In this section we will create two collections, in order to compare results when there is a single face in the collection versus multiple faces profiles.

| âš ï¸ WARNING: Assign a unique name for your collections inside the quotes in the next cell |
| -- |

In [None]:
collection_A='' # PROVIDE A UNIQUE NAME FOR COLLECTION A
collection_B='' # PROVIDE A UNIQUE NAME FOR COLLECTION B

In [None]:
def create_collection(collection_id):
 #Create a collection
 print('Creating collection:' + collection_id)
 try:
 response=client.create_collection(CollectionId=collection_id)
 print('Collection ARN: ' + response['CollectionArn'])
 print('Status code: ' + str(response['StatusCode']))
 print('Done.')
 except Exception as e:
 print(e)

create_collection(collection_A)
create_collection(collection_B)

### Step 2a - Confirm your collections creation.

In [None]:
collection_count=list_collections()
print("There are: {} collections in your account ".format(collection_count))

## Step 3 - Populate collections 

### Step 3.a - Populate collection A

First we will index a single face into collection A. 

In [None]:
def populate_collection(collection, directory):
 for filename in os.listdir(directory):
 f = os.path.join(directory, filename)
 # checking if it is a file
 if os.path.isfile(f):
 print(f)
 file = open(f, "rb") # opening for [r]eading as [b]inary
 data = file.read() 
 response=client.index_faces(CollectionId=collection,
 Image={'Bytes':data},
 ExternalImageId=f.split("/")[2],
 MaxFaces=1,
 QualityFilter="AUTO",
 DetectionAttributes=['ALL'])
 print ('Results for ' + f.split("/")[2])
 print('Faces indexed:')
 for faceRecord in response['FaceRecords']:
 print(' Face ID : {}'.format( faceRecord['Face']['FaceId']))
 print(' Location: {}'.format(faceRecord['Face']['BoundingBox']))

 if len(response['UnindexedFaces']) > 0:
 print('Faces not indexed:')
 for unindexed_face in response['UnindexedFaces']:
 print(' Location: {}'.format(unindexed_face['FaceDetail']['BoundingBox']))
 print(' Reasons :')
 for reason in unindexed_face['Reasons']:
 print(' ' + reason)
 file.close()
 return

In [None]:
directory_A = 'media/single-profile'

In [None]:
populate_collection(collection_A,directory_A) 

In [None]:
img = cv2.imread("media/single-profile/dani1.jpg")[:,:,::-1]
fig = plt.figure(figsize=(10, 7))
fig.add_subplot(1, 4, 1)
plt.imshow(img)
plt.axis('off')
plt.title("dani1")

### Step 3.b - Populate collection B

Now we will populate B with multiple face profiles. Having multiple images of the same person should improve the face match results.

In [None]:
directory_B = 'media/multiple-profiles'

In [None]:
populate_collection(collection_B,directory_B) 

In [None]:
fig = plt.figure(figsize=(10, 7))
images = glob.glob("media/multiple-profiles/*.jpg")
for idx, image in enumerate(images):
 img = cv2.imread(image)[:,:,::-1]
 fig.add_subplot(1, len(images), idx+1)
 plt.imshow(img)
 plt.axis('off')
 plt.title(image.split("/")[-1])

## Step 4 - Search faces against both collections

Now we will search the same photo against both collections to compare the similarity confidence.

In [None]:
file = open("media/test-images/test1.jpg", "rb") # opening for [r]eading as [b]inary
data = file.read() 
img = cv2.imread("media/test-images/test1.jpg")[:,:,::-1]
fig = plt.figure(figsize=(10, 7))
fig.add_subplot(1, 4, 1)
plt.imshow(img)
plt.axis('off')
plt.title("test")

In [None]:
def search_face(data, collection):
 searchresults = client.search_faces_by_image(CollectionId=collection,
 Image={'Bytes':data},
 FaceMatchThreshold=50)
 return searchresults

In [None]:
searchA = search_face(data,collection_A)
searchB = search_face(data,collection_B)

## Step 5 - Review results

Let's have a look at the results of the search against collection A (one profile)

In [None]:
print(searchA)

Now let's see the results of the search against collection B (multiple profiles)

In [None]:
print(searchB)

Let's compare the best similarities, when we use multiple profile pictures the face similarity score improves. 

In [None]:
print("Using a single face profile in a collection, the similarity score is {}".format(searchA["FaceMatches"][0]["Similarity"]))
print("Using multiple face profiles in a collection, the best similarity score is {}".format(searchB["FaceMatches"][0]["Similarity"]))

## Step 6 - Clean up resources

Let's delete the collections we created in our account.

In [None]:
def delete_collection(collection_id):

 print('Attempting to delete collection ' + collection_id)
 status_code=0
 try:
 response=client.delete_collection(CollectionId=collection_id)
 status_code=response['StatusCode']
 
 except ClientError as e:
 if e.response['Error']['Code'] == 'ResourceNotFoundException':
 print ('The collection ' + collection_id + ' was not found ')
 else:
 print ('Error other than Not Found occurred: ' + e.response['Error']['Message'])
 status_code=e.response['ResponseMetadata']['HTTPStatusCode']
 print('Status code: ' + str(status_code))


delete_collection(collection_A)
delete_collection(collection_B)
