# Clean Up Resources
This notebook demonstrates how to clean up all the resources created in a previous notebook by loading saved variable names. 

(!) If you have created resources using multiple notebooks and have multiple dataset groups, run the cell that begins `# store for cleanup` at the end of the notebook and then immediatly after that run this notebook for each of them (if you have created two dataset groups, you wil have to run this notebook twice). This code only deletes the resources specified in `# store for cleanup` and if you used multiple notebooks, the values saved will correspond to the last notebook you ran).

This notebook uses the functions defined below, to iterate throught the resources inside a dataset group. 

You can use this notebook to delete Amazon Personalize resources The resource ARNs are defined in the previous notebook or you can enter them manually bellow.

In [None]:
%store -r

In [None]:
# If you cannot/do not want to use "%store -r" to load the resources to delete, 
# you can uncomment the code bellow and enter them manually

# dataset_group_arn='XXXXX'
# role_name='XXXXX'
# region='XXXXX'

Print the resources to be deleted. Please check that these correspond to the resources you want to delete. 
This operation cannot be undone.

In [None]:
print ('dataset_group_arn:', dataset_group_arn)
print ('role_name:', role_name)
print ('region:', region)

In [None]:
schema_arns = []

In [None]:
import sys
import getopt
import logging
import botocore
import boto3
import time
from packaging import version
from time import sleep
from botocore.exceptions import ClientError

logger = logging.getLogger()
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.INFO)
logger.setLevel(logging.INFO)
logger.addHandler(handler)

personalize = None

In [None]:
def _delete_event_trackers(dataset_group_arn):
 event_tracker_arns = []

 event_trackers_paginator = personalize.get_paginator('list_event_trackers')
 for event_tracker_page in event_trackers_paginator.paginate(datasetGroupArn = dataset_group_arn):
 for event_tracker in event_tracker_page['eventTrackers']:
 if event_tracker['status'] in [ 'ACTIVE', 'CREATE FAILED' ]:
 logger.info('Deleting event tracker {}'.format(event_tracker['eventTrackerArn']))
 personalize.delete_event_tracker(eventTrackerArn = event_tracker['eventTrackerArn'])
 elif event_tracker['status'].startswith('DELETE'):
 logger.warning('Event tracker {} is already being deleted so will wait for delete to complete'.format(event_tracker['eventTrackerArn']))
 else:
 raise Exception('Solution {} has a status of {} so cannot be deleted'.format(event_tracker['eventTrackerArn'], event_tracker['status']))

 event_tracker_arns.append(event_tracker['eventTrackerArn'])

 max_time = time.time() + 30*60 # 30 mins
 while time.time() < max_time:
 for event_tracker_arn in event_tracker_arns:
 try:
 describe_response = personalize.describe_event_tracker(eventTrackerArn = event_tracker_arn)
 logger.debug('Event tracker {} status is {}'.format(event_tracker_arn, describe_response['eventTracker']['status']))
 except ClientError as e:
 error_code = e.response['Error']['Code']
 if error_code == 'ResourceNotFoundException':
 event_tracker_arns.remove(event_tracker_arn)

 if len(event_tracker_arns) == 0:
 logger.info('All event trackers have been deleted or none exist for dataset group')
 break
 else:
 logger.info('Waiting for {} event tracker(s) to be deleted'.format(len(event_tracker_arns)))
 time.sleep(20)

 if len(event_tracker_arns) > 0:
 raise Exception('Timed out waiting for all event trackers to be deleted')

def _delete_filters(dataset_group_arn):
 filter_arns = []

 filters_response = personalize.list_filters(datasetGroupArn = dataset_group_arn, maxResults = 100)
 for filter in filters_response['Filters']:
 logger.info('Deleting filter ' + filter['filterArn'])
 personalize.delete_filter(filterArn = filter['filterArn'])
 filter_arns.append(filter['filterArn'])

 max_time = time.time() + 30*60 # 30 mins
 while time.time() < max_time:
 for filter_arn in filter_arns:
 try:
 describe_response = personalize.describe_filter(filterArn = filter_arn)
 logger.debug('Filter {} status is {}'.format(filter_arn, describe_response['filter']['status']))
 except ClientError as e:
 error_code = e.response['Error']['Code']
 if error_code == 'ResourceNotFoundException':
 filter_arns.remove(filter_arn)

 if len(filter_arns) == 0:
 logger.info('All filters have been deleted or none exist for dataset group')
 break
 else:
 logger.info('Waiting for {} filter(s) to be deleted'.format(len(filter_arns)))
 time.sleep(20)

 if len(filter_arns) > 0:
 raise Exception('Timed out waiting for all filter(s) to be deleted')
 
def _delete_recommenders(dataset_group_arn):
 recommender_arns = []
 recommenders_response = personalize.list_recommenders(datasetGroupArn = dataset_group_arn, maxResults = 100)
 for recommender in recommenders_response['recommenders']:
 logger.info('Deleting recommender ' + recommender['recommenderArn'])
 recommender_status = personalize.describe_recommender(recommenderArn = recommender['recommenderArn'])['recommender']['status']
 if not (recommender_status == 'DELETE IN_PROGRESS'):
 personalize.delete_recommender(recommenderArn = recommender['recommenderArn'])
 recommender_arns.append(recommender['recommenderArn'])
 max_time = time.time() + 30*60 # 30 mins
 while time.time() < max_time:
 for recommender_arn in recommender_arns:
 try:
 describe_response = personalize.describe_recommender(recommenderArn = recommender_arn)
 logger.debug('Recommender {} status is {}'.format(recommender_arn, describe_response['recommender']['status']))
 except ClientError as e:
 error_code = e.response['Error']['Code']
 if error_code == 'ResourceNotFoundException':
 recommender_arns.remove(recommender_arn)

 if len(recommender_arns) == 0:
 logger.info('All recommenders have been deleted or none exist for dataset group')
 break
 else:
 logger.info('Waiting for {} recommender(s) to be deleted'.format(len(recommender_arns)))
 time.sleep(20)

 if len(recommender_arns) > 0:
 raise Exception('Timed out waiting for all recommender(s) to be deleted')
 

def _delete_datasets_and_schemas(dataset_group_arn, schema_arns):
 dataset_arns = []
 
 dataset_paginator = personalize.get_paginator('list_datasets')
 for dataset_page in dataset_paginator.paginate(datasetGroupArn = dataset_group_arn):
 for dataset in dataset_page['datasets']:
 describe_response = personalize.describe_dataset(datasetArn = dataset['datasetArn'])
 schema_arns.append(describe_response['dataset']['schemaArn'])

 if dataset['status'] in ['ACTIVE', 'CREATE FAILED']:
 logger.info('Deleting dataset ' + dataset['datasetArn'])
 personalize.delete_dataset(datasetArn = dataset['datasetArn'])
 elif dataset['status'].startswith('DELETE'):
 logger.warning('Dataset {} is already being deleted so will wait for delete to complete'.format(dataset['datasetArn']))
 else:
 raise Exception('Dataset {} has a status of {} so cannot be deleted'.format(dataset['datasetArn'], dataset['status']))

 dataset_arns.append(dataset['datasetArn'])

 max_time = time.time() + 30*60 # 30 mins
 while time.time() < max_time:
 for dataset_arn in dataset_arns:
 try:
 describe_response = personalize.describe_dataset(datasetArn = dataset_arn)
 logger.debug('Dataset {} status is {}'.format(dataset_arn, describe_response['dataset']['status']))
 except ClientError as e:
 error_code = e.response['Error']['Code']
 if error_code == 'ResourceNotFoundException':
 dataset_arns.remove(dataset_arn)

 if len(dataset_arns) == 0:
 logger.info('All datasets have been deleted or none exist for dataset group')
 break
 else:
 logger.info('Waiting for {} dataset(s) to be deleted'.format(len(dataset_arns)))
 time.sleep(20)

 if len(dataset_arns) > 0:
 raise Exception('Timed out waiting for all datasets to be deleted')

 for schema_arn in schema_arns:
 try:
 logger.info('Deleting schema ' + schema_arn)
 personalize.delete_schema(schemaArn = schema_arn)
 except ClientError as e:
 error_code = e.response['Error']['Code']
 if error_code == 'ResourceInUseException':
 logger.info('Schema {} is still in-use by another dataset (likely in another dataset group)'.format(schema_arn))
 else:
 raise e

 logger.info('All schemas used exclusively by datasets have been deleted or none exist for dataset group')

def _delete_dataset_group(dataset_group_arn):
 logger.info('Deleting dataset group ' + dataset_group_arn)
 personalize.delete_dataset_group(datasetGroupArn = dataset_group_arn)

 max_time = time.time() + 30*60 # 30 mins
 while time.time() < max_time:
 try:
 describe_response = personalize.describe_dataset_group(datasetGroupArn = dataset_group_arn)
 logger.debug('Dataset group {} status is {}'.format(dataset_group_arn, describe_response['datasetGroup']['status']))
 break
 except ClientError as e:
 error_code = e.response['Error']['Code']
 if error_code == 'ResourceNotFoundException':
 logger.info('Dataset group {} has been fully deleted'.format(dataset_group_arn))
 else:
 raise e

 logger.info('Waiting for dataset group to be deleted')
 time.sleep(20)

def _delete_metric_attributions(dataset_group_arn):
 # delete metric attributions in this dataset group 
 metric_attribution_list = personalize.list_metric_attributions(datasetGroupArn = dataset_group_arn)['metricAttributions']

 for metric in metric_attribution_list:
 response = personalize.delete_metric_attribution(
 metricAttributionArn = metric['metricAttributionArn']
 ) 
 
 logger.info('All metric attributions have been deleted or none exist for dataset group')
 
 #wait for metrics to delete
 time.sleep (120)

def delete_dataset_groups(dataset_group_arns, schema_arns, region = None):
 global personalize
 personalize = boto3.client(service_name = 'personalize', region_name = region)

 for dataset_group_arn in dataset_group_arns:
 logger.info('Dataset Group ARN: ' + dataset_group_arn)

 # 1. Delete Recommenders
 _delete_recommenders(dataset_group_arn)
 
 # 2. Delete event trackers
 _delete_event_trackers(dataset_group_arn)

 # 3. Delete filters
 _delete_filters(dataset_group_arn)
 
 # 4. Delete metric attributions
 _delete_metric_attributions(dataset_group_arn)

 # 5. Delete datasets and their schemas
 _delete_datasets_and_schemas(dataset_group_arn, schema_arns)

 # 6. Delete dataset group
 _delete_dataset_group(dataset_group_arn)

 logger.info(f'Dataset group {dataset_group_arn} fully deleted')


In [None]:
delete_dataset_groups([dataset_group_arn], schema_arns, region)

## Clean up the IAM role
Start by deleting the role.

In [None]:
iam = boto3.Session().client(
 service_name='iam', region_name=region)

Identify the name of the role you want to delete.

You cannot delete an IAM role which still has policies attached to it. So after you have identified the relevant role, let's list the attached policies of that role.

In [None]:
attached_policies = iam.list_attached_role_policies( RoleName = role_name)
print (attached_policies)

# detaching the role policies
for attached_policy in attached_policies['AttachedPolicies']:
 iam.detach_role_policy(
 RoleName = role_name,
 PolicyArn = attached_policy['PolicyArn']
 )

In [None]:
# removing the inline policies
in_line_policies = iam.list_role_policies(
 RoleName = role_name
)
print (in_line_policies)

for in_line_policy in in_line_policies['PolicyNames']:
 iam.delete_role_policy(
 RoleName = role_name,
 PolicyName = in_line_policy
)


Finally, you should be able to delete the IAM role.

In [None]:
iam.delete_role(
 RoleName = role_name
)

## Deleting the Amazon S3 bucket
To delete an Amazon S3 bucket, it first needs to be empty. The easiest way to delete an Amazon S3 bucket, is just to navigate to Amazon S3 in the AWS console, delete the objects in the bucket, and then delete the Amazon S3 bucket itself.