# Module 3. Personalize 캠페인 생성 하기

이 노트북은 Module2에서 생성한 솔류션을 바탕으로 아래와 같은 작업을 합니다.
* 캠페인 생성
* 캠페인을 통해 특정 유저에 대한 추천 영화 리스트 얻기


## 라이브러리 임포트

파이썬에는 광범위한 라이브러리 모음이 포함되어 있으며, 본 핸즈온을 위해서 핵심 데이터 과학 도구인 boto3 (AWS SDK) 및 Pandas/Numpy와 같은 라이브러리를 가져와야 합니다.

In [1]:
# Imports
import boto3
import json
import numpy as np
import pandas as pd
import time
from datetime import datetime

다음으로 여러분의 환경이 Amazon Personalize와 성공적으로 통신할 수 있는지 확인해야 합니다.

In [2]:
# Configure the SDK to Personalize:
personalize = boto3.client('personalize')
personalize_runtime = boto3.client('personalize-runtime')

아래 코드 셀은 이전 notebook에서 저장했던 공유 변수들을 불러옵니다.

In [3]:
%store -r

생성할 오브젝트의 끝에 임의의 숫자를 부여하기 위해 suffix 정의

In [4]:
suffix = str(np.random.uniform())[4:9]

In [5]:
create_campaign_response = personalize.create_campaign(
    name = "DEMO-hrnn-campaign-" + suffix,
    solutionVersionArn = hrnn_solution_version_arn,
    minProvisionedTPS = 1
)

hrnn_campaign_arn = create_campaign_response['campaignArn']
print(json.dumps(create_campaign_response, indent=2))

{
  "campaignArn": "arn:aws:personalize:ap-northeast-2:870180618679:campaign/DEMO-hrnn-campaign-83882",
  "ResponseMetadata": {
    "RequestId": "1267eddb-4f25-4738-9fe5-3c452794abd5",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 14 Jul 2020 03:50:52 GMT",
      "x-amzn-requestid": "1267eddb-4f25-4738-9fe5-3c452794abd5",
      "content-length": "99",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


In [6]:
create_campaign_response = personalize.create_campaign(
    name = "DEMO-hrnn-coldstart-campaign-" + suffix,
    solutionVersionArn = hrnn_coldstart_solution_version_arn,
    minProvisionedTPS = 1
)

hrnn_coldstart_campaign_arn = create_campaign_response['campaignArn']
print(json.dumps(create_campaign_response, indent=2))

{
  "campaignArn": "arn:aws:personalize:ap-northeast-2:870180618679:campaign/DEMO-hrnn-coldstart-campaign-83882",
  "ResponseMetadata": {
    "RequestId": "757ef066-b24e-44b8-af7b-1ae5110972ae",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 14 Jul 2020 03:50:53 GMT",
      "x-amzn-requestid": "757ef066-b24e-44b8-af7b-1ae5110972ae",
      "content-length": "109",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


In [7]:
create_campaign_response = personalize.create_campaign(
    name = "DEMO-sims-campaign-" + suffix,
    solutionVersionArn = sims_solution_version_arn,
    minProvisionedTPS = 1
)

sims_campaign_arn = create_campaign_response['campaignArn']
print(json.dumps(create_campaign_response, indent=2))

{
  "campaignArn": "arn:aws:personalize:ap-northeast-2:870180618679:campaign/DEMO-sims-campaign-83882",
  "ResponseMetadata": {
    "RequestId": "14f06158-6ef1-4d0c-bf32-4ddbf5fbf875",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 14 Jul 2020 03:50:53 GMT",
      "x-amzn-requestid": "14f06158-6ef1-4d0c-bf32-4ddbf5fbf875",
      "content-length": "99",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


In [8]:
create_campaign_response = personalize.create_campaign(
    name = "DEMO-ranking-campaign-" + suffix,
    solutionVersionArn = ranking_solution_version_arn,
    minProvisionedTPS = 1
)

ranking_campaign_arn = create_campaign_response['campaignArn']
print(json.dumps(create_campaign_response, indent=2))

{
  "campaignArn": "arn:aws:personalize:ap-northeast-2:870180618679:campaign/DEMO-ranking-campaign-83882",
  "ResponseMetadata": {
    "RequestId": "1c64e04a-aa6e-4fbc-bbf2-bdf3cce71f6b",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Tue, 14 Jul 2020 03:50:55 GMT",
      "x-amzn-requestid": "1c64e04a-aa6e-4fbc-bbf2-bdf3cce71f6b",
      "content-length": "102",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


### 캠페인 생성 및 대기

작동하는 솔루션 버전을 보유하고 있으므로, 이제 애플리케이션과 함께 사용할 캠페인을 작성해야 합니다. 캠페인은 단순히 모델의 호스팅된 사본입니다. 물론 인프라가 프로비저닝되기까지의 시간이 소요됩니다.

In [9]:
%%time

max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    
    describe_campaign_response = personalize.describe_campaign(
        campaignArn = hrnn_campaign_arn
    )
    status_hrnn = describe_campaign_response["campaign"]["status"]
    print("HRNN_Campaign: {}".format(status_hrnn))
    
    describe_campaign_response = personalize.describe_campaign(
        campaignArn = hrnn_coldstart_campaign_arn
    )
    status_hrnn_cs = describe_campaign_response["campaign"]["status"]
    print("HRNN_Coldstart_Campaign: {}".format(status_hrnn_cs))
    
    describe_campaign_response = personalize.describe_campaign(
    campaignArn = sims_campaign_arn
    )
    status_sims = describe_campaign_response["campaign"]["status"]
    print("Sims_Campaign: {}".format(status_sims))
 
    describe_campaign_response = personalize.describe_campaign(
    campaignArn = ranking_campaign_arn
    )
    status_ranking = describe_campaign_response["campaign"]["status"]
    print("Ranking_Campaign: {}".format(status_ranking))
    
        
    
    if (status_hrnn == "ACTIVE" or status_hrnn == "CREATE FAILED")&\
       (status_hrnn_cs == "ACTIVE" or status_hrnn_cs == "CREATE FAILED")&\
       (status_sims == "ACTIVE" or status_sims == "CREATE FAILED")&\
       (status_ranking == "ACTIVE" or status_ranking == "CREATE FAILED"):
        break
    print("-------------------------------------->")
    time.sleep(60)

print("All Campaign creation completed")   

HRNN_Campaign: CREATE PENDING
HRNN_Coldstart_Campaign: CREATE PENDING
Sims_Campaign: CREATE PENDING
Ranking_Campaign: CREATE PENDING
-------------------------------------->
HRNN_Campaign: CREATE IN_PROGRESS
HRNN_Coldstart_Campaign: CREATE IN_PROGRESS
Sims_Campaign: CREATE IN_PROGRESS
Ranking_Campaign: CREATE IN_PROGRESS
-------------------------------------->
HRNN_Campaign: CREATE IN_PROGRESS
HRNN_Coldstart_Campaign: CREATE IN_PROGRESS
Sims_Campaign: CREATE IN_PROGRESS
Ranking_Campaign: CREATE IN_PROGRESS
-------------------------------------->
HRNN_Campaign: CREATE IN_PROGRESS
HRNN_Coldstart_Campaign: CREATE IN_PROGRESS
Sims_Campaign: CREATE IN_PROGRESS
Ranking_Campaign: CREATE IN_PROGRESS
-------------------------------------->
HRNN_Campaign: CREATE IN_PROGRESS
HRNN_Coldstart_Campaign: CREATE IN_PROGRESS
Sims_Campaign: CREATE IN_PROGRESS
Ranking_Campaign: CREATE IN_PROGRESS
-------------------------------------->
HRNN_Campaign: CREATE IN_PROGRESS
HRNN_Coldstart_Campaign: CREATE IN_PR

## 샘플 추천 결과 얻기

캠페인이 활성화되면 추천 결과를 받을 수 있습니다. 먼저 컬렉션에서 임의의 사용자를 선택해야 합니다. 그런 다음, ID 대신 추천을 위해 영화 정보를 표시하는 몇 가지 헬퍼 함수를 만듭니다.

In [12]:
items_all = pd.read_csv('./ml-1m/movies.dat',sep='::', encoding='latin1',names=['ITEM_ID', 'TITLE', 'GENRE'],)
items=items_all.copy()

items['to_keep'] = items['ITEM_ID'].apply(lambda x:x in unique_items)
items=items[items['to_keep']]
del items['to_keep']
items.tail()

#len(unique_items)

  if __name__ == '__main__':


Unnamed: 0,ITEM_ID,TITLE,GENRE
3878,3948,Meet the Parents (2000),Comedy
3879,3949,Requiem for a Dream (2000),Drama
3880,3950,Tigerland (2000),Drama
3881,3951,Two Family House (2000),Drama
3882,3952,"Contender, The (2000)",Drama|Thriller


In [13]:
def get_movie_title(movie_id):
    """
    Takes in an ID, returns a title
    """
    movie_id = int(movie_id)
    movie_title=items[items['ITEM_ID']==movie_id]['TITLE']
    return (movie_title.tolist())


#### HRNN GetRecommendations 호출

아래 코드 셀을 실행하면 특정 사용자에 대한 추천 사항이 표시되고 추천 영화 목록이 반환됩니다.

In [30]:
df=pd.read_csv(interaction_filename)

# Getting a random user:
user_id, item_id, _,_,_ = df.sample().values[0]
print("USER: {}".format(user_id))

get_recommendations_response = personalize_runtime.get_recommendations(
    campaignArn = hrnn_campaign_arn,
    userId = str(user_id),
)
# Update DF rendering
pd.set_option('display.max_rows', 30)

print("Recommendations for user{} ", user_id, )


item_list = get_recommendations_response['itemList']
###추가###
#score=get_recommendations_response['itemId']['score']
#print(item_list)
recommendation_title_list = []
recommendation_id_list=[]
for item in item_list:
    title = get_movie_title(item['itemId'])
    score=item['score']
    recommendation_title_list.append([title,score])
    recommendation_id_list.append(item['itemId'])
recommendations_df = pd.DataFrame(recommendation_title_list ,columns = ['OriginalRecs','score'])
recommendations_df

USER: 929
Recommendations for user{}  929


Unnamed: 0,OriginalRecs,score
0,[Austin Powers: The Spy Who Shagged Me (1999)],0.109625
1,[Beavis and Butt-head Do America (1996)],0.030269
2,[Ace Ventura: Pet Detective (1994)],0.026847
3,[Liar Liar (1997)],0.022481
4,[Sleepless in Seattle (1993)],0.021483
5,[Honeymoon in Vegas (1992)],0.021161
6,[Pretty Woman (1990)],0.020945
7,[Meatballs (1979)],0.019769
8,[Dumb & Dumber (1994)],0.019733
9,[So I Married an Axe Murderer (1993)],0.019265


#### Sims GetRecommendations 호출
아래 코드 셀을 실행하면 특정 아이템과 유사한 추천 영화 목록이 반환됩니다.

In [18]:
# Getting a random user:
user_id, item_id, _,_,_ = df.sample().values[0]
print("ITEM ID: {}".format(item_id))


get_recommendations_response = personalize_runtime.get_recommendations(
    campaignArn = sims_campaign_arn,
    itemId = str(item_id),
)
# Update DF rendering
pd.set_option('display.max_rows', 30)

print("Recommendations for item_id: ", item_id)

item_list = get_recommendations_response['itemList']
recommendation_title_list = []
recommendation_id_list=[]
for item in item_list:
    title = get_movie_title(item['itemId'])
    recommendation_title_list.append(title)
    recommendation_id_list.append(item['itemId'])
recommendations_df = pd.DataFrame(recommendation_title_list, columns = ['OriginalRecs'])
recommendations_df

ITEM ID: 2028
Recommendations for item_id:  2028


Unnamed: 0,OriginalRecs
0,"Hunt for Red October, The (1990)"
1,"Boat, The (Das Boot) (1981)"
2,Patriot Games (1992)
3,"Shawshank Redemption, The (1994)"
4,Heat (1995)
5,In the Line of Fire (1993)
6,GoodFellas (1990)
7,Enemy of the State (1998)
8,Full Metal Jacket (1987)
9,Good Will Hunting (1997)


In [19]:
# Getting a random user:
user_id, item_id, _,_,_ = df.sample().values[0]
print("ITEM ID: {}".format(item_id))


get_recommendations_response = personalize_runtime.get_recommendations(
    campaignArn = sims_campaign_arn,
    itemId = str(item_id),
)
# Update DF rendering
pd.set_option('display.max_rows', 30)

print("Recommendations for item_id: ", item_id)

item_list = get_recommendations_response['itemList']
recommendation_title_list = []
recommendation_id_list=[]
for item in item_list:
    title = get_movie_title(item['itemId'])
    recommendation_title_list.append(title)
    recommendation_id_list.append(item['itemId'])
recommendations_df = pd.DataFrame(recommendation_title_list, columns = ['OriginalRecs'])
recommendations_df

ITEM ID: 1299
Recommendations for item_id:  1299


Unnamed: 0,OriginalRecs
0,Gandhi (1982)
1,Raging Bull (1980)
2,My Left Foot (1989)
3,"Bridge on the River Kwai, The (1957)"
4,Sophie's Choice (1982)
5,Breaker Morant (1980)
6,Midnight Cowboy (1969)
7,Ordinary People (1980)
8,Five Easy Pieces (1970)
9,Amadeus (1984)



## Personalized Ranking

Personalized Ranking의 핵심 사용 사례는 아이템 리스트를 가져 와서 사용자에게 우선 순위 또는 사용자 관심 순서로 표시하는 것입니다. 이 기능에 대해 알아보기 위해 이번 파트에서는 한명의 사용자와 랜덤하게 뽑은 25개의 아이템 항목을 가지고 테스트 해 봅니다. 

In [33]:
df=pd.read_csv(interaction_filename)

# Getting a random user:
user_id, item_id, _,_,_ = df.sample().values[0]
print("USER: {}".format(user_id))

get_recommendations_response = personalize_runtime.get_recommendations(
    campaignArn = hrnn_campaign_arn,
    userId = str(user_id),
)
# Update DF rendering
pd.set_option('display.max_rows', 30)

print("Recommendations for user{} ", user_id, )


item_list = get_recommendations_response['itemList']
###추가###
#score=get_recommendations_response['itemId']['score']
#print(item_list)
recommendation_title_list = []
recommendation_id_list=[]
for item in item_list:
    title = get_movie_title(item['itemId'])
    score=item['score']
    recommendation_title_list.append([title,score])
    recommendation_id_list.append(item['itemId'])
recommendations_df = pd.DataFrame(recommendation_title_list ,columns = ['OriginalRecs','score'])
recommendations_df

USER: 4655
Recommendations for user{}  4655


Unnamed: 0,OriginalRecs,score
0,[Superman II (1980)],0.14846
1,[Star Trek: Insurrection (1998)],0.118454
2,[Independence Day (ID4) (1996)],0.07734
3,[Powder (1995)],0.074635
4,[Star Trek: The Motion Picture (1979)],0.053107
5,[Moonraker (1979)],0.049543
6,[Star Trek: Generations (1994)],0.037871
7,[Demolition Man (1993)],0.035129
8,[Stargate (1994)],0.034491
9,[Logan's Run (1976)],0.032017


In [37]:
#Get the user list
df=pd.read_csv(interaction_filename)
df_users = df['USER_ID'].unique()
df_users=pd.DataFrame(df_users,columns=['USER_ID'])
df_items=df['ITEM_ID'].unique()
df_items=pd.DataFrame(df_items,columns=['ITEM_ID'])

#rerank_user = df_users['USER_ID'].sample(1).tolist()[0]
#rerank_items = df_items['ITEM_ID'].sample(25).tolist()
rerank_user=user_id
rerank_items=recommendation_id_list 
rerank_items

['2641',
 '2393',
 '780',
 '24',
 '1371',
 '3638',
 '329',
 '442',
 '316',
 '2528',
 '674',
 '1690',
 '3701',
 '2533',
 '198',
 '1375',
 '788',
 '1876',
 '1676',
 '1037',
 '2105',
 '3033',
 '2046',
 '2311',
 '748']

In [38]:
rerank_list = []
for item in rerank_items:
    title = get_movie_title(item)
    rerank_list.append(title)
rerank_df = pd.DataFrame(rerank_list, columns = [rerank_user])
rerank_df


Unnamed: 0,4655
0,Superman II (1980)
1,Star Trek: Insurrection (1998)
2,Independence Day (ID4) (1996)
3,Powder (1995)
4,Star Trek: The Motion Picture (1979)
5,Moonraker (1979)
6,Star Trek: Generations (1994)
7,Demolition Man (1993)
8,Stargate (1994)
9,Logan's Run (1976)


In [39]:
# Convert user to string:
user_id = str(rerank_user)

rerank_item_list = []
for item in rerank_items:
    rerank_item_list.append(str(item))
    
# Get recommended reranking
get_recommendations_response_rerank = personalize_runtime.get_personalized_ranking(
        campaignArn = ranking_campaign_arn,
        userId = user_id,
        inputList = rerank_item_list
)

get_recommendations_response_rerank

{'ResponseMetadata': {'RequestId': '325a5776-162b-4e0e-82f7-a514b1423059',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/json',
   'date': 'Tue, 14 Jul 2020 04:52:03 GMT',
   'x-amzn-requestid': '325a5776-162b-4e0e-82f7-a514b1423059',
   'content-length': '1413',
   'connection': 'keep-alive'},
  'RetryAttempts': 0},
 'personalizedRanking': [{'itemId': '2641', 'score': 0.2206743},
  {'itemId': '780', 'score': 0.1017688},
  {'itemId': '329', 'score': 0.0734999},
  {'itemId': '442', 'score': 0.0689559},
  {'itemId': '198', 'score': 0.0556528},
  {'itemId': '2393', 'score': 0.0526214},
  {'itemId': '3033', 'score': 0.0466782},
  {'itemId': '1371', 'score': 0.0437162},
  {'itemId': '1037', 'score': 0.0423804},
  {'itemId': '1375', 'score': 0.0387793},
  {'itemId': '1676', 'score': 0.0320306},
  {'itemId': '316', 'score': 0.0285161},
  {'itemId': '748', 'score': 0.0275406},
  {'itemId': '2311', 'score': 0.026852},
  {'itemId': '3638', 'score': 0.0245874},
  {'itemI

In [40]:
ranked_list = []
item_list = get_recommendations_response_rerank['personalizedRanking']
for item in item_list:
    title = get_movie_title(item['itemId'])
    ranked_list.append(title)
ranked_df = pd.DataFrame(ranked_list, columns = ['Re-Ranked'])
rerank_df = pd.concat([rerank_df, ranked_df], axis=1)
rerank_df

Unnamed: 0,4655,Re-Ranked
0,Superman II (1980),Superman II (1980)
1,Star Trek: Insurrection (1998),Independence Day (ID4) (1996)
2,Independence Day (ID4) (1996),Star Trek: Generations (1994)
3,Powder (1995),Demolition Man (1993)
4,Star Trek: The Motion Picture (1979),Strange Days (1995)
5,Moonraker (1979),Star Trek: Insurrection (1998)
6,Star Trek: Generations (1994),Spaceballs (1987)
7,Demolition Man (1993),Star Trek: The Motion Picture (1979)
8,Stargate (1994),"Lawnmower Man, The (1992)"
9,Logan's Run (1976),Star Trek III: The Search for Spock (1984)


## Amazon Personalize Batch Export 작업 


Amazon Personalize Batch기능을 활용하려면 json 파일 형식으로 추천 받고하 자는 사용자 또는 아이템 아이디를 json 형태의 파일로 s3에 저장하여야 합니다. Output의 형식도 json형태로 저장되며 지정한 S3 bucket 경로에 저장 되게 됩니다. 

HRNN 솔루션  Batch Input 예제: 

```JSON,
    {"userId": "4638"},
    {"userId": "663"},
    {"userId": "3384"},
```


Batch Output 예제: 
```JSON,
{"input":{"userId":"4638"}, "output": {"recommendedItems": ["296", "1", "260", "318"]}}
{"input":{"userId":"663"}, "output": {"recommendedItems": ["1393", "3793", "2701", "3826"]}}
{"input":{"userId":"3384"}, "output": {"recommendedItems": ["8368", "5989", "40815", "48780"]}}
```


In [21]:
#Get the user list
#batch_users = df_users.sample(3).index.tolist()
!mkdir dataset
batch_users=df_users.index.tolist()
data_dir="dataset/"
# Write the file to disk
json_input_filename = "json_input.json"
with open(data_dir+json_input_filename, 'w') as json_input:
    for user_id in batch_users:
        json_input.write('{"userId": "' + str(user_id) + '"}\n')

In [22]:
# Showcase the input file:
!cat $data_dir$json_input_filename

{"userId": "0"}
{"userId": "1"}
{"userId": "2"}
{"userId": "3"}
{"userId": "4"}
{"userId": "5"}
{"userId": "6"}
{"userId": "7"}
{"userId": "8"}
{"userId": "9"}
{"userId": "10"}
{"userId": "11"}
{"userId": "12"}
{"userId": "13"}
{"userId": "14"}
{"userId": "15"}
{"userId": "16"}
{"userId": "17"}
{"userId": "18"}
{"userId": "19"}
{"userId": "20"}
{"userId": "21"}
{"userId": "22"}
{"userId": "23"}
{"userId": "24"}
{"userId": "25"}
{"userId": "26"}
{"userId": "27"}
{"userId": "28"}
{"userId": "29"}
{"userId": "30"}
{"userId": "31"}
{"userId": "32"}
{"userId": "33"}
{"userId": "34"}
{"userId": "35"}
{"userId": "36"}
{"userId": "37"}
{"userId": "38"}
{"userId": "39"}
{"userId": "40"}
{"userId": "41"}
{"userId": "42"}
{"userId": "43"}
{"userId": "44"}
{"userId": "45"}
{"userId": "46"}
{"userId": "47"}
{"userId": "48"}
{"userId": "49"}
{"userId": "50"}
{"userId": "51"}
{"userId": "52"}
{"userId": "53"}
{"userId": "54"}
{"userId": "55"}
{"

In [23]:
# Upload files to S3
boto3.Session().resource('s3').Bucket(bucket).Object(data_dir+json_input_filename).upload_file(data_dir+json_input_filename)
s3_input_path = "s3://" + bucket + "/" + data_dir+json_input_filename
print(s3_input_path)

s3://sagemaker-ap-northeast-2-870180618679/dataset/json_input.json


In [24]:
# Define the output path
s3_output_path = "s3://" + bucket + "/"+data_dir
print(s3_output_path)

s3://sagemaker-ap-northeast-2-870180618679/dataset/


In [25]:
print(role_arn)

arn:aws:iam::870180618679:role/PersonalizeRoleDemo57675


In [26]:
batchInferenceJobArn = personalize.create_batch_inference_job (
    solutionVersionArn = hrnn_solution_version_arn,
    jobName = "POC-Batch-Inference-Job-HRNN-"+suffix,
    roleArn = role_arn,
    jobInput = 
     {"s3DataSource": {"path": s3_input_path}},
    jobOutput = 
     {"s3DataDestination":{"path": s3_output_path}}
)
batchInferenceJobArn = batchInferenceJobArn['batchInferenceJobArn']

In [27]:
current_time = datetime.now()
print("Import Started on: ", current_time.strftime("%I:%M:%S %p"))

max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_dataset_inference_job_response = personalize.describe_batch_inference_job(
        batchInferenceJobArn = batchInferenceJobArn
    )
    status = describe_dataset_inference_job_response["batchInferenceJob"]['status']
    print("DatasetInferenceJob: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)
    
current_time = datetime.now()
print("Import Completed on: ", current_time.strftime("%I:%M:%S %p"))

Import Started on:  06:53:52 AM
DatasetInferenceJob: CREATE PENDING
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInfer

In [28]:
s3 = boto3.client('s3')
export_name = json_input_filename + ".out"
s3.download_file(bucket,data_dir+export_name,data_dir+export_name)

# Update DF rendering
pd.set_option('display.max_rows', 30)
with open("dataset/"+export_name) as json_file:
    # Get the first line and parse it
    line = json.loads(json_file.readline())
    # Do the same for the other lines
    while line:
        # extract the user ID 
        col_header = "User: " + line['input']['userId']
        # Create a list for all the artists
        recommendation_list = []
        # Add all the entries
        for item in line['output']['recommendedItems']:
            title = get_movie_title(item)
            recommendation_list.append(title)
        if 'bulk_recommendations_df' in locals():
            new_rec_DF = pd.DataFrame(recommendation_list, columns = [col_header])
            bulk_recommendations_df = bulk_recommendations_df.join(new_rec_DF)
        else:
            bulk_recommendations_df = pd.DataFrame(recommendation_list, columns=[col_header])
        try:
            line = json.loads(json_file.readline())
        except:
            line = None
bulk_recommendations_df

Unnamed: 0,User: 1024,User: 1029,User: 1064,User: 1080,User: 110,User: 1102,User: 1134,User: 1173,User: 1207,User: 1232,...,User: 5856,User: 5966,User: 5971,User: 676,User: 697,User: 803,User: 806,User: 810,User: 879,User: 905
0,"Mrs. Brown (Her Majesty, Mrs. Brown) (1997)",Peggy Sue Got Married (1986),"Rock, The (1996)","Hunt for Red October, The (1990)",Any Given Sunday (1999),Dead Poets Society (1989),True Grit (1969),Double Jeopardy (1999),"Fog, The (1980)",Four Weddings and a Funeral (1994),...,Goldfinger (1964),Like Water for Chocolate (Como agua para choco...,Enemy of the State (1998),Varsity Blues (1999),Pleasantville (1998),Mission: Impossible (1996),Mutiny on the Bounty (1935),Vertigo (1958),Manhattan (1979),Elizabeth (1998)
1,Four Weddings and a Funeral (1994),"Man with Two Brains, The (1983)",Erin Brockovich (2000),"Godfather, The (1972)",Rules of Engagement (2000),Seven Years in Tibet (1997),Quest for Fire (1981),"Thomas Crown Affair, The (1999)","American Werewolf in London, An (1981)",Much Ado About Nothing (1993),...,Batman (1989),Apollo 13 (1995),Air Force One (1997),You've Got Mail (1998),Waking Ned Devine (1998),True Lies (1994),Fitzcarraldo (1982),"Manchurian Candidate, The (1962)","Philadelphia Story, The (1940)",Apollo 13 (1995)
2,"English Patient, The (1996)","American Tail, An (1986)","Fugitive, The (1993)",Blade Runner (1982),From Dusk Till Dawn (1996),Leaving Las Vegas (1995),Runaway Train (1985),U.S. Marshalls (1998),"Toxic Avenger, The (1985)",Pleasantville (1998),...,"Matrix, The (1999)",Leaving Las Vegas (1995),Indiana Jones and the Temple of Doom (1984),"Naked Gun 2 1/2: The Smell of Fear, The (1991)",Aladdin (1992),GoldenEye (1995),Dances with Wolves (1990),Psycho (1960),Annie Hall (1977),Leaving Las Vegas (1995)
3,Like Water for Chocolate (Como agua para choco...,"Money Pit, The (1986)",Jurassic Park (1993),Die Hard (1988),She's the One (1996),Titanic (1997),Swiss Family Robinson (1960),"Perfect World, A (1993)",Ghostbusters II (1989),Bullets Over Broadway (1994),...,Jurassic Park (1993),Elizabeth (1998),Rocky (1976),White Men Can't Jump (1992),Four Weddings and a Funeral (1994),Escape from New York (1981),Sanjuro (1962),Star Wars: Episode IV - A New Hope (1977),American Graffiti (1973),Like Water for Chocolate (Como agua para choco...
4,Chasing Amy (1997),Back to School (1986),Gladiator (2000),2001: A Space Odyssey (1968),American Psycho (2000),Dead Man Walking (1995),Barbarella (1968),"River Wild, The (1994)",Firestarter (1984),There's Something About Mary (1998),...,Rocky (1976),Thelma & Louise (1991),Batman (1989),"Truth About Cats & Dogs, The (1996)",Dave (1993),Jurassic Park (1993),"Mark of Zorro, The (1940)","Sting, The (1973)",Network (1976),Trainspotting (1996)
5,Shine (1996),Bachelor Party (1984),"Hunt for Red October, The (1990)",Jaws (1975),Boiler Room (2000),"Joy Luck Club, The (1993)","Mask of Zorro, The (1998)","Siege, The (1998)","Texas Chainsaw Massacre 2, The (1986)",Ferris Bueller's Day Off (1986),...,Top Gun (1986),Dead Man Walking (1995),Speed (1994),"Birdcage, The (1996)",Babe: Pig in the City (1998),"Perfect Storm, The (2000)",Star Wars: Episode VI - Return of the Jedi (1983),"Grifters, The (1990)",Crimes and Misdemeanors (1989),Dead Man Walking (1995)
6,Leaving Las Vegas (1995),Night Shift (1982),"Matrix, The (1999)","Matrix, The (1999)",Where the Heart Is (2000),"English Patient, The (1996)",Conan the Barbarian (1982),Falling Down (1993),Toxic Avenger Part III: The Last Temptation of...,Beetlejuice (1988),...,Robocop (1987),Fight Club (1999),"Negotiator, The (1998)",Big Daddy (1999),My Cousin Vinny (1992),Robocop (1987),"Adventures of Robin Hood, The (1938)",2001: A Space Odyssey (1968),This Is Spinal Tap (1984),Thelma & Louise (1991)
7,Splash (1984),Tough Guys (1986),Outbreak (1995),Braveheart (1995),Chicken Run (2000),American Beauty (1999),Song of the South (1946),Eraser (1996),Friday the 13th (1980),Say Anything... (1989),...,"Blues Brothers, The (1980)",Babe (1995),Tomorrow Never Dies (1997),Everything You Always Wanted to Know About Sex...,Bullets Over Broadway (1994),Star Trek: First Contact (1996),Adventures of Buckaroo Bonzai Across the 8th D...,"Sixth Sense, The (1999)",Little Big Man (1970),Hamlet (1996)
8,"Purple Rose of Cairo, The (1985)",Running Scared (1986),Top Gun (1986),Jurassic Park (1993),"Perfect Storm, The (2000)",Elizabeth (1998),GoldenEye (1995),"Mummy, The (1999)",Cat's Eye (1985),Clueless (1995),...,Star Trek IV: The Voyage Home (1986),October Sky (1999),"Thomas Crown Affair, The (1999)",Home Alone (1990),"Hurricane, The (1999)",Star Trek IV: The Voyage Home (1986),"Journey of Natty Gann, The (1985)","Big Sleep, The (1946)","Great Dictator, The (1940)","Ice Storm, The (1997)"
9,Jerry Maguire (1996),Spaceballs (1987),Mission: Impossible (1996),"Sixth Sense, The (1999)",Scream 3 (2000),Star Wars: Episode IV - A New Hope (1977),Con Air (1997),Natural Born Killers (1994),"Nightmare on Elm Street, A (1984)",L.A. Story (1991),...,Thelma & Louise (1991),"Fugitive, The (1993)",Top Gun (1986),Up in Smoke (1978),There's Something About Mary (1998),"Rock, The (1996)","Treasure of the Sierra Madre, The (1948)",Network (1976),Diner (1982),Fight Club (1999)


## 리뷰

캠페인을 생성하고 실제적으로 특정 유저의 추천 영화 목록도 얻었습니다.
이제 다음 노트북으로 넘어갈 준비가 되었습니다. (`4.View_Campaign_And_Interactions.ipynb`)


## 다음 노트북에 대한 참고 사항

다음 실습에 필요한 몇 가지 값들이 있습니다. 아래 셀을 실행하여 저장한 후, 다음 주피터 노트북에서 그대로 사용할 수 있습니다.

In [None]:
%store hrnn_campaign_arn
%store hrnn_coldstart_campaign_arn
%store sims_campaign_arn
%store ranking_campaign_arn
%store recommendations_df
%store user_id