{ "cells": [ { "cell_type": "markdown", "id": "6f667b85", "metadata": {}, "source": [ "# Studio Notebooks と Training Jobs のコンテナを統一する\n", "機械学習のコード開発において、開発環境と実行環境に差異があると、動かない可能性がある。 \n", "このノートブックでは開発環境として Studio Notebooks, 実行環境として Traning Jobs を使用することとし、同一環境(コンテナイメージ)で実行するための手順を紹介。\n", "\n", "## 前提条件\n", "* SageMaker Studio のドメインが出来上がっている\n", " * 使用する SageMaker Studio の DomainId をメモしてください。後ほど使用します。\n", "* このノートブックを SageMaker Notebooks (≠Studio)で実行する\n", " * docker コマンドを使用できて、SageMakerFullAccess 相当の権限があれば他の環境でも実行できますが、その場合は role などを適切に設定してください\n", " * docker コマンドが正常に機能しないため、SageMaker Studio Notebooks では動かせません。このノートブックはあくまで環境を作成(コンテナイメージ作成〜SageMaker STudio への登録)するものであり、Studio で動かすノートブックは、このノートブックを動かしたあとに [2_studio.ipynb](./2_studio.ipynb) を実行します。" ] }, { "cell_type": "markdown", "id": "7344de10", "metadata": {}, "source": [ "## 使用するモジュールや定数の設定\n", "[constant_config.yml](./constant_config.yml) に定数が設定されているので必要に応じて変更する。" ] }, { "cell_type": "code", "execution_count": null, "id": "6b83d358", "metadata": {}, "outputs": [], "source": [ "import yaml\n", "import boto3\n", "import sagemaker\n", "from time import sleep\n", "region = boto3.session.Session().region_name\n", "account = boto3.client('sts').get_caller_identity().get('Account')\n", "ecr = boto3.client('ecr')\n", "sm_client = boto3.client('sagemaker')\n", "\n", "\n", "# 定数ファイル読み込み\n", "with open('constant_config.yml','rt') as f:\n", " constant_config = yaml.safe_load(f)\n", "\n", "REPOSITORY_NAME = constant_config['REPOSITORY_NAME']\n", "IMAGE_NAME = constant_config['IMAGE_NAME']\n", "KERNEL_NAME = constant_config['KERNEL_NAME']\n", "DISPLAY_NAME = constant_config['DISPLAY_NAME']\n", "role = sagemaker.get_execution_role()\n", "image_uri = f'{account}.dkr.ecr.{region}.amazonaws.com/{REPOSITORY_NAME}:{IMAGE_NAME}'\n", "print(f'region: {region}')\n", "print(f'account : {account}')\n", "print(f'role : {role}')\n", "print(f'REPOSITORY_NAME : {REPOSITORY_NAME}')\n", "print(f'IMAGE_NAME : {IMAGE_NAME}')\n", "print(f'KERNEL_NAME : {KERNEL_NAME}')\n", "print(f'DISPLAY_NAME : {DISPLAY_NAME}')\n", "print(f'image_uri : {image_uri}')" ] }, { "cell_type": "markdown", "id": "bcfc61d5", "metadata": {}, "source": [ "## コンテナイメージの準備\n", "### コンテナイメージの作成と登録\n", "Studio Notebooks も Tranining Jobs もコンテナで動くことから、まずはコンテナイメージを build し、ECR に push する。\n", "#### ECR のリポジトリ作成" ] }, { "cell_type": "code", "execution_count": null, "id": "3c24b928", "metadata": { "scrolled": true }, "outputs": [], "source": [ "!aws ecr create-repository --repository-name {REPOSITORY_NAME}" ] }, { "cell_type": "markdown", "id": "1857ca17", "metadata": {}, "source": [ "#### コンテナイメージにインストールするモジュールの列挙\n", "使用するものを requirements.txt に書き込む。 \n", "機械学習で使用するモジュール(今回は `numpy` ) とは別に、Studio Notebooks で使用するために、`ipykernel` を、Training Jobs で使用するために `sagemaker-training` をインストールする。" ] }, { "cell_type": "code", "execution_count": null, "id": "7b6ae007", "metadata": {}, "outputs": [], "source": [ "%%writefile requirements.txt\n", "ipykernel==6.20.1\n", "numpy==1.22.3\n", "sagemaker==2.144.0\n", "sagemaker-training==4.4.8" ] }, { "cell_type": "markdown", "id": "2f400f41", "metadata": {}, "source": [ "#### Dockerfile の作成\n", "ポイントは、Studio(Jupyter) で使うために、`ipykernel install` しているところ。 \n", "また、使用者がどんな風にコンテナを作ったかがわかるように(追加モジュールのリクエストやバージョンの確認のため)、 `requirements.txt` と `Dockerfile` も同包している。" ] }, { "cell_type": "code", "execution_count": null, "id": "321ecd49", "metadata": {}, "outputs": [], "source": [ "%%writefile Dockerfile\n", "FROM python:3.10.10-buster\n", "\n", "COPY requirements.txt /root/requirements.txt\n", "COPY Dockerfile /root/Dockerfile\n", "\n", "RUN pip install -r /root/requirements.txt && \\\n", " python -m ipykernel install --sys-prefix" ] }, { "cell_type": "markdown", "id": "28acd332", "metadata": {}, "source": [ "#### コンテナイメージのビルド" ] }, { "cell_type": "code", "execution_count": null, "id": "66c06e4a", "metadata": { "scrolled": true }, "outputs": [], "source": [ "!docker build . -t {IMAGE_NAME} -t {image_uri}" ] }, { "cell_type": "markdown", "id": "cfeea212", "metadata": {}, "source": [ "#### コンテナイメージのプッシュ" ] }, { "cell_type": "code", "execution_count": null, "id": "ead26831", "metadata": { "scrolled": false }, "outputs": [], "source": [ "!$(aws ecr get-login --region {region} --registry-ids {account} --no-include-email)\n", "!docker push {image_uri}" ] }, { "cell_type": "markdown", "id": "031653c2", "metadata": {}, "source": [ "### コンテナイメージを SageMaker に登録する\n", "手順としては、 \n", "1. SageMaker 側のコンテナイメージの箱であるイメージを作成(`create_image`)する\n", "2. そこに ECR のイメージを登録(`create_image_version`)する。\n", "3. イメージの運用設定(uid, gid, mountpoint, Studio での表示名など)を登録(`create_app_image_config`)する\n", "4. SageMaker Studio のドメインに紐づけ(update_domain)て使用可能にする\n", "\n", "#### create_image" ] }, { "cell_type": "code", "execution_count": null, "id": "118a2db4", "metadata": { "scrolled": true }, "outputs": [], "source": [ "sm_client.create_image(\n", " ImageName=IMAGE_NAME,\n", " RoleArn=role,\n", ")" ] }, { "cell_type": "markdown", "id": "b28fee3f", "metadata": {}, "source": [ "#### create_image_version" ] }, { "cell_type": "code", "execution_count": null, "id": "9172842c", "metadata": {}, "outputs": [], "source": [ "sleep(1) # create_image は非同期 API のため、wait を入れる\n", "sm_client.create_image_version(\n", " BaseImage=image_uri,\n", " ImageName=IMAGE_NAME,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "13a7d370", "metadata": {}, "outputs": [], "source": [ "# バージョンが登録されたかチェック(ImageVersionStatus)\n", "sm_client.describe_image_version(\n", " ImageName=IMAGE_NAME\n", ")" ] }, { "cell_type": "markdown", "id": "0b4114fe", "metadata": {}, "source": [ "#### create_app_image_config\n", "事前にコンテナのデフォルトの uid, gid を確認する" ] }, { "cell_type": "code", "execution_count": null, "id": "76514e8b", "metadata": {}, "outputs": [], "source": [ "output = !docker run {image_uri} id -u\n", "default_uid = int(output[0])\n", "output = !docker run {image_uri} id -g\n", "default_gid = int(output[0])\n", "print(default_uid, default_gid)" ] }, { "cell_type": "code", "execution_count": null, "id": "f909cd0c", "metadata": {}, "outputs": [], "source": [ "sm_client.create_app_image_config(\n", " AppImageConfigName=IMAGE_NAME,\n", " KernelGatewayImageConfig={\n", " 'KernelSpecs': [\n", " {\n", " 'Name': KERNEL_NAME,\n", " 'DisplayName': DISPLAY_NAME\n", " },\n", " ],\n", " 'FileSystemConfig': {\n", " 'MountPath': '/root/data',\n", " 'DefaultUid': default_uid,\n", " 'DefaultGid': default_gid\n", " }\n", " }\n", ")" ] }, { "cell_type": "markdown", "id": "342540ca", "metadata": {}, "source": [ "#### update_domain\n", "ここでは list_domains()で取得する先頭の domain に紐付ける。\n", "本来は以下の domain_id に紐付けたい domain を設定する。" ] }, { "cell_type": "code", "execution_count": null, "id": "25307bf8", "metadata": {}, "outputs": [], "source": [ "# 例えば\n", "domain_id = sm_client.list_domains()['Domains'][0]['DomainId']\n", "# 本来は\n", "# domain_id = 'type your domain id'" ] }, { "cell_type": "code", "execution_count": null, "id": "4f85dd4c", "metadata": {}, "outputs": [], "source": [ "sm_client.update_domain(\n", " DomainId = domain_id,\n", " DefaultUserSettings = {\n", " 'KernelGatewayAppSettings' : {\n", " 'CustomImages' : [\n", " {\n", " 'ImageName' : IMAGE_NAME,\n", " 'AppImageConfigName' : IMAGE_NAME\n", " }\n", " ]\n", " }\n", " }\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "4a01c27a", "metadata": { "scrolled": true }, "outputs": [], "source": [ "# check\n", "sm_client.describe_domain(\n", " DomainId = domain_id\n", ")" ] }, { "cell_type": "markdown", "id": "c5417420", "metadata": {}, "source": [ "以降は SageMaker Studio で、改めて \n", "`git clone https://github.com/kazuhitogo/sagemaker-studio-byoc` \n", "したあと、[2_studio.ipynb](2_studio.ipynb) を実行してください。 \n", "また、SageMaker Studio を立ち上げる際は、UserProfile から、Default カーネルを事前に削除してから Studio を開いてください。\n", "![delete_default](./img/delete_default.png)\n", "2_studio.ipynb が完了したら以降のお片付けを実行してください。" ] }, { "cell_type": "markdown", "id": "cdd9727f", "metadata": {}, "source": [ "## お片付け" ] }, { "cell_type": "code", "execution_count": null, "id": "39469687", "metadata": {}, "outputs": [], "source": [ "sm_client.update_domain(\n", " DomainId = domain_id,\n", " DefaultUserSettings = {\n", " 'KernelGatewayAppSettings' : {\n", " 'CustomImages' : []\n", " }\n", " }\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "2330a553", "metadata": {}, "outputs": [], "source": [ "sm_client.delete_app_image_config(\n", " AppImageConfigName = IMAGE_NAME\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "02acfe08", "metadata": {}, "outputs": [], "source": [ "sm_client.delete_image(\n", " ImageName=IMAGE_NAME,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "8d858944", "metadata": {}, "outputs": [], "source": [ "!aws ecr delete-repository --repository-name {REPOSITORY_NAME} --force" ] }, { "cell_type": "code", "execution_count": null, "id": "f43f6e43", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "conda_python3", "language": "python", "name": "conda_python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.8" } }, "nbformat": 4, "nbformat_minor": 5 }