{ "cells": [ { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "# Train a model & track your experiments\n", "\n", "In this notebook you will prepare features needed for training from a raw dataset and train an XGBoost model. The metrics and parameters associated with each training run will be tracked in a SageMaker Experiment." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "!pip install \"sagemaker>=2.123.0\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "import sagemaker\n", "import boto3\n", "import numpy as np \n", "import pandas as pd \n", "import os \n", "from sagemaker import get_execution_role\n", "from datetime import datetime\n", "\n", "# Get default bucket\n", "bucket = sagemaker.Session().default_bucket()\n", "prefix = 'sagemaker/mlops-workshop'\n", "\n", "# Get SageMaker Execution Role\n", "role = get_execution_role()\n", "region = boto3.Session().region_name\n", "\n", "# SageMaker Session\n", "sagemaker_session = sagemaker.session.Session()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Retrieve variables from previous module" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "%store -r" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "print(train_uri)\n", "print(test_uri)\n", "print(val_uri)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training\n", "\n", "To train a model in SageMaker, you create a training job. The training job includes the following information:\n", "\n", "* The Amazon Elastic Container Registry path where the training code is stored.\n", "* The URL of the Amazon Simple Storage Service (Amazon S3) bucket where you've stored the training data.\n", "* The compute resources that you want SageMaker to use for model training. Compute resources are ML compute instances that are managed by SageMaker.\n", "* The URL of the S3 bucket where you want to store the output of the job.\n", "\n", "`xgboost` is an extremely popular, open-source package for gradient boosted trees. It is computationally powerful, fully featured, and has been successfully used in many machine learning competitions. Let's start with a simple `xgboost` model, trained using the Amazon SageMaker's XGBoost image." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "image_uri = sagemaker.image_uris.retrieve(region=region, framework='xgboost', version='latest')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Track the training job in an Experiment\n", "\n", "To ensure we are able to keep track of our parameters and metrics that correspond to the training job, we train a model as part of a Run that is associated with an Experiment\n", "\n", "Experiments are organized as -\n", "```\n", "Experiment\n", " Run 1\n", " Run 2\n", " ...\n", "``` \n", "In this notebook, each time we run the Training job, it will correspond to a new Run in the Experiment. We can compare metrics across multiple runs to find the best set of parameters to train a model." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An estimator is a high level interface for SageMaker training. We will create an estimator object by supplying the required parameters, such as IAM role, compute instance count and type. and the S3 output path. \n", "\n", "We also supply hyperparameters for the algoirthm and then call its fit() method to start training the model." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Track your training job in a Run\n", "\n", "When using 1P algorithms, the metrics are tracked as part of the jobs listed in the Run. This can be found here. \n", "\n", "The run with th name `xgboost-` will have the job metrics\n", "\n", "![](imgs/show-jobs.png)\n", "\n", "![](imgs/view-job.png)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "from sagemaker.estimator import Estimator\n", "from sagemaker.inputs import TrainingInput\n", "from sagemaker.experiments import Run\n", "\n", "model_path = f\"s3://{bucket}/{prefix}/xgb_model\"\n", "experiment_name = f'mlops-workshop-experiment'\n", "\n", "xgb_train = Estimator(\n", " image_uri=image_uri,\n", " instance_type=\"ml.m5.xlarge\",\n", " instance_count=1,\n", " output_path=model_path,\n", " role=role,\n", " sagemaker_session=sagemaker_session,\n", " keep_alive_period_in_seconds=3600\n", ")\n", "\n", "with Run(experiment_name = experiment_name, sagemaker_session = sagemaker_session) as run: \n", " xgb_train.set_hyperparameters(\n", " objective=\"binary:logistic\",\n", " num_round=50,\n", " max_depth=5,\n", " eta=0.2,\n", " gamma=4,\n", " min_child_weight=6,\n", " subsample=0.7,\n", " silent=0\n", " )\n", "\n", " xgb_train.fit(\n", " inputs = {\n", " \"train\": TrainingInput(\n", " s3_data=train_uri,\n", " content_type=\"text/csv\"\n", " ),\n", " 'validation': TrainingInput(\n", " s3_data=val_uri,\n", " content_type=\"text/csv\"\n", " )\n", " }\n", " ) " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "trained_model_uri = xgb_train.model_data" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "training_image = xgb_train.image_uri" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "%store trained_model_uri\n", "%store training_image" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### You can now move to the next section of the module `Track all models in a model registry`\n", "\n", "The notebook used in that section is `sagemaker-register.ipynb`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "instance_type": "ml.t3.medium", "kernelspec": { "display_name": "Python 3 (Data Science)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-east-2:429704687514:image/datascience-1.0" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.10" } }, "nbformat": 4, "nbformat_minor": 4 }