Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
SPDX-License-Identifier: CC-BY-SA-4.0

Step 3: Create a Jupyter Notebook

Create a Jupyter notebook in the notebook instance you created in Step 2: Create an Amazon SageMaker Notebook Instance, and create a cell that gets the IAM role that your notebook needs to run Amazon SageMaker APIs and specifies the name of the Amazon S3 bucket that you will use to store the datasets that you use for your training data and the model artifacts that a Amazon SageMaker training job outputs.

To create a Jupyter notebook

  1. Open the notebook instance.

    1. Sign in to the Amazon SageMaker console at https://console.aws.amazon.com/sagemaker/.

    2. Open the notebook instance, by choosing either Open Jupyter for classic Juypter view or Open JupyterLab for JupyterLab view next to the name of the notebook instance. The Jupyter notebook server page appears:

  2. Create a notebook.

    1. If you opened the notebook in Jupyter classic view, on the Files tab, choose New, and conda_python3. This preinstalled environment includes the default Anaconda installation and Python 3.

    2. If you opened the notebook in JupyterLab view, on the File menu, choose New, and then choose Notebook.. For Select Kernel, choose conda_python3. This preinstalled environment includes the default Anaconda installation and Python 3.

  3. In the Jupyter notebook, choose File and Save as, and name the notebook.

  4. Copy the following Python code and paste it into the first cell in your notebook. Add the name of the S3 bucket that you created in Set Up Amazon SageMaker, and run the code. The get_execution_role function retrieves the IAM role you created when you created your notebook instance.

    import os
    import boto3
    import re
    import copy
    import time
    from time import gmtime, strftime
    from sagemaker import get_execution_role
    
    role = get_execution_role()
    
    region = boto3.Session().region_name
    
    bucket='bucket-name' # Replace with your s3 bucket name
    prefix = 'sagemaker/xgboost-mnist' # Used as part of the path in the bucket where you store data
    bucket_path = 'https://s3-{}.amazonaws.com/{}'.format(region,bucket) # The URL to access the bucket

Next Step
Step 4: Download, Explore, and Transform the Training Data