# AutoGluon Environment Validation

[![Open in SageMaker Studio Lab](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/aws/studio-lab-examples/blob/main/custom-environments/AutoGluon/env_validation.ipynb)

AutoML is a technology that aims to automate the process of applying machine learning to real-world problems, or to automate the process of building machine learning models from data. [AutoGluon](https://github.com/awslabs/autogluon) enables easy-to-use and easy-to-extend AutoML with a focus on automated stack ensembling, deep learning, and real-world applications spanning text, image, and tabular data.

In this notebook, we will set up AutoGluon in [Amazon SageMaker Studio Lab](https://studiolab.sagemaker.aws/), a free machine learning development environment, and train machine learning models using AutoGluon on a sample dataset.

Run the following cell to generate a YAML file for creating the Anaconda virtual environment.

In [None]:
%%writefile autogluon_cpu.yml
# run in terminal: conda env create --file autogluon_cpu.yml
# note: This command creates an environment that is larger than 4GB.
name: autogluon_cpu
dependencies:
 - python=3.9
 - conda
 - pip
 - ipykernel
 - nodejs
 - pip:
 - ipywidgets
 - setuptools
 - wheel
 - autogluon
 - jupyter_bokeh

After `autogluon_cpu.yml` is generated, run `conda env create -f autogluon_cpu.yml` in your system terminal to create a virtual environment for Anaconda with AutoGluon set up.

If the installation is successful, you should be able to select `autogluon_cpu:Python` in the popup that appears when you click the kernel name in the upper right corner of the screen. Switch to the `autogluon_cpu:Python` kernel. You should now be able to use AutoGluon.

## Import Packages

In [None]:
from autogluon.tabular import TabularDataset, TabularPredictor
from IPython.display import HTML

## Train on Sample Dataset

In [None]:
train_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv')
test_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv')
predictor = TabularPredictor(label='class', path='ag-default').fit(train_data, time_limit=120) # Fit models for 120s

## Check Models Summary

In [None]:
leaderboard = predictor.leaderboard(test_data)

In [None]:
results = predictor.fit_summary()

In [None]:
HTML(filename='ag-default/SummaryOfModels.html')