{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Machine Learning Accelerator - Computer Vision - Lecture 2\n",
"\n",
"\n",
"## Customized Image Classification with AutoGluon\n",
"\n",
"In this tutorial, we load images and the corresponding labels into [AutoGluon](https://autogluon.mxnet.io/index.html) and use this data to obtain a neural network that can classify new images. This is different from traditional machine learning where we need to manually define the neural network and then specify the hyperparameters in the training process. Instead, with just a single call to AutoGluon’s fit function, AutoGluon automatically trains many models with different hyperparameter configurations and returns the model that achieved the highest level of accuracy.\n",
"\n",
"Note: Please use **GPU** for training. CPU training will lead to an unceasing running script. "
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[33mWARNING: You are using pip version 21.3.1; however, version 22.3.1 is available.\r\n",
"You should consider upgrading via the '/home/ec2-user/anaconda3/envs/pytorch_p39/bin/python3.9 -m pip install --upgrade pip' command.\u001b[0m\r\n"
]
}
],
"source": [
"! pip install -q -r ../../requirements.txt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's import the ImagePredictor"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/ec2-user/anaconda3/envs/pytorch_p39/lib/python3.9/site-packages/gluoncv/__init__.py:40: UserWarning: Both `mxnet==1.9.1` and `torch==1.13.1+cu117` are installed. You might encounter increased GPU memory footprint if both framework are used at the same time.\n",
" warnings.warn(f'Both `mxnet=={mx.__version__}` and `torch=={torch.__version__}` are installed. '\n",
"INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgwxxk0j3\n",
"INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgwxxk0j3/_remote_module_non_scriptable.py\n"
]
}
],
"source": [
"from autogluon.vision import ImagePredictor, ImageDataset"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To use AutoGluon for computer vision task training, we need to organize our data with the following structure:\n",
"\n",
" data/\n",
" ├── train/\n",
" ├── class1/\n",
" ├── class2/\n",
" ├── class3/\n",
" ├── ...\n",
" ├── test/\n",
" ├── class1/\n",
" ├── class2/\n",
" ├── class3/\n",
" ├── ...\n",
"\n",
"Here each subfolder contains all images that belong to that category, e.g., `class1` contains all images belonging to the first class. We generally recommend at least 100 training images per class for reasonable classification performance, but this might depend on the type of images in your specific use-case."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Download the dataset\n",
"(Go to top)\n",
"\n",
"For demonstration purposes, we use a subset of the [Shopee-IET](https://www.kaggle.com/c/shopee-iet-machine-learning-competition/data) dataset from Kaggle. Each image in this data depicts a clothing item and the corresponding label specifies its clothing category. Our subset of the data contains the following possible labels: BabyPants, BabyShirt, womencasualshoes, womenchiffontop."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Downloading /home/ec2-user/.gluoncv/archive/shopee-iet.zip from https://autogluon.s3.amazonaws.com/datasets/shopee-iet.zip...\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 40895/40895 [00:01<00:00, 31765.29KB/s]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"data/\n",
"├── test/\n",
"└── train/\n"
]
}
],
"source": [
"train_dataset, _, test_dataset = ImageDataset.from_folders('https://autogluon.s3.amazonaws.com/datasets/shopee-iet.zip')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you use this on your own dataset, just point it to your training or test folder. Example: `train_dataset = ImageDataset.from_folder('mydataset/train')`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's print the training dataset"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" image label\n",
"0 /home/ec2-user/.gluoncv/datasets/shopee-iet/da... 0\n",
"1 /home/ec2-user/.gluoncv/datasets/shopee-iet/da... 0\n",
"2 /home/ec2-user/.gluoncv/datasets/shopee-iet/da... 0\n",
"3 /home/ec2-user/.gluoncv/datasets/shopee-iet/da... 0\n",
"4 /home/ec2-user/.gluoncv/datasets/shopee-iet/da... 0\n",
".. ... ...\n",
"795 /home/ec2-user/.gluoncv/datasets/shopee-iet/da... 3\n",
"796 /home/ec2-user/.gluoncv/datasets/shopee-iet/da... 3\n",
"797 /home/ec2-user/.gluoncv/datasets/shopee-iet/da... 3\n",
"798 /home/ec2-user/.gluoncv/datasets/shopee-iet/da... 3\n",
"799 /home/ec2-user/.gluoncv/datasets/shopee-iet/da... 3\n",
"\n",
"[800 rows x 2 columns]\n"
]
}
],
"source": [
"print(train_dataset)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Use AutoGluon to Fit Models\n",
"(Go to top)\n",
"\n",
"Now, let's fit a __classifier__ using AutoGluon [predictor.fit()](https://auto.gluon.ai/stable/tutorials/image_prediction/beginner.html). Within fit, the dataset is __automatically__ split into training and validation sets. The model with the best hyperparameter configuration is selected based on its performance on the __validation set__."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"AutoGluon ImagePredictor will be deprecated in v0.7. Please use AutoGluon MultiModalPredictor instead for more functionalities and better support. Visit https://auto.gluon.ai/stable/tutorials/multimodal/index.html for more details! \n",
"ImagePredictor sets accuracy as default eval_metric for classification problems.\n",
"Reset labels to [0, 1, 2, 3]\n",
"Randomly split train_data into train[720]/validation[80] splits.\n",
"The number of requested GPUs is greater than the number of available GPUs.Reduce the number to 1\n",
"Starting fit without HPO\n",
"INFO:TorchImageClassificationEstimator:modified configs( != ): {\n",
"INFO:TorchImageClassificationEstimator:root.img_cls.model resnet101 != resnet50\n",
"INFO:TorchImageClassificationEstimator:root.misc.seed 42 != 471\n",
"INFO:TorchImageClassificationEstimator:root.train.early_stop_max_value 1.0 != inf\n",
"INFO:TorchImageClassificationEstimator:root.train.early_stop_patience -1 != 10\n",
"INFO:TorchImageClassificationEstimator:root.train.batch_size 32 != 16\n",
"INFO:TorchImageClassificationEstimator:root.train.epochs 200 != 15\n",
"INFO:TorchImageClassificationEstimator:root.train.early_stop_baseline 0.0 != -inf\n",
"INFO:TorchImageClassificationEstimator:}\n",
"INFO:TorchImageClassificationEstimator:Saved config to /home/ec2-user/SageMaker/aws-machine-learning-university-accelerated-cv/notebooks/MLA-CV-DAY2-AutoGluon-CV/c3c99c72/.trial_0/config.yaml\n",
"Downloading: \"https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-rsb-weights/resnet50_a1_0-14fe96d1.pth\" to /home/ec2-user/.cache/torch/hub/checkpoints/resnet50_a1_0-14fe96d1.pth\n",
"INFO:TorchImageClassificationEstimator:Model resnet50 created, param count: 23516228\n",
"INFO:TorchImageClassificationEstimator:AMP not enabled. Training in float32.\n",
"INFO:TorchImageClassificationEstimator:Disable EMA as it is not supported for now.\n",
"INFO:TorchImageClassificationEstimator:Start training from [Epoch 0]\n",
"INFO:TorchImageClassificationEstimator:[Epoch 0] training: accuracy=0.286111\n",
"INFO:TorchImageClassificationEstimator:[Epoch 0] speed: 74 samples/sec\ttime cost: 9.431086\n",
"INFO:TorchImageClassificationEstimator:[Epoch 0] validation: top1=0.350000 top5=1.000000\n",
"INFO:TorchImageClassificationEstimator:[Epoch 0] Current best top-1: 0.350000 vs previous -inf, saved to /home/ec2-user/SageMaker/aws-machine-learning-university-accelerated-cv/notebooks/MLA-CV-DAY2-AutoGluon-CV/c3c99c72/.trial_0/best_checkpoint.pkl\n",
"INFO:TorchImageClassificationEstimator:[Epoch 1] training: accuracy=0.409722\n",
"INFO:TorchImageClassificationEstimator:[Epoch 1] speed: 97 samples/sec\ttime cost: 7.251337\n",
"INFO:TorchImageClassificationEstimator:[Epoch 1] validation: top1=0.562500 top5=1.000000\n",
"INFO:TorchImageClassificationEstimator:[Epoch 1] Current best top-1: 0.562500 vs previous 0.350000, saved to /home/ec2-user/SageMaker/aws-machine-learning-university-accelerated-cv/notebooks/MLA-CV-DAY2-AutoGluon-CV/c3c99c72/.trial_0/best_checkpoint.pkl\n",
"INFO:TorchImageClassificationEstimator:[Epoch 2] training: accuracy=0.572222\n",
"INFO:TorchImageClassificationEstimator:[Epoch 2] speed: 96 samples/sec\ttime cost: 7.290407\n",
"INFO:TorchImageClassificationEstimator:[Epoch 2] validation: top1=0.737500 top5=1.000000\n",
"INFO:TorchImageClassificationEstimator:[Epoch 2] Current best top-1: 0.737500 vs previous 0.562500, saved to /home/ec2-user/SageMaker/aws-machine-learning-university-accelerated-cv/notebooks/MLA-CV-DAY2-AutoGluon-CV/c3c99c72/.trial_0/best_checkpoint.pkl\n",
"INFO:TorchImageClassificationEstimator:[Epoch 3] training: accuracy=0.595833\n",
"INFO:TorchImageClassificationEstimator:[Epoch 3] speed: 96 samples/sec\ttime cost: 7.289871\n",
"INFO:TorchImageClassificationEstimator:[Epoch 3] validation: top1=0.825000 top5=1.000000\n",
"INFO:TorchImageClassificationEstimator:[Epoch 3] Current best top-1: 0.825000 vs previous 0.737500, saved to /home/ec2-user/SageMaker/aws-machine-learning-university-accelerated-cv/notebooks/MLA-CV-DAY2-AutoGluon-CV/c3c99c72/.trial_0/best_checkpoint.pkl\n",
"INFO:TorchImageClassificationEstimator:[Epoch 4] training: accuracy=0.670833\n",
"INFO:TorchImageClassificationEstimator:[Epoch 4] speed: 95 samples/sec\ttime cost: 7.335812\n",
"INFO:TorchImageClassificationEstimator:[Epoch 4] validation: top1=0.737500 top5=1.000000\n",
"INFO:TorchImageClassificationEstimator:[Epoch 5] training: accuracy=0.659722\n",
"INFO:TorchImageClassificationEstimator:[Epoch 5] speed: 94 samples/sec\ttime cost: 7.416250\n",
"INFO:TorchImageClassificationEstimator:[Epoch 5] validation: top1=0.712500 top5=1.000000\n",
"INFO:TorchImageClassificationEstimator:[Epoch 6] training: accuracy=0.656944\n",
"INFO:TorchImageClassificationEstimator:[Epoch 6] speed: 95 samples/sec\ttime cost: 7.395159\n",
"INFO:TorchImageClassificationEstimator:[Epoch 6] validation: top1=0.787500 top5=1.000000\n",
"INFO:TorchImageClassificationEstimator:[Epoch 7] training: accuracy=0.684722\n",
"INFO:TorchImageClassificationEstimator:[Epoch 7] speed: 94 samples/sec\ttime cost: 7.444487\n",
"INFO:TorchImageClassificationEstimator:[Epoch 7] validation: top1=0.775000 top5=1.000000\n",
"INFO:TorchImageClassificationEstimator:[Epoch 8] training: accuracy=0.677778\n",
"INFO:TorchImageClassificationEstimator:[Epoch 8] speed: 94 samples/sec\ttime cost: 7.449108\n",
"INFO:TorchImageClassificationEstimator:[Epoch 8] validation: top1=0.837500 top5=1.000000\n",
"INFO:TorchImageClassificationEstimator:[Epoch 8] Current best top-1: 0.837500 vs previous 0.825000, saved to /home/ec2-user/SageMaker/aws-machine-learning-university-accelerated-cv/notebooks/MLA-CV-DAY2-AutoGluon-CV/c3c99c72/.trial_0/best_checkpoint.pkl\n",
"INFO:TorchImageClassificationEstimator:[Epoch 9] training: accuracy=0.691667\n",
"INFO:TorchImageClassificationEstimator:[Epoch 9] speed: 94 samples/sec\ttime cost: 7.447343\n",
"INFO:TorchImageClassificationEstimator:[Epoch 9] validation: top1=0.812500 top5=1.000000\n",
"INFO:TorchImageClassificationEstimator:[Epoch 10] training: accuracy=0.708333\n",
"INFO:TorchImageClassificationEstimator:[Epoch 10] speed: 93 samples/sec\ttime cost: 7.493931\n",
"INFO:TorchImageClassificationEstimator:[Epoch 10] validation: top1=0.887500 top5=1.000000\n",
"INFO:TorchImageClassificationEstimator:[Epoch 10] Current best top-1: 0.887500 vs previous 0.837500, saved to /home/ec2-user/SageMaker/aws-machine-learning-university-accelerated-cv/notebooks/MLA-CV-DAY2-AutoGluon-CV/c3c99c72/.trial_0/best_checkpoint.pkl\n",
"INFO:TorchImageClassificationEstimator:[Epoch 11] training: accuracy=0.713889\n",
"INFO:TorchImageClassificationEstimator:[Epoch 11] speed: 93 samples/sec\ttime cost: 7.511936\n",
"INFO:TorchImageClassificationEstimator:[Epoch 11] validation: top1=0.825000 top5=1.000000\n",
"INFO:TorchImageClassificationEstimator:[Epoch 12] training: accuracy=0.709722\n",
"INFO:TorchImageClassificationEstimator:[Epoch 12] speed: 93 samples/sec\ttime cost: 7.545191\n",
"INFO:TorchImageClassificationEstimator:[Epoch 12] validation: top1=0.837500 top5=1.000000\n",
"INFO:TorchImageClassificationEstimator:[Epoch 13] training: accuracy=0.743056\n",
"INFO:TorchImageClassificationEstimator:[Epoch 13] speed: 94 samples/sec\ttime cost: 7.453663\n",
"INFO:TorchImageClassificationEstimator:[Epoch 13] validation: top1=0.887500 top5=1.000000\n",
"INFO:TorchImageClassificationEstimator:[Epoch 14] training: accuracy=0.743056\n",
"INFO:TorchImageClassificationEstimator:[Epoch 14] speed: 94 samples/sec\ttime cost: 7.444030\n",
"INFO:TorchImageClassificationEstimator:[Epoch 14] validation: top1=0.887500 top5=1.000000\n",
"INFO:TorchImageClassificationEstimator:Applying the state from the best checkpoint...\n",
"Finished, total runtime is 125.82 s\n",
"{ 'best_config': { 'batch_size': 16,\n",
" 'dist_ip_addrs': None,\n",
" 'early_stop_baseline': -inf,\n",
" 'early_stop_max_value': inf,\n",
" 'early_stop_patience': 10,\n",
" 'epochs': 15,\n",
" 'final_fit': False,\n",
" 'gpus': [0],\n",
" 'lr': 0.01,\n",
" 'model': 'resnet50',\n",
" 'ngpus_per_trial': 8,\n",
" 'nthreads_per_trial': 128,\n",
" 'num_workers': 4,\n",
" 'searcher': 'random',\n",
" 'seed': 471,\n",
" 'time_limits': 600},\n",
" 'total_time': 118.206538438797,\n",
" 'train_acc': 0.7430555555555556,\n",
" 'valid_acc': 0.8875}\n"
]
},
{
"data": {
"text/plain": [
""
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"predictor = ImagePredictor()\n",
"\n",
"time_limit = 10 * 60 # how long fit() should run (in seconds)\n",
"predictor.fit(train_dataset,\n",
" time_limit=time_limit\n",
" )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Model Results\n",
"(Go to top)\n",
"\n",
"Use AutoGluon to Fit Models\n",
"Autogluon also provides the training results, which can be accessed by calling `predictor.fit_summary()`. "
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"fit_result = predictor.fit_summary()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'train_acc': 0.7430555555555556,\n",
" 'valid_acc': 0.8875,\n",
" 'total_time': 118.206538438797,\n",
" 'best_config': {'model': 'resnet50',\n",
" 'lr': 0.01,\n",
" 'epochs': 15,\n",
" 'batch_size': 16,\n",
" 'nthreads_per_trial': 128,\n",
" 'ngpus_per_trial': 8,\n",
" 'time_limits': 600,\n",
" 'dist_ip_addrs': None,\n",
" 'searcher': 'random',\n",
" 'early_stop_patience': 10,\n",
" 'early_stop_baseline': -inf,\n",
" 'early_stop_max_value': inf,\n",
" 'num_workers': 4,\n",
" 'gpus': [0],\n",
" 'seed': 471,\n",
" 'final_fit': False},\n",
" 'fit_history': {'train_acc': 0.7430555555555556,\n",
" 'valid_acc': 0.8875,\n",
" 'total_time': 118.206538438797,\n",
" 'best_config': {'model': 'resnet50',\n",
" 'lr': 0.01,\n",
" 'epochs': 15,\n",
" 'batch_size': 16,\n",
" 'nthreads_per_trial': 128,\n",
" 'ngpus_per_trial': 8,\n",
" 'time_limits': 600,\n",
" 'dist_ip_addrs': None,\n",
" 'searcher': 'random',\n",
" 'early_stop_patience': 10,\n",
" 'early_stop_baseline': -inf,\n",
" 'early_stop_max_value': inf,\n",
" 'num_workers': 4,\n",
" 'gpus': [0],\n",
" 'seed': 471,\n",
" 'final_fit': False}}}"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fit_result"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can access certain results from this summary. For example, training and validation accuracies below."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Train acc: 0.743, val acc: 0.887\n"
]
}
],
"source": [
"print('Train acc: %.3f, val acc: %.3f' %(fit_result['train_acc'], fit_result['valid_acc']))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The best model and optimum hyperparameters: Learning rate, batch size, epochs can be printed with this:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'model': 'resnet50',\n",
" 'lr': 0.01,\n",
" 'epochs': 15,\n",
" 'batch_size': 16,\n",
" 'nthreads_per_trial': 128,\n",
" 'ngpus_per_trial': 8,\n",
" 'time_limits': 600,\n",
" 'dist_ip_addrs': None,\n",
" 'searcher': 'random',\n",
" 'early_stop_patience': 10,\n",
" 'early_stop_baseline': -inf,\n",
" 'early_stop_max_value': inf,\n",
" 'num_workers': 4,\n",
" 'gpus': [0],\n",
" 'seed': 471,\n",
" 'final_fit': False}"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fit_result['fit_history']['best_config']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" ## 4. Making Predictions\n",
"(Go to top)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can call the predict function to run on different images."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0 0\n",
"Name: label, dtype: int64"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"image_path = test_dataset.iloc[0]['image']\n",
"predictor.predict(image_path)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's get predictions on the test set."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 0\n",
"1 1\n",
"2 1\n",
"3 0\n",
"4 1\n",
" ..\n",
"75 3\n",
"76 3\n",
"77 3\n",
"78 3\n",
"79 3\n",
"Name: label, Length: 80, dtype: int64\n"
]
}
],
"source": [
"pred = predictor.predict(test_dataset)\n",
"print(pred)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "conda_pytorch_p39",
"language": "python",
"name": "conda_pytorch_p39"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
}
},
"nbformat": 4,
"nbformat_minor": 4
}