{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "84b9e41b-be32-40be-ba36-e7f679802949",
   "metadata": {
    "tags": []
   },
   "source": [
    "# Vegetation Analysis Pre/Post Lava Fire Using SentinelHub and Sentinel-2 Imagery on AWS\n",
    "\n",
    "This notebook includes examples included in the blog post [Getting Started with Geospatial Analysis on SageMaker Studio Lab](https://towardsdatascience.com/getting-started-with-geospatial-analysis-b2116c50308b). For this specific use case we focus on working with the [Sentinel-2](https://registry.opendata.aws/sentinel-2/) dataset using the [SentinelHub API]((https://www.sentinel-hub.com/) to query and download pre/post scenes over Mt. Shasta in California during the [Lava Fire](https://en.wikipedia.org/wiki/Lava_Fire_(2021)) from June 2021. We then compare two images calculating an normalized difference vegetation index (NDVI) to identify vegetation and classify the result using a simple land cover classification scheme. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "566fa133-94a6-4c8a-93f5-3e36d4fd24b6",
   "metadata": {},
   "source": [
    "## Install Packages (Optional)\n",
    "Creating a environment in Studio Lab is easy, just select the environment.yml file (by cloning this repository or upload it directly), right click the YAML file and select create environment. Once the environment is created, you should open this notbook with the newly created kernel. Optionally you can also uncomment the package installation section of the notebook to install these packages manually. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d5d85c3a-5066-492b-b54a-fad7552e706a",
   "metadata": {},
   "outputs": [],
   "source": [
    "# %pip install numpy\n",
    "# %pip install matplotlib\n",
    "# %pip install plotly_express\n",
    "# %pip install sentinelhub\n",
    "# %pip install rasterio\n",
    "# %pip install earthpy"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "feac1762-2ce3-4a25-aad3-a64df4b969c5",
   "metadata": {},
   "source": [
    "## Import Packages\n",
    "After the environment is created and selected or the packages installed manually, we can import them directly."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d1deaf5d-2b3a-49c3-a623-7e0120a54bbb",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib\n",
    "import matplotlib.pyplot as plt \n",
    "import folium\n",
    "import plotly_express as px\n",
    "import os\n",
    "import warnings\n",
    "warnings.filterwarnings('ignore')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4e069d6b",
   "metadata": {},
   "source": [
    "## Working With Geospatial Images\n",
    "For Geospatial data, we will use Sentinel-2. The [Sentinel-2 mission](https://sentinel.esa.int/web/sentinel/missions/sentinel-2) is a land monitoring constellation of two satellites that provide high resolution optical imagery and continuity for the current SPOT and Landsat missions. The Sentinel-2 dataset is available publicly at the [AWS open data registry](https://registry.opendata.aws/sentinel-2/)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "42d74192-9b19-4afc-a4df-bb33e9d8fd30",
   "metadata": {},
   "source": [
    "We will use the `sentinelhub` python package, that makes it easy to search and download data specific to our focus area directly from AWS. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4ba81f6c",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from sentinelhub import SHConfig\n",
    "config = SHConfig()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c23cd514-f951-4409-b581-b3d41c4170eb",
   "metadata": {},
   "source": [
    "#### Sentinel Hub Setup\n",
    "This section shows how to configure your credentials for sentinelhub. We are using a optional json file to store and retrieve credentials."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b258fbbf-a5a8-4e10-aa22-f3e7aee41d70",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import json\n",
    "\n",
    "with open(\"config.json\") as json_data_file:\n",
    "    cfg = json.load(json_data_file)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4aab6a70",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# instance_id - Instance ID from from your Sentinel Hub account \n",
    "# aws_access_key_id - Access key ID from your AWS account\n",
    "# aws_secret_access_key - Secrect access key from your AWS account\n",
    "\n",
    "config.instance_id = cfg[\"sentinelhub\"][\"instance_id\"]\n",
    "config.aws_access_key_id = cfg[\"aws\"][\"access_key_id\"]\n",
    "config.aws_secret_access_key = cfg[\"aws\"][\"secret_access_key\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1442d11e",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Save the configuration\n",
    "config.save()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7f244cb7",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Verify credentials\n",
    "\n",
    "from sentinelhub import WebFeatureService, BBox, CRS, DataCollection, SHConfig\n",
    "if config.instance_id == '':\n",
    "    print(\"Warning! To use WFS functionality, please configure the `instance_id`.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "caa451c7-f828-4b34-a4a3-73cfe2aa3762",
   "metadata": {
    "tags": []
   },
   "source": [
    "#### Data Search\n",
    "Before we download, we need to specify our search coordinates that we want to study and the time window. In our case we are focusing on the Mt. Shasta region, which we specify as a bounding box and a time period corresponding to the Lava Fire (June 2021). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "523c6b31",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Specify bounding box and time interval for search. Here we are using a bounding box over the Mt. Shasta. We'll use the time that corresponds to pre/post fire event.\n",
    "\n",
    "search_bbox = BBox(bbox=[-122.36984252929688,41.5327401,-122.20436096,41.4195597], crs=CRS.WGS84)\n",
    "\n",
    "search_time_interval = ('2021-05-12T00:00:00', '2021-05-13T23:59:59')  #to include the post image expand to '2021-09-14T23:59:59'\n",
    "\n",
    "\n",
    "wfs_iterator = WebFeatureService(\n",
    "    search_bbox,\n",
    "    search_time_interval,\n",
    "    data_collection=DataCollection.SENTINEL2_L1C,\n",
    "    maxcc=1.0,\n",
    "    config=config\n",
    ")\n",
    "\n",
    "for tile_info in wfs_iterator:\n",
    "    print(tile_info)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "aa8a2b48",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# List available tiles\n",
    "wfs_iterator.get_tiles()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "514c6ce3-46dd-4425-850b-f7bcd8579b8c",
   "metadata": {
    "tags": []
   },
   "source": [
    "#### Picking Tiles\n",
    "For best results, we pick a tile with least cloud coverage."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "962b3048-ad5d-4d31-8e7d-b8aa3835b2f7",
   "metadata": {},
   "outputs": [],
   "source": [
    "from sentinelhub import AwsTile\n",
    "\n",
    "tile_id = 'S2A_OPER_MSI_L1C_TL_VGS2_20210512T223857_A030755_T10TEL_N03.00'\n",
    "tile_name, time, aws_index = AwsTile.tile_id_to_tile(tile_id)\n",
    "tile_name, time, aws_index\n",
    "\n",
    "tile_id2 = 'S2B_OPER_MSI_L1C_TL_VGS4_20210914T222623_A023634_T10TEL_N03.01'\n",
    "tile_name2, time2, aws_index2 = AwsTile.tile_id_to_tile(tile_id2)\n",
    "tile_name2, time2, aws_index2"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "30e57092-150a-4b57-9b85-f1129c2cd0ae",
   "metadata": {},
   "source": [
    "#### Sentinel Data Download\n",
    "The Sentinel-2 satellites each carry a single multi-spectral instrument (MSI) with 13 spectral channels in the visible/near infrared (VNIR) and short wave infrared spectral range (SWIR). You can read more about these bands [here](https://en.wikipedia.org/wiki/Sentinel-2#Spectral_bands). For our example will download eight specific bands that will aid our analysis."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "df7af114",
   "metadata": {},
   "outputs": [],
   "source": [
    "warnings.simplefilter(\"ignore\", UserWarning)\n",
    "from sentinelhub import AwsTileRequest\n",
    "\n",
    "bands = ['B04','B08']  # Sentinel-2 imagery is comprised of several bands but we are only focusing on these for this analysis\n",
    "metafiles = ['tileInfo', 'preview', 'qi/MSK_CLOUDS_B00']\n",
    "data_folder = './AwsData'\n",
    "\n",
    "request = AwsTileRequest(   # This corresponds to the first image request\n",
    "    tile=tile_name,\n",
    "    time=time,\n",
    "    aws_index=aws_index,\n",
    "    bands=bands,\n",
    "    metafiles=metafiles,\n",
    "    data_folder=data_folder,\n",
    "    data_collection=DataCollection.SENTINEL2_L1C\n",
    ")\n",
    "\n",
    "request.save_data()  # This is where the download is triggered\n",
    "\n",
    "request2 = AwsTileRequest(   # This corresponds to the second image request\n",
    "    tile=tile_name2,\n",
    "    time=time2,\n",
    "    aws_index=aws_index2,\n",
    "    bands=bands,\n",
    "    metafiles=metafiles,\n",
    "    data_folder=data_folder,\n",
    "    data_collection=DataCollection.SENTINEL2_L1C\n",
    ")\n",
    "\n",
    "request2.save_data()   "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5ad27da1-a9f2-42dc-ac2d-1670ab776105",
   "metadata": {},
   "source": [
    "## Working with Raster Data\n",
    "Geospatial data is essentially comprised of raster data or vector data. Sentinel-2 uses GeoTIFF, a gridded raster datasets for satellite imagery and terrain models. Rasterio is a Python library that allows to read, inspect, visualize and write geospatial raster data. Here we use `rasterio` to read thee raster arrays and then use this data to create a true color image."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "00185f67",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import rasterio\n",
    "from rasterio import plot\n",
    "import earthpy.spatial as es\n",
    "import earthpy.plot as ep"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "14bf3768-3cb7-4429-a763-3215267de9f9",
   "metadata": {
    "tags": []
   },
   "source": [
    "The `earthpy` package allows easy plotting of visualization of bands, we use it here to visualize the NDVI"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "448f00c4-80a3-4cb9-8b9f-6152592734e2",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Calculating Spectral Indices\n",
    "Spectral indices are combinations of the pixel values from two or more spectral bands in a multispectral image. Spectral indices highlight pixels showing the relative abundance or lack of a land-cover type of interest in an image. Let's looks at a couple "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3b9d9f47-189f-4f6b-90ce-e0ebd81be39f",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Normalized Difference Vegetation Index - NVDI\n",
    "The normalized difference vegetation index is a simple graphical indicator that can be used to analyze whether or not the target being observed contains live green vegetation. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f1b6a949-c024-499e-bfb7-d9442b0e94ba",
   "metadata": {},
   "source": [
    "It calculated as `NDVI = (NIR – Red) / (NIR + Red)`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "186cf9cf-10b2-40cd-ac24-52f77e44d6fe",
   "metadata": {},
   "outputs": [],
   "source": [
    "band4_image1 = rasterio.open('./AwsData/10TEL,2021-05-12,0/B04.jp2', driver='JP2OpenJPEG') #red\n",
    "band8_image1 = rasterio.open('./AwsData/10TEL,2021-05-12,0/B08.jp2', driver='JP2OpenJPEG') #nir\n",
    "\n",
    "band4_image2 = rasterio.open('./AwsData/10TEL,2021-09-14,0/B04.jp2', driver='JP2OpenJPEG') #red\n",
    "band8_image2 = rasterio.open('./AwsData/10TEL,2021-09-14,0/B08.jp2', driver='JP2OpenJPEG') #nir\n",
    "\n",
    "# read Red(b4) and NIR(b8) as arrays\n",
    "red_image1 = band4_image1.read(1)\n",
    "nir_image1 = band8_image1.read(1)\n",
    "ndvi_image1 = (nir_image1.astype(float)-red_image1.astype(float))/(nir_image1.astype(float)+red_image1.astype(float))\n",
    "\n",
    "red_image2 = band4_image2.read(1)\n",
    "nir_image2 = band8_image2.read(1)\n",
    "ndvi_image2 = (nir_image2.astype(float)-red_image2.astype(float))/(nir_image2.astype(float)+red_image2.astype(float))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c8f6140e-e301-4eea-a8b2-9aefea96008d",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Creating a Simple Image Classification from NDVI\n",
    "Remote Sensing data can be classified into thematic layers like land cover maps which are useful for representing spectral values as vegetation. This next section will create a 5-class map from the input NDVI images to compare differences in land cover change before and after the Lava Fire event occurred."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9e149cd9-7d3e-475e-9acc-1d4edf3de0ad",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create classes and apply to NDVI results\n",
    "ndvi_class_bins = [-np.inf, 0, 0.1, 0.25, 0.4, np.inf]\n",
    "ndvi_sentinel_class = np.digitize(ndvi_image1, ndvi_class_bins)\n",
    "\n",
    "# Apply the nodata mask to the newly classified NDVI data\n",
    "ndvi_sentinel_class = np.ma.masked_where(\n",
    "    np.ma.getmask(ndvi_image1), ndvi_sentinel_class\n",
    ")\n",
    "#np.unique(ndvi_sentinel_class)\n",
    "\n",
    "from matplotlib.colors import ListedColormap\n",
    "\n",
    "# Define color map\n",
    "nbr_colors = [\"khaki\", \"y\", \"yellowgreen\", \"g\", \"darkgreen\"]\n",
    "nbr_cmap = ListedColormap(nbr_colors)\n",
    "\n",
    "# Define class names\n",
    "ndvi_cat_names = [\n",
    "    \"Dead forest\",\n",
    "    \"Scrub\",\n",
    "    \"Open Forest\",\n",
    "    \"Moderately Dense Forest\",\n",
    "    \"Very Dense Forest\",\n",
    "]\n",
    "\n",
    "# Get list of classes\n",
    "classes = np.unique(ndvi_sentinel_class)\n",
    "classes = classes.tolist()\n",
    "\n",
    "# The mask returns a value of none in the classes. remove that\n",
    "classes = classes[0:5]\n",
    "\n",
    "# Create classes and apply to NDVI results\n",
    "ndvi_class_bins_2 = [-np.inf, 0, 0.1, 0.25, 0.4, np.inf]\n",
    "ndvi_sentinel_class_2 = np.digitize(ndvi_image2, ndvi_class_bins)\n",
    "\n",
    "# Apply the nodata mask to the newly classified NDVI data\n",
    "ndvi_sentinel_class_2 = np.ma.masked_where(\n",
    "    np.ma.getmask(ndvi_image2), ndvi_sentinel_class_2\n",
    ")\n",
    "#np.unique(ndvi_sentinel_class_2)\n",
    "\n",
    "nbr_colors_2 = [\"khaki\", \"y\", \"yellowgreen\", \"g\", \"darkgreen\"]\n",
    "nbr_cmap_2 = ListedColormap(nbr_colors)\n",
    "\n",
    "# Define class names\n",
    "ndvi_cat_names_2 = [\n",
    "    \"Dead forest\",\n",
    "    \"Scrub\",\n",
    "    \"Open Forest\",\n",
    "    \"Moderately Dense Forest\",\n",
    "    \"Very Dense Forest\",\n",
    "]\n",
    "\n",
    "# Get list of classes\n",
    "classes_2 = np.unique(ndvi_sentinel_class_2)\n",
    "classes_2 = classes_2.tolist()\n",
    "# The mask returns a value of none in the classes. remove that\n",
    "classes_2 = classes_2[0:5]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ac01ad35-f6e4-4e9f-80d7-8ccf83d89166",
   "metadata": {},
   "source": [
    "## Plotting the Images"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "63f831f5-0bcd-43e2-b3c6-16e252f7c074",
   "metadata": {},
   "source": [
    "### NDVI Image Comparison"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c97fb49d-2fe1-46a0-88f1-acb1df6e07c0",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "fig, (ax1, ax2) = plt.subplots(1,2, figsize=(20,6)) # 2 axes on a 1x2 grid\n",
    "\n",
    "ndvi_image1 = ax1.imshow(ndvi_image1, cmap=\"RdYlGn\")\n",
    "ax1.set_title(\"NDVI (Pre) - 2021-05-12\")\n",
    "ndvi_image1.set_clim(vmin=-1, vmax=1)\n",
    "fig.colorbar(ndvi_image1, ax=ax1)\n",
    "\n",
    "# Now red band in the second subplot\n",
    "ndvi_image2 = ax2.imshow(ndvi_image2, cmap=\"RdYlGn\")\n",
    "ax2.set_title(\"NDVI (Post) - 2021-09-14 \")\n",
    "ndvi_image2.set_clim(vmin=-1, vmax=1)\n",
    "fig.colorbar(ndvi_image2, ax=ax2)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "78cca11d-11db-4365-a2ae-78652a5ab436",
   "metadata": {},
   "source": [
    "You can see areas with vegetation in green, areas with dense vegetation as darker shades of green, water bodes generally have low to no vegetation and as such in a contrasting shade of orange. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6c76594d-430c-4df1-a1c1-bf9a0b6f3199",
   "metadata": {},
   "source": [
    "### Vegetation Classification Maps"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ef05b24e-3396-4f37-8cca-7aa8d260859a",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "fig, (ax1, ax2) = plt.subplots(1,2, figsize=(20,6)) # 2 axes on a 1x2 grid\n",
    "\n",
    "im1 = ax1.imshow(np.squeeze(ndvi_sentinel_class), cmap=nbr_cmap)\n",
    "ep.draw_legend(im_ax=im1, classes=classes, titles=ndvi_cat_names)\n",
    "ax1.set_title(\n",
    "    \"Pre-fire (2021-05-12)\",\n",
    "    fontsize=14)\n",
    "ax1.set_axis_off()\n",
    "\n",
    "# Auto adjust subplot to fit figure size\n",
    "#plt.tight_layout()\n",
    "\n",
    "\n",
    "im2 = ax2.imshow(np.squeeze(ndvi_sentinel_class_2), cmap=nbr_cmap_2)\n",
    "\n",
    "#ep.draw_legend(im_ax=im2, classes=classes_2, titles=ndvi_cat_names_2)\n",
    "ax2.set_title(\n",
    "    \"Post-fire (2021-09-14)\",\n",
    "    fontsize=14,\n",
    ")\n",
    "ax2.set_axis_off()\n",
    "\n",
    "# Auto adjust subplot to fit figure size\n",
    "#plt.tight_layout()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f75b9742-6337-4aa9-83b2-f73e89709692",
   "metadata": {},
   "source": [
    "Here we see a comparison of pre/post fire classified NDVI images using some common vegetation classes representing different NDVI value ranges. Visually we see large areas of Very Dense / Moderately Dense Forest transformed into Dead Forest or Scrub following events of the Lava Fire."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "15ed1983-d35f-4073-838d-755fdc34c64e",
   "metadata": {
    "jp-MarkdownHeadingCollapsed": true,
    "tags": []
   },
   "source": [
    "## Clean Up (Optional)\n",
    "Though we did not create any AWS billable resources as part of this exercise, the geographic and GIS data that we downloaded and the images generated may take up significant storage. Make sure to check any storage utilization and delete the files as needed."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "geo-data:Python",
   "language": "python",
   "name": "conda-env-geo-data-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}