{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "09323c51-6cb3-4a7d-9c7a-ef3b41e63248",
"metadata": {},
"source": [
"# Monitoring Lake Drought with SageMaker Geospatial Capabilities\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. \n",
"\n",
"\n",
"\n",
"---"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "09323c51-6cb3-4a7d-9c7a-ef3b41e63248",
"metadata": {},
"source": [
"\n",
"This notebook runs with Kernel Geospatial 1.0. Note that the following policies need to be attached to the execution role that you used to run this notebook:\n",
"\n",
"- AmazonSageMakerFullAccess\n",
"- AmazonSageMakerGeospatialFullAccess\n",
"\n",
"You can see the policies attached to the role in the IAM console under the permissions tab. If required, add the roles using the 'Add Permissions' button. \n",
"\n",
"In addition to these policies, ensure that the execution role's trust policy allows the SageMaker-GeoSpatial service to assume the role. This can be done by adding the following trust policy using the 'Trust relationships' tab:\n",
"```\n",
"{\n",
" \"Version\": \"2012-10-17\",\n",
" \"Statement\": [\n",
" {\n",
" \"Effect\": \"Allow\",\n",
" \"Principal\": {\n",
" \"Service\": [\n",
" \"sagemaker.amazonaws.com\",\n",
" \"sagemaker-geospatial.amazonaws.com\"\n",
" ]\n",
" },\n",
" \"Action\": \"sts:AssumeRole\"\n",
" }\n",
" ]\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "04a08fa7-a784-40e8-a36d-cfa84830b66b",
"metadata": {
"tags": []
},
"source": [
"**Contents**\n",
"* [Setup SageMaker geospatial capabilities](#1)\n",
"* [Query and access Data](#2)\n",
"* [Start an Earth Observation Job (EOJ) to identify the land cover types in the area of Lake Mead](#3)\n",
"* [Visualize EOJ inputs and outputs in FourSquare Studio](#4)\n",
"* [Extract the land cover segmentation results](#5)\n",
"* [Measure Lake Mead surface area](#6)\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "62e59d15-3a3e-4eae-8100-50b646425a12",
"metadata": {},
"source": [
"\n",
"\n",
"## Setup SageMaker geospatial capabilities"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fb593738-f41e-4bbe-bab3-12e4e5beab26",
"metadata": {},
"outputs": [],
"source": [
"import boto3\n",
"import sagemaker\n",
"import sagemaker_geospatial_map\n",
"\n",
"session = boto3.Session()\n",
"execution_role = sagemaker.get_execution_role()\n",
"sg_client = session.client(service_name=\"sagemaker-geospatial\")"
]
},
{
"cell_type": "markdown",
"id": "767b519b-4b00-4702-9e99-34d91ba0c720",
"metadata": {
"tags": []
},
"source": [
"\n",
"\n",
"## Query and access data"
]
},
{
"cell_type": "markdown",
"id": "665d5b78-06fb-4c36-a5eb-5ec080f8e794",
"metadata": {
"tags": []
},
"source": [
"Query the geospatial data with AOI, time range and cloud cover filter."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bde07cea-8d83-4027-b20e-921e7f36f298",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"search_rdc_args = {\n",
" \"Arn\": \"arn:aws:sagemaker-geospatial:us-west-2:378778860802:raster-data-collection/public/nmqj48dcu3g7ayw8\", # sentinel-2 L2A COG\n",
" \"RasterDataCollectionQuery\": {\n",
" \"AreaOfInterest\": {\n",
" \"AreaOfInterestGeometry\": {\n",
" \"PolygonGeometry\": {\n",
" \"Coordinates\": [\n",
" [\n",
" [-114.529, 36.142],\n",
" [-114.373, 36.142],\n",
" [-114.373, 36.411],\n",
" [-114.529, 36.411],\n",
" [-114.529, 36.142],\n",
" ]\n",
" ]\n",
" }\n",
" }\n",
" },\n",
" \"TimeRangeFilter\": {\n",
" \"StartTime\": \"2021-01-01T00:00:00Z\",\n",
" \"EndTime\": \"2022-07-10T23:59:59Z\",\n",
" },\n",
" \"PropertyFilters\": {\n",
" \"Properties\": [{\"Property\": {\"EoCloudCover\": {\"LowerBound\": 0, \"UpperBound\": 1}}}],\n",
" \"LogicalOperator\": \"AND\",\n",
" },\n",
" \"BandFilter\": [\"visual\"],\n",
" },\n",
"}\n",
"\n",
"tci_urls = []\n",
"data_manifests = []\n",
"while search_rdc_args.get(\"NextToken\", True):\n",
" search_result = sg_client.search_raster_data_collection(**search_rdc_args)\n",
" if search_result.get(\"NextToken\"):\n",
" data_manifests.append(search_result)\n",
" for item in search_result[\"Items\"]:\n",
" tci_url = item[\"Assets\"][\"visual\"][\"Href\"]\n",
" print(tci_url)\n",
" tci_urls.append(tci_url)\n",
"\n",
" search_rdc_args[\"NextToken\"] = search_result.get(\"NextToken\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "733e7ae7-4b22-47d8-967a-97b2c1a5b12c",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Check one example.\n",
"import os\n",
"from urllib import request\n",
"import tifffile\n",
"import matplotlib.pyplot as plt\n",
"\n",
"image_dir = \"./images/lake_mead\"\n",
"os.makedirs(image_dir, exist_ok=True)\n",
"\n",
"tci_url = tci_urls[0]\n",
"img_id = tci_url.split(\"/\")[-2]\n",
"tci_path = image_dir + \"/\" + img_id + \"_TCI.tif\"\n",
"response = request.urlretrieve(tci_url, tci_path)\n",
"print(\"Downloaded image: \" + img_id)\n",
"\n",
"tci = tifffile.imread(tci_path)\n",
"plt.figure(figsize=(6, 6))\n",
"plt.imshow(tci)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"id": "23788ee9-bf5f-48b8-80f9-e0432048f679",
"metadata": {},
"source": [
"\n",
"## Start an Earth Observation Job (EOJ) to identify the land cover types in the area of Lake Mead"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4a7f45ab-06fc-41cc-b7c2-9c169ea5b787",
"metadata": {},
"outputs": [],
"source": [
"# Perform land cover segmentation on images returned from the sentinel dataset.\n",
"eoj_input_config = {\n",
" \"RasterDataCollectionQuery\": {\n",
" \"RasterDataCollectionArn\": \"arn:aws:sagemaker-geospatial:us-west-2:378778860802:raster-data-collection/public/nmqj48dcu3g7ayw8\",\n",
" \"AreaOfInterest\": {\n",
" \"AreaOfInterestGeometry\": {\n",
" \"PolygonGeometry\": {\n",
" \"Coordinates\": [\n",
" [\n",
" [-114.529, 36.142],\n",
" [-114.373, 36.142],\n",
" [-114.373, 36.411],\n",
" [-114.529, 36.411],\n",
" [-114.529, 36.142],\n",
" ]\n",
" ]\n",
" }\n",
" }\n",
" },\n",
" \"TimeRangeFilter\": {\n",
" \"StartTime\": \"2021-01-01T00:00:00Z\",\n",
" \"EndTime\": \"2022-07-10T23:59:59Z\",\n",
" },\n",
" \"PropertyFilters\": {\n",
" \"Properties\": [{\"Property\": {\"EoCloudCover\": {\"LowerBound\": 0, \"UpperBound\": 1}}}],\n",
" \"LogicalOperator\": \"AND\",\n",
" },\n",
" }\n",
"}\n",
"eoj_config = {\"LandCoverSegmentationConfig\": {}}\n",
"\n",
"response = sg_client.start_earth_observation_job(\n",
" Name=\"lake-mead-landcover\",\n",
" InputConfig=eoj_input_config,\n",
" JobConfig=eoj_config,\n",
" ExecutionRoleArn=execution_role,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "47e00573-a20b-4264-8fb7-f80a1c92f4e3",
"metadata": {},
"source": [
"### Monitor the EOJ status."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "309b74ca-883e-4b64-b597-0f0e154a23dc",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"eoj_arn = response[\"Arn\"]\n",
"job_details = sg_client.get_earth_observation_job(Arn=eoj_arn)\n",
"{k: v for k, v in job_details.items() if k in [\"Arn\", \"Status\", \"DurationInSeconds\"]}"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4ede1a54-1450-4fec-8f42-f91318d4eedd",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# List all jobs in the account\n",
"sg_client.list_earth_observation_jobs()[\"EarthObservationJobSummaries\"]"
]
},
{
"cell_type": "markdown",
"id": "6ec7662d-7528-4cb8-99e4-dedc4025edca",
"metadata": {},
"source": [
"\n",
"\n",
"## Visualize EOJ inputs and outputs in FourSquare Studio"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2dbbfd3b-2726-42ae-9c00-487b1244d590",
"metadata": {},
"outputs": [],
"source": [
"# Creates an instance of the map to add EOJ input/ouput layer\n",
"map = sagemaker_geospatial_map.create_map({\"is_raster\": True})\n",
"map.set_sagemaker_geospatial_client(sg_client)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "770cea38-8376-4d25-89ad-1784ee83726c",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Render the map\n",
"map.render()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7eca43da-4bdc-4133-bea2-45bbf4fcc269",
"metadata": {},
"outputs": [],
"source": [
"# Visualize AOI\n",
"config = {\"label\": \"Lake Mead AOI\"}\n",
"aoi_layer = map.visualize_eoj_aoi(Arn=eoj_arn, config=config)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f82107b8-eefe-424e-8389-d11d31acc772",
"metadata": {},
"outputs": [],
"source": [
"# Visualize input.\n",
"time_range_filter = {\n",
" \"start_date\": \"2022-07-01T00:00:00Z\",\n",
" \"end_date\": \"2022-07-10T23:59:59Z\",\n",
"}\n",
"config = {\"label\": \"Input\"}\n",
"\n",
"input_layer = map.visualize_eoj_input(\n",
" Arn=eoj_arn, config=config, time_range_filter=time_range_filter\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ebf1b5d7-6e15-4c7f-9dcd-e98e02d42b34",
"metadata": {},
"outputs": [],
"source": [
"# Visualize output, EOJ needs to be in completed status.\n",
"time_range_filter = {\n",
" \"start_date\": \"2022-07-01T00:00:00Z\",\n",
" \"end_date\": \"2022-07-10T23:59:59Z\",\n",
"}\n",
"config = {\"preset\": \"singleBand\", \"band_name\": \"mask\"}\n",
"output_layer = map.visualize_eoj_output(\n",
" Arn=eoj_arn, config=config, time_range_filter=time_range_filter\n",
")"
]
},
{
"cell_type": "markdown",
"id": "321c260b-49fd-4973-9dd6-0f00f629fab6",
"metadata": {},
"source": [
"\n",
"## Extract the land cover segmentation results"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d68d8237-c580-4164-b69c-6098eb76fd00",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"sagemaker_session = sagemaker.Session()\n",
"s3_bucket_name = sagemaker_session.default_bucket() # Replace with your own bucket if needed\n",
"s3_bucket = session.resource(\"s3\").Bucket(s3_bucket_name)\n",
"prefix = \"eoj_lakemead\" # Replace with the S3 prefix desired\n",
"export_bucket_and_key = f\"s3://{s3_bucket_name}/{prefix}/\"\n",
"\n",
"eoj_output_config = {\"S3Data\": {\"S3Uri\": export_bucket_and_key}}\n",
"export_response = sg_client.export_earth_observation_job(\n",
" Arn=eoj_arn,\n",
" ExecutionRoleArn=execution_role,\n",
" OutputConfig=eoj_output_config,\n",
" ExportSourceImages=False,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "36eea163-528c-4471-9a4e-2cc71b55af70",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Monitor the export job status\n",
"export_job_details = sg_client.get_earth_observation_job(Arn=export_response[\"Arn\"])\n",
"{k: v for k, v in export_job_details.items() if k in [\"Arn\", \"Status\", \"DurationInSeconds\"]}"
]
},
{
"cell_type": "markdown",
"id": "a667db5f-ac1c-4218-9e28-48208d35d54a",
"metadata": {
"tags": []
},
"source": [
"\n",
"## Measure Lake Mead surface area"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6ef9344c-9f0b-42a8-a1e9-7c77c57487c0",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import os\n",
"from glob import glob\n",
"import cv2\n",
"import numpy as np\n",
"import tifffile\n",
"import matplotlib.pyplot as plt\n",
"from urllib.parse import urlparse\n",
"from botocore import UNSIGNED\n",
"from botocore.config import Config\n",
"\n",
"# Download land cover masks\n",
"mask_dir = \"./masks/lake_mead\"\n",
"os.makedirs(mask_dir, exist_ok=True)\n",
"image_paths = []\n",
"for s3_object in s3_bucket.objects.filter(Prefix=prefix).all():\n",
" path, filename = os.path.split(s3_object.key)\n",
" if \"output\" in path:\n",
" mask_name = mask_dir + \"/\" + filename\n",
" s3_bucket.download_file(s3_object.key, mask_name)\n",
" print(\"Downloaded mask: \" + mask_name)\n",
"\n",
"# Download source images for visualization\n",
"for tci_url in tci_urls:\n",
" url_parts = urlparse(tci_url)\n",
" img_id = url_parts.path.split(\"/\")[-2]\n",
" tci_download_path = image_dir + \"/\" + img_id + \"_TCI.tif\"\n",
" cogs_bucket = session.resource(\n",
" \"s3\", config=Config(signature_version=UNSIGNED, region_name=\"us-west-2\")\n",
" ).Bucket(url_parts.hostname.split(\".\")[0])\n",
" cogs_bucket.download_file(url_parts.path[1:], tci_download_path)\n",
" print(\"Downloaded image: \" + tci_download_path)\n",
"\n",
"print(\"Downloads complete.\")"
]
},
{
"cell_type": "markdown",
"id": "cd1e76bc-6204-44b7-9ee4-4d618cfb2943",
"metadata": {},
"source": [
"### Extract the water mask and measure the water surface area"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "23111adf-e639-45eb-baa1-6f9c20ae47c7",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"image_files = glob(\"images/lake_mead/*.tif\")\n",
"mask_files = glob(\"masks/lake_mead/*.tif\")\n",
"image_files.sort(key=lambda x: x.split(\"SQA_\")[1])\n",
"mask_files.sort(key=lambda x: x.split(\"SQA_\")[1])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9bd68e7f-4dee-4f94-b699-f8ce91c1c76e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"overlay_dir = \"./masks/lake_mead_overlay\"\n",
"os.makedirs(overlay_dir, exist_ok=True)\n",
"\n",
"lake_areas = []\n",
"mask_dates = []\n",
"\n",
"for image_file, mask_file in zip(image_files, mask_files):\n",
" image_id = image_file.split(\"/\")[-1].split(\"_TCI\")[0]\n",
" mask_id = mask_file.split(\"/\")[-1].split(\".tif\")[0]\n",
" mask_date = mask_id.split(\"_\")[2]\n",
" mask_dates.append(mask_date)\n",
" assert image_id == mask_id\n",
" image = tifffile.imread(image_file)\n",
" image_ds = cv2.resize(image, (1830, 1830), interpolation=cv2.INTER_LINEAR)\n",
" mask = tifffile.imread(mask_file)\n",
" water_mask = np.isin(mask, [6]).astype(np.uint8) # water has a class index 6\n",
" lake_mask = water_mask[1000:, :1100]\n",
" lake_area = lake_mask.sum() * 60 * 60 / (1000 * 1000) # calculate the surface area\n",
" lake_areas.append(lake_area)\n",
" contour, _ = cv2.findContours(water_mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)\n",
" combined = cv2.drawContours(image_ds, contour, -1, (255, 0, 0), 4)\n",
" lake_crop = combined[1000:, :1100]\n",
" cv2.putText(\n",
" lake_crop,\n",
" f\"{mask_date}\",\n",
" (10, 50),\n",
" cv2.FONT_HERSHEY_SIMPLEX,\n",
" 1.5,\n",
" (0, 0, 0),\n",
" 3,\n",
" cv2.LINE_AA,\n",
" )\n",
" cv2.putText(\n",
" lake_crop,\n",
" f\"{lake_area} [sq km]\",\n",
" (10, 100),\n",
" cv2.FONT_HERSHEY_SIMPLEX,\n",
" 1.5,\n",
" (0, 0, 0),\n",
" 3,\n",
" cv2.LINE_AA,\n",
" )\n",
" overlay_file = overlay_dir + \"/\" + mask_date + \".png\"\n",
" cv2.imwrite(overlay_file, cv2.cvtColor(lake_crop, cv2.COLOR_RGB2BGR))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "18fdc809-02bf-49fc-90b2-5d4bb1269983",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"plt.figure(figsize=(20, 10))\n",
"plt.title(\"Lake Mead surface area for the 2021.02 - 2022.07 period.\", fontsize=20)\n",
"plt.xticks(rotation=45)\n",
"plt.ylabel(\"Water surface area [sq km]\", fontsize=14)\n",
"plt.plot(mask_dates, lake_areas, marker=\"o\")\n",
"plt.grid(\"on\")\n",
"plt.ylim(240, 320)\n",
"for i, v in enumerate(lake_areas):\n",
" plt.text(i, v + 2, \"%d\" % v, ha=\"center\")\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"id": "aef01b08-4ef2-46d9-bc93-fe7966cf72fc",
"metadata": {},
"source": [
"### Overlay water boundaries with source images"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "320a3125-7622-4ef8-a674-ee220a32598c",
"metadata": {},
"outputs": [],
"source": [
"import imageio.v2 as imageio\n",
"\n",
"frames = []\n",
"filenames = glob(\"./masks/lake_mead_overlay/*.png\")\n",
"filenames.sort()\n",
"\n",
"for filename in filenames:\n",
" frames.append(imageio.imread(filename))\n",
"imageio.mimsave(\"lake_mead.gif\", frames, duration=1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "07542687-5288-4b6f-ae4e-483af2f130a7",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from IPython.display import HTML\n",
"\n",
"HTML('
')"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Notebook CI Test Results\n",
"\n",
"This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n"
]
}
],
"metadata": {
"instance_type": "ml.m5.4xlarge",
"kernelspec": {
"display_name": "Python 3 (Geospatial 1.0)",
"language": "python",
"name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-west-2:081189585635:image/sagemaker-geospatial-v1-0"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
},
"toc-autonumbering": true,
"toc-showtags": false
},
"nbformat": 4,
"nbformat_minor": 5
}