{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "[![AWS SDK for pandas](_static/logo.png \"AWS SDK for pandas\")](https://github.com/aws/aws-sdk-pandas)\n", "\n", "# 24 - Athena Query Metadata\n", "\n", "For `wr.athena.read_sql_query()` and `wr.athena.read_sql_table()` the resulting DataFrame (or every DataFrame in the returned Iterator for chunked queries) have a `query_metadata` attribute, which brings the query result metadata returned by Boto3/Athena.\n", "\n", "The expected `query_metadata` format is the same returned by:\n", "\n", "https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/athena.html#Athena.Client.get_query_execution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Environment Variables" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "env: WR_DATABASE=default\n" ] } ], "source": [ "%env WR_DATABASE=default" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import awswrangler as wr" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
foo
01
\n", "
" ], "text/plain": [ " foo\n", "0 1" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = wr.athena.read_sql_query(\"SELECT 1 AS foo\")\n", "\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Getting statistics from query metadata" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "DataScannedInBytes: 0\n", "TotalExecutionTimeInMillis: 2311\n", "QueryQueueTimeInMillis: 121\n", "QueryPlanningTimeInMillis: 250\n", "ServiceProcessingTimeInMillis: 37\n" ] } ], "source": [ "print(f'DataScannedInBytes: {df.query_metadata[\"Statistics\"][\"DataScannedInBytes\"]}')\n", "print(f'TotalExecutionTimeInMillis: {df.query_metadata[\"Statistics\"][\"TotalExecutionTimeInMillis\"]}')\n", "print(f'QueryQueueTimeInMillis: {df.query_metadata[\"Statistics\"][\"QueryQueueTimeInMillis\"]}')\n", "print(f'QueryPlanningTimeInMillis: {df.query_metadata[\"Statistics\"][\"QueryPlanningTimeInMillis\"]}')\n", "print(f'ServiceProcessingTimeInMillis: {df.query_metadata[\"Statistics\"][\"ServiceProcessingTimeInMillis\"]}')" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.9.14", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.14" } }, "nbformat": 4, "nbformat_minor": 4 }