{ "cells": [ { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "[![AWS SDK for pandas](_static/logo.png \"AWS SDK for pandas\")](https://github.com/aws/aws-sdk-pandas)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "# 39 - Athena Iceberg" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "Athena supports read, time travel, write, and DDL queries for Apache Iceberg tables that use the Apache Parquet format for data and the AWS Glue catalog for their metastore. More in [User Guide](https://docs.aws.amazon.com/athena/latest/ug/querying-iceberg.html)." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Create Iceberg table" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "import getpass\n", "bucket_name = getpass.getpass()" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "True" }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import awswrangler as wr\n", "\n", "glue_database = \"aws_sdk_pandas\"\n", "glue_table = \"iceberg_test\"\n", "path = f\"s3://{bucket_name}/iceberg_test/\"\n", "temp_path = f\"s3://{bucket_name}/iceberg_test_temp/\"\n", "\n", "# Cleanup table before create\n", "wr.catalog.delete_table_if_exists(database=glue_database, table=glue_table)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Create table & insert data" ] }, { "cell_type": "markdown", "source": [ "It is possible to insert Pandas data frame into Iceberg table using `wr.athena.to_iceberg`. If the table does not exist, it will be created:" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [ "import pandas as pd\n", "\n", "df = pd.DataFrame({\"id\": [1, 2, 3], \"name\": [\"John\", \"Lily\", \"Richard\"]})\n", "\n", "wr.athena.to_iceberg(\n", " df=df,\n", " database=glue_database,\n", " table=glue_table,\n", " table_location=path,\n", " temp_path=temp_path,\n", ")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "Alternatively, it is also possible to insert by directly running `INSERT INTO ... VALUES`:" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 53, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": [ "{'QueryExecutionId': 'e339fcd2-9db1-43ac-bb9e-9730e6395b51',\n", " 'Query': \"INSERT INTO iceberg_test VALUES (1,'John'), (2, 'Lily'), (3, 'Richard')\",\n", " 'StatementType': 'DML',\n", " 'ResultConfiguration': {'OutputLocation': 's3://aws-athena-query-results-...-us-east-1/e339fcd2-9db1-43ac-bb9e-9730e6395b51'},\n", " 'ResultReuseConfiguration': {'ResultReuseByAgeConfiguration': {'Enabled': False}},\n", " 'QueryExecutionContext': {'Database': 'aws_sdk_pandas'},\n", " 'Status': {'State': 'SUCCEEDED',\n", " 'SubmissionDateTime': datetime.datetime(2023, 3, 16, 10, 40, 8, 612000, tzinfo=tzlocal()),\n", " 'CompletionDateTime': datetime.datetime(2023, 3, 16, 10, 40, 11, 143000, tzinfo=tzlocal())},\n", " 'Statistics': {'EngineExecutionTimeInMillis': 2242,\n", " 'DataScannedInBytes': 0,\n", " 'DataManifestLocation': 's3://aws-athena-query-results-...-us-east-1/e339fcd2-9db1-43ac-bb9e-9730e6395b51-manifest.csv',\n", " 'TotalExecutionTimeInMillis': 2531,\n", " 'QueryQueueTimeInMillis': 241,\n", " 'QueryPlanningTimeInMillis': 179,\n", " 'ServiceProcessingTimeInMillis': 48,\n", " 'ResultReuseInformation': {'ReusedPreviousResult': False}},\n", " 'WorkGroup': 'primary',\n", " 'EngineVersion': {'SelectedEngineVersion': 'Athena engine version 3',\n", " 'EffectiveEngineVersion': 'Athena engine version 3'}}" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "wr.athena.start_query_execution(\n", " sql=f\"INSERT INTO {glue_table} VALUES (1,'John'), (2, 'Lily'), (3, 'Richard')\",\n", " database=glue_database,\n", " wait=True,\n", ")" ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": [ "{'QueryExecutionId': '922c8f02-4c00-4050-b4a7-7016809efa2b',\n", " 'Query': \"INSERT INTO iceberg_test VALUES (4,'Anne'), (5, 'Jacob'), (6, 'Leon')\",\n", " 'StatementType': 'DML',\n", " 'ResultConfiguration': {'OutputLocation': 's3://aws-athena-query-results-...-us-east-1/922c8f02-4c00-4050-b4a7-7016809efa2b'},\n", " 'ResultReuseConfiguration': {'ResultReuseByAgeConfiguration': {'Enabled': False}},\n", " 'QueryExecutionContext': {'Database': 'aws_sdk_pandas'},\n", " 'Status': {'State': 'SUCCEEDED',\n", " 'SubmissionDateTime': datetime.datetime(2023, 3, 16, 10, 40, 24, 582000, tzinfo=tzlocal()),\n", " 'CompletionDateTime': datetime.datetime(2023, 3, 16, 10, 40, 27, 352000, tzinfo=tzlocal())},\n", " 'Statistics': {'EngineExecutionTimeInMillis': 2414,\n", " 'DataScannedInBytes': 0,\n", " 'DataManifestLocation': 's3://aws-athena-query-results-...-us-east-1/922c8f02-4c00-4050-b4a7-7016809efa2b-manifest.csv',\n", " 'TotalExecutionTimeInMillis': 2770,\n", " 'QueryQueueTimeInMillis': 329,\n", " 'QueryPlanningTimeInMillis': 189,\n", " 'ServiceProcessingTimeInMillis': 27,\n", " 'ResultReuseInformation': {'ReusedPreviousResult': False}},\n", " 'WorkGroup': 'primary',\n", " 'EngineVersion': {'SelectedEngineVersion': 'Athena engine version 3',\n", " 'EffectiveEngineVersion': 'Athena engine version 3'}}" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "wr.athena.start_query_execution(\n", " sql=f\"INSERT INTO {glue_table} VALUES (4,'Anne'), (5, 'Jacob'), (6, 'Leon')\",\n", " database=glue_database,\n", " wait=True,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Query\n" ] }, { "cell_type": "code", "execution_count": 65, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idname
01John
14Anne
22Lily
33Richard
45Jacob
56Leon
\n", "
" ], "text/plain": [ " id name\n", "0 1 John\n", "1 4 Anne\n", "2 2 Lily\n", "3 3 Richard\n", "4 5 Jacob\n", "5 6 Leon" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "wr.athena.read_sql_query(\n", " sql=f'SELECT * FROM \"{glue_table}\"',\n", " database=glue_database,\n", " ctas_approach=False,\n", " unload_approach=False,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Read query metadata" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "In a SELECT query, you can use the following properties after `table_name` to query Iceberg table metadata:" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "- `$files` Shows a table's current data files" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "- `$manifests` Shows a table's current file manifests" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "- `$history` Shows a table's history" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "- `$partitions` Shows a table's current partitions" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
contentfile_pathfile_formatrecord_countfile_size_in_bytescolumn_sizesvalue_countsnull_value_countsnan_value_countslower_boundsupper_boundskey_metadatasplit_offsetsequality_ids
00s3://.../iceberg_test/data/089a...PARQUET3360{1=48, 2=63}{1=3, 2=3}{1=0, 2=0}{}{1=1, 2=John}{1=3, 2=Richard}<NA>NaNNaN
10s3://.../iceberg_test/data/5736...PARQUET3355{1=48, 2=61}{1=3, 2=3}{1=0, 2=0}{}{1=4, 2=Anne}{1=6, 2=Leon}<NA>NaNNaN
\n", "
" ], "text/plain": [ " content file_path file_format \\\n", "0 0 s3://.../iceberg_test/data/089a... PARQUET \n", "1 0 s3://.../iceberg_test/data/5736... PARQUET \n", "\n", " record_count file_size_in_bytes column_sizes value_counts \\\n", "0 3 360 {1=48, 2=63} {1=3, 2=3} \n", "1 3 355 {1=48, 2=61} {1=3, 2=3} \n", "\n", " null_value_counts nan_value_counts lower_bounds upper_bounds \\\n", "0 {1=0, 2=0} {} {1=1, 2=John} {1=3, 2=Richard} \n", "1 {1=0, 2=0} {} {1=4, 2=Anne} {1=6, 2=Leon} \n", "\n", " key_metadata split_offsets equality_ids \n", "0 NaN NaN \n", "1 NaN NaN " ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "wr.athena.read_sql_query(\n", " sql=f'SELECT * FROM \"{glue_table}$files\"',\n", " database=glue_database,\n", " ctas_approach=False,\n", " unload_approach=False,\n", ")" ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pathlengthpartition_spec_idadded_snapshot_idadded_data_files_countadded_rows_countexisting_data_files_countexisting_rows_countdeleted_data_files_countdeleted_rows_countpartitions
0s3://.../iceberg_test/metadata/...653804379263637983206651130000[]
1s3://.../iceberg_test/metadata/...654802934717851675145063130000[]
\n", "
" ], "text/plain": [ " path length \\\n", "0 s3://.../iceberg_test/metadata/... 6538 \n", "1 s3://.../iceberg_test/metadata/... 6548 \n", "\n", " partition_spec_id added_snapshot_id added_data_files_count \\\n", "0 0 4379263637983206651 1 \n", "1 0 2934717851675145063 1 \n", "\n", " added_rows_count existing_data_files_count existing_rows_count \\\n", "0 3 0 0 \n", "1 3 0 0 \n", "\n", " deleted_data_files_count deleted_rows_count partitions \n", "0 0 0 [] \n", "1 0 0 [] " ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "wr.athena.read_sql_query(\n", " sql=f'SELECT * FROM \"{glue_table}$manifests\"',\n", " database=glue_database,\n", " ctas_approach=False,\n", " unload_approach=False,\n", ")" ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
made_current_atsnapshot_idparent_idis_current_ancestor
02023-03-16 09:40:10.438000+00:002934717851675145063<NA>True
12023-03-16 09:40:26.754000+00:0043792636379832066512934717851675144704True
\n", "
" ], "text/plain": [ " made_current_at snapshot_id parent_id \\\n", "0 2023-03-16 09:40:10.438000+00:00 2934717851675145063 \n", "1 2023-03-16 09:40:26.754000+00:00 4379263637983206651 2934717851675144704 \n", "\n", " is_current_ancestor \n", "0 True \n", "1 True " ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = wr.athena.read_sql_query(\n", " sql=f'SELECT * FROM \"{glue_table}$history\"',\n", " database=glue_database,\n", " ctas_approach=False,\n", " unload_approach=False,\n", ")\n", "\n", "# Save snapshot id\n", "snapshot_id = df.snapshot_id[0]\n", "\n", "df" ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
record_countfile_counttotal_sizedata
062715{id={min=1, max=6, null_count=0, nan_count=nul...
\n", "
" ], "text/plain": [ " record_count file_count total_size \\\n", "0 6 2 715 \n", "\n", " data \n", "0 {id={min=1, max=6, null_count=0, nan_count=nul... " ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "wr.athena.read_sql_query(\n", " sql=f'SELECT * FROM \"{glue_table}$partitions\"',\n", " database=glue_database,\n", " ctas_approach=False,\n", " unload_approach=False,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Time travel" ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idname
01John
14Anne
22Lily
33Richard
45Jacob
56Leon
\n", "
" ], "text/plain": [ " id name\n", "0 1 John\n", "1 4 Anne\n", "2 2 Lily\n", "3 3 Richard\n", "4 5 Jacob\n", "5 6 Leon" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "wr.athena.read_sql_query(\n", " sql=f\"SELECT * FROM {glue_table} FOR TIMESTAMP AS OF (current_timestamp - interval '5' second)\",\n", " database=glue_database,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Version travel\n" ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idname
01John
12Lily
23Richard
\n", "
" ], "text/plain": [ " id name\n", "0 1 John\n", "1 2 Lily\n", "2 3 Richard" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "wr.athena.read_sql_query(\n", " sql=f\"SELECT * FROM {glue_table} FOR VERSION AS OF {snapshot_id}\",\n", " database=glue_database,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Optimize" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "The `OPTIMIZE table REWRITE DATA` compaction action rewrites data files into a more optimized layout based on their size and number of associated delete files. For syntax and table property details, see [OPTIMIZE](https://docs.aws.amazon.com/athena/latest/ug/optimize-statement.html).\n", "\n" ] }, { "cell_type": "code", "execution_count": 62, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": [ "{'QueryExecutionId': '94666790-03ae-42d7-850a-fae99fa79a68',\n", " 'Query': 'OPTIMIZE iceberg_test REWRITE DATA USING BIN_PACK',\n", " 'StatementType': 'DDL',\n", " 'ResultConfiguration': {'OutputLocation': 's3://aws-athena-query-results-...-us-east-1/tables/94666790-03ae-42d7-850a-fae99fa79a68'},\n", " 'ResultReuseConfiguration': {'ResultReuseByAgeConfiguration': {'Enabled': False}},\n", " 'QueryExecutionContext': {'Database': 'aws_sdk_pandas'},\n", " 'Status': {'State': 'SUCCEEDED',\n", " 'SubmissionDateTime': datetime.datetime(2023, 3, 16, 10, 49, 42, 857000, tzinfo=tzlocal()),\n", " 'CompletionDateTime': datetime.datetime(2023, 3, 16, 10, 49, 45, 655000, tzinfo=tzlocal())},\n", " 'Statistics': {'EngineExecutionTimeInMillis': 2622,\n", " 'DataScannedInBytes': 220,\n", " 'DataManifestLocation': 's3://aws-athena-query-results-...-us-east-1/tables/94666790-03ae-42d7-850a-fae99fa79a68-manifest.csv',\n", " 'TotalExecutionTimeInMillis': 2798,\n", " 'QueryQueueTimeInMillis': 124,\n", " 'QueryPlanningTimeInMillis': 252,\n", " 'ServiceProcessingTimeInMillis': 52,\n", " 'ResultReuseInformation': {'ReusedPreviousResult': False}},\n", " 'WorkGroup': 'primary',\n", " 'EngineVersion': {'SelectedEngineVersion': 'Athena engine version 3',\n", " 'EffectiveEngineVersion': 'Athena engine version 3'}}" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "wr.athena.start_query_execution(\n", " sql=f\"OPTIMIZE {glue_table} REWRITE DATA USING BIN_PACK\",\n", " database=glue_database,\n", " wait=True,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Vacuum\n" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "`VACUUM` performs [snapshot expiration](https://iceberg.apache.org/docs/latest/spark-procedures/#expire_snapshots) and [orphan file removal](https://iceberg.apache.org/docs/latest/spark-procedures/#remove_orphan_files). These actions reduce metadata size and remove files not in the current table state that are also older than the retention period specified for the table. For syntax details, see [VACUUM](https://docs.aws.amazon.com/athena/latest/ug/vacuum-statement.html)." ] }, { "cell_type": "code", "execution_count": 64, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": [ "{'QueryExecutionId': '717a7de6-b873-49c7-b744-1b0b402f24c9',\n", " 'Query': 'VACUUM iceberg_test',\n", " 'StatementType': 'DML',\n", " 'ResultConfiguration': {'OutputLocation': 's3://aws-athena-query-results-...-us-east-1/717a7de6-b873-49c7-b744-1b0b402f24c9.csv'},\n", " 'ResultReuseConfiguration': {'ResultReuseByAgeConfiguration': {'Enabled': False}},\n", " 'QueryExecutionContext': {'Database': 'aws_sdk_pandas'},\n", " 'Status': {'State': 'SUCCEEDED',\n", " 'SubmissionDateTime': datetime.datetime(2023, 3, 16, 10, 50, 41, 14000, tzinfo=tzlocal()),\n", " 'CompletionDateTime': datetime.datetime(2023, 3, 16, 10, 50, 43, 441000, tzinfo=tzlocal())},\n", " 'Statistics': {'EngineExecutionTimeInMillis': 2229,\n", " 'DataScannedInBytes': 0,\n", " 'TotalExecutionTimeInMillis': 2427,\n", " 'QueryQueueTimeInMillis': 153,\n", " 'QueryPlanningTimeInMillis': 30,\n", " 'ServiceProcessingTimeInMillis': 45,\n", " 'ResultReuseInformation': {'ReusedPreviousResult': False}},\n", " 'WorkGroup': 'primary',\n", " 'EngineVersion': {'SelectedEngineVersion': 'Athena engine version 3',\n", " 'EffectiveEngineVersion': 'Athena engine version 3'}}" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "wr.athena.start_query_execution(\n", " sql=f\"VACUUM {glue_table}\",\n", " database=glue_database,\n", " wait=True,\n", ")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.9.14", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.14" } }, "nbformat": 4, "nbformat_minor": 4 }