{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://github.com/aws/aws-sdk-pandas)\n",
"\n",
"# 21 - Global Configurations\n",
"\n",
"[awswrangler](https://github.com/aws/aws-sdk-pandas) has two ways to set global configurations that will override the regular default arguments configured in functions signatures.\n",
"\n",
"- **Environment variables**\n",
"- **wr.config**\n",
"\n",
"*P.S. Check the [function API doc](https://aws-sdk-pandas.readthedocs.io/en/3.2.1/api.html) to see if your function has some argument that can be configured through Global configurations.*\n",
"\n",
"*P.P.S. One exception to the above mentioned rules is the `botocore_config` property. It cannot be set through environment variables\n",
"but only via `wr.config`. It will be used as the `botocore.config.Config` for all underlying `boto3` calls.\n",
"The default config is `botocore.config.Config(retries={\"max_attempts\": 5}, connect_timeout=10, max_pool_connections=10)`.\n",
"If you only want to change the retry behavior, you can use the environment variables `AWS_MAX_ATTEMPTS` and `AWS_RETRY_MODE`.\n",
"(see [Boto3 documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#using-environment-variables))*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Environment Variables"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"env: WR_DATABASE=default\n",
"env: WR_CTAS_APPROACH=False\n",
"env: WR_MAX_CACHE_SECONDS=900\n",
"env: WR_MAX_CACHE_QUERY_INSPECTIONS=500\n",
"env: WR_MAX_REMOTE_CACHE_ENTRIES=50\n",
"env: WR_MAX_LOCAL_CACHE_ENTRIES=100\n"
]
}
],
"source": [
"%env WR_DATABASE=default\n",
"%env WR_CTAS_APPROACH=False\n",
"%env WR_MAX_CACHE_SECONDS=900\n",
"%env WR_MAX_CACHE_QUERY_INSPECTIONS=500\n",
"%env WR_MAX_REMOTE_CACHE_ENTRIES=50\n",
"%env WR_MAX_LOCAL_CACHE_ENTRIES=100"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import awswrangler as wr\n",
"import botocore"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" foo | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" foo\n",
"0 1"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"wr.athena.read_sql_query(\"SELECT 1 AS FOO\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Resetting"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"# Specific\n",
"wr.config.reset(\"database\")\n",
"# All\n",
"wr.config.reset()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## wr.config"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"wr.config.database = \"default\"\n",
"wr.config.ctas_approach = False\n",
"wr.config.max_cache_seconds = 900\n",
"wr.config.max_cache_query_inspections = 500\n",
"wr.config.max_remote_cache_entries = 50\n",
"wr.config.max_local_cache_entries = 100\n",
"# Set botocore.config.Config that will be used for all boto3 calls\n",
"wr.config.botocore_config = botocore.config.Config(\n",
" retries={\"max_attempts\": 10},\n",
" connect_timeout=20,\n",
" max_pool_connections=20\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" foo | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" foo\n",
"0 1"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"wr.athena.read_sql_query(\"SELECT 1 AS FOO\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Visualizing"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" | \n",
" name | \n",
" Env. Variable | \n",
" type | \n",
" nullable | \n",
" enforced | \n",
" configured | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" catalog_id | \n",
" WR_CATALOG_ID | \n",
" <class 'str'> | \n",
" True | \n",
" False | \n",
" False | \n",
" None | \n",
"
\n",
" \n",
" 1 | \n",
" concurrent_partitioning | \n",
" WR_CONCURRENT_PARTITIONING | \n",
" <class 'bool'> | \n",
" False | \n",
" False | \n",
" False | \n",
" None | \n",
"
\n",
" \n",
" 2 | \n",
" ctas_approach | \n",
" WR_CTAS_APPROACH | \n",
" <class 'bool'> | \n",
" False | \n",
" False | \n",
" True | \n",
" False | \n",
"
\n",
" \n",
" 3 | \n",
" database | \n",
" WR_DATABASE | \n",
" <class 'str'> | \n",
" True | \n",
" False | \n",
" True | \n",
" default | \n",
"
\n",
" \n",
" 4 | \n",
" max_cache_query_inspections | \n",
" WR_MAX_CACHE_QUERY_INSPECTIONS | \n",
" <class 'int'> | \n",
" False | \n",
" False | \n",
" True | \n",
" 500 | \n",
"
\n",
" \n",
" 5 | \n",
" max_cache_seconds | \n",
" WR_MAX_CACHE_SECONDS | \n",
" <class 'int'> | \n",
" False | \n",
" False | \n",
" True | \n",
" 900 | \n",
"
\n",
" \n",
" 6 | \n",
" max_remote_cache_entries | \n",
" WR_MAX_REMOTE_CACHE_ENTRIES | \n",
" <class 'int'> | \n",
" False | \n",
" False | \n",
" True | \n",
" 50 | \n",
"
\n",
" \n",
" 7 | \n",
" max_local_cache_entries | \n",
" WR_MAX_LOCAL_CACHE_ENTRIES | \n",
" <class 'int'> | \n",
" False | \n",
" False | \n",
" True | \n",
" 100 | \n",
"
\n",
" \n",
" 8 | \n",
" s3_block_size | \n",
" WR_S3_BLOCK_SIZE | \n",
" <class 'int'> | \n",
" False | \n",
" True | \n",
" False | \n",
" None | \n",
"
\n",
" \n",
" 9 | \n",
" workgroup | \n",
" WR_WORKGROUP | \n",
" <class 'str'> | \n",
" False | \n",
" True | \n",
" False | \n",
" None | \n",
"
\n",
" \n",
" 10 | \n",
" chunksize | \n",
" WR_CHUNKSIZE | \n",
" <class 'int'> | \n",
" False | \n",
" True | \n",
" False | \n",
" None | \n",
"
\n",
" \n",
" 11 | \n",
" s3_endpoint_url | \n",
" WR_S3_ENDPOINT_URL | \n",
" <class 'str'> | \n",
" True | \n",
" True | \n",
" True | \n",
" None | \n",
"
\n",
" \n",
" 12 | \n",
" athena_endpoint_url | \n",
" WR_ATHENA_ENDPOINT_URL | \n",
" <class 'str'> | \n",
" True | \n",
" True | \n",
" True | \n",
" None | \n",
"
\n",
" \n",
" 13 | \n",
" sts_endpoint_url | \n",
" WR_STS_ENDPOINT_URL | \n",
" <class 'str'> | \n",
" True | \n",
" True | \n",
" True | \n",
" None | \n",
"
\n",
" \n",
" 14 | \n",
" glue_endpoint_url | \n",
" WR_GLUE_ENDPOINT_URL | \n",
" <class 'str'> | \n",
" True | \n",
" True | \n",
" True | \n",
" None | \n",
"
\n",
" \n",
" 15 | \n",
" redshift_endpoint_url | \n",
" WR_REDSHIFT_ENDPOINT_URL | \n",
" <class 'str'> | \n",
" True | \n",
" True | \n",
" True | \n",
" None | \n",
"
\n",
" \n",
" 16 | \n",
" kms_endpoint_url | \n",
" WR_KMS_ENDPOINT_URL | \n",
" <class 'str'> | \n",
" True | \n",
" True | \n",
" True | \n",
" None | \n",
"
\n",
" \n",
" 17 | \n",
" emr_endpoint_url | \n",
" WR_EMR_ENDPOINT_URL | \n",
" <class 'str'> | \n",
" True | \n",
" True | \n",
" True | \n",
" None | \n",
"
\n",
" \n",
" 18 | \n",
" lakeformation_endpoint_url | \n",
" WR_LAKEFORMATION_ENDPOINT_URL | \n",
" <class 'str'> | \n",
" True | \n",
" True | \n",
" True | \n",
" None | \n",
"
\n",
" \n",
" 19 | \n",
" dynamodb_endpoint_url | \n",
" WR_DYNAMODB_ENDPOINT_URL | \n",
" <class 'str'> | \n",
" True | \n",
" True | \n",
" True | \n",
" None | \n",
"
\n",
" \n",
" 20 | \n",
" secretsmanager_endpoint_url | \n",
" WR_SECRETSMANAGER_ENDPOINT_URL | \n",
" <class 'str'> | \n",
" True | \n",
" True | \n",
" True | \n",
" None | \n",
"
\n",
" \n",
" 21 | \n",
" timestream_endpoint_url | \n",
" WR_TIMESTREAM_ENDPOINT_URL | \n",
" <class 'str'> | \n",
" True | \n",
" True | \n",
" True | \n",
" None | \n",
"
\n",
" \n",
" 22 | \n",
" botocore_config | \n",
" WR_BOTOCORE_CONFIG | \n",
" <class 'botocore.config.Config'> | \n",
" True | \n",
" False | \n",
" True | \n",
" <botocore.config.Config object at 0x14f313e50> | \n",
"
\n",
" \n",
" 23 | \n",
" verify | \n",
" WR_VERIFY | \n",
" <class 'str'> | \n",
" True | \n",
" False | \n",
" True | \n",
" None | \n",
"
\n",
" \n",
" 24 | \n",
" address | \n",
" WR_ADDRESS | \n",
" <class 'str'> | \n",
" True | \n",
" False | \n",
" False | \n",
" None | \n",
"
\n",
" \n",
" 25 | \n",
" redis_password | \n",
" WR_REDIS_PASSWORD | \n",
" <class 'str'> | \n",
" True | \n",
" False | \n",
" False | \n",
" None | \n",
"
\n",
" \n",
" 26 | \n",
" ignore_reinit_error | \n",
" WR_IGNORE_REINIT_ERROR | \n",
" <class 'bool'> | \n",
" True | \n",
" False | \n",
" False | \n",
" None | \n",
"
\n",
" \n",
" 27 | \n",
" include_dashboard | \n",
" WR_INCLUDE_DASHBOARD | \n",
" <class 'bool'> | \n",
" True | \n",
" False | \n",
" False | \n",
" None | \n",
"
\n",
" \n",
" 28 | \n",
" log_to_driver | \n",
" WR_LOG_TO_DRIVER | \n",
" <class 'bool'> | \n",
" True | \n",
" False | \n",
" False | \n",
" None | \n",
"
\n",
" \n",
" 29 | \n",
" object_store_memory | \n",
" WR_OBJECT_STORE_MEMORY | \n",
" <class 'int'> | \n",
" True | \n",
" False | \n",
" False | \n",
" None | \n",
"
\n",
" \n",
" 30 | \n",
" cpu_count | \n",
" WR_CPU_COUNT | \n",
" <class 'int'> | \n",
" True | \n",
" False | \n",
" False | \n",
" None | \n",
"
\n",
" \n",
" 31 | \n",
" gpu_count | \n",
" WR_GPU_COUNT | \n",
" <class 'int'> | \n",
" True | \n",
" False | \n",
" False | \n",
" None | \n",
"
\n",
" \n",
"
"
],
"text/plain": [
""
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"wr.config"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
},
"vscode": {
"interpreter": {
"hash": "bd595004b250e5f4145a0d632609b0d8f97d1ccd278d58fafd6840c0467021f9"
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}