AWS_S3 module

AWS_S3.s3_object_exist(object_name, s3_bucket_name)

Determine if an object exists in a specific s3 bucket.

Return type:

bool

Parameters:

object_name (str) – Can include s3 key
s3_bucket_name (str) –

Returns:

True if it exists.

Return type:

bool

AWS_S3.s3_all_contents(folder_name, s3_bucket_name)

Return a list of all objects in an s3 bucket.

Return type:

list[str]

Parameters:

folder_name (str) – Folder or s3 key name, can be ‘’ if looking for objects in root s3 bucket directory.
s3_bucket_name (str) –

Returns:

A list of all objects in a specific s3_bucket_name/folder_name

Return type:

list[str]

AWS_S3.get_data_s3(local_data_path, file_to_get, bucket_name)

Download an s3 object from the ‘bucket_name’.

First checks if object exists, if not a warning is thrown and the function returns none.

Return type:

None

Parameters:

local_data_path (str) – Location on the local file system to copy the object into.
file_to_get (str) – The s3 object name to download, can include the s3 key if multiple folders in the bucket.
bucket_name (str) –

Return type:

None

AWS_S3.send_data_s3(local_data_path, remote_filename, bucket_name)

Simple pass through to boto3. Included for completeness.

Return type:

None

Parameters:

local_data_path (str) – File include path to be uploaded to the s3 bucket.
remote_filename (str) – The name with key to upload to the s3 bucket.
bucket_name (str) –

Return type:

None

AWS_S3.stream_send_s3_csv(df, remote_filename, bucket_name)

Directly upload a pandas dataframe to a csv file in an s3 bucket.

Simple pass through to awswrangler, very likely deprecate this function since it currently adds little value.

See awswrangler for many more options: https://aws-sdk-pandas.readthedocs.io/en/stable/stubs/awswrangler.s3.to_csv.html#awswrangler.s3.to_csv

Return type:

None

Parameters:

df (pandas.DataFrame) –
remote_filename (str) – Can include s3 key
bucket_name (str) –

Return type:

None

AWS_S3.stream_get_s3_csv(remote_filename, bucket_name)

Directly download a csv file into a pandas dataframe.

Simple pass through to awswrangler, very likely deprecate this function since it currently adds little value.

This function does check if the s3 object exists first and will return a blank dataframe if the file does not exist.

See awswrangler for many more options: https://aws-sdk-pandas.readthedocs.io/en/stable/stubs/awswrangler.s3.read_csv.html#awswrangler.s3.read_csv

Return type:

DataFrame

Parameters:

remote_filename (str) – Remote file including s3 key
bucket_name (str) –

Return type:

pandas.DataFrame

AWS_S3.delete_data_s3(remote_filename, bucket_name)

Delete an object from s3 bucket.

Return type:

None

Parameters:

remote_filename (str) – Object to delete, can include s3 key.
bucket_name (str) –

Return type:

None

AWS_S3.create_s3_buckets(s3_bucket_name, region)

Create an s3 bucket using the SDK. All public access to this bucket will be shutoff. Note this function is expected to be replaced with CDK in the future.

Return type:

None

Parameters:

s3_bucket_name (str) –
region (str) – AWS region where the bucket will be created

Return type:

None

AWS_S3.setup_s3_notifications(s3_bucket_name, queue_attributes, notification_name, prefix)

Setup event notifications for an s3 bucket. This uses the AWS SDK and is expected to be replaced with CDK in the future.

Return type:

None

Parameters:

s3_bucket_name (str) –
queue_attributes (str) – The notifications will be sent to the SQS at this ARN.
notification_name (str) –
prefix (str) – Only look at changes for objects with this prefix (i.e. objects in this s3 folder)

Return type:

None

AWS_S3.delete_s3_bucket(s3_bucket_name, account_number)

Delete the s3 bucket. Note this uses the AWS SDK and is expected to be replaced with CDK in the future. Users are assumed to have proper IAM access or this function will fail for security reasons.

Return type:

None

Parameters:

s3_bucket_name (str) –
account_number (str) – The account where the s3 bucket resides.

Return type:

None