AWS_S3 module
- AWS_S3.s3_object_exist(object_name, s3_bucket_name)
Determine if an object exists in a specific s3 bucket.
- Return type:
bool
- Parameters:
object_name (str) – Can include s3 key
s3_bucket_name (str) –
- Returns:
True if it exists.
- Return type:
bool
- AWS_S3.s3_all_contents(folder_name, s3_bucket_name)
Return a list of all objects in an s3 bucket.
- Return type:
list
[str
]- Parameters:
folder_name (str) – Folder or s3 key name, can be ‘’ if looking for objects in root s3 bucket directory.
s3_bucket_name (str) –
- Returns:
A list of all objects in a specific s3_bucket_name/folder_name
- Return type:
list[str]
- AWS_S3.get_data_s3(local_data_path, file_to_get, bucket_name)
Download an s3 object from the ‘bucket_name’.
First checks if object exists, if not a warning is thrown and the function returns none.
- Return type:
None
- Parameters:
local_data_path (str) – Location on the local file system to copy the object into.
file_to_get (str) – The s3 object name to download, can include the s3 key if multiple folders in the bucket.
bucket_name (str) –
- Return type:
None
- AWS_S3.send_data_s3(local_data_path, remote_filename, bucket_name)
Simple pass through to boto3. Included for completeness.
- Return type:
None
- Parameters:
local_data_path (str) – File include path to be uploaded to the s3 bucket.
remote_filename (str) – The name with key to upload to the s3 bucket.
bucket_name (str) –
- Return type:
None
- AWS_S3.stream_send_s3_csv(df, remote_filename, bucket_name)
Directly upload a pandas dataframe to a csv file in an s3 bucket.
Simple pass through to awswrangler, very likely deprecate this function since it currently adds little value.
See awswrangler for many more options: https://aws-sdk-pandas.readthedocs.io/en/stable/stubs/awswrangler.s3.to_csv.html#awswrangler.s3.to_csv
- Return type:
None
- Parameters:
df (pandas.DataFrame) –
remote_filename (str) – Can include s3 key
bucket_name (str) –
- Return type:
None
- AWS_S3.stream_get_s3_csv(remote_filename, bucket_name)
Directly download a csv file into a pandas dataframe.
Simple pass through to awswrangler, very likely deprecate this function since it currently adds little value.
This function does check if the s3 object exists first and will return a blank dataframe if the file does not exist.
See awswrangler for many more options: https://aws-sdk-pandas.readthedocs.io/en/stable/stubs/awswrangler.s3.read_csv.html#awswrangler.s3.read_csv
- Return type:
DataFrame
- Parameters:
remote_filename (str) – Remote file including s3 key
bucket_name (str) –
- Return type:
pandas.DataFrame
- AWS_S3.delete_data_s3(remote_filename, bucket_name)
Delete an object from s3 bucket.
- Return type:
None
- Parameters:
remote_filename (str) – Object to delete, can include s3 key.
bucket_name (str) –
- Return type:
None
- AWS_S3.create_s3_buckets(s3_bucket_name, region)
Create an s3 bucket using the SDK. All public access to this bucket will be shutoff. Note this function is expected to be replaced with CDK in the future.
- Return type:
None
- Parameters:
s3_bucket_name (str) –
region (str) – AWS region where the bucket will be created
- Return type:
None
- AWS_S3.setup_s3_notifications(s3_bucket_name, queue_attributes, notification_name, prefix)
Setup event notifications for an s3 bucket. This uses the AWS SDK and is expected to be replaced with CDK in the future.
- Return type:
None
- Parameters:
s3_bucket_name (str) –
queue_attributes (str) – The notifications will be sent to the SQS at this ARN.
notification_name (str) –
prefix (str) – Only look at changes for objects with this prefix (i.e. objects in this s3 folder)
- Return type:
None
- AWS_S3.delete_s3_bucket(s3_bucket_name, account_number)
Delete the s3 bucket. Note this uses the AWS SDK and is expected to be replaced with CDK in the future. Users are assumed to have proper IAM access or this function will fail for security reasons.
- Return type:
None
- Parameters:
s3_bucket_name (str) –
account_number (str) – The account where the s3 bucket resides.
- Return type:
None