{ "cells": [ { "cell_type": "markdown", "source": [ "# Pure S3 Path Manipulation" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "raw", "raw_mimetype": "text/restructuredtext", "source": [ ".. article-info::\n", " :avatar-outline: muted\n", " :author: Sanhe\n", " :date: Apr 20, 2023\n", " :read-time: 15 min read\n", " :class-container: sd-p-2 sd-outline-muted sd-rounded-1" ], "metadata": { "collapsed": false, "raw_mimetype": "text/restructuredtext", "pycharm": { "name": "#%% raw\n" } } }, { "cell_type": "markdown", "source": [ "## What is Pure S3 Path\n", "\n", "A Pure S3 Path is a Python object that represents an AWS S3 bucket, object, or folder. However, it's important to note that a Pure S3 Path object does not make any calls to the AWS API, nor does it imply the existence of the corresponding S3 object. Rather, it's a lightweight abstraction that allows you to work with S3 paths in a Pythonic, object-oriented manner without incurring any network overhead." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "S3Path('s3://bucket/folder/file.txt')\n" ] } ], "source": [ "from s3pathlib import S3Path\n", "\n", "s3path = S3Path(\"s3://bucket/folder/file.txt\")\n", "print(s3path)" ] }, { "cell_type": "markdown", "source": [ "## Construct an S3 Path object in Python" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "raw", "source": [ "With s3pathlib, there are :class:`numerous ways to create ` an :class:`~s3pathlib.core.s3path.S3Path` object." ], "metadata": { "collapsed": false, "raw_mimetype": "text/restructuredtext", "pycharm": { "name": "#%% raw\n" } } }, { "cell_type": "markdown", "source": [ "### From bucket, and key parts\n", "\n", "In a file system, you typically use a file path like ``C:\\\\Users\\username\\file.txt`` on Windows or ``/Users/username/file.txt`` on a POSIX system. It's similarly intuitive to construct an S3 Path from a string." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 2, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/file.txt')" }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# construct from bucket, key parts\n", "s3path = S3Path(\"bucket\", \"folder\", \"file.txt\")\n", "s3path" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 3, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/file.txt')" }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# construct from full path also works\n", "s3path = S3Path(\"bucket/folder/file.txt\")\n", "s3path" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "S3 uses ``/`` as a [delimiter to organize and browse your keys hierarchically](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-prefixes.html). With ``s3pathlib``, the delimiter is handled intelligently." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 4, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/file.txt')" }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path = S3Path(\"bucket\", \"/folder/\", \"/file.txt\")\n", "s3path" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "### From S3 URI\n", "\n", "[S3 URI](https://repost.aws/questions/QUFXlwQxxJQQyg9PMn2b6nTg/questions/QUFXlwQxxJQQyg9PMn2b6nTg/what-is-s3-uri-in-simple-storage-service?) is the unique resource identifier within the context of the S3 protocol. They follow this naming convention: ``s3://bucket-name/key-name``. You can create an S3 Path from S3 URI." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 5, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/file.txt')" }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path = S3Path(\"s3://bucket/folder/file.txt\")\n", "s3path" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "raw", "source": [ "You can also use the :meth:`~s3pathlib.core.uri.UriAPIMixin.from_s3_uri` factory method to create an :class:`~s3pathlib.core.s3path.S3Path` object from an URI." ], "metadata": { "collapsed": false, "raw_mimetype": "text/restructuredtext", "pycharm": { "name": "#%% raw\n" } } }, { "cell_type": "code", "execution_count": 6, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/file.txt')" }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path = S3Path.from_s3_uri(\"s3://bucket/folder/file.txt\")\n", "s3path" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "### From S3 ARN\n", "\n", "[S3 ARN](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-arn-format.html) is the Amazon Resource Name of an S3 resources. They follow this naming convention: ``arn:aws:s3:::bucket_name/key_name``. You can create an S3 Path from S3 ARN." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 7, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/file.txt')" }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path = S3Path(\"arn:aws:s3:::bucket/folder/file.txt\")\n", "s3path" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "raw", "source": [ "You can use the :meth:`~s3pathlib.core.uri.UriAPIMixin.from_s3_arn` factory method to create an :class:`~s3pathlib.core.s3path.S3Path` object from an ARN." ], "metadata": { "collapsed": false, "raw_mimetype": "text/restructuredtext", "pycharm": { "name": "#%% raw\n" } } }, { "cell_type": "code", "execution_count": 8, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/file.txt')" }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path = S3Path.from_s3_arn(\"arn:aws:s3:::bucket/folder/file.txt\")\n", "s3path" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "## S3 Path Types\n", "\n", "S3 Path is a logical concept that can represent different types of AWS S3 concepts. Here is the list of S3 Path types:\n", "\n", "1. 📜 **Classic S3 object**: represents an S3 object, such as ``s3://bucket/folder/file.txt``.\n", "2. 📁 **Logical S3 directory**: represents an S3 directory, such as ``s3://bucket/folder/``.\n", "3. 🪣 **S3 bucket**: represents an S3 bucket, such as ``s3://bucket/``\n", "4. **Void Path**: denotes the absence of any bucket or key, essentially representing a blank slate, no bucket, no key, no nothing.\n", "5. **Relative Path**: represents a path relative to another S3 Path. For example, the relative path from ``s3://bucket/folder/file.txt`` to ``s3://bucket/`` is simply ``folder/file.txt``. A relative path can be joined with another S3 Path to create a new S3 Path. Importantly, any concrete path joined with a void path will result in the original concrete path.\n", "6. **Concrete Path**: represents an S3 Path that refers to a concrete object in the S3 storage system. This includes classic S3 object paths, logical S3 directory paths, and S3 bucket paths. Any concrete path joined with a relative path will result in another concrete path.\n", "\n", "### Classic S3 object\n", "\n", "Similar to a file on your local laptop, an [S3 object](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingObjects.html) stores your data. At any given moment, it could be just a pointer, and the object doesn't have to exist in S3." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 9, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/file.txt')" }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path = S3Path(\"s3://bucket/folder/file.txt\")\n", "s3path" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "raw", "source": [ "There are some \"is XYZ test\" methods can tell you whether the S3 Path object is a \"file\", \"directory\", \"bucket\", \"void path\", \"relative path\".\n", "\n", "- :meth:`~s3pathlib.core.is_test.IsTestAPIMixin.is_dir`:\n", "- :meth:`~s3pathlib.core.is_test.IsTestAPIMixin.is_file`:\n", "- :meth:`~s3pathlib.core.is_test.IsTestAPIMixin.is_bucket`:\n", "- :meth:`~s3pathlib.core.is_test.IsTestAPIMixin.is_void`:\n", "- :meth:`~s3pathlib.core.relative.RelativePathAPIMixin.is_relpath`:" ], "metadata": { "collapsed": false, "raw_mimetype": "text/restructuredtext", "pycharm": { "name": "#%% raw\n" } } }, { "cell_type": "code", "execution_count": 10, "outputs": [ { "data": { "text/plain": "True" }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.is_file()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 11, "outputs": [ { "data": { "text/plain": "False" }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.is_dir()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 12, "outputs": [ { "data": { "text/plain": "False" }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.is_bucket()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 13, "outputs": [ { "data": { "text/plain": "False" }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.is_void()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 14, "outputs": [ { "data": { "text/plain": "False" }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.is_relpath()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "### Logical S3 Directory\n", "\n", "Since [AWS S3 is an object storage system](https://aws.amazon.com/s3/), not a file system, directories are only a logical concept in AWS S3. AWS uses / as the path delimiter in S3 keys. There are two types of directories in AWS S3:\n", "\n", "- Hard directory: When you create a folder in the S3 console, it creates a special object without any content (an empty string) with the ``/`` character at the end of the key. You can see the folder as an object in the [list_objects](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3/client/list_objects_v2.html) API response.\n", "- Soft directory: This type of directory does not actually exist; it is [a virtual concept used to help organize your objects in a folder](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-folders.html). For example, if you have an S3 object like ``s3://bucket/folder/file.txt``, then the ``s3://bucket/folder/`` path is a soft folder. Although you can see it in the S3 console, it does not actually exist.\n", "\n", "You can create a S3 directory from string, URI, ARN." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 15, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/')" }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3dir = S3Path(\"bucket\", \"folder/\")\n", "s3dir" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 16, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/')" }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3dir = S3Path(\"s3://bucket/folder/\")\n", "s3dir" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 17, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/')" }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3dir = S3Path(\"arn:aws:s3:::bucket/folder/\")\n", "s3dir" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "raw", "source": [ "Sometimes, you may be concerned that you forgot to append a trailing slash ``/`` to the end of a path to indicate that it refers to a directory. In this case, you can use the :meth:`~s3pathlib.core.mutate.MutateAPIMixin.to_dir` method to ensure that the path refers to a directory." ], "metadata": { "collapsed": false, "raw_mimetype": "text/restructuredtext", "pycharm": { "name": "#%% raw\n" } } }, { "cell_type": "code", "execution_count": 18, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/')" }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3dir = S3Path(\"bucket\", \"folder\").to_dir()\n", "s3dir" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "You can also use \"is XYZ test\" methods on S3 directory too." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 19, "outputs": [ { "data": { "text/plain": "True" }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3dir.is_dir()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 20, "outputs": [ { "data": { "text/plain": "False" }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3dir.is_file()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 21, "outputs": [ { "data": { "text/plain": "False" }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3dir.is_bucket()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 22, "outputs": [ { "data": { "text/plain": "False" }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3dir.is_void()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 23, "outputs": [ { "data": { "text/plain": "False" }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3dir.is_relpath()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "### S3 Bucket\n", "\n", "An S3 bucket is a special type of directory that can be thought of as a \"root\" directory without a key. In other words, it represents the top-level directory of the bucket, and it is both a bucket and a directory in its own right." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 24, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/')" }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3bkt = S3Path(\"bucket\")\n", "s3bkt" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 25, "outputs": [ { "data": { "text/plain": "True" }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3bkt.is_bucket()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 26, "outputs": [ { "data": { "text/plain": "True" }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3bkt.is_dir()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 27, "outputs": [ { "data": { "text/plain": "False" }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3bkt.is_file()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "raw", "source": [ "You can use :meth:`~s3pathlib.core.attribute.AttributeAPIMixin.root` method to get the S3 bucket of any S3 object or directory." ], "metadata": { "collapsed": false, "raw_mimetype": "text/restructuredtext", "pycharm": { "name": "#%% raw\n" } } }, { "cell_type": "code", "execution_count": 28, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/')" }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3bkt = S3Path(\"bucket/folder/file.txt\").root\n", "s3bkt" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "### Void Path\n", "\n", "While Void path should not be used in your application, it can serve as an indicator that something is wrong if you accidentally attempt to use a Void path to perform an S3 API operation." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 29, "outputs": [ { "data": { "text/plain": "S3VoidPath()" }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path = S3Path()\n", "s3path" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 30, "outputs": [ { "data": { "text/plain": "True" }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.is_void()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 31, "outputs": [ { "data": { "text/plain": "False" }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.is_file()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 32, "outputs": [ { "data": { "text/plain": "False" }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.is_dir()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 33, "outputs": [ { "data": { "text/plain": "False" }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.is_bucket()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 34, "outputs": [ { "data": { "text/plain": "True" }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.is_relpath()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "### Relative Path\n", "\n", "Relative paths are very useful for S3 Path calculations. For example, if you want to move all objects in folder ``A`` to another folder ``B``, you can use the relative path from each object ``C`` to ``A`` to calculate the target location in ``B``. Specifically, the target location for each object can be found by joining the relative path from ``C`` to ``A`` with the folder path ``B``. In other words, the formula for the target path is: ``Target = B + (C - A)``." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "raw", "source": [ "Even though you can, but I don't recommend you to construct a relative path manually. You should use path calculation method :meth:`~s3pathlib.core.relative.RelativePathAPIMixin.relative_to` to create it." ], "metadata": { "collapsed": false, "raw_mimetype": "text/restructuredtext", "pycharm": { "name": "#%% raw\n" } } }, { "cell_type": "code", "execution_count": 35, "outputs": [ { "data": { "text/plain": "S3RelPath('file.txt')" }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# The correct way\n", "s3relpath = S3Path(\"s3://bucket/folder/file.txt\").relative_to(S3Path(\"s3://bucket/folder\"))\n", "s3relpath" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 36, "outputs": [ { "data": { "text/plain": "S3RelPath('file.txt')" }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# The manual way (NOT RECOMMENDED)\n", "s3relpath = S3Path.make_relpath(\"file.txt\")\n", "s3relpath" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 37, "outputs": [ { "data": { "text/plain": "S3Path('s3://another-bucket/another-folder/file.txt')" }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path = S3Path(\"s3://another-bucket/another-folder\").to_dir().joinpath(s3relpath)\n", "s3path" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "### S3 Path Variable Naming Convention\n", "\n", "I recommend the following variable naming convention for different types of S3 Path. So when you read the code, you can easily tell what to expect.\n", "\n", "- ``s3path_xyz``: Classic S3 object\n", "- ``s3dir_xyz``: Logical S3 directory\n", "- ``s3bkt_xyz``: S3 bucket\n", "- ``s3void_xyz``: Void Path\n", "- ``s3relpath_xyz``: Relative Path" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "markdown", "source": [ "## S3 Path Attributes" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "raw", "source": [ "S3 Path object has a lot of useful attributes (even though they are `property method `_).\n", "\n", "- :attr:`~s3pathlib.core.uri.UriAPIMixin.bucket`: Return the bucket name as a string.\n", "- :attr:`~s3pathlib.core.uri.UriAPIMixin.key`: return the S3 key as a string.\n", "- :attr:`~s3pathlib.core.base.BaseS3Path.parts`: Provides sequence-like access to the components in the filesystem path.\n", "- :attr:`~s3pathlib.core.uri.UriAPIMixin.uri`: Return the AWS S3 URI.\n", "- :attr:`~s3pathlib.core.uri.UriAPIMixin.arn`: Return an AWS S3 Resource ARN.\n", "- :attr:`~s3pathlib.core.uri.UriAPIMixin.console_url`: Return an url that can inspect the object, directory details in AWS Console.\n", "- :attr:`~s3pathlib.core.uri.UriAPIMixin.us_gov_cloud_console_url`: Return a Gov Cloud url that can inspect the object, directory details in AWS Console.\n" ], "metadata": { "collapsed": false, "raw_mimetype": "text/restructuredtext", "pycharm": { "name": "#%% raw\n" } } }, { "cell_type": "code", "execution_count": 38, "outputs": [], "source": [ "# create an instance\n", "s3path = S3Path(\"bucket\", \"folder\", \"file.txt\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 39, "outputs": [ { "data": { "text/plain": "'bucket'" }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.bucket" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 40, "outputs": [ { "data": { "text/plain": "'folder/file.txt'" }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.key" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 41, "outputs": [ { "data": { "text/plain": "['folder', 'file.txt']" }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.parts" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "raw", "source": [ "The :class:`~s3pathlib.core.s3path.S3Path` class is both immutable and hashable. These attributes don't require any AWS boto3 API calls and are generally available. Because S3Path objects are immutable, you cannot change the value of these attributes once they have been created." ], "metadata": { "collapsed": false, "raw_mimetype": "text/restructuredtext", "pycharm": { "name": "#%% raw\n" } } }, { "cell_type": "code", "execution_count": 42, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "can't set attribute S3Path.bucket\n" ] } ], "source": [ "try:\n", " s3path.bucket = \"new-bucket\"\n", "except Exception as e:\n", " print(e)" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 43, "outputs": [ { "data": { "text/plain": "'s3://bucket/folder/file.txt'" }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.uri" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 44, "outputs": [ { "data": { "text/plain": "'arn:aws:s3:::bucket/folder/file.txt'" }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.arn" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 45, "outputs": [ { "data": { "text/plain": "'https://console.aws.amazon.com/s3/object/bucket?prefix=folder/file.txt'" }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.console_url" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 46, "outputs": [ { "data": { "text/plain": "'https://console.amazonaws-us-gov.com/s3/object/bucket?prefix=folder/file.txt'" }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.us_gov_cloud_console_url" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "raw", "source": [ "Logically, a :class:`~s3pathlib.core.s3path.S3Path` is also a file system like object. So it should have those **file system concepts** too:\n", "\n", "- :attr:`~s3pathlib.core.attribute.AttributeAPIMixin.basename`: the file name with extension.\n", "- :attr:`~s3pathlib.core.attribute.AttributeAPIMixin.fname`: file name without file extension.\n", "- :attr:`~s3pathlib.core.attribute.AttributeAPIMixin.ext`: file extension, if available\n", "- :attr:`~s3pathlib.core.attribute.AttributeAPIMixin.dirname`: the basename of the parent directory\n", "- :attr:`~s3pathlib.core.attribute.AttributeAPIMixin.abspath`: the absolute path is the full path from the root drive. You can think of S3 bucket as the root drive.\n", "- :attr:`~s3pathlib.core.attribute.AttributeAPIMixin.parent`: the parent directory S3 Path\n", "- :attr:`~s3pathlib.core.attribute.AttributeAPIMixin.dirpath`: the absolute path of the parent directory. It is equal to ``s3path.parent.abspath``" ], "metadata": { "collapsed": false, "raw_mimetype": "text/restructuredtext", "pycharm": { "name": "#%% raw\n" } } }, { "cell_type": "code", "execution_count": 47, "outputs": [ { "data": { "text/plain": "'file.txt'" }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.basename" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 48, "outputs": [ { "data": { "text/plain": "'file'" }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.fname" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 49, "outputs": [ { "data": { "text/plain": "'.txt'" }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.ext" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 50, "outputs": [ { "data": { "text/plain": "'folder'" }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.dirname" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 51, "outputs": [ { "data": { "text/plain": "'/folder/file.txt'" }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.abspath" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 52, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/')" }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.parent" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 53, "outputs": [ { "data": { "text/plain": "'/folder/'" }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.dirpath" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "## S3 Path Methods\n", "\n", "### Comparison\n", "\n", "Because every ``S3Path`` object corresponds to an S3 URI (except for relative paths), it's often useful to compare these URIs. Therefore, the comparison operator is implemented for ``S3Path``, allowing you to compare one ``S3Path`` to another." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 54, "outputs": [ { "data": { "text/plain": "True" }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S3Path(\"bucket/file.txt\") == S3Path(\"bucket/file.txt\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 55, "outputs": [ { "data": { "text/plain": "True" }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S3Path(\"bucket\") == S3Path(\"bucket\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 56, "outputs": [ { "data": { "text/plain": "False" }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S3Path(\"bucket1\") == S3Path(\"bucket2\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 57, "outputs": [ { "data": { "text/plain": "True" }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S3Path(\"bucket1\") < S3Path(\"bucket2\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 58, "outputs": [ { "data": { "text/plain": "True" }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S3Path(\"bucket1\") <= S3Path(\"bucket2\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 59, "outputs": [ { "data": { "text/plain": "True" }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# right one is a prefix of the left one\n", "S3Path(\"bucket/a/1.txt\") > S3Path(\"bucket/a/\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 60, "outputs": [ { "data": { "text/plain": "True" }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S3Path(\"bucket/a/1.txt\") < S3Path(\"bucket/a/2.txt\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "### Hash" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "markdown", "source": [ "``S3Path`` is hashable. You can use [set](https://docs.python.org/3/library/stdtypes.html#set-types-set-frozenset) data structure to deduplicate them." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 61, "outputs": [], "source": [ "p1 = S3Path(\"bucket\", \"1.txt\")\n", "p2 = S3Path(\"bucket\", \"2.txt\")\n", "p3 = S3Path(\"bucket\", \"3.txt\")\n", "set1 = {p1, p2}\n", "set2 = {p2, p3}" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 62, "outputs": [ { "data": { "text/plain": "{S3Path('s3://bucket/1.txt'),\n S3Path('s3://bucket/2.txt'),\n S3Path('s3://bucket/3.txt')}" }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# union\n", "set1.union(set2)" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 63, "outputs": [ { "data": { "text/plain": "{S3Path('s3://bucket/2.txt')}" }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# intersection\n", "set1.intersection(set2)" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 64, "outputs": [ { "data": { "text/plain": "{S3Path('s3://bucket/1.txt')}" }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# difference\n", "set1.difference(set2)" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "### Mutate the immutable S3Path" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "raw", "source": [ "It's common to modify existing S3Path objects. However, since S3Path is immutable by design, it cannot be directly edited. Nonetheless, there are numerous utility methods available that enable you to manipulate S3Path objects in various ways.\n", "\n", "- :meth:`~s3pathlib.core.mutate.MutateAPIMixin.copy`: Create a copy of an ``S3Path`` object that logically equals to this one, but is actually different identity in memory. Also, the cache data are cleared.\n", "- :meth:`~s3pathlib.core.mutate.MutateAPIMixin.change`: Create a new ``S3Path`` by replacing part of the attributes.\n", "- :meth:`~s3pathlib.core.joinpath.JoinPathAPIMixin.joinpath`: join with other path parts or relative paths to form another ``S3Path``.\n", "- :attr:`~s3pathlib.core.attribute.AttributeAPIMixin.parent`: travel back to the the parent directory ``S3Path``." ], "metadata": { "collapsed": false, "raw_mimetype": "text/restructuredtext", "pycharm": { "name": "#%% raw\n" } } }, { "cell_type": "markdown", "source": [ "**Copy**" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 65, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/file.txt')" }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path1 = S3Path(\"bucket\", \"folder\", \"file.txt\")\n", "s3path2 = s3path1.copy()\n", "s3path2" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 66, "outputs": [ { "data": { "text/plain": "True" }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path1 == s3path2" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 67, "outputs": [ { "data": { "text/plain": "False" }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path1 is s3path2" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "**Change**" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 68, "outputs": [], "source": [ "s3path = S3Path(\"bkt\", \"a\", \"b\", \"c.jpg\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 69, "outputs": [ { "data": { "text/plain": "S3Path('s3://new-bkt/a/b/c.jpg')" }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# only change the bucket\n", "s3path.change(new_bucket=\"new-bkt\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 70, "outputs": [ { "data": { "text/plain": "S3Path('s3://bkt/x/y/z.png')" }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# only change the absolute path\n", "s3path.change(new_abspath=\"x/y/z.png\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 71, "outputs": [ { "data": { "text/plain": "S3Path('s3://bkt/a/b/c.png')" }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# only change the file extention\n", "s3path.change(new_ext=\".png\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 72, "outputs": [ { "data": { "text/plain": "S3Path('s3://bkt/a/b/ddd.jpg')" }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# only change the file name\n", "s3path.change(new_fname=\"ddd\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 73, "outputs": [ { "data": { "text/plain": "S3Path('s3://bkt/a/b/ddd.png')" }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# only change the base name (file name + file extension)\n", "s3path_new = s3path.change(new_basename=\"ddd.png\")\n", "s3path_new" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 74, "outputs": [ { "data": { "text/plain": "True" }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path_new.is_file()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 75, "outputs": [ { "data": { "text/plain": "S3Path('s3://bkt/a/b/ddd/')" }, "execution_count": 75, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# only change the base name, but this time it becomes a folder\n", "s3path_new = s3path.change(new_basename=\"ddd/\")\n", "s3path_new" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 76, "outputs": [ { "data": { "text/plain": "True" }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path_new.is_dir()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 77, "outputs": [ { "data": { "text/plain": "S3Path('s3://bkt/a/ddd/c.jpg')" }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# only change the dir name\n", "s3path.change(new_dirname=\"ddd/\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 78, "outputs": [ { "data": { "text/plain": "S3Path('s3://bkt/a/ddd/c.jpg')" }, "execution_count": 78, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# only change the dir name\n", "s3path.change(new_dirname=\"ddd\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 79, "outputs": [ { "data": { "text/plain": "S3Path('s3://bkt/xxx/yyy/c.jpg')" }, "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path.change(new_dirpath=\"xxx/yyy/\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "**Join**\n", "\n", "``S3Path.joinpath`` is a very powerful method." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 80, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/subfolder/file.txt')" }, "execution_count": 80, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path1 = S3Path(\"bucket\", \"folder\", \"subfolder\", \"file.txt\")\n", "s3path1" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 81, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/subfolder/')" }, "execution_count": 81, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path2 = s3path1.parent\n", "s3path2" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 82, "outputs": [ { "data": { "text/plain": "S3RelPath('file.txt')" }, "execution_count": 82, "metadata": {}, "output_type": "execute_result" } ], "source": [ "relpath1 = s3path1.relative_to(s3path2)\n", "relpath1" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 83, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/subfolder/file.txt')" }, "execution_count": 83, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# join concrete path with a relative path\n", "s3path2.joinpath(relpath1)" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 84, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/')" }, "execution_count": 84, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path3 = s3path2.parent\n", "s3path3" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 85, "outputs": [ { "data": { "text/plain": "S3RelPath('subfolder/')" }, "execution_count": 85, "metadata": {}, "output_type": "execute_result" } ], "source": [ "relpath2 = s3path2.relative_to(s3path3)\n", "relpath2" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 86, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/subfolder/file.txt')" }, "execution_count": 86, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path3.joinpath(relpath2, relpath1)" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 87, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/subfolder/file.txt')" }, "execution_count": 87, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path3.joinpath(\"subfolder\", \"file.txt\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 88, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/subfolder/file.txt')" }, "execution_count": 88, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# it's OK if you mess up with the \"/\"\n", "s3path3.joinpath(\"/subfolder/\", \"/file.txt\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "The ``/`` operator provide a syntax sugar for ``joinpath`` method" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 89, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/file.txt')" }, "execution_count": 89, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path = S3Path(\"bucket\")\n", "s3path / \"file.txt\"" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 90, "outputs": [ { "data": { "text/plain": "S3Path('s3://bucket/folder/file.txt')" }, "execution_count": 90, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3path / \"folder\" / \"file.txt\"" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "### Calculate Relative Path" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "raw", "source": [ "The :meth:`~s3pathlib.core.relative.RelativePathAPIMixin.relative_to` method is used to calculate the relative path between two paths. The syntax for this method is ``s3path_from.relative_to(s3path_to)``. Note that the ``s3path_to`` argument must be a shorter path than the ``s3path_from`` argument in order for the method to work correctly." ], "metadata": { "collapsed": false, "raw_mimetype": "text/restructuredtext", "pycharm": { "name": "#%% raw\n" } } }, { "cell_type": "code", "execution_count": 91, "outputs": [ { "data": { "text/plain": "['b', 'c']" }, "execution_count": 91, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S3Path(\"bucket\", \"a/b/c\").relative_to(S3Path(\"bucket\", \"a\")).parts" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 92, "outputs": [ { "data": { "text/plain": "[]" }, "execution_count": 92, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S3Path(\"bucket\", \"a\").relative_to(S3Path(\"bucket\", \"a\")).parts" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 93, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "s3://bucket/a does not start with s3://bucket/a/b/c\n" ] } ], "source": [ "# this won't work\n", "try:\n", " S3Path(\"bucket\", \"a\").relative_to(S3Path(\"bucket\", \"a/b/c\")).parts\n", "except Exception as e:\n", " print(e)" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "The ``-`` operator override provide a syntax sugar for ``relative_to`` method." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 94, "outputs": [ { "data": { "text/plain": "['b', 'c']" }, "execution_count": 94, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(S3Path(\"bucket\", \"a/b/c\") - S3Path(\"bucket\", \"a\")).parts" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "## What's Next\n", "\n", "Now that we have established the basics of working with ``s3pathlib``, let's explore how to use it to interact with the AWS S3 API." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 94, "outputs": [], "source": [], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.11" } }, "nbformat": 4, "nbformat_minor": 4 }