{ "cells": [ { "cell_type": "markdown", "source": [ "# Example App - Reinvent Cloud Drive\n", "\n", "There are many cloud drive products available in the market, such as Google Drive, Microsoft One Drive, Dropbox, and more. By using AWS S3 and ``s3pathlib``, it is possible to create your own cloud drive product.\n", "\n", "The primary technologies used are ``S3 Bucket versioning`` and ``Object Lifecycle Management``. [S3 Bucket versioning](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Versioning.html) can preserve all historical versions of your file and prevent accidental deletions. [Object Lifecycle Management](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lifecycle-mgmt.html) can also automatically expire older versions.\n", "\n", "First, let's assume that you have a customer ``John`` using your cloud drive product. He is trying to sync the ``/Users/John/cloud-drive/`` on the laptop to the ``s3://s3pathlib-versioning-enabled/cloud-drive/John/`` on S3." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 62, "outputs": [], "source": [ "from pathlib_mate import Path\n", "from s3pathlib import S3Path\n", "\n", "dir_root = Path(\"cloud-drive\")\n", "dir_root.mkdir_if_not_exists()\n", "s3dir_root = S3Path(\"s3://s3pathlib-versioning-enabled/cloud-drive/John/\")\n", "s3dir_root.mkdir(exist_ok=True)\n", "\n", "_ = dir_root.remove_if_exists()\n", "_ = s3dir_root.delete(is_hard_delete=True)" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "## Local File Change Handler\n", "\n", "User wants to sync their change on the local file system to the cloud drive. Now we defined a function to handle the local file change event. It will upload the file to S3 and create a new version, and return the S3 path." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 63, "outputs": [], "source": [ "def handle_local_file_change_event(path: Path):\n", " s3path = s3dir_root.joinpath(str(path.relative_to(dir_root)))\n", " s3path.write_bytes(path.read_bytes())\n", " return s3path" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 64, "outputs": [], "source": [ "# user create the version of doc.txt\n", "path_v1 = dir_root.joinpath(\"my-documents/doc.txt\")\n", "path_v1.parent.mkdir_if_not_exists()\n", "path_v1.write_text(\"v1\")\n", "\n", "# invoke the handle\n", "s3path_v1 = handle_local_file_change_event(path_v1)" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 65, "outputs": [ { "data": { "text/plain": "'v1'" }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# you can validate the content of file on S3\n", "s3path_v1.read_text()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 66, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "version_id = xMqg9Jp5l_PhcU4..gcYeGgJbVW6A4wB, content = 'v1'\n" ] } ], "source": [ "for p in s3path_v1.list_object_versions().all():\n", " print(f\"version_id = {p.version_id}, content = {p.read_text(version_id=p.version_id)!r}\")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "## S3 File Change Handler\n", "\n", "User wants to sync their change on the cloud drive to the local file system too.\n" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 67, "outputs": [], "source": [ "def handle_s3_file_change_event(s3path: S3Path):\n", " path = dir_root.joinpath(s3path.relative_to(s3dir_root).key)\n", " path.parent.mkdir_if_not_exists()\n", " path.write_bytes(s3path.read_bytes(version_id=s3path.version_id))\n", " return path" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 68, "outputs": [], "source": [ "# put a new version of doc.txt on S3\n", "s3path_v2 = s3path_v1.write_text(\"v2\")\n", "\n", "# invoke the handle\n", "path_v2 = handle_s3_file_change_event(s3path_v2)" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 69, "outputs": [ { "data": { "text/plain": "'v2'" }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# you can validate the content of file on your laptop\n", "path_v2.read_text()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "## Delete and Recover\n", "\n", "With S3 bucket versioning, a deletion is actually a delete-marker on the latest version. If user accidentally delete a file on their local. The S3 still have the copy and marked it as \"deleted\". If user accidentally delete a file on their cloud drive, the S3 only put a delete-marker. The user can always use the \"Recent deleted\" feature to recover the file. On S3 side, it is just a ``list_object_versions`` API call to retrieve a historical version." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.6" } }, "nbformat": 4, "nbformat_minor": 0 }