.. _resources_collections: ======================= Resources & Collections ======================= Overview ======== ``Resource`` objects are the second level of ``boto``, forming a higher-level interface than the :ref:`Connection ` objects themselves. The ``Resource`` (& by extension, ``Collections``) form around **conceptual** parts of the API, adding an object-oriented layer on top of the typically "flat" HTTP APIs presented by `AWS`_. The canonical example of this is an `EC2`_ instance or an `S3`_ bucket. Operations on these map nicely to conceptual things (``Instance.run()`` or ``Bucket.set_cors``), but are exposed by the API like ``run_instances()`` or ``put_bucket_cors``. The ``Resource`` layer brings this conceptual model forward, letting you interact with with the resources server-side as though they were local Python objects. Unlike the ``Connection`` objects, most ``Resource`` objects aren't able to talk to the entire HTTP API. Instead, they are limited to the operations that make sense (again, conceptually). Where a ``Resource`` instance maps to a single resource, the ``Collection`` objects act over an entire set of a given ``Resource``. For example, you might have a ``Bucket`` instance (that represents a single S3 bucket), but if you wanted a list of all buckets under your account, you'd use ``BucketCollection`` to fetch them all, to create new resources or to get specific ones. It may help to think of ``Collection`` objects as the things you'd expose as ``@classmethod`` interfaces in traditional OO. .. note:: Much of the rest of this documentation will talk about ``Resource`` objects. Virtually all the same things apply to ``Collection`` objects. The primary difference between the two is that ``Collection`` objects usually don't track instance data & that they typically return ``Resource`` objects rather than data like ``Resources`` do. .. _`AWS`: http://aws.amazon.com/ .. _`EC2`: http://aws.amazon.com/ec2/ .. _`S3`: http://aws.amazon.com/s3/ Where Are The Resources? ======================== Similar to ``Connection`` objects, ``boto`` generates *most* of the ``Resource`` & ``Collection`` classes at run-time. Some differences from ``Connection`` objects are: * ``Resource/Collection`` classes aren't built by introspecting ``botocore`` objects * ``Resource/Collection`` objects have a ``Connection`` that they use to talk to the service * there are concrete classes found with in ``boto``, usually within ``boto..resources``, though they may not be what you'd expect Just like ``Connection`` objects, ``boto`` constructs most of the ``Resource/Collection`` classes dynamically. Just like ``Connection`` objects, the ``Resource/Collection``classes are generated by **factories**. However, as previously alluded to, these factories consume special **JSON** (imaginatively called ``ResourceJSON`` :P). See the :ref:`resource_construction` section for more detail. .. warning:: What follows about importing ``Resource`` & ``Collection`` objects is subject to change before the release of v3.0.0. It is a work in progress. We intend for ``from boto.s3.resources import BucketCollection`` to be relatively final syntax, but the definition & the way the ``boto/s3/resources.py`` file itself works is subject to change. The concrete classes that you'll import & use also might not be defined the way you'd expect. Instead of typical Python class definitions like:: # boto/s3/resources.py class BucketCollection(Collection): # ... class Bucket(Resource): # ... We instead define them like:: # boto/s3/resources.py import boto3 BucketCollection = boto3.session.get_collection('s3', 'BucketCollection') Bucket = boto3.session.get_resource('s3', 'Bucket') This generates those classes at import-time (when you execute ``from boto.s3.resources import Bucket``). This is to build the class as late as possible & allow for user-customizations. Usage ===== Using a ``Resource`` or ``Collection`` both tries to be standardized (on common methods between services) as well as varying widely as the service requires. Methods that have been standardized typically are: * ``Collection`` * ``each`` - lists all ``Resources`` present * ``create`` - creates a new ``Resource`` * ``Resource`` * ``get`` - fetches an individual ``Resource`` by identifier * ``update`` - Where present within the service, this updates a given ``Resource``. * ``delete`` - Deletes a given ``Resource`` Beyond those methods, the rest of the methods mapped to each vary service by service. Where possible, the methods have been given shorter/friendlier/more idiomatic names. Please refer to the tutorials/references for each service for examples. .. warning:: TBD .. _resource_construction: Construction ============ .. note:: For the bulk of this, the construction of a ``Collection`` is practically identical. As a result, we'll only identify places where there are differences by them & ``Resources``. Construction of ``Resource`` class takes place within a ``boto3.core.resources.ResourceFactory`` instance. Any instance (whether built into ``boto`` or instantiated by a user) can successfully create ``Resource`` subclass. The only information needed is the ``service_name`` (see :ref:`service_names` for valid names) & the desired ``Resource`` object name. Usage is trivial:: >>> import boto3 >>> from boto3.core.resources import ResourceFactory # We'll use the default session, though you can just as easily provide # your own custom ``Session`` instance. >>> rf = ResourceFactory(session=boto3.session) # Now build the resource. >>> Bucket = rf.construct_for('s3', 'Bucket') Building a ``Collection`` class looks nearly identical:: >>> import boto3 >>> from boto3.core.collections import CollectionFactory # We'll use the default session, though you can just as easily provide # your own custom ``Session`` instance. >>> cf = CollectionFactory(session=boto3.session) # Now build the collection. >>> BucketCollection = cf.construct_for('s3', 'BucketCollection') However, to make things even easier, this functionality is also exposed through the :ref:`Session ` object itself. The ``Session`` has its own ``ResourceFactory`` & ``CollectionFactory`` instances & can handle the details for you. So typically, you'll actually do:: >>> import boto3 # Again, we're just using the default session, but you should feel free # to use your own ``Session`` instances. >>> Bucket = boto3.session.get_resource('s3', 'Bucket') >>> BucketCollection = boto3.session.get_collection('s3', 'BucketCollection') However, for some services, additional modifications to behavior (or added convenience functionality) are added by further subclasses within the ``boto..resources`` modules, so what you get out of the factories should be considered a starting point & may not match the ``Resources/Collections`` you can import from the service modules. Under The Hood -------------- There are many similarities between ``ResourceFactory/CollectionFactory`` to the "Under The Hood" section of :ref:`connections`. The objects used are roughly analogous: * ``ConnectionFactory`` -> ``ResourceFactory`` * ``ConnectionDetails`` -> ``ResourceDetails`` * ``Connection`` -> ``Resource`` The build process is almost identical as well. However, the biggest difference is how the service data & resource data are collected. In ``ConnectionDetails``, the service data is introspected off ``botocore`` objects. However, for ``ResourceDetails/CollectionDetails``, the data about how to structure the object is loaded from what's called ``ResourceJSON``. This JSON ships with ``boto`` itself & can be found at ``boto/data/aws/resources/*.json``. An intermediate (non-introspection) format was needed because of the difficulty in establishing what conceptually fits into what objects. This has to be explicitly mapped due to a lack of standardization or metadata elsewhere. JSON was chosen because of its standardized simple format, being broadly understood & ease of sharing between tooling written in different languages. It has the added benefit that the user can easily override the data with their own version & change the behavior of the ``Resources/Collections`` without having to subclass. .. warning:: An explicit ``ResourceJSON`` specification is still pending. For now, please refer to the existing files (& the source that loads them) as guidance on the format. This is very much sub-optimal, but should be resolved before the official release of v3.0.0. Overriding/Extending ==================== ``Resource`` & ``Collection`` objects have a broader degree of customizability to them, moreso than the equivalent :ref:`connections` objects. These customizations can be categorized into: * construction time alterations * per-instance alterations Construction Alterations ------------------------ You can specify your own base ``Resource`` class. This allows you to alter the behavior of *every* resource that comes out of the factory. This class can either be passed as a one-off value to ``ResourceFactory.construct_for`` or as an initialization parameter to the ``ResourceFactory``. For a one-off call (the typical use-case):: import json import boto3 from boto3.core.resources import Resource class JSONS3Object(Resource): # A class that automatically handles getting/setting the data as # JSON. def get_content(self): raw_data = super(JSONS3Object, self).get_content() return json.loads(raw_data) def set_content(self, content): raw_data = json.dumps(content) return super(JSONS3Object, self).set_content(raw_data) # Construct the class. S3Object = boto3.session.get_resource( 's3', 'Bucket', base_class=JSONS3Object ) assert issubclass(S3Object, JSONS3Object) # Using the new class. obj = S3Object(bucket='some-bucket', key='my-auto-json.json') # Give it a plain Python object. obj.set_content({ 'hello': 'world!', }) # JSON is sent to S3. obj.update() # Now fetch it. got_it = S3Object(bucket='some-bucket', key='my-auto-json.json').get() # The JSON is automatically loaded. data = got_it.get_content() print(data['hello']) If you want the factory to **always** use a certain subclass (for instance, the changes should be applied to every class that comes out), you should pass it as an initialization parameter. For example:: from boto3.core.resources import Resource, ResourceFactory class AlwaysForceMyInstanceDataToWinResource(Resource): def update_params(self, conn_method_name, params): # Overwrite whatever was passed into the method with the # instance data. params.update(self._data) return params rf = ResourceFactory( base_resource_class=AlwaysForceMyInstanceDataToWinResource ) Bucket = rf.construct_for('s3', 'Bucket') # Now anytime a request is sent, the instance data will be used in-place of # parameters explicitly passed in. bukkit = Bucket(bucket='they-be-stealin-mah') # This parameter will be overwritten! bukkit.update(bucket='nope-nope-nope') # The bucket's name is still 'they-be-stealin-mah' You can do a similar thing for the ``ResourceDetails`` class to be used. It also is specified as part of the initialization of a ``ResourceFactory``. For example:: from boto3.core.resources import ResourceDetails, ResourceFactory class CustomIdentifierDetails(ResourceDetails): @property def identifier_var_name(self): # Because we like SHOUTCAPS. return 'ID' rf = ResourceFactory(details_class=CustomIdentifierDetails) Bucket = rf.construct_for('s3', 'Bucket') assert Bucket().get_identifiers() == [ { 'var_name': 'id', 'api_name': 'ID', } ] Per-Instance Alterations ------------------------ .. warning:: TBD. Talk about ``full_update_params``, ``update_params``, ``update_params_FOO``, ``full_post_process``, ``post_process`` & ``post_process_FOO`` as extension mechanisms.