U C^(7@sddlZddlZddlZddlZddlZddlZddlZddlmZm Z ddl m Z ddl m Z mZedZGdddeZGdd d eZdS) N)AbstractFileSystem filesystem)AbstractBufferedFile) MMapCache BaseCachefsspeccsleZdZdZdZdfdd Zd d Zd d ZddZddZ ddZ dddZ ddZ fddZ ZS)CachingFileSystema4Locally caching filesystem, layer over any other FS This class implements chunk-wise local storage of remote files, for quick access after the initial download. The files are stored in a given directory with random hashes for the filenames. If no directory is given, a temporary one is used, which should be cleaned up by the OS after the process ends. The files themselves as sparse (as implemented in MMapCache), so only the data which is accessed takes up space. Restrictions: - the block-size must be the same for each access of a given file, unless all blocks of the file have already been read - caching can only be applied to file-systems which produce files derived from fsspec.spec.AbstractBufferedFile ; LocalFileSystem is also allowed, for testing )Z blockcachecachedNTMP F: c  s|jr dStjf||dkr,tg}nt|tr>|g}n|}tj|ddd||_ |p`i|_ ||_ ||_ ||_ |t|tr||_|jj|_n||_t|f|j |_dS)a{ Parameters ---------- target_protocol: str Target fielsystem protocol cache_storage: str or list(str) Location to store files. If "TMP", this is a temporary directory, and will be cleaned up by the OS when this process ends (or later). If a list, each location will be tried in the order given, but only the last will be considered writable. cache_check: int Number of seconds between reload of cache metadata check_files: bool Whether to explicitly see if the UID of the remote file matches the stored one before using. Warning: some file systems such as HTTP cannot reliably give a unique hash of the contents of some path, so be sure to set this option to False. expiry_time: int The time in seconds after which a local copy is considered useless. Set to falsy to prevent expiry. The default is equivalent to one week. target_options: dict or None Passed to the instantiation of the FS, if fs is None. Nr Texist_ok)_cachedsuper__init__tempfilemkdtemp isinstancestrosmakedirsstoragekwargs cache_check check_filesexpiry load_cacherfsprotocolr) selfZtarget_protocolZ cache_storagerrZ expiry_timeZtarget_optionsrr __class__A/tmp/pip-install-6_kvzl1k/fsspec/fsspec/implementations/cached.pyr$s(#     zCachingFileSystem.__init__cGs&|j|j|j|j|j|j|jp dffSN)r#r rrrrr)r!_r$r$r% __reduce_ex___szCachingFileSystem.__reduce_ex__c Csg}|jD]^}tj|d}tj|rPt|d}|t|W5QRXq tj |dd|iq |prig|_ t |_ dS)z#Read set of stored blocks from filecacherbTrN) rrpathjoinexistsopenappendpickleloadr cached_filestime last_cache)r!r2rfnfr$r$r%rls     zCachingFileSystem.load_cachec Cs6tj|jdd}|jd}tj|rt|d}t|}W5QRX| D]L\}}|ddk rR||ddkrd|d<qRt |d ||d|d<qRn|}dd| D}| D]"}t |dt rt|d|d<qt|dd }t||W5QRXtj|r"t|t|d|d S) z#Save set of stored blocks from filer r)r*blocksTcSsi|]\}}||qSr$)copy).0kvr$r$r% sz0CachingFileSystem.save_cache..z.tempwbN)rr+r,rr2r-r.r0r1itemssetunionvaluesrlistdumpremoverename)r!r5r)r6r2r:cr$r$r% save_cache{s(     "  zCachingFileSystem.save_cachecCsF|js dSt|j|jk}tdd|jD}|s:|sB|dS)z0Reload caches if time elapsed or any disappearedNcss|]}tj|VqdSr&)rr+r-)r9rr$r$r% sz1CachingFileSystem._check_cache..)rr3r4allrr)r!ZtimecondZ existcondr$r$r% _check_caches zCachingFileSystem._check_cachecCs|t|j|jD]\}}||kr(q||}|jrP|d|j|krPq|jrn|dt |jkrnqt j ||d}t j |r||fSqdS)z Is path in cache and still validuidr3r5)FN)rJziprr2r8rrukeyrr3rr+r,r-)r!r+rr)detailr5r$r$r% _check_files  zCachingFileSystem._check_filer*Tc  s|}|js$jd|}|dkrJjj|f||||d|S|\}}|r|d|d} } | dkrtd|t|dStd|nbt |  } t jjd | }t} | | tj|d }|jd |<td |jj|f||||d d |d|krR|djkr\td|djfn j|d<tjjj|| _jfdd_S)a!Wrap the target _open If the whole file exists in the cache, just open it locally and return that. Otherwise, open the file on the target FS, and make it have a mmap cache pointing to the location which we determine, in our cache. The ``blocks`` instance is shared, so as the mmap cache instance updates, so does the entry in our ``cached_files`` attribute. We monkey-patch this file, so that when it closes, we call ``close_and_update`` to save the state of the blocks. ://r*)mode block_size autocommit cache_optionsr5r7TOpening local copy of %sz#Opening partially cached copy of %sr r5r7r3rKz!Creating local sparse file for %sN)rQrRrSrTZ cache_type blocksizezNCached file must be reopened with same blocksize as original (old: %i, new %i)cs Sr&)close_and_updater$closer6r!r$r%z)CachingFileSystem._open..)_strip_protocol startswithr r_openrOloggerdebugr.hashlibsha256encode hexdigestrr+r,rr?r3rMr2rW ValueErrorrZ _fetch_rangesizer)rZ) r!r+rQrRrSrTrrNr5hashr7r$rYr%r_sl       zCachingFileSystem._opencCsp|j|jr|j}n|jd|j}|jd|}|ddk r^tdg|j|jkr^d|d<||dS)z9Called when a file is closing, so store the set of blocksrPr r7TN)r+r^r r2lenrWrgrG)r!r6rZr+rFr$r$r%rXs"z"CachingFileSystem.close_and_updatecsdkrfddSdkr&tStd}|dd}|krN|S|dk r|jkrj|jSt|}t|}t|rt|dr|j dkr| ||S|St SdS)N)rr_rGrXr__getattribute__r(r.catgetZ read_blocktailheadrOrJcsttf||Sr&)getattrtype)argskwitemr!r$r%r["r\z4CachingFileSystem.__getattribute__..r#__dict__r__self__) rpobjectrjrlruroinspect isfunctionhasattrrv__get__r)r!rtdrclsmr"rsr%rjs*       z"CachingFileSystem.__getattribute__)Nr r Fr N)r*NTN)__name__ __module__ __qualname____doc__r rr(rrGrJrOr_rXrj __classcell__r$r$r"r%rs*;   P rc@seZdZdZdZdddZdS)WholeFileCacheFileSystema Caches whole remote files on first access This class is intended as a layer over any other file system, and will make a local copy of each file accessed, so that all subsequent reads are local. This is similar to ``CachingFileSystem``, but without the block-wise functionality and so can work even when sparse files are not allowed. See its docstring for definition of the init arguments. The class still needs access to the remote store for listing files, and may refresh cached files. filecacher*c Ks||}||js@t|jtr.|jd}n|j}|d|}|dkr`|jj|fd|i|S||\}}|r|d|d}}|dkrt d|t |dSt d |n`t |}tj|jd |}d}||t|j|d }||jd |<t d |||d<|jj|f|} t |d p} t| trVtd| jj| j| _t| ddr| jrd} | r| | j} | | qpn| | W5QRX| |||S)NrrPr*rQr5r7TrUz?Attempt to open partially cached file %sas a wholly cached filer rVzCopying %s to local cacher=rW)!r]r^r rtuplerr_rOr`rar.rfrbrcrdrerr+r,rr3rMr2rrr)ZfetcherrgroreadrWwriterG) r!r+rQrr rNr5rhr7r6f2datar$r$r%r_KsT          zWholeFileCacheFileSystem._openN)r*)rrrrr r_r$r$r$r%r;s r)r3r0loggingrrbrrxrrrZ fsspec.specrZ fsspec.corerr getLoggerr`rrr$r$r$r%s  .