U C^4@sddlmZmZmZddlZddlZddlZddlmZddl m Z m Z m Z m Z ddlmZddlmZmZmZmZmZmZedZGd d d eZdd dZdddZddZddZddZddZ dddZ!ddZ"dS) )print_functiondivisionabsolute_importN)compr)infer_compressionbuild_name_functionupdate_storage_optionsstringify_path)get_filesystem_class) BaseCache MMapCacheReadAheadCache BytesCache BlockCachecachesZfsspecc@sZeZdZdZdddZddZdd Zd d Zd d ZddZ ddZ ddZ ddZ dS)OpenFilea File-like object to be used in a context Can layer (buffered) text-mode and compression over any file-system, which are typically binary-only. These instances are safe to serialize, as the low-level file object is not created until invoked using `with`. Parameters ---------- fs: FileSystem The file system to use for opening the file. Should match the interface of ``dask.bytes.local.LocalFileSystem``. path: str Location to open mode: str like 'rb', optional Mode of the opened file compression: str or None, optional Compression to apply encoding: str or None, optional The encoding to use if opened in text mode. errors: str or None, optional How to handle encoding errors if opened in text mode. newline: None or str Passed to TextIOWrapper in text mode, how to handle line endings. rbNcCs:||_||_||_t|||_||_||_||_g|_dSN) fspathmodeget_compression compressionencodingerrorsnewlinefobjects)selfrrrrrrrr//tmp/pip-install-6_kvzl1k/fsspec/fsspec/core.py__init__9s  zOpenFile.__init__cCs t|j|j|j|j|j|jffSr)rrrrrrrrrrr __reduce__LszOpenFile.__reduce__cCs d|jS)Nz)formatrr"rrr __repr__YszOpenFile.__repr__cCs|jSr)rr"rrr __fspath__\szOpenFile.__fspath__cCs|jddddd}|jj|j|d}|g|_|jdk rdt|j}|||dd}|j|d|jkrt j ||j |j |j d}|j||jdS)Ntb)rr)rrr)rreplaceropenrrrrappendio TextIOWrapperrrr)rrfcompressrrr __enter___s      zOpenFile.__enter__cGs |dSrclose)rargsrrr __exit__tszOpenFile.__exit__cCs |dSrr3r"rrr __del__wszOpenFile.__del__cCs|S)zMaterialise this as a real open file without context The file should be explicitly closed to avoid enclosed open file instances persisting )r2r"rrr r,zsz OpenFile.opencCs:t|jD]$}d|jkr&|js&||q g|_dS)z#Close all encapsulated file objectsrN)reversedrrclosedflushr4)rr0rrr r4s  zOpenFile.close)rNNNN) __name__ __module__ __qualname____doc__r!r#r%r&r2r6r7r,r4rrrr rs   rrutf8c  s6t|||| |d\} } fdd| DS)a5 Given a path or paths, return a list of ``OpenFile`` objects. For writing, a str path must contain the "*" character, which will be filled in by increasing numbers, e.g., "part*" -> "part1", "part2" if num=2. For either reading or writing, can instead provide explicit list of paths. Parameters ---------- urlpath: string or list Absolute or relative filepath(s). Prefix with a protocol like ``s3://`` to read from alternative filesystems. To read from multiple files you can pass a globstring or a list of paths, with the caveat that they must all have the same protocol. mode: 'rb', 'wt', etc. compression: string Compression to use. See ``dask.bytes.compression.files`` for options. encoding: str For text mode only errors: None or str Passed to TextIOWrapper in text mode name_function: function or None if opening a set of files for writing, those files do not yet exist, so we need to generate their names by formatting the urlpath for each sequence number num: int [1] if writing mode, number of files we expect to create (passed to name+function) protocol: str or None If given, overrides the protocol found in the URL. newline: bytes or None Used for line terminator in text mode. If None, uses system default; if blank, uses no translation. **kwargs: dict Extra options that make sense to a particular storage connection, e.g. host, port, username, password, etc. Examples -------- >>> files = open_files('2015-*-*.csv') # doctest: +SKIP >>> files = open_files( ... 's3://bucket/2015-*-*.csv.gz', compression='gzip' ... ) # doctest: +SKIP Returns ------- List of ``OpenFile`` objects. )num name_functionstorage_optionsprotocolc s"g|]}t|dqS))rrrrr)r).0rrrrrrrrr s zopen_files..)get_fs_token_paths) urlpathrrrrrBrArDrkwargsZfs_tokenpathsrrFr open_filess<  rLcKs$t|g|||||fd|i|dS)a Given a path or paths, return one ``OpenFile`` object. Parameters ---------- urlpath: string or list Absolute or relative filepath. Prefix with a protocol like ``s3://`` to read from alternative filesystems. Should not include glob character(s). mode: 'rb', 'wt', etc. compression: string Compression to use. See ``dask.bytes.compression.files`` for options. encoding: str For text mode only errors: None or str Passed to TextIOWrapper in text mode protocol: str or None If given, overrides the protocol found in the URL. newline: bytes or None Used for line terminator in text mode. If None, uses system default; if blank, uses no translation. **kwargs: dict Extra options that make sense to a particular storage connection, e.g. host, port, username, password, etc. Examples -------- >>> openfile = open('2015-01-01.csv') # doctest: +SKIP >>> openfile = open( ... 's3://bucket/2015-01-01.csv.gz', ... compression='gzip' ... ) # doctest: +SKIP >>> with openfile as f: ... df = pd.read_csv(f) # doctest: +SKIP Returns ------- ``OpenFile`` object. rr)rL)rIrrrrrDrrJrrr r,s0 r,cCs0|dkrt|}|dk r,|tkr,td||S)NZinferz!Compression type %s not supported)rr ValueError)rIrrrr rs  rcCs<t|}d|kr4|dd\}}t|dkr4||fSd|fS)zReturn protocol, path pairz://rN)r splitlen)rIrDrrrr split_protocol!s  rPcCst|\}}t|}||S)zCReturn only path part of full URL, according to appropriate backend)rPr _strip_protocol)rIrD_clsrrr strip_protocol,s rTcCsg}t|}d|kr4tdd|Ddkr4tdnd|krJt|t|}|D]D}d|krd|krv|t|||q|||qN||qNd|krt||kr|d|}|S)aExpand paths if they have a ``*`` in them. :param paths: list of paths mode: str Mode in which to open files. num: int If opening in writing mode, number of files we expect to create. fs: filesystem object name_function: callable If opening in writing mode, this callable is used to generate path names. Names are generated for each partition by ``urlpath.replace('*', name_function(partition_index))``. :return: list of paths wcSsg|]}d|krdqS)*rrrEprrr rGDsz*expand_paths_if_needed..rz;When writing data, only one filename mask can be specified.rVN) listsumrMmaxrOextend _expand_pathsglobr-)rKrrArrBZexpanded_pathsZ curr_pathrrr expand_paths_if_needed3s   r_c sxt|ttfr|stdttt|\}}p6|dtfdd|DsVtdtttj |}fdd|D}|dtfdd|Dstd t |f} t |||| |}nt|t st |d r`t|\}} p|t ||} t |f} d |kr>t| ||}n d | krXt| | }n| g}n td || | j|fS)a?Filesystem, deterministic token, and paths from a urlpath and options. Parameters ---------- urlpath: string or iterable Absolute or relative filepath, URL (may include protocols like ``s3://``), or globstring pointing to data. mode: str, optional Mode in which to open files. num: int, optional If opening in writing mode, number of files we expect to create. name_function: callable, optional If opening in writing mode, this callable is used to generate path names. Names are generated for each partition by ``urlpath.replace('*', name_function(partition_index))``. storage_options: dict, optional Additional keywords to pass to the filesystem class. protocol: str or None To override the protocol specifier in the URL zempty urlpath sequencerc3s|]}|kVqdSrrrW)rDrr tsz%get_fs_token_paths..zGWhen specifying a list of paths, all paths must share the same protocolcsg|]}|qSr)rQ)rEu)rSrr rG{sz&get_fs_token_paths..c3s|]}|kVqdSrr)rEo)optionsrr r`}szRWhen specifying a list of paths, all paths must share the same file-system optionsnamerUrVzurl type not understood: %s) isinstancerYtuplerMzipmaprPallr Z_get_kwargs_from_urlsr r_strhasattrrQr]sortedr^ TypeErrorZ _fs_token) rIrrArBrCrDZ protocolsrKZoptionssrrr)rSrcrDr rHXsF           rHcsttr|ddkr"tdndkr8tjddkrLt|dfddt|D}|t |krt dn0tt t frt|kstt }ntd|S) NrVrz.Output path spec must contain exactly one '*'.z*.partcsg|]}d|qS)rV)r+)rEirBrrr rGsz!_expand_paths..zqIn order to preserve order between partitions paths created with ``name_function`` should sort to partition orderzPath should be either 1. A list of paths: ['foo.json', 'bar.json', ...] 2. A directory: 'foo/ 3. A path with a '*' in it: 'foo.*.json')rerjcountrMosrjoinrrangerlloggerwarningrfrYrOAssertionError)rrBrArKrror r]s&     r])rNr@NNrNN)rNr@NNN)rrNNN)# __future__rrrr.rqloggingrrutilsrrr r registryr Zcachingr r rrrr getLoggerrtobjectrrLr,rrPrTr_rHr]rrrr sH   q T < & E