B @`m @sdZddlZddlZddlmZddlZddlZddlmZm Z m Z m Z m Z ddl Z ddlZddlmZmZmZmZmZmZmZmZmZmZddlmZmZmZmZm Z ddl!Z!ddl"Z"ddl#m$Z$m%Z%m&Z&m'Z'm(Z(m)Z)ddl*m+Z+m,Z,dd l-m.Z.dd l/m0Z0e,Z1e2e eeZ3e34d ej5Gd d d Z6ej5GdddZ7e8dddZ9e'ee'edddZ:ddddZ;dDe(ee8e'edddZdEe(e?e&e?e)e6d#d$d%Z@e?e?d&d'd(ZAd)d*d+d,d-ZBe&eee?e%fd.d/d0ZCe(ee?ee?d1d2d3ZDdFe(e?ee?e&e8e8ee?e)e7d5 d6d7ZEGd8d9d9e"jFe ZGGd:d;d;ejHZIe'e8e?e?ee?ee'e8ee$fd<d=d>ZJe(e8dd?d@ZKe(e?e8dAdBdCZLdS)GzCommon IO api utilitiesN)abc)BufferedIOBaseBytesIO RawIOBaseStringIO TextIOWrapper) IOAnyAnyStrDictListMappingOptionalTupleUnioncast)urljoinurlparse uses_netloc uses_params uses_relative)BufferCompressionDictCompressionOptions FileOrBufferFilePathOrBufferStorageOptions) get_lzma_file import_lzma)import_optional_dependency) is_file_likec@s>eZdZUdZeed<eed<eed<eed<dZe ed<dS) IOArgsa) Return value of io/common.py:_get_filepath_or_buffer. Note (copy&past from io/parsers): filepath_or_buffer can be Union[FilePathOrBuffer, s3fs.S3File, gcsfs.GCSFile] though mypy handling of conditional imports is difficult. See https://github.com/python/mypy/issues/1297 filepath_or_bufferencodingmode compressionF should_closeN) __name__ __module__ __qualname____doc__r__annotations__strrr'boolr/r/4/tmp/pip-unpacked-wheel-q9tj5l6a/pandas/io/common.pyr"*s r"c@s~eZdZUdZeed<eed<eje dZ e eed<dZ e ed<dZe ed<d d d d Zdd d dZed dddZd S) IOHandlesau Return value of io/common.py:get_handle Can be used as a context manager. This is used to easily close created buffers and to handle corner cases when TextIOWrapper is inserted. handle: The file handle to be used. created_handles: All file handles that are created by get_handle is_wrapped: Whether a TextIOWrapper needs to be detached. handler&)default_factorycreated_handlesF is_wrappedis_mmapN)returnc Cs~|jr8t|jtst|j|j|j|jyx|jD] }| qBWWnt t fk rlYnXg|_d|_dS)z Close all created buffers. Note: If a TextIOWrapper was inserted, it is flushed and detached to avoid closing the potentially user-created buffer. FN) r5 isinstancer2rAssertionErrorflushdetachr4removecloseOSError ValueError)selfr2r/r/r0r=Qs   zIOHandles.closecCs|S)Nr/)r@r/r/r0 __enter__eszIOHandles.__enter__)argsr7cGs |dS)N)r=)r@rBr/r/r0__exit__hszIOHandles.__exit__)r(r)r*r+rr,r dataclassesfieldlistr4r r5r.r6r=rAr rCr/r/r/r0r1<s   r1)r7cCst|tsdSt|jtkS)z Check to see if a URL has a valid protocol. Parameters ---------- url : str or unicode Returns ------- isurl : bool If `url` has a valid protocol return True otherwise False. F)r8r- parse_urlscheme _VALID_URLS)urlr/r/r0is_urlls rK)r#r7cCst|trtj|S|S)a] Return the argument with an initial component of ~ or ~user replaced by that user's home directory. Parameters ---------- filepath_or_buffer : object to be converted if possible Returns ------- expanded_filepath_or_buffer : an expanded filepath or the input if not expandable )r8r-ospath expanduser)r#r/r/r0 _expand_user~s  rOcCst|trtddS)NzPassing a bool to header is invalid. Use header=None for no header or header=int or list-like of ints to specify the row(s) making up the column names)r8r. TypeError)headerr/r/r0validate_header_args rRF)r#convert_file_liker7cCs6|st|rttt|St|tjr.|}t|S)a Attempt to convert a path-like object to a string. Parameters ---------- filepath_or_buffer : object to be converted Returns ------- str_filepath_or_buffer : maybe a string version of the object Notes ----- Objects supporting the fspath protocol (python 3.6+) are coerced according to its __fspath__ method. Any other object is passed through unchanged, which includes bytes, strings, buffers, or anything else that's not even path-like. ) r rrr r8rLPathLike __fspath__rO)r#rSr/r/r0stringify_paths   rVcOsddl}|jj||S)z` Lazy-import wrapper for stdlib urlopen, as that imports a big chunk of the stdlib. rN)urllib.requestrequesturlopen)rBkwargsurllibr/r/r0rYsrY)rJr7cCst|tod|ko|d S)zR Returns true if the given URL looks like something fsspec can handle z://)zhttp://zhttps://)r8r- startswith)rJr/r/r0 is_fsspec_urls r]utf-8r)r#r$r&r%storage_optionsr7c Cst|}t|\}}t||}|rHt|drHd|krHtjdtddd}t||d}|dk rl|dd  }d |kr|d kr|d krt|d |t |}d|krd|kr|d7}t |t r$t |r$|rtdt|}|jdd}|dkrddi}t|} |t| ||d|dSt|rNt |t s>t|drV|dd}|drn|dd}td} g} y&tdddlm} m} | | tg} Wntk rYnXy$| j|fd|i|pi}Wn^t | k r:|dkrddi}nt|}d|d<| j|fd|i|p,i}YnXt|||d|dS|r\tdt |t t!t"j"frtt#|||d|dSt$|sdt%|}t|t|||d|dS) a If the filepath_or_buffer is a url, translate and return the buffer. Otherwise passthrough. Parameters ---------- filepath_or_buffer : a url, filepath (str, py.path.local or pathlib.Path), or buffer compression : {{'gzip', 'bz2', 'zip', 'xz', None}}, optional encoding : the encoding to use to decode bytes, default is 'utf-8' mode : str, optional storage_options : dict, optional Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc., if using a URL that will be parsed by ``fsspec``, e.g., starting "s3://", "gcs://". An error will be raised if providing this argument with a local path or a file-like buffer. See the fsspec and backend storage implementation docs for the set of allowed keys and values .. versionadded:: 1.2.0 ..versionchange:: 1.2.0 Returns the dataclass IOArgs. writebzDcompression has no effect when passing a non-binary object as input.) stacklevelN)method_-w)bz2xz)zutf-16zutf-32z( will not write the byte order mark for tz?storage_options passed with file object or non-fsspec file pathzContent-EncodinggzipreT)r#r$r&r'r%zs3a://zs3://zs3n://fsspecZbotocorer) ClientErrorNoCredentialsErrorr%ZanonFz)Invalid file path or buffer object type: )&rVget_compression_methodinfer_compressionhasattrwarningswarnRuntimeWarningdictreplacelowerUnicodeWarningr8r-rKr?rYheadersgetrreadr=r"r]r9r\rZbotocore.exceptionsrnroPermissionError ImportErroropentuplebytesmmaprOr type)r#r$r&r%r`compression_methodZ fsspec_modereqcontent_encodingreaderrmZerr_types_to_retry_with_anonrnroZfile_objmsgr/r/r0_get_filepath_or_buffers!              " r)rMr7cCsddlm}td||S)z converts an absolute native path to a FILE URL. Parameters ---------- path : a path in native format Returns ------- a valid FILE URL r) pathname2urlzfile:)rWrr)rMrr/r/r0file_path_to_url~s rz.gzz.bz2z.zipz.xz)rlriziprj)r&r7c Cs`t|trPt|}y|d}WqXtk rL}ztd|Wdd}~XYqXXni}|}||fS)a Simplifies a compression argument to a compression method string and a mapping containing additional arguments. Parameters ---------- compression : str or mapping If string, specifies the compression method. If mapping, value at key 'method' specifies compression method. Returns ------- tuple of ({compression method}, Optional[str] {compression arguments}, Dict[str, Any]) Raises ------ ValueError on mapping missing 'method' key rez.If mapping, compression must have key 'method'N)r8r rvpopKeyErrorr?)r&compression_argsrerrr/r/r0rps rp)r#r&r7cCs|dkr dS|dkrZt|dd}t|ts.dSx&tD]\}}||r8|Sq8WdS|tkrf|Sd|}ddgtt}|d|7}t|dS)a Get the compression method for filepath_or_buffer. If compression='infer', the inferred compression method is returned. Otherwise, the input compression method is returned unchanged, unless it's invalid, in which case an error is raised. Parameters ---------- filepath_or_buffer : str or file handle File path or object. compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None} If 'infer' and `filepath_or_buffer` is path-like, then detect compression from the following extensions: '.gz', '.bz2', '.zip', or '.xz' (otherwise no compression). Returns ------- string or None Raises ------ ValueError on invalid compression specified. NZinferT)rSzUnrecognized compression type: z Valid compression types are ) rVr8r-_compression_to_extensionitemsrxendswithsortedr?)r#r& extensionrZvalidr/r/r0rqs    rqT) path_or_bufr%r$r& memory_mapis_texterrorsr`r7cCs||pd}}t||r(d|kr(|d7}t|||||d} | j} t| || j| j|\} }} t| t} t| j } | d}|r| j dd| _|dkr| rt| tst t jf| | jd| } nt jf| | jd | } n|d krtj| fd | ji| } n|d krt| | jf| } | jd kr| | | }t|dkrV| | } n,t|dkrttd|ntd|n.|dkrtt| | j} nd|}t|t| trt | | nft| tr4| jrd| jkr|dkr|dkrd}t| | j| j|dd} n t| | j} | | d}|r|sRt| | jrt| | j|dd} | | t| jtp| j }| | jrt| jtrt | | jt| trt t| | ||| j dS)a Get file handle for given path/buffer and mode. Parameters ---------- path_or_buf : str or file handle File path or object. mode : str Mode to open path_or_buf with. encoding : str or None Encoding to use. compression : str or dict, default None If string, specifies compression mode. If dict, value at key 'method' specifies compression mode. Compression mode must be one of {'infer', 'gzip', 'bz2', 'zip', 'xz', None}. If compression mode is 'infer' and `filepath_or_buffer` is path-like, then detect compression from the following extensions: '.gz', '.bz2', '.zip', or '.xz' (otherwise no compression). If dict and compression mode is one of {'zip', 'gzip', 'bz2'}, or inferred as one of the above, other entries passed as additional compression options. .. versionchanged:: 1.0.0 May now be a dict with key 'method' as compression mode and other keys as compression options if compression mode is 'zip'. .. versionchanged:: 1.1.0 Passing compression options as keys in dict is now supported for compression modes 'gzip' and 'bz2' as well as 'zip'. memory_map : boolean, default False See parsers._parser_params for more information. is_text : boolean, default True Whether the type of the content passed to the file/buffer is string or bytes. This is not the same as `"b" not in mode`. If a string content is passed to a binary file/buffer, a wrapper is inserted. errors : str, default 'strict' Specifies how encoding and decoding errors are to be handled. See the errors argument for :func:`open` for a full list of options. storage_options: StorageOptions = None Passed to _get_filepath_or_buffer .. versionchanged:: 1.2.0 Returns the dataclass IOHandles zutf-8rb)r$r&r%r`rerkr!rl)filenamer%)fileobjr%rir%rr_rzZero files found in ZIP file z9Multiple files found in ZIP file. Only one file per ZIP: rjzUnrecognized compression type: Nrw)r$rnewlineF)r2r4r5r6r&)_is_binary_moderr#_maybe_memory_mapr$r%r8r-rvr&rrwr9rlGzipFileriBZ2File _BytesZipFileappendnamelistlenrr?rlzmarr'reverser1)rr%r$r&rrrr`Zencoding_passedZioargsr2handlesZis_pathrZ zip_namesrr5r/r/r0 get_handles<                   rcsbeZdZdZdeeeedfdd ZddZddfd d Z fd d Z e d dZ Z S)ra  Wrapper for standard library class ZipFile and allow the returned file-like handle to accept byte strings via `write` method. BytesIO provides attributes of file-like object and ZipFile.writestr writes bytes strings into a member of the archive. N)filer% archive_namec sB|dd}||_d|_dtji}||tj||f|dS)Nrbr!r&)rwrmultiple_write_bufferzipfile ZIP_DEFLATEDupdatesuper__init__)r@rr%rrZZ kwargs_zip) __class__r/r0rs    z_BytesZipFile.__init__cCs2|jdkr"t|trtnt|_|j|dS)N)rr8rrrra)r@datar/r/r0ras z_BytesZipFile.write)r7c sP|jdks|jjrdS|jp$|jp$d}|jt||jWdQRXdS)Nr)rclosedrrrwritestrgetvalue)r@r)rr/r0r:s z_BytesZipFile.flushcs|tdS)N)r:rr=)r@)rr/r0r=sz_BytesZipFile.closecCs |jdkS)N)fp)r@r/r/r0rsz_BytesZipFile.closed)N)r(r)r*r+rr-rrrar:r=propertyr __classcell__r/r/)rr0rs   rc@sHeZdZdZedddZedddZddd d Zedd d Z d S) _MMapWrappera Wrapper for the Python's mmap class so that it can be properly read in by Python's csv.reader class. Parameters ---------- f : file object File object to be mapped onto memory. Must support the 'fileno' method or have an equivalent attribute )fcCsNi|_x*dD]"}t||sq t|||j|<q Wtj|dtjd|_dS)N)seekablereadableZ writeabler)access) attributesrrgetattrrfilenoZ ACCESS_READ)r@r attributer/r/r0rs   z_MMapWrapper.__init__)namecs$jkrfddStjS)Ncs jS)N)rr/)rr@r/r0z*_MMapWrapper.__getattr__..)rrr)r@rr/)rr@r0 __getattr__s z_MMapWrapper.__getattr__)r7cCs|S)Nr/)r@r/r/r0__iter__sz_MMapWrapper.__iter__cCs$|j}|d}|dkr t|S)Nzutf-8r!)rreadlinedecode StopIteration)r@Znewbytesrr/r/r0__next__ s   z_MMapWrapper.__next__N) r(r)r*r+rrr-rrrr/r/r/r0rs  r)r2rr$r%rr7cCsg}|t|dpt|tM}|s*|||fSt|trh|rTd|krTt||||dd}n t||}||y4ttjt|}|| ||||}Wnt k rd}YnX|||fS)zTry to memory map file/buffer.rrbr!)r$rrF) rrr8r-rrrrrr=r< Exception)r2rr$r%rrwrappedr/r/r0rs$        rc CsHd}t|}t|ts|Sytj|}Wnttfk rBYnX|S)zTest whether file exists.F)rVr8r-rLrMexistsrPr?)r#rr/r/r0 file_exists@s r)r2r%r7cCs8tjf}t||rdSttf}t||p6dt|d|kS)z+Whether the handle is opened in binary modeFrbr%)codecsStreamReaderWriterr8rrr)r2r%Z text_classesZbinary_classesr/r/r0rNs  r)F)r^Nr_N)NNFTNN)Mr+rir collectionsrrDrliorrrrrrrLtypingrr r r r r rrrr urllib.parserrrGrrrrsrZpandas._typingrrrrrrZ pandas.compatrrZpandas.compat._optionalrZpandas.core.dtypes.commonr rsetrIdiscardZ dataclassr"r1r.rKrOrRrVrYr]r-rrrrprqrZipFilerIteratorrrrrr/r/r/r0st 0    0   '"6E5-!