U Dx`*n @sdZddlZddlZddlmZddlZddlZddlmZm Z m Z m Z m Z ddl Z ddlZddlmZmZmZmZmZmZmZmZmZmZddlmZmZmZmZm Z ddl!Z!ddl"Z"ddl#m$Z$m%Z%m&Z&m'Z'm(Z(m)Z)ddl*m+Z+m,Z,dd l-m.Z.dd l/m0Z0e,Z1e2e eeZ3e34d ej5Gd d d Z6ej5GdddZ7e8dddZ9e'ee'edddZ:ddddZ;dDe(ee8e'edddZdEe(e?e&e?e)e6d#d$d%Z@e?e?d&d'd(ZAd)d*d+d,d-ZBe&eee?e%fd.d/d0ZCe(ee?ee?d1d2d3ZDdFe(e?ee?e&e8e8ee?e)e7d5 d6d7ZEGd8d9d9e"jFe ZGGd:d;d;ejHZIe'e8e?e?ee?ee'e8ee$fd<d=d>ZJe(e8dd?d@ZKe(e?e8dAdBdCZLdS)GzCommon IO api utilitiesN)abc)BufferedIOBaseBytesIO RawIOBaseStringIO TextIOWrapper) IOAnyAnyStrDictListMappingOptionalTupleUnioncast)urljoinurlparse uses_netloc uses_params uses_relative)BufferCompressionDictCompressionOptions FileOrBufferFilePathOrBufferStorageOptions) get_lzma_file import_lzma)import_optional_dependency) is_file_likec@s>eZdZUdZeed<eed<eed<eed<dZe ed<dS) IOArgsa) Return value of io/common.py:_get_filepath_or_buffer. Note (copy&past from io/parsers): filepath_or_buffer can be Union[FilePathOrBuffer, s3fs.S3File, gcsfs.GCSFile] though mypy handling of conditional imports is difficult. See https://github.com/python/mypy/issues/1297 filepath_or_bufferencodingmode compressionF should_closeN) __name__ __module__ __qualname____doc__r__annotations__strrr'boolr/r/7/tmp/pip-target-zr53vnty/lib/python/pandas/io/common.pyr"*s  r"c@s~eZdZUdZeed<eed<eje dZ e eed<dZ e ed<dZe ed<d d d d Zdd d dZed dddZd S) IOHandlesau Return value of io/common.py:get_handle Can be used as a context manager. This is used to easily close created buffers and to handle corner cases when TextIOWrapper is inserted. handle: The file handle to be used. created_handles: All file handles that are created by get_handle is_wrapped: Whether a TextIOWrapper needs to be detached. handler&)default_factorycreated_handlesF is_wrappedis_mmapNreturnc Csz|jr8t|jtst|j|j|j|jz|jD] }| q@Wnt t fk rhYnXg|_d|_dS)z Close all created buffers. Note: If a TextIOWrapper was inserted, it is flushed and detached to avoid closing the potentially user-created buffer. FN) r5 isinstancer2rAssertionErrorflushdetachr4removecloseOSError ValueError)selfr2r/r/r0r>Qs   zIOHandles.closecCs|SNr/rAr/r/r0 __enter__eszIOHandles.__enter__)argsr8cGs |dSrB)r>)rArEr/r/r0__exit__hszIOHandles.__exit__)r(r)r*r+rr,r dataclassesfieldlistr4r r5r.r6r>rDr rFr/r/r/r0r1<s    r1r7cCst|tsdSt|jtkS)z Check to see if a URL has a valid protocol. Parameters ---------- url : str or unicode Returns ------- isurl : bool If `url` has a valid protocol return True otherwise False. F)r9r- parse_urlscheme _VALID_URLSurlr/r/r0is_urlls rO)r#r8cCst|trtj|S|S)a] Return the argument with an initial component of ~ or ~user replaced by that user's home directory. Parameters ---------- filepath_or_buffer : object to be converted if possible Returns ------- expanded_filepath_or_buffer : an expanded filepath or the input if not expandable )r9r-ospath expanduser)r#r/r/r0 _expand_user~s  rScCst|trtddS)NzPassing a bool to header is invalid. Use header=None for no header or header=int or list-like of ints to specify the row(s) making up the column names)r9r. TypeError)headerr/r/r0validate_header_args rVF)r#convert_file_liker8cCs6|st|rttt|St|tjr.|}t|S)a Attempt to convert a path-like object to a string. Parameters ---------- filepath_or_buffer : object to be converted Returns ------- str_filepath_or_buffer : maybe a string version of the object Notes ----- Objects supporting the fspath protocol (python 3.6+) are coerced according to its __fspath__ method. Any other object is passed through unchanged, which includes bytes, strings, buffers, or anything else that's not even path-like. ) r rrr r9rPPathLike __fspath__rS)r#rWr/r/r0stringify_paths   rZcOsddl}|jj||S)z` Lazy-import wrapper for stdlib urlopen, as that imports a big chunk of the stdlib. rN)urllib.requestrequesturlopen)rEkwargsurllibr/r/r0r]sr])rNr8cCst|tod|ko|d S)zR Returns true if the given URL looks like something fsspec can handle z://)zhttp://zhttps://)r9r- startswithrMr/r/r0 is_fsspec_urls   rautf-8r)r#r$r&r%storage_optionsr8c Cst|}t|\}}t||}|rHt|drHd|krHtjdtddd}t||d}|dk rl|dd  }d |kr|d kr|d krt|d |t |}d|krd|kr|d7}t |t r$t |r$|rtdt|}|jdd}|dkrddi}t|} |t| ||d|dSt|rNt |t s>t|drV|dd}|drn|dd}td} g} z&tdddlm} m} | | tg} Wntk rYnXz$| j|fd|i|pi}Wn^t | k r:|dkrddi}nt|}d|d<| j|fd|i|p,i}YnXt|||d|dS|r\tdt |t t!t"j"frtt#|||d|dSt$|sdt%|}t|t|||d|dS) a If the filepath_or_buffer is a url, translate and return the buffer. Otherwise passthrough. Parameters ---------- filepath_or_buffer : a url, filepath (str, py.path.local or pathlib.Path), or buffer compression : {{'gzip', 'bz2', 'zip', 'xz', None}}, optional encoding : the encoding to use to decode bytes, default is 'utf-8' mode : str, optional storage_options : dict, optional Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc., if using a URL that will be parsed by ``fsspec``, e.g., starting "s3://", "gcs://". An error will be raised if providing this argument with a local path or a file-like buffer. See the fsspec and backend storage implementation docs for the set of allowed keys and values .. versionadded:: 1.2.0 ..versionchange:: 1.2.0 Returns the dataclass IOArgs. writebzDcompression has no effect when passing a non-binary object as input.) stacklevelN)method_-w)bz2xz)zutf-16zutf-32z( will not write the byte order mark for tz?storage_options passed with file object or non-fsspec file pathzContent-EncodinggzipriT)r#r$r&r'r%zs3a://zs3://zs3n://fsspecZbotocorer) ClientErrorNoCredentialsErrorr%ZanonFz)Invalid file path or buffer object type: )&rZget_compression_methodinfer_compressionhasattrwarningswarnRuntimeWarningdictreplacelowerUnicodeWarningr9r-rOr@r]headersgetrreadr>r"rar:r`rZbotocore.exceptionsrrrsPermissionError ImportErroropentuplebytesmmaprSr type)r#r$r&r%rdcompression_methodZ fsspec_modereqcontent_encodingreaderrqZerr_types_to_retry_with_anonrrrsZfile_objmsgr/r/r0_get_filepath_or_buffers!              r)rQr8cCsddlm}td||S)z converts an absolute native path to a FILE URL. Parameters ---------- path : a path in native format Returns ------- a valid FILE URL r) pathname2urlzfile:)r[rr)rQrr/r/r0file_path_to_url~s rz.gzz.bz2z.zipz.xz)rprmziprn)r&r8c Cs`t|trPt|}z|d}WqXtk rL}ztd|W5d}~XYqXXni}|}||fS)a Simplifies a compression argument to a compression method string and a mapping containing additional arguments. Parameters ---------- compression : str or mapping If string, specifies the compression method. If mapping, value at key 'method' specifies compression method. Returns ------- tuple of ({compression method}, Optional[str] {compression arguments}, Dict[str, Any]) Raises ------ ValueError on mapping missing 'method' key riz.If mapping, compression must have key 'method'N)r9r rzpopKeyErrorr@)r&compression_argsrerrr/r/r0rts rt)r#r&r8cCs|dkr dS|dkrZt|dd}t|ts.dStD]\}}||r6|Sq6dS|tkrf|Sd|}ddgtt}|d|7}t|dS)a Get the compression method for filepath_or_buffer. If compression='infer', the inferred compression method is returned. Otherwise, the input compression method is returned unchanged, unless it's invalid, in which case an error is raised. Parameters ---------- filepath_or_buffer : str or file handle File path or object. compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None} If 'infer' and `filepath_or_buffer` is path-like, then detect compression from the following extensions: '.gz', '.bz2', '.zip', or '.xz' (otherwise no compression). Returns ------- string or None Raises ------ ValueError on invalid compression specified. NZinferT)rWUnrecognized compression type: z Valid compression types are ) rZr9r-_compression_to_extensionitemsr|endswithsortedr@)r#r& extensionrvalidr/r/r0rus     ruT) path_or_bufr%r$r& memory_mapis_texterrorsrdr8cCs||pd}}t||r(d|kr(|d7}t|||||d} | j} t| || j| j|\} }} t| t} t| j } | d}|r| j dd| _|dkr| rt| tst t jf| | jd| } nt jf| | jd | } n|d krtj| fd | ji| } n|d krt| | jf| } | jd kr| | | }t|dkrV| | } n,t|dkrttd|ntd|n.|dkrtt| | j} nd|}t|t| trt | | nft| tr4| jrd| jkr|dkr|dkrd}t| | j| j|dd} n t| | j} | | d}|r|sRt| | jrt| | j|dd} | | t| jtp| j }| | jrt| jtrt | | jt| trt t| | ||| j dS)a Get file handle for given path/buffer and mode. Parameters ---------- path_or_buf : str or file handle File path or object. mode : str Mode to open path_or_buf with. encoding : str or None Encoding to use. compression : str or dict, default None If string, specifies compression mode. If dict, value at key 'method' specifies compression mode. Compression mode must be one of {'infer', 'gzip', 'bz2', 'zip', 'xz', None}. If compression mode is 'infer' and `filepath_or_buffer` is path-like, then detect compression from the following extensions: '.gz', '.bz2', '.zip', or '.xz' (otherwise no compression). If dict and compression mode is one of {'zip', 'gzip', 'bz2'}, or inferred as one of the above, other entries passed as additional compression options. .. versionchanged:: 1.0.0 May now be a dict with key 'method' as compression mode and other keys as compression options if compression mode is 'zip'. .. versionchanged:: 1.1.0 Passing compression options as keys in dict is now supported for compression modes 'gzip' and 'bz2' as well as 'zip'. memory_map : boolean, default False See parsers._parser_params for more information. is_text : boolean, default True Whether the type of the content passed to the file/buffer is string or bytes. This is not the same as `"b" not in mode`. If a string content is passed to a binary file/buffer, a wrapper is inserted. errors : str, default 'strict' Specifies how encoding and decoding errors are to be handled. See the errors argument for :func:`open` for a full list of options. storage_options: StorageOptions = None Passed to _get_filepath_or_buffer .. versionchanged:: 1.2.0 Returns the dataclass IOHandles rbrf)r$r&r%rdriror!rp)filenamer%)fileobjr%rmr%rrcrzZero files found in ZIP file z9Multiple files found in ZIP file. Only one file per ZIP: rnrNr{r$rnewlineF)r2r4r5r6r&)_is_binary_moderr#_maybe_memory_mapr$r%r9r-rzr&rr{r:rpGzipFilermBZ2File _BytesZipFileappendnamelistlenrr@rlzmarr'reverser1)rr%r$r&rrrrdZencoding_passedZioargsr2handlesZis_pathrZ zip_namesrr5r/r/r0 get_handles<                rcsbeZdZdZdeeeedfdd ZddZddfd d Z fd d Z e d dZ Z S)ra  Wrapper for standard library class ZipFile and allow the returned file-like handle to accept byte strings via `write` method. BytesIO provides attributes of file-like object and ZipFile.writestr writes bytes strings into a member of the archive. N)filer% archive_namec sB|dd}||_d|_dtji}||tj||f|dS)Nrfr!r&)r{rmultiple_write_bufferzipfile ZIP_DEFLATEDupdatesuper__init__)rArr%rr^Z kwargs_zip __class__r/r0rs    z_BytesZipFile.__init__cCs2|jdkr"t|trtnt|_|j|dSrB)rr9rrrre)rAdatar/r/r0res z_BytesZipFile.writer7c sP|jdks|jjrdS|jp$|jp$d}|jt||jW5QRXdS)Nr)rclosedrrrwritestrgetvalue)rArrr/r0r;s z_BytesZipFile.flushcs|tdSrB)r;rr>rCrr/r0r>sz_BytesZipFile.closecCs |jdkSrB)fprCr/r/r0rsz_BytesZipFile.closed)N)r(r)r*r+rr-rrrer;r>propertyr __classcell__r/r/rr0rs  rc@sHeZdZdZedddZedddZddd d Zedd d Z d S) _MMapWrappera Wrapper for the Python's mmap class so that it can be properly read in by Python's csv.reader class. Parameters ---------- f : file object File object to be mapped onto memory. Must support the 'fileno' method or have an equivalent attribute )fcCsJi|_dD]"}t||sq t|||j|<q tj|dtjd|_dS)N)seekablereadableZ writeabler)access) attributesrvgetattrrfilenoZ ACCESS_READ)rAr attributer/r/r0rs  z_MMapWrapper.__init__)namecs$jkrfddStjS)Ncs jSrB)rr/rrAr/r0z*_MMapWrapper.__getattr__..)rrr)rArr/rr0 __getattr__s z_MMapWrapper.__getattr__r7cCs|SrBr/rCr/r/r0__iter__sz_MMapWrapper.__iter__cCs$|j}|d}|dkr t|S)Nrbr!)rreadlinedecode StopIteration)rAZnewbytesrr/r/r0__next__ s   z_MMapWrapper.__next__N) r(r)r*r+rrr-rrrr/r/r/r0rs  r)r2rr$r%rr8cCsg}|t|dpt|tM}|s*|||fSt|trh|rTd|krTt||||dd}n t||}||z4ttjt|}|| ||||}Wnt k rd}YnX|||fS)zTry to memory map file/buffer.rrfr!rF) rvr9r-rrrrrr>r= Exception)r2rr$r%rrwrappedr/r/r0rs$        rc CsHd}t|}t|ts|Sztj|}Wnttfk rBYnX|S)zTest whether file exists.F)rZr9r-rPrQexistsrTr@)r#rr/r/r0 file_exists@s r)r2r%r8cCs\d|ksd|krd|kStjtjtjf}tt||r:dSttf}t||pZdt |d|kS)z+Whether the handle is opened in binary moderorfFr%) codecs StreamWriter StreamReaderStreamReaderWriter issubclassrrrr9r)r2r%Z text_classesZbinary_classesr/r/r0rNsr)F)rbNrcN)NNFTNN)Mr+rmr collectionsrrGrpiorrrrrrrPtypingrr r r r r rrrr urllib.parserrrJrrrrwrZpandas._typingrrrrrrZ pandas.compatrrZpandas.compat._optionalrZpandas.core.dtypes.commonr rsetrLdiscardZ dataclassr"r1r.rOrSrVrZr]rar-rrrrtrurZipFilerIteratorrrrrr/r/r/r0s 0    /  "  - $ 8 N5- '