B @`,@sddlmZmZddlZddlZddlZddlmZddlZ ddlm Z m Z m Z m Z mZmZmZddlmZddlmmZddZddd d ifde ed fd d d fddifddifddifddifddifddifddddifddddifddddifddddifddifddifddifd d!ifd"d#ifd$d%ifd&difd'difd(difd)d*ifd+difd,d-ddifd,d-ddifd.d/ifd0d1ifd2difd3d#ifd4difd5d-ddifd5d-ddifd6d7ifd8d9ifd:d-ddifd:d-ddifd;difd;dddifd;dddifdififd?d@ifdAdifg*eedBddCdDdEdFdGdHdIdJdKdLdMdNdOd,d5dPdQd6d8d:dRdSdTdUdVgdgd>igd>Zee\ZZZdWdXeejDeeZ e rt!ej"eedYdZd[Z#d\d]ej$d^gfd_d`ej$dagfdbej$ej$ej$gfdbgfdcd]ej$ddgfgZ%ee%\ZZej"e%edYdedfZ&GdgdhdhZ'didjZ(ej)*dkddddgfdddgfd"ddgfd3dddgfgdldmZ+ej)*dkdGdddgfdEdddgfdDdddgfdGdddgfgdndoZ,dpdqZ-ej)*dre.eej/e j e j gdsdtZ0dudvZ1dwdxZ2dS)y)datetime timedeltaN)lib) DataFrameIndex MultiIndexSeriesconcatisnanotnacCs(t|trt||n t||dS)N) isinstancertmassert_series_equalassert_index_equal)leftrightr=/tmp/pip-unpacked-wheel-q9tj5l6a/pandas/tests/test_strings.pyassert_series_or_index_equals rcatrsep,Zzyxr)rjoincenter) contains)acountdecode)zUTF-8encodeendswithnaTFextract)z([a-z]*)expand extractallfindfindallget)rindex)r)rljustmatch fullmatch normalize)NFCpad partition) repeat)replace)rzrfindrindexrjust rpartitionslice)r slice_replace)rr;r5split startswith translateadwrap)zfill capitalize get_dummiesisalnumisalpha isdecimalisdigitislower isnumericisspaceistitleisupperlenlowerlstriprsplitrstripstripswapcasetitleuppercasefoldcCsh|]}|ds|qS)_)r>).0frrr lsr])paramsidscCs|jS)a Fixture for all public methods of `StringMethods` This fixture returns a tuple of the method name and sample arguments necessary to call the method. Returns ------- method_name : str The name of the method in `StringMethods` args : tuple Sample values for the positional arguments kwargs : dict Sample values for the keyword arguments Examples -------- >>> def test_something(any_string_method): ... s = Series(['a', 'b', np.nan, 'd']) ... ... method_name, args, kwargs = any_string_method ... method = getattr(s.str, method_name) ... # will not raise ... method(*args, **kwargs) )param)requestrrrany_string_methodrsrbstringrcbytesacemptyz mixed-integerrCcCs |j\}}tj|td}||fS)a> Fixture for all (inferred) dtypes allowed in StringMethods.__init__ The covered (inferred) types are: * 'string' * 'empty' * 'bytes' * 'mixed' * 'mixed-integer' Returns ------- inferred_dtype : str The string for the inferred dtype from _libs.lib.infer_dtype values : np.ndarray An array of object dtype that will be inferred to have `inferred_dtype` Examples -------- >>> import pandas._libs.lib as lib >>> >>> def test_something(any_allowed_skipna_inferred_dtype): ... inferred_dtype, values = any_allowed_skipna_inferred_dtype ... # will pass ... assert lib.infer_dtype(values, skipna=True) == inferred_dtype ... ... # constructor for .str-accessor will also pass ... Series(values).str )dtype)r`nparrayobject)rainferred_dtypevaluesrrr!any_allowed_skipna_inferred_dtypes roc@seZdZddZddZejdedgddZ ejdedgd d Z d d Z d dZ ddZ ddZddZejddeegddZddZddZejdddgejdd dgejd!d dgd"d#Zejjd$d%d&d'gd(d)d*gd%d&d+ggd,d-d.gd/ejjd0eeed1d2gd3d4d5d6gd/d7d8Zd9d:Zejd;dd?gd@dAZejd;dd?gdBdCZdDdEZdFdGZdHdIZdJdKZdLdMZdNdOZ ejdddgejdPde!j"e#j$gejdQdRdSgdTdUZ%ejdddgejdPde!j"e#j$gejdQdRdSgdVdWZ&dXdYZ'dZd[Z(d\d]Z)d^d_Z*d`daZ+dbdcZ,dddeZ-dfdgZ.dhdiZ/djdkZ0dldmZ1dndoZ2dpdqZ3drdsZ4dtduZ5dvdwZ6dxdyZ7dzd{Z8d|d}Z9d~dZ:ddZ;ddZddZ?ddZ@ddZAddZBddZCddZDddZEddZFddZGddZHddZIddZJddZKddZLddZMddZNddZOddZPddZQejddddddgddZRddZSddZTddÄZUddńZVddDŽZWejdedgejdddgdd̈́ZXddτZYddфZZddӄZ[ddՄZ\ddׄZ]ddلZ^ddۄZ_dd݄Z`dd߄ZaddZbddZcddZdddZeddZfddZgddZhddZiejdd&ddedde!j"dgfdd'dedde!j"dgfdddedde!j"dgfd'dd&edde!j"dgfd'ddedde!j"dgfgddZjddZkddZlddZmdd Znd d Zod d ZpddZqejderee!jsgddZtddZuddZvddZwddZxddZyddZzdd Z{d!d"Z|d#d$Z}d%d&Z~d'd(Zd)d*Zd+d,ZdS(-TestStringMethodscCs,tjtjkstttdgjtjs(tdS)Nr))rstrstrings StringMethodsAssertionErrorr )selfrrrtest_apiszTestStringMethods.test_apic CsHtdddgg}d}tjt|d |jWdQRXt|drDtdS)Nrbrdz5Can only use .str accessor with Index, not MultiIndex)r+rq)rZ from_arrayspytestraisesAttributeErrorrqhasattrrt)rumimsgrrrtest_api_mi_raisess z$TestStringMethods.test_api_mi_raisesricategoryc Csz|}|\}}|||d}ddddddg}||krDt|jtjsvtn2d} tjt| d  |jWdQRXt|d rvtdS) N)rircunicoderhremixedz mixed-integerz/Can only use .str accessor with string values.*)r+rq) r rqrrrsrtrxryrzr{) ruindex_or_seriesriZany_skipna_inferred_dtypeboxrmrntZtypes_passing_constructorr}rrrtest_api_per_dtypes z$TestStringMethods.test_api_per_dtypec CsP|}|\}}|\} } } d} |tkrn|jdkrn| dkrF| ddrFd} q| dkr`| ddr`d} q| dkrd } n$|tkr|d kr|tkr| dkrd } | dk rtjj| d } |j| |||d }t |j | }| d k}| dk}ddd gdg|ddg|}||kr|| | n:d| dt |d}tj t |d|| | WdQRXdS)Nr)r0r9r#Tz#Method cannot deal with empty Indexr=z,Split fails on empty Series when expand=TruerFz(Need to fortify get_dummies corner casesrh)reason)ri)rr'rPr:)rrcrrerz mixed-integerzCannot use .str.z with values of inferred dtype .)r+)rsizer'rlrxmarkZxfailnodeZ add_markergetattrrqreprry TypeError)rurrirorbrarrmrn method_nameargskwargsrrrmethodZ bytes_allowedZ mixed_allowedZ allowed_typesr}rrrtest_api_per_methods8         z%TestStringMethods.test_api_per_methodc Csttd}|d|}|d}t|jtjs4t|\}}}t|j|||}t|j|||}t|t rzt ||n$t|trt ||n ||kstdS)NZaabbr1r) rlistastyper rqrrrsrtrrr assert_frame_equalr) rurbsrdrrrresultexpectedrrrtest_api_for_categorical-s      z*TestStringMethods.test_api_for_categoricalc Csd}t|}ttXxP|jD]F}t|ts2tt|j|jx"|D]}t|tsHt |sHtqHWq WWdQRX| j dkstdS)N)googleZ wikimediaZ wikipediaZ wikitravell) rr assert_produces_warning FutureWarningrqr rtrr(r dropnarnitem)rustrsdsrelrrr test_iterAs   (zTestStringMethods.test_iterc Cs^tgtd}d\}}ttxt|jD]\}}q,WWdQRX|dksNt|dksZtdS)N)ri)rAr;rAr;)rrlr rr enumeraterqrt)rurirrrrtest_iter_emptyXs   z!TestStringMethods.test_iter_emptyc CsPtdg}ttxt|jD]\}}q"WWdQRX|r@tt||dS)Nr)rr rrrrqrtr)rurrrrrrtest_iter_single_elementfs   z*TestStringMethods.test_iter_single_elementc CshtddtdD}d\}}ttxt|jD]\}}q6WWdQRX|dksXt|dksdtdS)Nc Ss*g|]"}tdtjdtjddqS)Nr)r:rjrandomrandint)r[rZrrr sszATestStringMethods.test_iter_object_try_string..)rAhrAr)rranger rrrrqrt)rurrrrrrtest_iter_object_try_stringps  z-TestStringMethods.test_iter_object_try_stringotherNcCsH|}ddg}|r||}n|}||ddjj|dd}|jdksDtdS)Nrrwname)rr)r)rqrrrt)rurrrrnrrrrtest_str_cat_names z#TestStringMethods.test_str_cat_namec CsD|}|dddddtjg}|j}d}||ks4t|jjdd}d}||ksRt|jjdd d }d }||ksrttjdtjdd d tjgtd}|ddddddg}|jj|dd}t|||jjt|dd}t||d}t dddg}t j t |d|j|j WdQRXt j t |d|jt|WdQRXdS)NrrwrdZaabbc-)na_repzaabbc-rZNA)rrZ a_a_b_b_c_NAdfoo)riaaza-bbZbdZcfooz--zzTestStringMethods.rrrznp.arrayc CsJtdddg}||}d}tjt|d|jj|dddWdQRXdS) Nrrwrdz;Concatenation requires list-likes containing only strings.*)r+outerr)rr)rrxryrrqr)rurrrrr}rrrtest_str_cat_wrong_dtype_raisess z1TestStringMethods.test_str_cat_wrong_dtype_raisesc Cs|}tddddg}|tkr |n t||d}tdddd g|jd}t|t||dgd d }td d ddg}|tkrv|nt|j|jd}|j|}t|||j|j}t|||j||g}t|||j||jg}t||ddddg|_|ddddg}|tkr|nt|j|jd}|j||g}t|||j||jg}t||ddddg|_|ddddg}|tkr|nt|j|jd}|j|}t||d}tdddg} t| | gd d } tj t |d|j| jWdQRXtj t |d|j| j|jgWdQRXtj t |d|j| j|gWdQRXd}tdt j ddg} tj t |d|j| dgWdQRXtj t |d|j| |gWdQRXtj t |d|j| |jgWdQRXtj t |d|j| | |ggWdQRXtj t |d|jt| WdQRXtj t |d|j| t| gWdQRXtj t |d|jd WdQRXtj t |d |jt|jt|gWdQRXdS)Nrrwrdr)r(ABCDr;)axisZaAaZbBbZcCcZdDdZaDaZbAbZcBcZdCdZaDdZbAaZcBbZdCczrjrr rrkrrr) rurirr!rnrrrrrrrrtest_startswithXs     $z!TestStringMethods.test_startswithc Cstd|ddd|dg|d}|jd}tdtjdddtjdg}t|||jjd|d }td|ddd|dg}t||tjd tjd dt ddd d g t d}t|jd}tdtjdtjtjdtjtjtjg } t|| dS)Nrrrrr)riFT)r!rrwr;g@r\) rrqr rjrr rrkrrrl) rurirr!rnrrrrrrrr test_endswithus    $zTestStringMethods.test_endswithc Cstddtjddg}|j}tddtjddg}t||tdtjdd td dd d g }|j}tdtjdtjtjdtjtjtjg }t ||dS) NFOOBARBlahblurgrBarBlurgbarTblahr;g@) rrjrrqrWr rrrassert_almost_equal)rurnrrrrrr test_titles   $zTestStringMethods.test_titlec Cstdtjddg}|j}tdtjddg}t|||j}t||tdtjddt ddd d g }|j}t|j}tdtjdtjtjdtjtjtjg }t |tst t||dS) NrrZOMZNOMrrwTrr;g@) rrjrrqrXr rrQrrr rt)rurnrrrrrrrrtest_lower_uppers      &z"TestStringMethods.test_lower_upperc Cstddtjddg}|j}tddtjddg}t||tdtjdd td dd d g }|j}tdtjdtjtjdtjtjtjg }t ||dS) NrrrrrrrrTr r;g@) rrjrrqrEr rrrr )rurnrrrrrrtest_capitalizes   $z!TestStringMethods.test_capitalizec Cstddtjddg}|j}tddtjddg}t||tdtjdd tddd d g }|j}tdtjdtjtjdtjtjtjg }t ||dS) NrrrrrrZbLAHZBLURGTr;g@) rrjrrqrVr rrrr )rurnrrrrrr test_swapcases   $zTestStringMethods.test_swapcasecCsdddddg}t|}|jdd|Dks6t|jdd|DksVt|jd d|Dksvt|jd d|Dkst|jd d|DkstdS) NrrZCCCZDdddZeEEEcSsg|] }|qSr)rQ)r[vrrrrsz6TestStringMethods.test_casemethods..cSsg|] }|qSr)rX)r[rrrrrscSsg|] }|qSr)rW)r[rrrrrscSsg|] }|qSr)rE)r[rrrrrscSsg|] }|qSr)rV)r[rrrrrs) rrqrQtolistrtrXrWrErV)rurnrrrrtest_casemethodss    z"TestStringMethods.test_casemethodsc Cstdtjg}|jjdddd}tdtjg}t|||jjddddd}td tjg}t||td tjd dtd ddd g }t|jjdddd}tdtjdtjtjdtjtjtjg }t |tst t ||td dg}td dg}|jjddt jdd}t||d}x~ttfD]r}xjddddifD]X} xPdddgddddgfD]6} || }tjt|d|jd| WdQRXq^WqBWq.WdS)NfooBAD__barBADzBAD[_]*r)T)rfoobarr;)nr foobarBADaBADbBADfooBADg@rrwrsabcd,àzutf-8sabcd, àz(?<=\w),(?=\w)z, )flagsrz!repl must be a string or callabler3rdad)r+)rrjrrqr4r rrrr rtr rreUNICODErrxryr) rurnrrrrrr}klassreplrrrr test_replaces2  &  zTestStringMethods.test_replacec Cs$tdtjg}dd}|jjd|ddd}tdtjg}t||d }d d}tjt |d |jd |WdQRXd d}tjt |d |jd |WdQRXddd}tjt |d |jd |WdQRXtdtjg}d}dd}|jj||dd}tdtjg}t||dS)NrcSs|dS)Nr)grouprV)mrrrr rz9TestStringMethods.test_replace_callable..z [a-z][A-Z]{2}rCT)rrfoObaD__baRbaDzO((takes)|(missing)) (?(2)from \d+ to )?\d+ (?(3)required )positional arguments?cSsdS)Nrrrrrrr)r+rcSsdS)Nr)r!rrrrrrcSsdS)Nr)r!ryrrrrrz Foo Bar Bazz,(?P\w+) (?P\w+) (?P\w+)cSs|dS)NZmiddle)r rV)r!rrrr#r)rZbAR)N) rrjrrqr4r rrxryr)rurnrrrZp_errrrrrtest_replace_callables*  z'TestStringMethods.test_replace_callablec Cs tdtjg}td}|jj|ddd}tdtjg}t|||jj|dddd}td tjg}t||td tjd dt d ddd g }t|jj|ddd}tdtjdtjtjdtjtjtjg }t |tst t ||tddg}tddg}tjdtjd}|j|d}t||tdtjg}td}tjtdd|jj|dtjd}WdQRXtjtdd|jj|ddd}WdQRXtjtdd|jj|ddd}WdQRXtdtjg}dd}td}|jj||dd }td!tjg}t||dS)"NrzBAD[_]*r)T)rrr;)rrrrrrg@rrwrsabcd,àzutf-8sabcd, àz(?<=\w),(?=\w))rz, ZfooBAD__barBAD__badzcase and flags cannot be)r+F)rcSs|dS)Nr)r rV)r!rrrrWrz?TestStringMethods.test_replace_compiled_regex..z [a-z][A-Z]{2}rC)rr")rrjrrcompilerqr4r rrrr rtr rrrxryr IGNORECASE) rurnrrrrrrrrrrtest_replace_compiled_regex(sB   &    z-TestStringMethods.test_replace_compiled_regexc Cstddtjg}tddtjg}|jjdddd}t||tddtjg}|jjdddd}t||d d }td }d }t j t |d |jjd|ddWdQRXd}t j t |d |jj|dddWdQRXdS)Nzf.orZbaozf.baT)rFcSs|dS)Nr)r rV)r!rrrrjrz8TestStringMethods.test_replace_literal..z [a-z][A-Z]{2}z2Cannot use a callable replacement when regex=False)r+abczCCannot use a compiled regex as replacement pattern with regex=Falser)) rrjrrqr4r rrr%rxryr)rurnrrZ callable_replZ compiled_patr}rrrtest_replace_literal]s   z&TestStringMethods.test_replace_literalc Cstddtjdtjdg}|jd}tddtjdtjd g}t|||jd d dd d dg}tddtjdtjdg}t||tdtjddtddd dg }t|jd}tdtjdtjtjdtjtjtjg }t |tst t||dS)Nrrwrdrr3rrcccrr;rCrrccccZddddddTrg@Z foofoofoo) rrjrrqr2r rrrr rt)rurnrrrrrrrr test_repeatus    $zTestStringMethods.test_repeatcCs|tddgdd}|jddg}tddgdd}t||tddgdd}|jddg}tddgdd}t||dS)Nrrc)rir3rrrw)rrqr2r r)rurnrrrrrtest_repeat_with_nulls z'TestStringMethods.test_repeat_with_nullc Cstdtjdg}|jd}tdtjdg}t||tddtjdg}|jd}tddtjdg}t||tdtjd dtddd d g }t|jd}tdtjdtjtjdtjtjtjg }t |tst t||td d tjgjjd dd}tdddg}t||td d tjgjd }tdtjtjg}t||tddddg}|jjddd}tddddg}t||dS)Nrrz.*(BAD[_]+).*(BAD)TFBAD_BADleroybrownz.*BAD[_]+.*BADaBAD_BAD BAD_b_BADr;g@rr)r!rABr)ABC)r) rrjrrqr+r rrrr rt) rurnrrrrrresrrrr test_matchs@    &   zTestStringMethods.test_matchcCstddtjdg}|jd}tddtjdg}t||tddtjdgdd}|jd}tddtjdgd d}t||td d d d g}|jjd dd}tddddg}t||dS)Nrr1rz.*BAD[_]+.*BADTFrc)ribooleanrr4r)r5)r)rrjrrqr,r r)rurnrrZ string_valuesZ string_exprrrrtest_fullmatchs    z TestStringMethods.test_fullmatchc Cs>tdtjdg}tjtdd|jjdddWdQRXdS)Nrrzexpand must be True or False)r+z.*(BAD[_]+).*(BAD))r#)rrjrrxryrrqr")rurnrrrtest_extract_expand_Nonesz*TestStringMethods.test_extract_expand_NonecCsJtdtjdg}|jd}t|ts*t|jjddd}t ||dS)Nrrz .*(BAD[_]+).*T)r#) rrjrrqr"r rrtr r)rurnZresult_unspecifiedZ result_truerrrtest_extract_expand_unspecifieds  z1TestStringMethods.test_extract_expand_unspecifiedc Cs:tdtjdg}tjtjg}|jjddd}tddg||g}t||tdtjd d t ddd d g }t|jjddd}td dg|d dg||||||g }t||tdtjdg}|jjddd}tddg||g}t||t dddddg}t j t dd|jjdddWdQRXxtt gD]}|dddg} d} t j t | d| jjdddWdQRXt j t | d| jjdddWdQRX|ddg} | jjddd}|jdkst|ddgdd}|tkrt||n t||q0Wtdddg} | jjddd}ttjtjtjgtd }t||| jjd!dd}ttjtjgtjtjgtjtjggtd }t||| jjd"dd}tdd#tjg}t||| jjddd}tdd$gd#d%gtjtjgg}t||| jjd&dd}tdd#tjgd'd}t||| jjd(dd}tdd$gd#d%gtjtjggd'd)gd*}t||| jjd+dd}tdd$gd#d%gtjtjggd,d)gd*}t||| jjd-dd}tdd#tjg}t||td.d/d0gjjd1dd}tdd$gd#d%gtjtjgg}t||tddd2gjjd3dd}tdd$gd#d%gtjd2ggd'd)gd*}t||tddd4gjjd5dd}tdd$gd#d%gd4tjggd'd)gd*}t||d6d7} tjtjtjtjtjtjg} x| D]}| |qWtd8d9d:gd;d} | jjdd?gd@d}t|||j|jks6tdS)ANrrz.*(BAD[_]+).*(BAD)F)r#BAD__BADr2r3Tr;g@BAD_A1A2A3ZA4ZB5 supported)r+z ([AB])([123])B2C3z"pattern contains no capture groupsz [ABC][123]z (?:[AB]).*z (?PA)\dunor)rz(_))riz(_)(_)z ([AB])[123]rrrz(?P[AB])letterz!(?P[AB])(?P[123])number)columnsz([AB])(?P[123])rz([AB])(?:[123])A11B22C33z([AB])([123])(?:[123])rz"(?P[AB])?(?P[123])rz#(?P[ABC])(?P[123])?cSsdddg}|dt|}t||d}|jjddd}tdd tjg|d}t||t||djjd dd}d dgd d gdtjgg}t|d dg|d}t ||dS)Nr?rCr)r(z(\d)F)r#rrz(?P\D)(?P\d)?rrrFrG)rHr() rPrrqr"rjrr rrr)r(rrrre_listrrr check_indexns    z@TestStringMethods.test_extract_expand_False..check_indexa3b3c2Zbobz(?P[a-z])rrwrdZsue)rrjrrqr"rr rrrrrxryrrrtrrrlmakeStringIndexmakeUnicodeIndex makeIntIndex makeDateIndexmakePeriodIndexmakeRangeIndex)rurnerrrrridxrs_or_idxr}rrMi_funsr(rrrrrtest_extract_expand_Falses  "     (    $ $     "  "   z+TestStringMethods.test_extract_expand_Falsec Csttdtjdg}tjtjg}|jjddd}tddg||g}t||tdtjd dt ddd d g }t|jjddd}td dg|d dg||||||g }t||xtt gD]}|d ddg}d} t j t | d|jjdddWdQRXt j t | d|jjdddWdQRX|d dg}|jjddd} t| tsLt| d} t| tddgddqWdS)Nrrz.*(BAD[_]+).*(BAD)T)r#r<r=r2r3r;g@r>r?rCrDz"pattern contains no capture groups)r+z [ABC][123]z (?:[AB]).*r@z (?PA)\drEr)r)rrjrrqr"rr rrrrrxryrr rtr) rurnrWrrrrrrYr}Z result_dfZ result_seriesrrrtest_extract_expand_Trues<  "  z*TestStringMethods.test_extract_expand_TruecCsxdD]}tdddg|d}|jjddd}ttjtjtjgtd }t|||jjd dd}ttjtjgtjtjgtjtjggtd }t|||jjd dd}td d tjg}t|||jjddd}td dgd dgtjtjgg}t|||jjddd}tdd d tjgi}t|||jjddd}d dgd dgtjtjgg}t|ddgd}t|||jjddd}t|ddgd}t|||jjddd}td d tjg}t||qWdS)N)N series_namer?rCrD)rz(_)T)r#)riz(_)(_)z ([AB])[123]rrz ([AB])([123])rrz(?P[AB])rFz!(?P[AB])(?P[123])rG)rHz([AB])(?P[123])rz([AB])(?:[123])) rrqr"rrjrrlr r)rur^rrrrLrrrtest_extract_seriess8 (      z%TestStringMethods.test_extract_seriescCs"tdddgjjddd}tddgd d gtjtjgg}t||td d d gjjddd}ddgd d gtjd gg}t|ddgd}t||td d dgjjddd}ddgd d gdtjgg}t|ddgd}t||dd}tjtj tj tj tj tj g}x|D]}||q WdS)NrIrJrKz([AB])([123])(?:[123])T)r#rrrrr?rCrz"(?P[AB])?(?P[123])rFrG)rHrz#(?P[ABC])(?P[123])?cSsdddg}|dt|}t||djjddd}tdd tjg|d}t||t||djjd dd}d dgd d gdtjgg}t|d dg|d}t||dS)Nr?rCr)r(z(\d)T)r#rrz(?P\D)(?P\d)?rrrFrG)rHr() rPrrqr"rrjrr r)r(rrrrLrrrrM s   zCTestStringMethods.test_extract_optional_groups..check_index)rrqr"rrjrr rrQrRrSrTrUrV)rurrrLrMrZr(rrrtest_extract_optional_groupss.       z.TestStringMethods.test_extract_optional_groupscCsDtdddgdd}|jjddd}td d d d gi}t||dS) NrNrOrPr^)rz(?P[a-z])T)r#rFrrwrd)rrqr"rr r)rurr[rrrr'test_extract_single_group_returns_frame"sz9TestStringMethods.test_extract_single_group_returns_framec Csdddddtjdg}ddd d d d d dg}d}dddg}t|}tjddddddddgdd}t|||}|j|tj }t ||tdddd d!d"d#g} t|| } tjd$d%d&d'd(d)d*d+gd,d}t|||}| j|tj }t ||t|| } d-| j _ d.|_ t|||}| j|tj }t ||dd/d0g}d1}t|j|}tjddd2gdd}td3tjd4ftjd5fg|d6d7gd8}t ||d9} t|j| }td3tjd4ftjd5fg|d:d7gd8}t ||dS);Nzdave@google.comztdhock5@gmail.comzmaudelaperriere@gmail.comz'rob@gmail.com some text steve@gmail.comz%a@b.com some text c@d.com and e@f.comr))davercom)Ztdhock5gmailrc)Zmaudelaperriererdrc)Zrobrdrc)Zsteverdrc)rrwrc)rdrrc)rr\rczq (?P[a-z0-9]+) @ (?P[a-z]+) \. (?P[a-z]{2,4}) userdomaintld)rr)r;r)rCr)r3r)r3r;)rr)rr;)rrC)Nr+)r)singleDave)rhToby)rhMaude)multiple robAndSteve)rlabcdef)nonemissing)rorh)rhrir)rhrjr)rhrkr)rlrmr)rlrmr;)rlrnr)rlrnr;)rlrnrC)NNr+)matches description)rqrrr+r?Z32z"(?P[AB])?(?P[123]))rCr;)rrrrrFrG)rHz([AB])?(?P[123])r)rjrrr from_tuplesrrqr$rVERBOSEr rr(r) ruZ subject_listZexpected_tuplesZ named_patternZexpected_columnsSZexpected_indexZ expected_dfZ computed_dfZ series_indexZSiZSnpatternrrrtest_extractall+s             z!TestStringMethods.test_extractallcCstdddgdd}|jd}tjddd d gd d }td ddddgi|}t|||jd}tddddg|}t||dS)NrNrOd4c2r^)rz(?P[a-z]))rr)r;r)rCr)rCr;)Nr+)rrFrrwrrdz([a-z]))rrqr$rrsrr r)rurr[rrrrrtest_extractall_single_groups   z.TestStringMethods.test_extractall_single_groupcCsVtdddgdd}|jd}tjddd d gd d }td dddg|}t||dS)NZab3Zabc3Zd4cd2r^)rz([a-z]+))rr)r;r)rCr)rCr;)Nr+)rrr)rZcd)rrqr$rrsrr r)rurr[rrrrr,test_extractall_single_group_with_quantifiers  z>TestStringMethods.test_extractall_single_group_with_quantifierz data, names)N)i1)Ni2)r{r|rNrOrxc s8t|t|dkr*tt|dd}n$fddtD}tj||d}t|d|dd }tjg|d d}|jd }tdg|d }t |||jd }tddg|d }t |||jd}tdg|d }t |||jd}tddg|d }t |||jd}tddg|d }t ||dS)Nr;r)rc3s |]}t|gdVqdS)r;N)tuple)r[r)rrr sz?TestStringMethods.test_extractall_no_matches..)rr^rl)rr(ri)r+z(z))rHr(z(z)(z)z (?Pz)firstz(?Pz)(?Pz)secondz(z)(?Pz)) rPrrrrsrrqr$rr r) rurrrrreir[rr)rrtest_extractall_no_matchess,          z,TestStringMethods.test_extractall_no_matchescCstdddgdd}|jd}tjddd gdd gd }td d dd gi|d}t||x@tdddgtdddgddgD]}|jd}t||qzWtdddgdtdddgddd}|jd}tjdddgdd gd }td d dd gi|d}t||dS)NZa1a2Zb1Zc1xxx)rz[ab](?P\d))rr)rr;)r;rr+)rdigitrr)r(Zs_nameXXyyzzZidx_name)rr()rr)rr;)rr) rrqr$rrsrr rr)rurr6Zexp_idxrrXrrrtest_extractall_stringindexs(     z-TestStringMethods.test_extractall_stringindexc Cs<tdddgdd}tjtdd|jdWdQRXdS) NrNrOrxr^)rzno capture groups)r+z[a-z])rrxryrrqr$)rurrrrtest_extractall_errorssz(TestStringMethods.test_extractall_errorscCstdddgdddgdd}|jjjd d d }td d dg}t|||jjjdd d }dddg}t|ddgd}t||dS)NrNrOrxrAZB3ZD4r^)r(rz([A-Z])T)r#rrrz!(?P[A-Z])(?P[0-9]))rr)rr)r4rFr)rH)rr(rqr"rr r)rurr[rrLrrr!test_extract_index_one_two_groupss  z3TestStringMethods.test_extract_index_one_two_groupsc Cstdddgdd}d}|jj|dd}|j|}|jd d d }t||d }|jj|dd}|j|}|jd d d }t||d }|jj|dd} |j|}|jd d d }t| |d} |jj| dd} |j| }|jd d d }t| |dS)NrNrOrPr^)rz([a-z])([0-9])T)r#rr+)levelz!(?P[a-z])(?P[0-9])z(?P[a-z])z([a-z]))rrqr"r$xsr r) rurpattern_two_nonameextract_two_nonameZhas_multi_indexZno_multi_indexpattern_two_namedextract_two_namedpattern_one_namedextract_one_namedpattern_one_nonameextract_one_nonamerrrtest_extractall_same_as_extracts*       z1TestStringMethods.test_extractall_same_as_extractc Cstjdddgdd}tdddg|d d }d }|jj|d d }|j|}|jddd}t||d}|jj|d d }|j|}|jddd}t||d} |jj| d d } |j| }|jddd}t| |d} |jj| d d } |j| }|jddd}t| |dS)N)rr)rr)rthird)ZcapitalZordinal)rrNrOrPr^)rz([a-z])([0-9])T)r#rr+)rz!(?P[a-z])(?P[0-9])z(?P[a-z])z([a-z])) rrsrrqr"r$rr r) rurrrrZhas_match_indexZno_match_indexrrrrrrrrr-test_extractall_same_as_extract_subject_index.s0       z?TestStringMethods.test_extractall_same_as_extract_subject_indexcCsBttd}}tdd}ttd}ttd}t||j|d|jksRtt||jt||j dt||j dt||j dt||j dt||j t||jt||jddt||jdt||jdttdgtd |jjd d d ttdd gtd |jjdd d t||jjd dd ttdd gtd |jjddd tttd|jt||jdt||jt||jdt||jdt||jdt||jdt||jdt||jdt||jdt||jjddd t||jj ddd t||jj!d dt||jj!d dt||j"t||j#t||j$t||j%dt||j&dt||j'dt||j(dt||j)t||j*t||j+t||j,t||j-t||j.t||j/t||j0t||j1t||j2t||j3t||j4dt5dd}t||j6|dS)N)riint64r)rrwr3z^ar)rHriz()T)r#r;z()()F*)stop)stepasciir.)7rrlboolr rrqrrtrWrrr>r rQrXr4r2r+rrr"rFrrPr&r%r6r/rr=rSr0r9r:rUrRrTrBr'rrrGrHrJrMrKrOrNrLrIrErVr- maketransr?)ruZ empty_strrhZ empty_intZ empty_boolZ empty_bytestablerrrtest_empty_str_methodsNsv     z(TestStringMethods.test_empty_str_methodscCs<ttd}t}t||jdt||jddS)N)rir)rrqrr rr0r9)rurhZempty_dfrrrtest_empty_str_methods_to_frames z1TestStringMethods.test_empty_str_methods_to_framec Cslddddddddd d g }t|}d d d d d d d d d d g }d d d d d d d d d d g }d d d d d d d d d d g }d d d d d d d d d d g }d d d d d d d d d d g }d d d d d d d d d d g }d d d d d d d d d d g } d d d d d d d d d d g } t|jt|t|jt|t|jt|t|jt|t|jt|t|j t| t|j t| |j d d|Dkst |j dd|Dkst |j dd|Dkst |j dd|Dkst |j dd|Dks$t |j dd|DksFt |j dd|Dksht dS)NrrwZXyrZ3Ar)ZTTZ55rz TFcSsg|] }|qSr)rG)r[rrrrrsz4TestStringMethods.test_ismethods..cSsg|] }|qSr)rH)r[rrrrrscSsg|] }|qSr)rJ)r[rrrrrscSsg|] }|qSr)rM)r[rrrrrscSsg|] }|qSr)rK)r[rrrrrscSsg|] }|qSr)rO)r[rrrrrscSsg|] }|qSr)rN)r[rrrrrs) rr rrqrGrHrJrMrKrOrNrrt) rurnZstr_sZalnum_eZalpha_eZdigit_eZnum_eZspace_eZlower_eZupper_eZtitle_errrtest_ismethodssB""""""z TestStringMethods.test_ismethodscCs6dddddddg}t|}dd d dd d dg}dd dddd dg}t|jt|t|jt|dddddddg}|jd d |Dkst|jd d |Dkstdtj ddtj ddg}t|}dtj d dtj d dg}dtj ddtj d dg}t|jt|t|jt|dS) Nrr¼u★u፸u3ZfourFTcSsg|] }|qSr)rL)r[rrrrrsz4TestStringMethods.test_isnumeric..cSsg|] }|qSr)rI)r[rrrrrs) rr rrqrLrIrrtrjr)rurnrZ numeric_eZ decimal_eZunicodesrrrtest_isnumerics  z TestStringMethods.test_isnumericcCstddtjg}|jd}tdddgdddgdddggtdd}t||tdd d g}|jd }tdddgdddgdddggtd d}t||t ddd g}|jd}t j dddgdd}t ||dS)Nza|bza|crr;rr))rHza;br;Z7abzb|c)r;r;r)r;rr;)rr;r;)rrwrd)r) rrjrrqrFrrr rrrrsr)rurrrrXrrrtest_get_dummiess (  (  z"TestStringMethods.test_get_dummiescCstdddg}|jd}tdddgdddgdddggdddgd}t||td d d g}|jd }tjd ddgdd}t ||dS)Nrzb,namerwrr;rr)rHza|bzname|czb|namer)r;r;rr)rrr;r;)rr;rr;)rrwrdr)r) rrqrFrr rrrrsr)rurrrrXrrr test_get_dummies_with_name_dummys (  z2TestStringMethods.test_get_dummies_with_name_dummyc Cstddtjdg}|jdjd}t||tdtjddt ddd d g }t|jdjd}tdtjdtjtjdtjtjtjg }t |tst t ||dS) Na_b_cc_d_ef_g_hrZa_b asdf_cas_asdfTrr;g@) rrjrrqr=rr rrrr rtr )rurnrrrrrrr test_joins4  zTestStringMethods.test_joinc Cstdddtjdg}|j}|dd}t||tdtjdd t ddd d g }t|j}td tjd tjtjd tjtjtjg }t |tst t ||dS)NrZfoooZfoooooZfooooooocSst|rt|StjS)N)r rPrjr)rrrrr+rz,TestStringMethods.test_len..rrTr;g@r3 ) rrjrrqrPmapr rrrr rtr )rurnrrrrrrrrtest_len's$  &zTestStringMethods.test_lenc Cstdtjddg}|jd}tddgtjgdgg}t||tdtjddtddddg }t|jd}tddgtjgtjtjdgtjtjtjg }t |tst t||dS) Nrrr=zBAD[_]*r<Tr;g@) rrjrrqr&r r rrr rt)rurnrrrrrrrr test_findallCs6   zTestStringMethods.test_findallc CsNtdddddg}|jd}t|tddd d d gtjd d |jDtjd}t |j||j d}t|tddddd gtjdd |jDtjd}t |j||jdd}t|tddddd gtjdd |jDtjd}t |j||j dd}t|tddddd gtjdd |jDtjd}t |j||jddd}t|tddd dd gtjdd |jDtjd}t |j||j ddd}t|tddd dd gtjdd |jDtjd}t |j|t j t dd|jd }WdQRXt j t dd|j d }WdQRXdS)NABCDEFGBCDEFEF DEFGHIJEFEFGHEFXXXXEFrr3r;rcSsg|]}|dqS)r)r%)r[rrrrrosz/TestStringMethods.test_find..)rir,rcSsg|]}|dqS)r)r6)r[rrrrrtscSsg|]}|ddqS)rr3)r%)r[rrrrryscSsg|]}|ddqS)rr3)r6)r[rrrrr~sr-cSsg|]}|dddqS)rr3r-)r%)r[rrrrrscSsg|]}|dddqS)rr3r-)r6)r[rrrrrsz!expected a string object, not int)r+)rrqr%r rrjrkrnrassert_numpy_array_equalr6rxryr)rurnrrrrr test_findks<  zTestStringMethods.test_findc Cs"tdtjdtjdg}|jd}t|tdtjdtjdg|jd}t|tdtjdtjdg|jdd }t|tdtjdtjdg|jdd }t|tdtjdtjdg|jdd d }t|tdtjdtjdg|jdd d }t|tdtjdtjdgdS) Nrrrrrr;rrr3r-)rrjrrqr%r rr6)rurnrrrr test_find_nans  zTestStringMethods.test_find_nanc Csdd}xlttgD]^}|ddddg}|jd}|||dd d d gtjd d |jDtjd}t |j||j d}|||ddddgtjdd |jDtjd}t |j||jdd }|||dd ddgtjdd |jDtjd}t |j||j dd }|||ddddgtjdd |jDtjd}t |j||jddd}|||ddddgtjdd |jDtjd}t |j||j dd d}|||dd d dgtjdd |jDtjd}t |j|t j t dd|jd}WdQRXd}t j t|d|jd }WdQRXt j t|d|j d }WdQRXqWtdddtjg}|jd}t|td d d tjg|j d}t|td d d tjgdS)!NcSs(t|trt||n t||dS)N)r rr rr)rrrrr_checks z,TestStringMethods.test_index.._checkrrrrrrr3r;rcSsg|]}|dqS)r)r()r[rrrrrsz0TestStringMethods.test_index..)rir,rcSsg|]}|dqS)r)r7)r[rrrrrscSsg|]}|ddqS)rr3)r()r[rrrrrscSsg|]}|ddqS)rr3)r7)r[rrrrrsrcSsg|]}|dddqS)rrr)r()r[rrrrrscSsg|]}|dddqS)rrr,)r7)r[rrrrrszsubstring not found)r+ZDEz!expected a string object, not intZabcbrZbcberwrC)rrrqr(rjrkrnrr rr7rxryrrrr)rurrrrrr}rrr test_indexsN    zTestStringMethods.test_indexc Cstddtjdtjdg}|jjddd}tdd tjd tjdg}t|||jjdd d}td d tjdtjdg}t|||jjddd}tddtjdtjdg}t||tdtjddtddddg }t|jjddd}tdtjd tjtjdtjtjtjg }t |tst t||tdtjddtddddg }t|jjdd d}td tjd tjtjdtjtjtjg }t |tst t||tdtjddtddddg }t|jjddd}tdtjdtjtjdtjtjtjg }t |ts t t||dS)Nrrwrdeeeeeer,r)sidez az bz crza zb zc bothz a z b z c Teer;g@z eezee z ee ) rrjrrqr/r r rrr rt)rurnrrrrrrrrtest_pads8    $  $  $zTestStringMethods.test_padc Cstddtjdtjdg}|jjdddd}td d tjd tjdg}t|||jjdd dd}td dtjdtjdg}t|||jjdddd}tddtjdtjdg}t||d}tjt |d|jjddd}WdQRXd}tjt |d|jjddd}WdQRXdS)Nrrwrdrr,rX)rfillcharXXXXaZXXXXbZXXXXcraXXXXZbXXXXZcXXXXrXXaXXZXXbXXZXXcXXz%fillchar must be a character, not str)r+XY)rz%fillchar must be a character, not int) rrjrrqr/r r rxryr)rurnrrr}rrrtest_pad_fillchar s    z#TestStringMethods.test_pad_fillcharr\rr*r8rDr/c CsBtddddg}d}tjt|dt|j|dWdQRXdS)Nr22rrz#width must be of integer type, not*)r+r\)rrxryrrrq)rur\rr}rrrtest_pad_width& sz TestStringMethods.test_pad_widthcCsdd}xRttgD]F}|ddddg}tdd}|j|}|d d d d g}|||qWtd dddg}tdddtjg}|j|}t||dS)NcSs(t|trt||n t||dS)N)r rr rr)rrrrrr0 s z0TestStringMethods.test_translate.._checkZabcdefgZabccZcdddfgZcdefgggr)cdeZcdedefgZcdeeZedddfgZedefgggrrwrdg333333?rr) rrrqrr?rjrr r)rurrrrrrrrrtest_translate/ s   z TestStringMethods.test_translatec Cstddtjdtjdg}|jd}tddtjdtjdg}t|||jd}td d tjd tjdg}t|||jd}td d tjdtjdg}t||tdtjddt dddddg }t|jd}tdtjdtjtjddtjtjtjg }t |tst t||t|jd}td tjd tjtjd dtjtjtjg }t |tsft t||t|jd}td tjd tjtjddtjtjtjg }t |tst t||dS)Nrrwrdrr,z a z b z c za zb zc z az bz cTZeeer;g@z eee zeee z eee) rrjrrqrr r r*r8rrr rt)rurnrrrrrrrrtest_center_ljust_rjustC sl            z)TestStringMethods.test_center_ljust_rjustc CsNtdddddg}|jjddd}td d d ddg}t||tjd d |jDtjd}t |j||jj ddd}tdddddg}t||tjdd |jDtjd}t |j||jj ddd}tddd ddg}t||tjdd |jDtjd}t |j|d}t j t|jddd|jjdddWdQRXt j t|jddd|jj dddWdQRXt j t|jddd|jj dddWdQRXt j t|jddd|jjdddWdQRXt j t|jddd|jj dddWdQRXt j t|jddd|jj dddWdQRXdS)Nrrr.Zdddddrr,r)rrZXXbbXZXcccccSsg|]}|ddqS)r,r)r)r[rrrrr szFTestStringMethods.test_center_ljust_rjust_fillchar..)rirZbbXXXZccccXcSsg|]}|ddqS)r,r)r*)r[rrrrr srZXXXbbcSsg|]}|ddqS)r,r)r8)r[rrrrr sz)fillchar must be a character, not {dtype}rq)r+rintr;)rrqrr rrjrkrnrrr*r8rxryrformat)rurnrrtemplaterrr test_center_ljust_rjust_fillchar s:   z2TestStringMethods.test_center_ljust_rjust_fillcharcCstdddddg}|jd}tddd d dg}t||tjd d |jDtjd }t |j||jd}tdddddg}t||tjdd |jDtjd }t |j|tdtj dtj dg}|jd}tdtj d tj dg}t||dS)NrrrZ333Z45678r,Z00001Z00022Z00aaaZ00333cSsg|]}|dqS)r,)rD)r[rrrrr sz0TestStringMethods.test_zfill..)rir3Z001Z022cSsg|]}|dqS)r3)rD)r[rrrrr s) rrqrDr rrjrkrnrrr)rurnrrrrr test_zfill s     zTestStringMethods.test_zfillc Cstddtjdg}|jd}tdddgddd gtjd d d gg}t||td dtjdg}|jd}t|||jjddd}t||tdtjddtdddg}|jd}tdddgtjdd d gtjtjtjtjtjg}t |tst t |||jjddd}t |ts(t t ||tddtjdg}|jd}tdddgddd gtjd d d gg}t||dS)NrrrrZrrwrdrrr\gra__b__cc__d__ef__g__h__F)r#d_e_fTr;g@za,b_czc_d,ezf,g,hz[,_]) rrjrrqr=r rrrr rtr )rurnrrrrrr test_split s: $         $zTestStringMethods.test_splitrcrr=rScCsptdtjdg|d}tddgtjddgg}t|j|ddd}t||t|j|dd d}t||dS) Nza bzb c)rirrwrdr1)rr)rpdrrrqr r)rurirrrrrrr test_split_n s  zTestStringMethods.test_split_nc Cstddtjdg}|jd}tdddgddd gtjd d d gg}t||td dtjdg}|jd}t|||jjddd}t||tdtjddtdddg}|jd}tdddgtjdd d gtjtjtjtjtjg}t |tst t |||jjddd}t |ts(t t ||tddtjdg}|jd}tdgdgtjdgg}t||tddtjdg}|jjddd}tddgdd gtjdd gg}t||dS)NrrrrZrrwrdrrr\rrrrrrF)r#rTr;g@za,b_czc_d,ezf,g,hz[,_])rrc_df_g) rrjrrqrSr rrrr rtr )rurnrrrrrr test_rsplit sB $          zTestStringMethods.test_rsplitcCstdgdd}|jjdd}tgg}t||tddddgdd}|jjdd}td d d gd d tjgtjtjtjgtjtjtjgg}t||dS) Nr)test)rT)r#za b cza br1rrwrd)rrqr=rr rrjr)rurnrrrrrtest_split_blank_string4 s   z)TestStringMethods.test_split_blank_stringcCsLtddg}|j}ddg}|d|ks.t|j}|d|ksHtdS)Nz Wes McKinneyzTravis OliphantZTravisZOliphantr;)rrqr=rtrS)rurrrrrrtest_split_noargsG s    z#TestStringMethods.test_split_noargscCstddg}|jjdd}|j}t|||jjdd}t|||jd}|jjddd}t|||jjddd}t||dS)Nz bd asdf jfgzkjasdflqw asdfnfkr)rrZasdf)rrqr=r r)rurrrrrrtest_split_maxsplitP s      z%TestStringMethods.test_split_maxsplitcCsDtddg}|jjdd}tddgddgd}tj||d d dS) Nz split oncezsplit once too!r;)rr=oncez once too!)rr;F)Zcheck_index_type)rrqr=r r)rurrrrrr test_split_no_pat_with_nonzero_nb s z2TestStringMethods.test_split_no_pat_with_nonzero_nc CsXtddg}|jjddd}tdtddgi}t||tddg}|jjddd}td d gd d gd dgd}t||tddg}|jjddd}td dgddgd dgtjdgtjdgtjdgd}t||tddgddgd}|jjddd}td d gd dgd ddgd}t||tj t d!d"|jjdd#dWdQRXdS)$Nnosplit alsonosplitrZT)r#rsome_equal_splits with_no_nanssomewithequalnosplitsnans)rr;rCsome_unequal_splitsone_of_these_things_is_notoneunequalofthesethingsisnot)rr;rCr3rr, some_splits with_indexpreserveme)r(r()rr;zexpand must be)r+ not_a_boolean) rrqr=rr rrjrrxryr)rurrrrrrtest_split_to_dataframeh s4       z)TestStringMethods.test_split_to_dataframec CsFtddtjg}|jjddd}|}t|||jdks>ttddtjdg}|jjddd}t d d tjtjtjgdddgg}t|||jd ksttd d tjdg}|jjddd}t dddtjtjtjfdtjtjtjtjtjtjfdg}t|||jdkstt j t dd|jjdddWdQRXdS)NrrrZT)r#r;rr)rrr)rrrr3rrrrr)rrrrrr)NNNNNNr-zexpand must be)r+r)rrjrrqr=r rnlevelsrtrrsrxryr)rurXrrrrrtest_split_to_multiindex_expand s2   z1TestStringMethods.test_split_to_multiindex_expandcCs@tddg}|jjddd}tdtddgi}t||tddg}|jjddd}td d gd d gd dgd}t|||jjdddd}td d gd d gd dgd}t|||jjdddd}tddgd dgd}t||tddgddgd}|jjddd}td d gd dgdddgd}t||dS)NrrrZT)r#rrrrrrrrr)rr;rCrC)r#rr; some_equalwith_no)rr;rrrr)r(r()rrqrSrr r)rurrrrrrtest_rsplit_to_dataframe_expand s*      z1TestStringMethods.test_rsplit_to_dataframe_expandcCstddg}|jjddd}|}t|||jdks:ttddg}|jjddd}td d g}t|||jd ks~ttddg}|jjdddd }td dg}t|||jdkstdS)NrrrZT)r#r;rr)rrr)rrrr3)r#r)rr)rrrC) rrqrSr rrrtrrs)rurXrrrrr test_rsplit_to_multiindex_expand s        z2TestStringMethods.test_rsplit_to_multiindex_expandcCshtdtjg}|jjddd}tdddgtjtjtjgg}t||tdd |j d Dsdt dS) Nz foo,bar,bazrT)r#rrbazcss|]}t|VqdS)N)rjisnan)r[rrrrr~ sz:TestStringMethods.test_split_nan_expand..r;) rrjrrqr=rr rallilocrt)rurrrrrrtest_split_nan_expand s  z'TestStringMethods.test_split_nan_expandcCstddgdd}|jd}tddgdd ggdd}t|||jjdd d }tddgdd gg}t||tddgdd}|jd}tddgdd ggdd}|jd kst t |||jjdd d }t d dg}|jdkst t ||dS)Nza,bzc,dr)rrrrwrdrT)r#r;)rrw)rdrrC) rrqr=r rrrrrrtrrrs)rurr6rrXrrrtest_split_with_name s      z&TestStringMethods.test_split_with_namecCstddtjddg}|jjddd}tddtjd dg}t|||jjddd}td d tjd dg}t||td dtjddg}|jjddd}tddtjddg}t|||jjddd}tddtjddg}t||tddtjddg}|jjdd}tddtjddg}t|||jjdd}tddtjddg}t||td d!tjd"dg}|jjddd}td#d$tjd%dg}t|||jjddd}td&d'tjd(dg}t||tddtjdg}|jjddd}tddtjd g}t|||jjddd}td d tjd g}t||td)d*d+d,g}|jjddd}|d-d.|Dksvt |jjddd}|d/d.|Dkst dS)0NrrrrZF)r#)rrZb_c)rdrZd_e)r\rZg_h)rrZrd)rrZr)rrZrrrrr)rrZb__c)rdrZd__e)r\rZg__h)Za__brrd)Zc__drr)Zf__grrza b czc d ezf g h)rr1zb c)rdr1zd e)r\r1zg h)za br1rd)zc dr1r)zf gr1rr)rfgh)r)r)r))rr)r))rr)r))r)r)r))r)r)r)r)r)rZA_B_CZB_C_DZE_F_GrcSsg|]}|dqS)rZ)r0)r[rrrrr[ sz;TestStringMethods.test_partition_series..cSsg|]}|dqS)rZ)r9)r[rrrrr] s) rrjrrqr0r rr9rrt)rurnrrrrrtest_partition_series sl          z'TestStringMethods.test_partition_seriescCsPtdddtjdg}|jjddd}ttjddd tjdgtd }t|||j d ks\t |jj ddd}ttjd d dtjdgtd }t|||j d kst |jd}tddd tjtjtjfdg}t||t |t st |j dkst |j d}td d dtjtjtjfdg}t||t |t s  z#TestStringMethods.test_replace_moarc Cstdddddtjdddg }|jd }|jd }t|||jdd }|jjd d }t|||jd dd }|jjd d d}t||dS)Nr2rrr4r5r3r/r6rr3)rrCr)rr)rrjrrqr'r rr:)rurrrrrrtest_string_slice_get_syntax- s&    z.TestStringMethods.test_string_slice_get_syntaxcCsltdddg}|jd}tdtjdg}t||tddd g}|jd}td tjd g}t||dS) N)r;rC)r;)r3rr,r;rCrrrwr(rr)rrqrjrr r)rurrrrrrtest_string_slice_out_of_boundsH s   z1TestStringMethods.test_string_slice_out_of_boundsc Cs dddtjd}t|}d}|jj|tjdd}|jdd d d gksLt |jj |tjd }|dsjt |jj |tjd }|dst |jj |tjd }|ddd kst |jj |tjd }|ddkst tt|jj|tjd }WdQRX|dst dS)Nzdave@google.comzsteve@gmail.comz rob@gmail.com)riZSteveZRobZWesz,([A-Z0-9._%+-]+)@([A-Z0-9.-]+)\.([A-Z]{2,4})T)rr#rrbrrc)r)rbrrcr;)rjrrrqr"rr&r rrtr+r,r&rr r UserWarningr)rurrrrrrtest_match_findall_flagsU s&    z*TestStringMethods.test_match_findall_flagscCsHtdddg}|jd}dd}|jd}||}t||dS)Nrrwuaäzutf-8cSs |dS)Nzutf-8)r)rrrrrw rz6TestStringMethods.test_encode_decode..)rrqrrrr r)rubaseZseriesr\rrrrrtest_encode_decodes s    z$TestStringMethods.test_encode_decodec Cstdddg}d}tjt|d|jdWdQRXdd}|jdd }||}t||td d d g}d }tjt |d|j dWdQRXdd}|j dd }||}t||dS)Nrrwuaz['charmap' codec can't encode character '\\x9d' in position 1: character maps to )r+cp1252cSs |ddS)Nr>ignore)r)rrrrr rz=TestStringMethods.test_encode_decode_errors..r?rfbsazS'charmap' codec can't decode byte 0x9d in position 1: character maps to cSs |ddS)Nr>r?)r)rrrrr r) rrxryUnicodeEncodeErrorrqrrr rUnicodeDecodeErrorr)ruZ encodeBaser}r\rrZ decodeBaserrrtest_encode_decode_errors} s    z+TestStringMethods.test_encode_decode_errorsc Csdddtjdg}t|ddddd gd }ddd tjd g}t|ddddd gd }|jd }t||tdddtjdgddddd gd }|jd}t||tjt dd|jdWdQRXt dddg}t dd d g}|jd }t ||dS)Nr5u ABCu 123u アイエrrwrdrr)r(Z123u アイエNFKCr.zinvalid normalization form)r+r) rjrrrqr-r rrxryrrr)rurnrnormedrrrrrtest_normalize s       z TestStringMethods.test_normalizec Csddlm}ddgdfdddgdfdddgd fddddgdfd td ddgd fg}xJ|D]B\}}t|}tt|j|s|tt|j|st|j|ksXtqXWxJ|D]B\}}t|}tt|j|stt|j|st|j|kstqWdt j gd ftd ddgd ft dgdfg}xr|D]j\}}t|}d}t j t|dt|jWdQRXt j t|d |jWdQRX|j|kstqWtddg}|jd kstd}t j t|d |jWdQRXdS)Nr)rsrrwrcr;z mixed-integerg?rriZfloatingZ datetime64Z timedelta64z-Can only use .str accessor with string values)r+)rrwz5Can only use .str accessor with Index, not MultiIndex)pandas.core.stringsrsrrr rrqrtZ inferred_typerjrrrxryrzrrs)rursZcasesrntprXrrrr"test_index_str_accessor_visibility s@     z4TestStringMethods.test_index_str_accessor_visibilityc Cs2ttd}tjtddd|j_WdQRXdS)NZaabbcdez You cannot add any new attribute)r+r)rrrxryrzrqZxlabel)rurrrr#test_str_accessor_no_new_attributes s z5TestStringMethods.test_str_accessor_no_new_attributesc Cs^tttddt}tttddt}tjtdd|j |WdQRXdS)Nr)ZS1defz$Cannot use .str.cat with values of.*)r+) rrjrkrrrlrxryrrqr)rulhsrhsrrrtest_method_on_bytes sz&TestStringMethods.test_method_on_bytescCs>tdtjddg}tdtjddg}|j}t||dS)NssrZssdßußd)rrjrrqrYr r)rurrrrrr test_casefold s zTestStringMethods.test_casefold)__name__ __module__ __qualname__rvr~rxr parametrizerlrrrrrrrrrrrrrrrrrrrrrrrrrjrrrrrr r r rrrr$r'r*r/r0r7r9r:r;r\r]r_r`rarwryrzrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr r rrrrrrrrrr r!r%r(r)r}rkr+r0r1r7r8r9r;r=rCrFrIrJrNrQrrrrrps B  ( %  e-Q &$5 ,-/04 f    ' F*&((9. J,+ 0 %& U5- * #    ,  $ "  ,     +  rpc Cs|\}}}|dkrtdddtjdg}t|td}t|dd}t|j|||}t|j|||}t|tr4|j dkrt | j r|j dkst|t}n|j dkrt j|j d d r|j d kst|t}nX|j d kr|j d kst|d }n4|j d krx|rx|j dks(t|d }nDt|trx|jddj} t|| jdksft|| t|| <t||dS)Nrzdecode requires bytes.rrr+)rircrlT)Zskipnar8rfloatInt64)include)rxskiprjrrrlrrqr rirZis_string_arrayrrnrtrZ is_bool_arrayr anyrZ select_dtypesrHr dtypesr assert_equal) rbrrrrrrwrrrHrrrtest_string_array s8             r]zmethod,expectedcCs<tddgdd}t|j|d}t|dd}t||dS)Nrrc)rirrW)rrrqr r)rrrrrrr'test_string_array_numeric_integer_arrays  r^cCs<tdddgdd}t|j|}t|dd}t||dS)Nrrrc)rir8)rrrqr r)rrrrrrrtest_string_array_boolean_array,s  r_cCsttdddgdd}tdddgdd}d}|jj|dd }|jj|dd }t|jdksZt|t}t ||dS) NZa1Zb2rrc)rirlz(\w)(\d)F)r#) rrqr"r r[rtrrlr r\)rrwrrrrrrtest_string_array_extract<s r`rcCsBtdddg}|j|dddg}tddd g}t||dS) Nrrwrdrr#r5ZaxZbycz)rrqrr r)rrrrrrrtest_cat_different_classesKsrbcCsPttddtjdg}|jd}tttjtjtjdg}t||dS)Nrrr)rCrd)rrrkrrqr'r r)rrrrrr&test_str_get_stringarray_multiple_nansTs rccCs>ttdd}tdddg}|jdddd }t||dS) Nr)rKzA/DzB/EzC/FcSsd|jS)N/)rrqrX)r\rrrr_rz1test_str_accessor_in_apply_func..r;)r)rziprapplyr r)rrrrrrtest_str_accessor_in_apply_func[srg)3rrrZnumpyrjrxZ pandas._libsrZpandasrrrrrr r r Zpandas._testingZ_testingr rGcorerrrrreZ_any_string_methodr_rZdirrsrZmissing_methodsrtZfixturerbrZ"_any_allowed_skipna_inferred_dtyperorpr]rrUr^r_r`r}rkrbrcrgrrrrs $                    'L(        &