\e[c@`sddlmZmZmZddlmZddlmZddl m Z ddl m Z ddl m Z m Z ddl mZmZmZdd l mZmZdd l mZdd lmZdd lmZee Zd efdYZdS(i(tabsolute_importtdivisiontunicode_literals(tunichr(tdequei(tspaceCharacters(tentities(t asciiLetterstasciiUpper2Lower(tdigitst hexDigitstEOF(t tokenTypest tagTokenTypes(treplacementCharacters(tHTMLInputStream(tTriet HTMLTokenizercB`seZdZdJdZdZdZdJedZdZ dZ dZ dZ d Z d Zd Zd Zd ZdZdZdZdZdZdZdZdZdZdZdZdZdZdZdZ dZ!dZ"dZ#d Z$d!Z%d"Z&d#Z'd$Z(d%Z)d&Z*d'Z+d(Z,d)Z-d*Z.d+Z/d,Z0d-Z1d.Z2d/Z3d0Z4d1Z5d2Z6d3Z7d4Z8d5Z9d6Z:d7Z;d8Z<d9Z=d:Z>d;Z?d<Z@d=ZAd>ZBd?ZCd@ZDdAZEdBZFdCZGdDZHdEZIdFZJdGZKdHZLdIZMRS(Ku  This class takes care of tokenizing HTML. * self.currentToken Holds the token that is currently being processed. * self.state Holds a reference to the method to be invoked... XXX * self.stream Points to HTMLInputStream object. cK`sbt|||_||_t|_g|_|j|_t|_d|_ t t |j dS(N(RtstreamtparsertFalset escapeFlagt lastFourCharst dataStatetstatetescapetNonet currentTokentsuperRt__init__(tselfRRtkwargs((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyR"s      cc`s}tg|_xg|jrxx6|jjrVitdd6|jjjdd6Vq!Wx|jrt|jjVqZWqWdS(u This is where the magic happens. We do our usually processing through the states and when we have a token to return we yield the token which pauses processing until the next token is requested. u ParseErrorutypeiudataN(Rt tokenQueueRRterrorsR tpoptpopleft(R((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyt__iter__1s * c %C`st}d}|r!t}d}ng}|jj}x8||krp|tk rp|j||jj}q9Wtdj||}|tkrt|}|j jit dd6dd6i|d6d 6nd |kod kns|d kr3d }|j jit dd6dd6i|d6d 6nrd|koJdknsd|kofdknsd|kodknsd|kodkns|t ddddddddddd d!d"d#d$d%d&d'd(d)d*d+d,d-d.d/d0d1d2d3d4d5d6d7d g#krQ|j jit dd6dd6i|d6d 6nyt |}WnAt k r|d8}t d |d?Bt d9|d:@B}nX|d;kr|j jit dd6d<d6|jj|n|S(=uThis function returns either U+FFFD or the character based on the decimal or hexadecimal representation. It also discards ";" if present. If not present self.tokenQueue.append({"type": tokenTypes["ParseError"]}) is invoked. i iuu ParseErrorutypeu$illegal-codepoint-for-numeric-entityudatau charAsIntudatavarsiiiu�iiiiiiiii iiiiiiiiiiiiiiiiiii i i i i i i i i i iiiiiiiiu;u numeric-entity-without-semicolon(R R RtcharR tappendtinttjoinRR R t frozensettchrt ValueErrortunget( RtisHextallowedtradixt charStacktct charAsIntR%tv((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pytconsumeNumberEntityAs`              *  c C`sd}|jjg}|dtks]|dtddfks]|dk rt||dkrt|jj|dn|ddkrpt}|j|jj|ddkrt}|j|jjn|r|dt ks| r"|dt kr"|jj|d|j |}q7|j jit dd 6d d 6|jj|jdd j|}nxF|dtk rtjd j|sPn|j|jjqsWy,tjd j|d }t|}Wntk rd}nX|dk r|dd kr@|j jit dd 6dd 6n|dd kr|r||tks||t ks||dkr|jj|jdd j|}q7t|}|jj|j|d j||7}nK|j jit dd 6dd 6|jj|jdd j|}|r[|jd ddc|7/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyt consumeEntitysf)        cC`s|jd|dtdS(uIThis method replaces the need for "entityInAttributeValueState". R;R<N(RBR5(RR;((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pytprocessEntityInAttributescC`s|j}|dtkr|djt|d<|dtdkr|drs|jjitdd6dd6n|dr|jjitdd6dd6qqn|jj||j|_d S( uThis method is a generic handler for emitting the tags. It also sets the state to "data" because that's what's needed after a token has been emitted. utypeunameuEndTagudatau ParseErroruattributes-in-end-tagu selfClosinguself-closing-flag-on-end-tagN( RR t translateRR R R&RR(Rttoken((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pytemitCurrentTokens   cC`s(|jj}|dkr*|j|_n|dkrE|j|_n|dkr|jjitdd6dd6|jjitdd6dd6n|tkrt S|t kr|jjitd d6||jj t t d6n8|jj d }|jjitdd6||d6t S( Nu&u/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRs&      !cC`s|j|j|_tS(N(RBRRR5(R((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRGs  cC`s(|jj}|dkr*|j|_n|dkrE|j|_n|tkrUtS|dkr|jjit dd6dd6|jjit dd6d d6n||t kr|jjit d d6||jj t t d6n8|jj d }|jjit dd6||d6t S( Nu&u/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyt rcdataStates&      !cC`s|j|j|_tS(N(RBRNRR5(R((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRL1s  cC`s|jj}|dkr*|j|_n|dkr}|jjitdd6dd6|jjitdd6dd6nH|tkrtS|jj d }|jjitdd6||d6t S( Nu/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyt rawtextState6s    cC`s|jj}|dkr*|j|_n|dkr}|jjitdd6dd6|jjitdd6dd6nH|tkrtS|jj d }|jjitdd6||d6t S( Nu/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pytscriptDataStateHs    cC`s|jj}|tkrtS|dkrr|jjitdd6dd6|jjitdd6dd6n2|jjitdd6||jjdd6tS(Nuu ParseErrorutypeuinvalid-codepointudatau Charactersu�( RR%R RR R&R RIR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pytplaintextStateZs   cC`s|jj}|dkr*|j|_nr|dkrE|j|_nW|tkritdd6|d6gd6td6td6|_|j |_n |d kr|j j itd d6d d6|j j itd d6d d6|j |_n|dkr<|j j itd d6dd6|jj ||j|_n`|j j itd d6dd6|j j itd d6dd6|jj ||j |_tS(Nu!u/uStartTagutypeunameudatau selfClosinguselfClosingAcknowledgedu>u ParseErroru'expected-tag-name-but-got-right-bracketu Charactersu<>u?u'expected-tag-name-but-got-question-markuexpected-tag-nameu<(RR%tmarkupDeclarationOpenStateRtcloseTagOpenStateRR RRt tagNameStateR R&RR,tbogusCommentStateR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRHis6      "   " cC`s?|jj}|tkrSitdd6|d6gd6td6|_|j|_n|dkr|jj itdd6dd6|j |_n|t kr|jj itdd6d d6|jj itd d6d d6|j |_nL|jj itdd6d d6i|d6d 6|jj ||j |_tS(NuEndTagutypeunameudatau selfClosingu>u ParseErroru*expected-closing-tag-but-got-right-bracketu expected-closing-tag-but-got-eofu Charactersu/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRUs(     " cC`s|jj}|tkr*|j|_n|dkrC|jn|tkr|jjit dd6dd6|j |_nr|dkr|j |_nW|dkr|jjit dd6dd6|j d cd 7u ParseErrorutypeueof-in-tag-nameudatau/uuinvalid-codepointunameu�(RR%RtbeforeAttributeNameStateRRFR R R&R RtselfClosingStartTagStateRR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRVs"        cC`su|jj}|dkr3d|_|j|_n>|jjitdd6dd6|jj||j |_t S(Nu/uu Charactersutypeu/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRMs  " cC`s{|jj}|tkr9|j|7_|j|_n>|jjitdd6dd6|jj ||j |_t S(Nu Charactersutypeu/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyR[s " cC`s|jo(|jdj|jjk}|jj}|tkr|ritdd6|jd6gd6td6|_|j|_ n|dkr|ritdd6|jd6gd6td6|_|j |_ n|dkr+|r+itdd6|jd6gd6td6|_|j |j |_ nc|t krI|j|7_nE|jjitdd6d |jd6|jj||j|_ tS( NunameuEndTagutypeudatau selfClosingu/u>u Charactersu/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyR\s2+      cC`su|jj}|dkr3d|_|j|_n>|jjitdd6dd6|jj||j |_t S(Nu/uu Charactersutypeu/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyROs  " cC`s{|jj}|tkr9|j|7_|j|_n>|jjitdd6dd6|jj ||j |_t S(Nu Charactersutypeu/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyR_s " cC`s|jo(|jdj|jjk}|jj}|tkr|ritdd6|jd6gd6td6|_|j|_ n|dkr|ritdd6|jd6gd6td6|_|j |_ n|dkr+|r+itdd6|jd6gd6td6|_|j |j |_ nc|t krI|j|7_nE|jjitdd6d |jd6|jj||j|_ tS( NunameuEndTagutypeudatau selfClosingu/u>u Charactersu/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyR`s2+      cC`s|jj}|dkr3d|_|j|_n{|dkrp|jjitdd6dd6|j|_n>|jjitdd6dd6|jj ||j |_t S( Nu/uu!u Charactersutypeu/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRQs   "" cC`s{|jj}|tkr9|j|7_|j|_n>|jjitdd6dd6|jj ||j |_t S(Nu Charactersutypeu/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRa,s " cC`s|jo(|jdj|jjk}|jj}|tkr|ritdd6|jd6gd6td6|_|j|_ n|dkr|ritdd6|jd6gd6td6|_|j |_ n|dkr+|r+itdd6|jd6gd6td6|_|j |j |_ nc|t krI|j|7_nE|jjitdd6d |jd6|jj||j|_ tS( NunameuEndTagutypeudatau selfClosingu/u>u Charactersu/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRc7s2+      cC`sl|jj}|dkrL|jjitdd6dd6|j|_n|jj||j|_t S(Nu-u Charactersutypeudata( RR%R R&R tscriptDataEscapeStartDashStateRR,RRR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRbSs " cC`sl|jj}|dkrL|jjitdd6dd6|j|_n|jj||j|_t S(Nu-u Charactersutypeudata( RR%R R&R tscriptDataEscapedDashDashStateRR,RRR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRd]s " cC`s|jj}|dkrL|jjitdd6dd6|j|_n|dkrg|j|_n|dkr|jjitdd6dd6|jjitdd6d d6nS|tkr|j |_n8|jj d }|jjitdd6||d6t S( Nu-u Charactersutypeudatau/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pytscriptDataEscapedStategs" "    cC`s|jj}|dkrL|jjitdd6dd6|j|_n|dkrg|j|_n|dkr|jjitdd6dd6|jjitdd6d d6|j|_nI|t kr|j |_n.|jjitdd6|d6|j|_t S( Nu-u Charactersutypeudatau/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRf{s" "     " cC`sD|jj}|dkr@|jjitdd6dd6n|dkr[|j|_n|dkr|jjitdd6dd6|j|_n|dkr|jjitdd6d d6|jjitdd6d d6|j|_nI|t kr|j |_n.|jjitdd6|d6|j|_t S( Nu-u Charactersutypeudatauuu ParseErroruinvalid-codepointu�( RR%R R&R RgRRRRhR RR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRes& %  "    " cC`s|jj}|dkr3d|_|j|_n|tkr}|jjitdd6d|d6||_|j |_n>|jjitdd6dd6|jj ||j |_t S(Nu/uu Charactersutypeu/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRgs   & " cC`su|jj}|tkr3||_|j|_n>|jjitdd6dd6|jj ||j |_t S(Nu Charactersutypeu/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRis  " cC`s|jo(|jdj|jjk}|jj}|tkr|ritdd6|jd6gd6td6|_|j|_ n|dkr|ritdd6|jd6gd6td6|_|j |_ n|dkr+|r+itdd6|jd6gd6td6|_|j |j |_ nc|t krI|j|7_nE|jjitdd6d |jd6|jj||j|_ tS( NunameuEndTagutypeudatau selfClosingu/u>u Charactersu/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRks2+      cC`s|jj}|ttdBkrz|jjitdd6|d6|jjdkrk|j |_ q|j |_ n\|t kr|jjitdd6|d6|j|7_n|jj ||j |_ tS(Nu/u>u Charactersutypeudatauscript(u/u>(RR%RR)R R&R RZR]tscriptDataDoubleEscapedStateRRhRR,R5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRjs" " cC`s?|jj}|dkrL|jjitdd6dd6|j|_n|dkr|jjitdd6dd6|j|_n|dkr|jjitdd6dd6|jjitdd6d d6n_|tkr|jjitdd6d d6|j |_n"|jjitdd6|d6t S( Nu-u Charactersutypeudatau/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRls$ " "    "cC`sW|jj}|dkrL|jjitdd6dd6|j|_n|dkr|jjitdd6dd6|j|_n|dkr|jjitdd6dd6|jjitdd6d d6|j|_nk|t kr%|jjitdd6d d6|j |_n.|jjitdd6|d6|j|_t S( Nu-u Charactersutypeudatau/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRms( " "     " cC`s|jj}|dkr@|jjitdd6dd6nD|dkr}|jjitdd6dd6|j|_n|dkr|jjitdd6dd6|j|_n|dkr|jjitdd6d d6|jjitdd6d d6|j|_nk|t krV|jjitdd6d d6|j |_n.|jjitdd6|d6|j|_t S( Nu-u Charactersutypeudatauuu ParseErroruinvalid-codepointu�ueof-in-script-in-script( RR%R R&R RnRRRRlR RR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRos, % " "     " cC`su|jj}|dkrU|jjitdd6dd6d|_|j|_n|jj||j |_t S(Nu/u Charactersutypeudatau( RR%R R&R RZtscriptDataDoubleEscapeEndStateRR,RlR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRn0s "  cC`s|jj}|ttdBkrz|jjitdd6|d6|jjdkrk|j |_ q|j |_ n\|t kr|jjitdd6|d6|j|7_n|jj ||j |_ tS(Nu/u>u Charactersutypeudatauscript(u/u>(RR%RR)R R&R RZR]RhRRlRR,R5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRp;s" " cC`s|jj}|tkr1|jjttnz|tkrf|jdj|dg|j|_ nE|dkr|j n,|dkr|j |_ n|dkr|j jit d d 6d d6|jdj|dg|j|_ n|d krH|j jit d d 6d d6|jdjddg|j|_ nc|tkr|j jit d d 6dd6|j|_ n&|jdj|dg|j|_ tS(Nudatauu>u/u'u"u=u/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRXKs6            cC`sv|jj}t}t}|dkr6|j|_n|tkry|jdddc||jjtt7u/uu ParseErrorutypeuinvalid-codepointu�u'u"u/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRqisR               - cC`s|jj}|tkr1|jjttn|dkrL|j|_nz|dkre|jna|tkr|j dj |dg|j |_n,|dkr|j |_n|dkr |j j itdd6d d6|j dj d dg|j |_n|dkrc|j j itdd6dd6|j dj |dg|j |_nc|tkr|j j itdd6dd6|j|_n&|j dj |dg|j |_tS(Nu=u>udatauu/uu ParseErrorutypeuinvalid-codepointu�u'u"u/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRss:             cC`s|jj}|tkr1|jjttn|dkrL|j|_n|dkrw|j|_|jj|nj|dkr|j |_nO|dkr|j j it dd6dd6|j n|d kr%|j j it dd6d d6|jdd d cd 7<|j|_n|dkr}|j j it dd6dd6|jdd d c|7<|j|_nd|tkr|j j it dd6dd6|j|_n'|jdd d c|7<|j|_tS(Nu"u&u'u>u ParseErrorutypeu.expected-attribute-value-but-got-right-bracketudatauuinvalid-codepointiiu�u=u/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRrs>               cC`s|jj}|dkr*|j|_n|dkrF|jdn|dkr|jjitdd6dd6|jddd cd 7/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRxs       cC`s|jj}|dkr*|j|_n|dkrF|jdn|dkr|jjitdd6dd6|jddd cd 7/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRzs       cC`sm|jj}|tkr*|j|_n?|dkrF|jdn#|dkr_|jn |dkr|jjit dd 6d d 6|j d d d c|7u"u'u=uu"u'u=u/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRys,          !cC`s|jj}|tkr*|j|_n|dkrC|jn|dkr^|j|_n|tkr|jj it dd6dd6|jj ||j |_n>|jj it dd6dd6|jj ||j|_t S(Nu>u/u ParseErrorutypeu$unexpected-EOF-after-attribute-valueudatau*unexpected-character-after-attribute-value(RR%RRXRRFRYR R R&R R,RR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyR{ s"        cC`s|jj}|dkr5t|jd<|jn|tkr|jjitdd6dd6|jj ||j |_ n>|jjitdd6dd6|jj ||j |_ tS(Nu>u selfClosingu ParseErrorutypeu#unexpected-EOF-after-solidus-in-tagudatau)unexpected-character-after-solidus-in-tag( RR%R5RRFR R R&R R,RRRX(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRY4s       cC`sc|jjd}|jdd}|jjitdd6|d6|jj|j|_t S(Nu>uu�uCommentutypeudata( RRItreplaceR R&R R%RRR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRWFs   cC`sB|jjg}|ddkrv|j|jj|ddkritdd6dd6|_|j|_tSnw|ddkr(t}xPdd d!d"d#d$fD]6}|j|jj|d|krt}PqqW|ritdd6dd6dd6dd6td6|_|j |_tSn|ddkr|j dk r|j j j r|j j j dj|j j jkrt}xPd dddddgD]6}|j|jj|d|krt}PqqW|r|j|_tSn|jjitdd6dd6x |r1|jj|jqW|j|_tS(%Niu-uCommentutypeuudatauduDuouOucuCutuTuyuYupuPueuEuDoctypeunameupublicIdusystemIducorrectu[uAu ParseErroruexpected-dashes-or-doctype(uduD(uouO(ucuC(utuT(uyuY(upuP(ueuE(RR%R&R RtcommentStartStateRR5RRt doctypeStateRttreet openElementst namespacetdefaultNamespacetcdataSectionStateR R,R"RW(RR0tmatchedtexpected((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRTUsR    %    cC`s1|jj}|dkr*|j|_n|dkrn|jjitdd6dd6|jdcd7uincorrect-commentueof-in-comment( RR%tcommentStartDashStateRR R&R RRR t commentStateR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyR}s(        cC`s5|jj}|dkr*|j|_n|dkrn|jjitdd6dd6|jdcd7uincorrect-commentueof-in-comment( RR%tcommentEndStateRR R&R RRR RR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRs(        cC`s|jj}|dkr*|j|_n|dkrn|jjitdd6dd6|jdcd7/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRs     cC`s|jj}|dkr*|j|_n|dkrz|jjitdd6dd6|jdcd7<|j|_ns|t kr|jjitdd6dd6|jj|j|j |_n#|jdcd|7<|j|_t S( Nu-uu ParseErrorutypeuinvalid-codepointudatau-�ueof-in-comment-end-dash( RR%RRR R&R RRR RR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRs       cC`s|jj}|dkr=|jj|j|j|_nf|dkr|jjitdd6dd6|jdcd7<|j|_n|dkr|jjitdd6d d6|j |_n|d kr|jjitdd6d d6|jdc|7uu ParseErrorutypeuinvalid-codepointudatau--�u!u,unexpected-bang-after-double-dash-in-commentu-u,unexpected-dash-after-double-dash-in-commentueof-in-comment-double-dashuunexpected-char-in-commentu--( RR%R R&RRRR RtcommentEndBangStateR R5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRs6           cC`s2|jj}|dkr=|jj|j|j|_n|dkrk|jdcd7<|j|_n|dkr|jjitdd6dd6|jdcd 7<|j |_ns|t kr |jjitdd6d d6|jj|j|j|_n#|jdcd|7<|j |_t S( Nu>u-udatau--!uu ParseErrorutypeuinvalid-codepointu--!�ueof-in-comment-end-bang-state( RR%R R&RRRRR RR R5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRs(       cC`s|jj}|tkr*|j|_n|tkr|jjitdd6dd6t |j d<|jj|j |j |_n>|jjitdd6dd6|jj ||j|_t S(Nu ParseErrorutypeu!expected-doctype-name-but-got-eofudataucorrectuneed-space-after-doctype(RR%RtbeforeDoctypeNameStateRR R R&R RRRR,R5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyR~ s      cC`s?|jj}|tkrn|dkr{|jjitdd6dd6t|jd<|jj|j|j|_ n|dkr|jjitdd6dd6d |jd <|j |_ nv|t kr"|jjitdd6d d6t|jd<|jj|j|j|_ n||jd <|j |_ t S( Nu>u ParseErrorutypeu+expected-doctype-name-but-got-right-bracketudataucorrectuuinvalid-codepointu�unameu!expected-doctype-name-but-got-eof( RR%RR R&R RRRRtdoctypeNameStateR R5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRs.            cC`ss|jj}|tkrG|jdjt|jd<|j|_n(|dkr|jdjt|jd<|jj |j|j |_n|dkr|jj it dd6dd6|jdcd7<|j |_n|t kr\|jj it dd6d d6t|jd <|jdjt|jd<|jj |j|j |_n|jdc|7uu ParseErrorutypeuinvalid-codepointudatau�ueof-in-doctype-nameucorrect(RR%RRRDRtafterDoctypeNameStateRR R&RR RR RR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyR6s,       cC`s|jj}|tkrn|dkrL|jj|j|j|_n|tkrt |jd<|jj ||jjit dd6dd6|jj|j|j|_n9|dkr)t }xBd d!d"d#d$fD]+}|jj}||krt }PqqW|r|j |_t Snp|d%krt }xBd&d'd(d)d*fD]+}|jj}||krQt }PqQqQW|r|j|_t Sn|jj ||jjit dd6dd6i|d6d6t |jd<|j|_t S(+Nu>ucorrectu ParseErrorutypeueof-in-doctypeudataupuPuuuUubuBuluLuiuIucuCusuSuyuYutuTueuEumuMu*expected-space-or-right-bracket-in-doctypeudatavars(upuP(uuuU(ubuB(uluL(uiuI(ucuC(usuS(uyuY(usuS(utuT(ueuE(umuM(RR%RR R&RRRR RR,R R5tafterDoctypePublicKeywordStatetafterDoctypeSystemKeywordStatetbogusDoctypeState(RRJRR((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyROsT               cC`s|jj}|tkr*|j|_n|d krw|jjitdd6dd6|jj||j|_ny|t kr|jjitdd6dd6t |j d<|jj|j |j |_n|jj||j|_t S( Nu'u"u ParseErrorutypeuunexpected-char-in-doctypeudataueof-in-doctypeucorrect(u'u"(RR%Rt"beforeDoctypePublicIdentifierStateRR R&R R,R RRRR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRs"       cC`sg|jj}|tkrnE|dkrFd|jd<|j|_n|dkrnd|jd<|j|_n|dkr|jjit dd6dd 6t |jd <|jj|j|j |_n|t kr(|jjit dd6d d 6t |jd <|jj|j|j |_n;|jjit dd6d d 6t |jd <|j |_tS( Nu"uupublicIdu'u>u ParseErrorutypeuunexpected-end-of-doctypeudataucorrectueof-in-doctypeuunexpected-char-in-doctype(RR%RRt(doctypePublicIdentifierDoubleQuotedStateRt(doctypePublicIdentifierSingleQuotedStateR R&R RRR RR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRs4              cC`s?|jj}|dkr*|j|_n|dkrn|jjitdd6dd6|jdcd7uunexpected-end-of-doctypeucorrectueof-in-doctype( RR%t!afterDoctypePublicIdentifierStateRR R&R RRRR R5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRs*         cC`s?|jj}|dkr*|j|_n|dkrn|jjitdd6dd6|jdcd7uunexpected-end-of-doctypeucorrectueof-in-doctype( RR%RRR R&R RRRR R5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRs*         cC`s|jj}|tkr*|j|_nZ|dkrX|jj|j|j|_n,|dkr|jjit dd6dd6d|jd<|j |_n|d kr|jjit dd6dd6d|jd<|j |_n|t krI|jjit dd6d d6t |jd <|jj|j|j|_n;|jjit dd6dd6t |jd <|j|_tS( Nu>u"u ParseErrorutypeuunexpected-char-in-doctypeudatauusystemIdu'ueof-in-doctypeucorrect(RR%Rt-betweenDoctypePublicAndSystemIdentifiersStateRR R&RRR t(doctypeSystemIdentifierDoubleQuotedStatet(doctypeSystemIdentifierSingleQuotedStateR RRR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRs6              cC`s8|jj}|tkrn|dkrL|jj|j|j|_n|dkrtd|jd<|j|_n|dkrd|jd<|j |_n|t kr|jjit dd6dd 6t |jd <|jj|j|j|_n;|jjit dd6d d 6t |jd <|j |_tS( Nu>u"uusystemIdu'u ParseErrorutypeueof-in-doctypeudataucorrectuunexpected-char-in-doctype(RR%RR R&RRRRRR R RRR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRs.            cC`s|jj}|tkr*|j|_n|d krw|jjitdd6dd6|jj||j|_ny|t kr|jjitdd6dd6t |j d<|jj|j |j |_n|jj||j|_t S( Nu'u"u ParseErrorutypeuunexpected-char-in-doctypeudataueof-in-doctypeucorrect(u'u"(RR%Rt"beforeDoctypeSystemIdentifierStateRR R&R R,R RRRR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRs"       cC`sg|jj}|tkrnE|dkrFd|jd<|j|_n|dkrnd|jd<|j|_n|dkr|jjit dd6dd 6t |jd <|jj|j|j |_n|t kr(|jjit dd6d d 6t |jd <|jj|j|j |_n;|jjit dd6dd 6t |jd <|j |_tS( Nu"uusystemIdu'u>u ParseErrorutypeuunexpected-char-in-doctypeudataucorrectueof-in-doctype(RR%RRRRRR R&R RRR RR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyR/s4              cC`s?|jj}|dkr*|j|_n|dkrn|jjitdd6dd6|jdcd7uunexpected-end-of-doctypeucorrectueof-in-doctype( RR%t!afterDoctypeSystemIdentifierStateRR R&R RRRR R5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRLs*         cC`s?|jj}|dkr*|j|_n|dkrn|jjitdd6dd6|jdcd7uunexpected-end-of-doctypeucorrectueof-in-doctype( RR%RRR R&R RRRR R5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRds*         cC`s|jj}|tkrn|dkrL|jj|j|j|_n|tkr|jjit dd6dd6t |jd<|jj|j|j|_n.|jjit dd6dd6|j |_t S(Nu>u ParseErrorutypeueof-in-doctypeudataucorrectuunexpected-char-in-doctype( RR%RR R&RRRR R RRR5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyR|s        cC`s|jj}|dkr=|jj|j|j|_n>|tkr{|jj||jj|j|j|_nt S(Nu>( RR%R R&RRRR R,R5(RRJ((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRs  cC`s`g}xtr|j|jjd|j|jjd|jj}|tkr`Pq |dksrt|dddkr|dd |diiu]]uuiu ParseErrorutypeuinvalid-codepointudatau�u Characters(R5R&RRIR%R tAssertionErrorR(tcounttrangeR R R|RR(RRJR%t nullCountRw((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRs0    N(Nt__name__t __module__t__doc__RRR$R4RRBRCRFRRGRNRLRPRRRSRHRURVRMR[R\ROR_R`RQRaRcRbRdRhRfReRgRiRkRjRlRmRoRnRpRXRqRsRrRxRzRyR{RYRWRTR}RRRRRR~RRRRRRRRRRRRRRRR(((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyRs    HP          #                  6 "       -          3            N(t __future__RRRtpip._vendor.sixRR*t collectionsRt constantsRRRRR R R R R Rt _inputstreamRt_trieRR6tobjectR(((s>/tmp/pip-install-0xiv62/pip/pip/_vendor/html5lib/_tokenizer.pyts