U o^;@s@dZddlZGdddeZGdddeZGdddeZdS) z This module contains a tokenizer for Excel formulae. The tokenizer is based on the Javascript tokenizer found at http://ewbi.blogs.com/develops/2004/12/excel_formula_p.html written by Eric Bachtal Nc@seZdZdZdS)TokenizerErrorz$Base class for all Tokenizer errors.N)__name__ __module__ __qualname____doc__rrt/private/var/folders/sd/whlwsn6x1_qgglc0mjv25_695qk2gl/T/pip-install-4zq3fp6i/openpyxl/openpyxl/formula/tokenizer.pyr src@seZdZdZedZedZededdZdZ dZ d d Z d d Z d dZ ddZddZddZddZddZddZddZddZd'd d!Zd"d#Zd$d%Zd&S)( Tokenizera^ A tokenizer for Excel worksheet formulae. Converts a str string representing an Excel formula (in A1 notation) into a sequence of `Token` objects. `formula`: The str string to tokenize Tokenizer defines a method `._parse()` to parse the formula into tokens, which can then be accessed through the `.items` attribute. z^[1-9](\.[0-9]+)?[Ee]$z[ \n]+z"(?:[^"]*"")*[^"]*"(?!")z'(?:[^']*'')*[^']*'(?!')"')z#NULL!z#DIV/0!z#VALUE!z#REF!z#NAME?z#NUM!z#N/Az #GETTING_DATAz,;}) +-*/^&=><%cCs*||_g|_g|_d|_g|_|dS)Nr)formulaitems token_stackoffsettoken_parse)selfr rrr__init__.s zTokenizer.__init__c Cs>|jr dS|jsdS|jddkr2|jd7_n|jt|jtjdSd|jfd|jfd|jfd|j fd |j fd |j fd |j fd |j fd |j ff }i}|D]\}}|t||q|jt|jkr2|rq|j|j}||jkr|||kr|j||7_q|j||jd7_q|dS)z5Populate self.items with the tokens from the formula.Nr=z"'[#  z +-*/^&=><%z{()}z;,)rr rappendTokenLITERAL _parse_string_parse_brackets _parse_error_parse_whitespace_parse_operator _parse_opener _parse_closer_parse_separatorupdatedictfromkeyslencheck_scientific_notation TOKEN_ENDERS save_tokenr)rZ consumers dispatchercharsZconsumer curr_charrrrr7s@      zTokenizer._parsecCs|jdd|j|j}|dks$t|j|}||j|jd}|dkrr|dkrXdnd}td|d |j|d }|dkr|j t |n |j |t |S) a Parse a "-delimited string or '-delimited link. The offset must be pointing to either a single quote ("'") or double quote ('"') character. The strings are parsed according to Excel rules where to escape the delimiter you just double it up. E.g., "abc""def" in Excel is parsed as 'abc"def' in Python. Returns the number of characters matched. (Does not update self.offset) : can_followr Nr stringlinkz%Reached end of formula while parsing z in r)assert_empty_tokenr rAssertionErrorSTRING_REGEXESmatchrgrouprrr make_operandrr*)rdelimregexr9subtyperrrr_s      zTokenizer._parse_stringcCs|j|jdkstddtd|j|jdD}ddtd|j|jdD}d}t||D]F\}}||7}|dkrh|d }|j|j|j|j||Sqhtd |jdS) z Consume all the text between square brackets []. Returns the number of characters matched. (Does not update self.offset) rcSsg|]}|dfqS)rstart.0trrr sz-Tokenizer._parse_brackets..z\[NcSsg|]}|dfqS)r?rArrrrDsz\]rrzEncountered unmatched '[' in ) r rr7refinditersortedrrr)rZleftsZrightsZ open_countidxZ open_closeZ outer_rightrrrr {s" zTokenizer._parse_bracketscCs|jdd|j|jdks t|j|jd}|jD]D}||r6|jt d |j ||j dd=t |Sq6t d|jd|jddS) z Consume the text following a '#' as an error. Looks for a match in self.ERROR_CODES and returns the number of characters matched. (Does not update self.offset) !r2rNzInvalid error code at position  in 'r )r6r rr7 ERROR_CODES startswithrrrr;joinrr*r)rZ subformulaerrrrrr!s    zTokenizer._parse_errorcCsL|j|jdkst|jt|j|jtj|j|j|jd S)z Consume a string of consecutive spaces. Returns the number of spaces found. (Does not update self.offset). )rrN) r rr7rrrWSPACE WSPACE_REr9endrrrrr"szTokenizer._parse_whitespacecCs |j|j|jddkrD|jt|j|j|jdtjdS|j|j}|dks\t|dkrrtdtj}n|dkrt|tj}nt|jst|tj}n`t ddt |jDd}|o|j tj kp|j tjkp|j tjk}|rt|tj}n t|tj}|j|d S) z Consume the characters constituting an operator. Returns the number of characters consumed. (Does not update self.offset) )z>=z<=z<>z %*/^&=><+-%z*/^&=>s z,Tokenizer._parse_operator..Nr)r rrrrOP_INr7OP_POSTOP_PREnextreversedr>CLOSErXOPERAND)rr0rprevZis_infixrrrr#s8       zTokenizer._parse_operatorcCs|j|jdkst|j|jdkr8|td}n8|jrfd|jd}|jdd=t|}n td}|j ||j |dS)z Consumes a ( or { character. Returns the number of characters consumed. (Does not update self.offset) )({rdrKrcNr) r rr7r6r make_subexprrOrrr)rrZ token_valuerrrr$s      zTokenizer._parse_openercCsR|j|jdkst|j}|j|j|jkrBtd|j|j |dS)z Consumes a } or ) character. Returns the number of characters consumed. (Does not update self.offset) ))}zMismatched ( and { pair in '%s'r) r rr7rpop get_closervaluerrr)rrrrrr%s zTokenizer._parse_closercCs|j|j}|dkst|dkr,td}nTz|jdj}Wn tk r\tdtj}Yn$X|tj krvtdtj}n td}|j |dS)z Consumes a ; or , character. Returns the number of characters consumed. (Does not update self.offset) );,rkrErlr) r rr7rmake_separatorrrX IndexErrorr[PARENrr)rr0rZtop_typerrrr&s      zTokenizer._parse_separatorcCsX|j|j}|dkrTt|jdkrT|jd|jrT|j||jd7_dSdS)z Consumes a + or - character if part of a number in sci. notation. Returns True if the character was consumed and self.offset was updated, False otherwise. z+-rrKTF)r rr*rSN_REr9rOr)rr0rrrr+s   z#Tokenizer.check_scientific_notationrcCs2|jr.|jd|kr.td|jd|jddS)a: Ensure that there's no token currently being parsed. Or if there is a token being parsed, it must end with a character in can_follow. If there are unconsumed token contents, it means we hit an unexpected token transition. In this case, we raise a TokenizerError rEz!Unexpected character at position rLr N)rrrr )rr3rrrr6's zTokenizer.assert_empty_tokencCs0|jr,|jtd|j|jdd=dS)z9If there's a token being parsed, add it to the item list.rKN)rrrrr;rOrTrrrr-5szTokenizer.save_tokencCsB|js dS|jdjtjkr(|jdjSdddd|jDS)z+Convert the parsed tokens back to a string.rKrrcss|] }|jVqdSrW)rj)rBrrrrrZAsz#Tokenizer.render..)rrXrrrjrOrTrrrrender;s  zTokenizer.renderN)r)rrrrrFcompilerprRr8rMr,rrrr r!r"r#r$r%r&r+r6r-rqrrrrr s,   ( & r c@seZdZdZdddgZdZdZdZdZd Z d Z d Z d Z d Z dZd'ddZdZdZdZdZdZddZeddZdZdZed(ddZd d!Zd"Zd#Zed$d%Zd&S))ra) A token in an Excel formula. Tokens have three attributes: * `value`: The string value parsed that led to this token * `type`: A string identifying the type of token * `subtype`: A string identifying subtype of the token (optional, and defaults to "") rjrXr>rraFUNCARRAYroSEPzOPERATOR-PREFIXzOPERATOR-INFIXzOPERATOR-POSTFIXz WHITE-SPACErKcCs||_||_||_dSrW)rjrXr>)rrjtype_r>rrrr_szToken.__init__TEXTNUMBERLOGICALERRORRANGEcCsd|j|j|jS)Nz {0} {1} {2}:)formatrXr>rjrTrrr__repr__qszToken.__repr__cCsp|dr|j}nP|dr$|j}n>|dkr4|j}n.zt||j}Wntk r`|j}YnX|||j|S)zCreate an operand token.r r)TRUEFALSE) rNrwrzryfloatrx ValueErrorr{raclsrjr>rrrr;ts    zToken.make_operandOPENr`FcCsr|ddkst|r,td|s$ttj}n&|dkrrrrres zToken.make_subexpcCsT|j|j|j|jfkst|j|jks*t|j|jkr:dnd}|j||j|jkdS)z6Return a closing token that matches this token's type.rgrf)r)rXrsrtror7r>rre)rrjrrrriszToken.get_closerARGROWcCs.|dks t|dkr|jn|j}|||j|S)zCreate a separator token)rlrkrl)r7rrrurrrrrms zToken.make_separatorN)rK)F)rrrr __slots__rrarsrtrorur]r[r\rQrrwrxryrzr{r} classmethodr;rr`rerirrrmrrrrrDs<    r)rrF Exceptionrobjectr rrrrrs 6