ó <¿CVc@sddlmZddlmZddlmZddlZddlZddlZddlZddl m Z ddl m Z m Z mZddlmZddlmZdd lmZd „Zd „Zd „Zd efd„ƒYZedkrddlZejƒndS(iÿÿÿÿ(tprint_function(tunicode_literals(t text_typeN(tZipFilePathPointer(tfind_dirt find_filetfind_jars_within_path(tParserI(tDependencyGraph(ttaggedsents_to_conllcCswddlm}|d1d2d3d4d5d6d7d8d9d:d;d<d=d>d?d@dAdBdCdDdEdFdGdHdIdJdKdLgƒ}|jS(MNiÿÿÿÿ(t RegexpTaggeru\.$u.u\,$u,u\?$u?u\($u(u\)$u)u\[$u[u\]$u]u^-?[0-9]+(.[0-9]+)?$uCDu(The|the|A|a|An|an)$uDTu&(He|he|She|she|It|it|I|me|Me|You|you)$uPRPu(His|his|Her|her|Its|its)$uPRP$u(my|Your|your|Yours|yours)$u (on|On|in|In|at|At|since|Since)$uINu (for|For|ago|Ago|before|Before)$u(till|Till|until|Until)$u(by|By|beside|Beside)$u(under|Under|below|Below)$u(over|Over|above|Above)$u (across|Across|through|Through)$u(into|Into|towards|Towards)$u(onto|Onto|from|From)$u.*able$uJJu.*ness$uNNu.*ly$uRBu.*s$uNNSu.*ing$uVBGu.*ed$uVBDu.*(u\.$u.(u\,$u,(u\?$u?(u\($u((u\)$u)(u\[$u[(u\]$u](u^-?[0-9]+(.[0-9]+)?$uCD(u(The|the|A|a|An|an)$uDT(u&(He|he|She|she|It|it|I|me|Me|You|you)$uPRP(u(His|his|Her|her|Its|its)$uPRP$(u(my|Your|your|Yours|yours)$uPRP$(u (on|On|in|In|at|At|since|Since)$uIN(u (for|For|ago|Ago|before|Before)$uIN(u(till|Till|until|Until)$uIN(u(by|By|beside|Beside)$uIN(u(under|Under|below|Below)$uIN(u(over|Over|above|Above)$uIN(u (across|Across|through|Through)$uIN(u(into|Into|towards|Towards)$uIN(u(onto|Onto|from|From)$uIN(u.*able$uJJ(u.*ness$uNN(u.*ly$uRB(u.*s$uNNS(u.*ing$uVBG(u.*ed$uVBD(u.*uNN(tnltk.tagR ttag(R t_tagger((sa/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/parse/malt.pytmalt_regex_taggers6  cCs¶tjj|ƒr|}nt|dd ƒ}dddg}tt|ƒƒ}td„|Dƒƒ}tdddgƒ}|j|ƒsŽt‚tt d„|ƒƒs¬t‚t |ƒS( uE A module to find MaltParser .jar file and its dependencies. tenv_varsu MALT_PARSERucss"|]}|jdƒdVqdS(u/iN(t rpartition(t.0tjar((sa/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/parse/malt.pys Esu log4j.jaru libsvm.jaruliblinear-1.8.jarcSs|jdƒo|jdƒS(Nu maltparser-u.jar(t startswithtendswith(ti((sa/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/parse/malt.pytIs(u MALT_PARSER( tostpathtexistsRtsetRtissubsettAssertionErrortanytfiltertlist(tparser_dirnamet _malt_dirtmalt_dependenciest _malt_jarst_jars((sa/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/parse/malt.pytfind_maltparser:s cCs@|dkrdStjj|ƒr&|St|dddtƒSdS(u8 A module to find pre-trained MaltParser model. u malt_temp.mcoRu MALT_MODELtverboseN(u MALT_MODEL(tNoneRRRRtFalse(tmodel_filename((sa/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/parse/malt.pytfind_malt_modelMs  t MaltParsercBsweZdZd d d d„Zedd„Zedd„Zd d d„Ze ed„ƒZ ed„Z ed„Z RS( uØ A class for dependency parsing with MaltParser. The input is the paths to: - a maltparser directory - (optionally) the path to a pre-trained MaltParser .mco model file - (optionally) the tagger to use for POS tagging before parsing - (optionally) additional Java arguments Example: >>> from nltk.parse import malt >>> # With MALT_PARSER and MALT_MODEL environment set. >>> mp = malt.MaltParser('maltparser-1.7.2', 'engmalt.linear-1.7.mco') # doctest: +SKIP >>> mp.parse_one('I shot an elephant in my pajamas .'.split()).tree() # doctest: +SKIP (shot I (elephant an) (in (pajamas my)) .) >>> # Without MALT_PARSER and MALT_MODEL environment. >>> mp = malt.MaltParser('/home/user/maltparser-1.7.2/', '/home/user/engmalt.linear-1.7.mco') # doctest: +SKIP >>> mp.parse_one('I shot an elephant in my pajamas .'.split()).tree() # doctest: +SKIP (shot I (elephant an) (in (pajamas my)) .) cCs|t|ƒ|_|dk r!|ng|_t|ƒ|_|jdk|_tjƒ|_ |dk rl|nt ƒ|_ dS(u¿ An interface for parsing with the Malt Parser. :param parser_dirname: The path to the maltparser directory that contains the maltparser-1.x.jar :type parser_dirname: str :param model_filename: The name of the pre-trained model with .mco file extension. If provided, training will not be required. (see http://www.maltparser.org/mco/mco.html and see http://www.patful.com/chalk/node/185) :type model_filename: str :param tagger: The tagger used to POS tag the raw string before formatting to CONLL format. It should behave like `nltk.pos_tag` :type tagger: function :param additional_java_args: This is the additional Java arguments that one can use when calling Maltparser, usually this is the heapsize limits, e.g. `additional_java_args=['-Xmx1024m']` (see http://goo.gl/mpDBvQ) :type additional_java_args: list u malt_temp.mcoN( R%t malt_jarsR'tadditional_java_argsR*tmodelt_trainedttempfilet gettempdirt working_dirRttagger(tselfR R)R3R-((sa/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/parse/malt.pyt__init__ls unullc csÇ|jstdƒ‚ntjddd|jdddtƒb}tjddd|jdddtƒ2}x't|ƒD]}|jt|ƒƒqyW|j ƒ|j |j |j dd ƒ}t j ƒ}y$t jt jj|jƒd ƒWnnX|j||ƒ} t j|ƒ| d k rEtd d j|ƒ| fƒ‚nt|j ƒA} x7| jƒjd ƒD] } tt| d|ƒgƒVqmWWdQXWdQXWdQXt j|j ƒt j|j ƒdS(u· Use MaltParser to parse multiple POS tagged sentences. Takes multiple sentences where each sentence is a list of (word, tag) tuples. The sentences must have already been tokenized and tagged. :param sentences: Input sentences to parse :type sentence: list(list(tuple(str, str))) :return: iter(iter(``DependencyGraph``)) the dependency graph representation of each sentence u0Parser has not been trained. Call train() first.tprefixumalt_input.conll.tdirtmodeuwtdeleteumalt_output.conll.uparseiu0MaltParser parsing (%s) failed with exit code %du u ttop_relation_labelN(R/t ExceptionR0tNamedTemporaryFileR2R(R twriteRtclosetgenerate_malt_commandtnameRtgetcwdtchdirRtsplitR.t_executetjointopentreadtiterRtremove( R4t sentencesR&R:t input_filet output_filetlinetcmdt _current_pathtrettinfilettree_str((sa/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/parse/malt.pytparse_tagged_sentss4    $  0cs,‡fd†|Dƒ}ˆj||d|ƒS(un Use MaltParser to parse multiple sentences. Takes a list of sentences, where each sentence is a list of words. Each sentence will be automatically tagged with this MaltParser instance's tagger. :param sentences: Input sentences to parse :type sentence: list(list(str)) :return: iter(DependencyGraph) c3s|]}ˆj|ƒVqdS(N(R3(Rtsentence(R4(sa/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/parse/malt.pys ÌsR:(RS(R4RJR&R:ttagged_sentences((R4sa/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/parse/malt.pyt parse_sentsÁs cCsÐdg}||j7}|ddj|jƒg7}|dg7}tjj|jƒrz|dtjj|jƒdg7}n|d|jg7}|d|g7}|dkr¼|d |g7}n|d |g7}|S( u This function generates the maltparser command use at the terminal. :param inputfilename: path to the input file :type inputfilename: str :param outputfilename: path to the output file :type outputfilename: str ujavau-cpu:uorg.maltparser.Maltu-ciÿÿÿÿu-iuparseu-ou-m(R-RER,RRRR.RC(R4t inputfilenametoutputfilenameR8RN((sa/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/parse/malt.pyR?Ïs   & cCs:|r dntj}tj|d|d|ƒ}|jƒS(Ntstdouttstderr(R't subprocesstPIPEtPopentwait(RNR&toutputtp((sa/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/parse/malt.pyRDêsc Cs†tjddd|jdddtƒ3}djd„|Dƒƒ}|jt|ƒƒWd QX|j|jd |ƒt j |jƒd S( uÍ Train MaltParser from a list of ``DependencyGraph`` objects :param depgraphs: list of ``DependencyGraph`` objects for training input data :type depgraphs: DependencyGraph R6umalt_train.conll.R7R8uwR9u css|]}|jdƒVqdS(i N(tto_conll(Rtdg((sa/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/parse/malt.pys ûsNR&( R0R<R2R(RER=Rttrain_from_fileR@RRI(R4t depgraphsR&RKt input_str((sa/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/parse/malt.pyttrainðs c Csît|tƒrŒtjddd|jdddtƒQ}|jƒ&}|jƒ}|jt |ƒƒWdQX|j |j d|ƒSWdQXn|j |dd ƒ}|j ||ƒ}|d krátd d j|ƒ|fƒ‚nt|_dS( u— Train MaltParser from a file :param conll_file: str for the filename of the training input data :type conll_file: str R6umalt_train.conll.R7R8uwR9NR&ulearniu1MaltParser training (%s) failed with exit code %du (t isinstanceRR0R<R2R(RFRGR=RRcR@R?RDR;REtTrueR/(R4t conll_fileR&RKtconll_input_filet conll_strRNRP((sa/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/parse/malt.pyRcs   N( t__name__t __module__t__doc__R'R5R(RSRVR?t staticmethodRDRfRc(((sa/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/parse/malt.pyR+Ys#2 u__main__(t __future__RRtnltk.sixRRR0R[tinspectt nltk.dataRtnltk.internalsRRRtnltk.parse.apiRtnltk.parse.dependencygraphRtnltk.parse.utilR RR%R*R+Rltdoctestttestmod(((sa/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/parse/malt.pyt s$       à C