ó <¿CVc@s¯dZddlmZddlZddlmZddlmZyddlZWne k rkdZnXda dd„Z e e d„Ze d„Zd „Zd „ZdS( sO A set of functions used to interface with the external megam_ maxent optimization package. Before megam can be used, you should tell NLTK where it can find the megam binary, using the ``config_megam()`` function. Typical usage: >>> from nltk.classify import megam >>> megam.config_megam() # pass path to megam if not found in PATH # doctest: +SKIP [Found megam: ...] Use with MaxentClassifier. Example below, see MaxentClassifier documentation for details. nltk.classify.MaxentClassifier.train(corpus, 'megam') .. _megam: http://www.umiacs.umd.edu/~hal/megam/index.html iÿÿÿÿ(tprint_functionN(tcompat(t find_binaryc Cs4td|ddgdddddgdd ƒad S( sA Configure NLTK's interface to the ``megam`` maxent optimization package. :param bin: The full path to the ``megam`` binary. If not specified, then nltk will search the system for a ``megam`` binary; and if one is not found, it will raise a ``LookupError`` exception. :type bin: str tmegamtenv_varstMEGAMt binary_namess megam.optt megam_686smegam_i686.optturls/http://www.umiacs.umd.edu/~hal/megam/index.htmlN(Rt _megam_bin(tbin((se/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/classify/megam.pyt config_megam)s   csˆjƒ}td„t|ƒDƒƒ}xÓ|D]Ë\‰‰tˆdƒry|jdj‡‡‡fd†|Dƒƒƒn|jd|ˆƒ|s³tˆjˆˆƒ||ƒn:x7|D]/}|jdƒtˆjˆ|ƒ||ƒqºW|jdƒq/WdS( sò Generate an input file for ``megam`` based on the given corpus of classified tokens. :type train_toks: list(tuple(dict, str)) :param train_toks: Training data, represented as a list of pairs, the first member of which is a feature dictionary, and the second of which is a classification label. :type encoding: MaxentFeatureEncodingI :param encoding: A feature encoding, used to convert featuresets into feature vectors. May optionally implement a cost() method in order to assign different costs to different class predictions. :type stream: stream :param stream: The stream to which the megam input file should be written. :param bernoulli: If true, then use the 'bernoulli' format. I.e., all joint features have binary values, and are listed iff they are true. Otherwise, list feature values explicitly. If ``bernoulli=False``, then you must call ``megam`` with the ``-fvals`` option. :param explicit: If true, then use the 'explicit' format. I.e., list the features that would fire for any of the possible labels, for each token. If ``explicit=True``, then you must call ``megam`` with the ``-explicit`` option. css!|]\}}||fVqdS(N((t.0titlabel((se/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/classify/megam.pys _stcostt:c3s*|] }tˆjˆˆ|ƒƒVqdS(N(tstrR(R tl(tencodingt featuresetR(se/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/classify/megam.pys ess%ds #s N(tlabelstdictt enumeratethasattrtwritetjoint_write_megam_featurestencode(t train_toksRtstreamt bernoullitexplicitRtlabelnumR((RRRse/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/classify/megam.pytwrite_megam_file>s    cCs tdkrtdƒ‚n|s-tdƒ‚|jƒjdƒ}tj|dƒ}xE|D]=}|jƒr[|jƒ\}}t|ƒ|t|ƒs    <