σ <ΏCVc@@s dZddlmZddlmZmZmZddlmZddl Z dZ de fd„ƒYZ d e fd „ƒYZ d e fd „ƒYZd e fd„ƒYZde fd„ƒYZedefdefdefgƒZdd„Zd„Zd„ZdS(s7File formats for training and testing data. Includes a registry of valid file formats. New file formats can be added to the registry like so: :: from textblob import formats class PipeDelimitedFormat(formats.DelimitedFormat): delimiter = '|' formats.register('psv', PipeDelimitedFormat) Once a format has been registered, classifiers will be able to read data files with that format. :: from textblob.classifiers import NaiveBayesAnalyzer with open('training_data.psv', 'r') as fp: cl = NaiveBayesAnalyzer(fp, format='psv') i(tabsolute_import(tPY2tcsvt OrderedDict(t is_filelikeNsutf-8t BaseFormatcB@s/eZdZd„Zd„Zed„ƒZRS(sInterface for format classes. Individual formats can decide on the composition and meaning of ``**kwargs``. :param File fp: A file-like object. .. versionchanged:: 0.9.0 Constructor receives a file pointer rather than a file path. cK@sdS(N((tselftfptkwargs((sf/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/textblob/textblob/formats.pyt__init__'scC@stdƒ‚dS(s(Return an iterable object from the data.s&Must implement a "to_iterable" method.N(tNotImplementedError(R((sf/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/textblob/textblob/formats.pyt to_iterable*scC@stdƒ‚dS(sΕDetect the file format given a filename. Return True if a stream is this file format. .. versionchanged:: 0.9.0 Changed from a static method to a class method. s'Must implement a "detect" class method.N(R (tclststream((sf/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/textblob/textblob/formats.pytdetect.s(t__name__t __module__t__doc__R R t classmethodR(((sf/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/textblob/textblob/formats.pyRs  tDelimitedFormatcB@s5eZdZdZd„Zd„Zed„ƒZRS(s%A general character-delimited format.t,cK@srtj|||tr:tj|d|jdtƒ}ntj|d|jƒ}g|D] }|^qY|_dS(Nt delimitertencoding(RR RRtreaderRtDEFAULT_ENCODINGtdata(RRRRtrow((sf/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/textblob/textblob/formats.pyR =s  cC@s|jS(s(Return an iterable object from the data.(R(R((sf/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/textblob/textblob/formats.pyR FscC@sFy$tjƒj|d|jƒtSWntjtfk rAtSXdS(sReturn True if stream is valid.t delimitersN(RtSniffertsniffRtTruetErrort TypeErrortFalse(R R ((sf/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/textblob/textblob/formats.pyRJs (RRRRR R RR(((sf/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/textblob/textblob/formats.pyR8s  tCSVcB@seZdZdZRS(s…CSV format. Assumes each row is of the form ``text,label``. :: Today is a good day,pos I hate this car.,pos R(RRRR(((sf/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/textblob/textblob/formats.pyR"TstTSVcB@seZdZdZRS(s@TSV format. Assumes each row is of the form ``text label``. s (RRRR(((sf/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/textblob/textblob/formats.pyR#^stJSONcB@s/eZdZd„Zd„Zed„ƒZRS(s JSON format. Assumes that JSON is formatted as an array of objects with ``text`` and ``label`` properties. :: [ {"text": "Today is a good day.", "label": "pos"}, {"text": "I hate this car.", "label": "neg"} ] cK@s)tj|||tj|ƒ|_dS(N(RR tjsontloadtdict(RRR((sf/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/textblob/textblob/formats.pyR pscC@s(g|jD]}|d|df^q S(s-Return an iterable object from the JSON data.ttexttlabel(R'(Rtd((sf/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/textblob/textblob/formats.pyR tscC@s.ytj|ƒtSWntk r)tSXdS(s$Return True if stream is valid JSON.N(R%tloadsRt ValueErrorR!(R R ((sf/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/textblob/textblob/formats.pyRxs   (RRRR R RR(((sf/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/textblob/textblob/formats.pyR$ds   RR%ttsvicC@sat|ƒsdSxJtjƒD]<}|j|j|ƒƒrL|jdƒ|S|jdƒqWdS(s«Attempt to detect a file's format, trying each of the supported formats. Return the format class that was detected. If no format is detected, return ``None``. iN(RtNonet _registrytvaluesRtreadtseek(Rtmax_readtFormat((sf/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/textblob/textblob/formats.pyR‡s  cC@stS(s*Return a dictionary of registered formats.(R/(((sf/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/textblob/textblob/formats.pyt get_registry•scC@s|tƒ|s"