ó <¿CVc@sidZddlZddlmZddlmZddlTddlTdefd„ƒYZ d„Z dS( s  The Carnegie Mellon Pronouncing Dictionary [cmudict.0.6] ftp://ftp.cs.cmu.edu/project/speech/dict/ Copyright 1998 Carnegie Mellon University File Format: Each line consists of an uppercased word, a counter (for alternative pronunciations), and a transcription. Vowels are marked for stress (1=primary, 2=secondary, 0=no stress). E.g.: NATURAL 1 N AE1 CH ER0 AH0 L The dictionary contains 127069 entries. Of these, 119400 words are assigned a unique pronunciation, 6830 words have two pronunciations, and 839 words have three or more pronunciations. Many of these are fast-speech variants. Phonemes: There are 39 phonemes, as shown below: Phoneme Example Translation Phoneme Example Translation ------- ------- ----------- ------- ------- ----------- AA odd AA D AE at AE T AH hut HH AH T AO ought AO T AW cow K AW AY hide HH AY D B be B IY CH cheese CH IY Z D dee D IY DH thee DH IY EH Ed EH D ER hurt HH ER T EY ate EY T F fee F IY G green G R IY N HH he HH IY IH it IH T IY eat IY T JH gee JH IY K key K IY L lee L IY M me M IY N knee N IY NG ping P IH NG OW oat OW T OY toy T OY P pee P IY R read R IY D S sea S IY SH she SH IY T tea T IY TH theta TH EY T AH UH hood HH UH D UW two T UW V vee V IY W we W IY Y yield Y IY L D Z zee Z IY ZH seizure S IY ZH ER iÿÿÿÿN(tcompat(tIndex(t*tCMUDictCorpusReadercBs,eZd„Zd„Zd„Zd„ZRS(cCs>tg|jdtƒD]!\}}t|td|ƒ^qƒS(su :return: the cmudict lexicon as a list of entries containing (word, transcriptions) tuples. tencodingN(tconcattabspathstNonetTruetStreamBackedCorpusViewtread_cmudict_block(tselftfileidtenc((sl/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/corpus/reader/cmudict.pytentries9scCsS|j}t|tjƒr'|g}ntg|D]}|j|ƒjƒ^q1ƒS(s? :return: the cmudict lexicon as a raw string. (t_fileidst isinstanceRt string_typesRtopentread(R tfileidstf((sl/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/corpus/reader/cmudict.pytrawBs  cCs)g|jƒD]\}}|jƒ^q S(sN :return: a list of all words defined in the cmudict lexicon. (Rtlower(R twordt_((sl/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/corpus/reader/cmudict.pytwordsKscCstt|jƒƒƒS(s” :return: the cmudict lexicon as a dictionary, whose keys are lowercase words and whose values are lists of pronunciations. (tdictRR(R ((sl/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/corpus/reader/cmudict.pyRQs(t__name__t __module__RRRR(((sl/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/corpus/reader/cmudict.pyR8s cCslg}x_t|ƒdkrg|jƒ}|dkr7|S|jƒ}|j|djƒ|dfƒq W|S(Nidtii(tlentreadlinetsplittappendR(tstreamRtlinetpieces((sl/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/corpus/reader/cmudict.pyR Xs   %( t__doc__tcodecstnltkRt nltk.utilRtnltk.corpus.reader.utiltnltk.corpus.reader.apit CorpusReaderRR (((sl/private/var/folders/cc/xm4nqn811x9b50x1q_zpkmvdjlphkp/T/pip-build-FUwmDn/nltk/nltk/corpus/reader/cmudict.pyt.s