U ¯Dx`Kã@s dZdZddlZddlmZddlmZddlZddlmZm Z ddl m Z ddl Z ddl Z ddlZddlZddlZddlZddlZddlZdd „Zd#d d „ZGd d„deƒZdd„ZdZdZd$dd„Zd%dd„Zd&dd„Zd'dd„Zd(d d!„Zed"kreej  ¡ƒdS))z=Diagnostic functions, mainly for use when doing tech support.ÚMITéN)ÚStringIO)Ú HTMLParser)Ú BeautifulSoupÚ __version__)Úbuilder_registryc CsLtdtƒtdtjƒdddg}|D]4}tjD]}||jkr2q(q2| |¡td|ƒq(d|krÆ| d¡z*dd l m }td d   t t |jƒ¡ƒWn*tk rÄ}z td ƒW5d }~XYnXd|krzdd l}td|jƒWn,tk r}z tdƒW5d }~XYnXt|dƒr.| ¡}nŠ| d¡sF| d¡r^td|ƒtdƒd Sz:tj |¡r–td|ƒt|ƒ}| ¡}W5QRXWntk r®YnXtdƒ|D]Š}td|ƒd} zt||d} d} Wn8tk r}ztd|ƒt ¡W5d }~XYnX| rr@rBrCrDrErFrGrHr-r-r-r.r8msr8cCstƒ}| |¡dS)zÂPrint out the HTMLParser events that occur during parsing. This lets you see how HTMLParser parses a document when no Beautiful Soup code is running. :param data: Some markup. N)r8Úfeed)r%r*r-r-r.Úhtmlparser_trace“srNZaeiouZbcdfghjklmnpqrstvwxyzécCs:d}t|ƒD](}|ddkr"t}nt}|t |¡7}q |S)z#Generate a random word-like string.rér)ÚrangeÚ _consonantsÚ_vowelsÚrandomÚchoice)Úlengthr:ÚiÚtr-r-r.Úrword¡s  rYécCsd dd„t|ƒDƒ¡S)z'Generate a random sentence-like string.ú css|]}tt dd¡ƒVqdS)rZé N)rYrTÚrandint)Ú.0rWr-r-r.Ú ®szrsentence..)rrQ)rVr-r-r.Ú rsentence¬sr`éècCs¤dddddddg}g}t|ƒD]r}t dd ¡}|dkrPt |¡}| d |¡q|d krp| tt d d ¡ƒ¡q|d krt |¡}| d|¡qdd |¡dS)z+Randomly generate an invalid HTML document.ÚpÚdivÚspanrWÚbÚscriptÚtableréz<%s>érZrPzzÚ z)rQrTr]rUrr`r)Ú num_elementsZ tag_namesÚelementsrWrUZtag_namer-r-r.Úrdoc°s    rmé †c Cs$tdtƒt|ƒ}tdt|ƒƒdddgddfD]z}d}z"t ¡}t||ƒ}t ¡}d}Wn6tk r”}ztd |ƒt ¡W5d }~XYnX|r4td |||fƒq4d d l m }t ¡}|  |¡t ¡}td||ƒd d l } |   ¡}t ¡}| |¡t ¡}td||ƒd S)z.Very basic head-to-head performance benchmark.z1Comparative parser benchmark on Beautiful Soup %sz3Generated a large invalid HTML document (%d bytes).r r0r rFTrNz"BS4+%s parsed the markup in %.2fs.rr z$Raw lxml parsed the markup in %.2fs.z(Raw html5lib parsed the markup in %.2fs.)rrrmÚlenÚtimerr"r#r$r r ZHTMLr rÚparse) rkr%r*r+Úar,rer(r r r-r-r.Úbenchmark_parsersÂs4      rsr cCsXt ¡}|j}t|ƒ}tt||d}t d|||¡t  |¡}|  d¡|  dd¡dS)z7Use Python's profiler on a randomly generated document.)Úbs4r%r*zbs4.BeautifulSoup(data, parser)Z cumulativez _html5lib|bs4é2N) ÚtempfileÚNamedTemporaryFiler&rmÚdictrtÚcProfileZrunctxÚpstatsÚStatsZ sort_statsZ print_stats)rkr*Z filehandleÚfilenamer%ÚvarsÚstatsr-r-r.Úprofileâs  rÚ__main__)T)rO)rZ)ra)rn)rnr )!rLÚ __license__ryÚiorÚ html.parserrrtrrZ bs4.builderrrrzrTrvrpr#rr/r7r8rNrSrRrYr`rmrsrrIÚstdinrr-r-r-r.Ús8   G &