U Dx`m@sdZddlmZddlZddlZddlZzddlmZddlm Z Wn$e k rhddl mZm Z YnXddl m Z ddlmZdd lmZmZdd lmZmZzeWnek reZYnXzeWnek reZYnXzeWnek r eefZYnXd d d ddddgZedejejBjZ edejjZ!ejdd:d;Z?dS)=zcA cleanup tool for HTML. Removes unwanted tags and content. See the `Cleaner` class for details. )absolute_importN)urlsplit) unquote_plus)rr)etree)defs) fromstringXHTML_NAMESPACE) xhtml_to_html_transform_result clean_htmlcleanCleanerautolink autolink_html word_breakword_break_htmlzexpression\s*\(.*?\)z @\s*importzdescendant-or-self::*[@style]zdescendant-or-self::a [normalize-space(@href) and substring(normalize-space(@href),1,1) != '#'] |descendant-or-self::x:a[normalize-space(@href) and substring(normalize-space(@href),1,1) != '#']x) namespacesc @seZdZdZdZdZdZdZdZdZ dZ dZ dZ dZ dZdZdZdZdZdZdZdZejZdZdZddhZdd Zed d d d gd d d d dZddZddZddZ ddZ!ddZ"d"ddZ#ddZ$e%&de%j'j(Z)ddZ*d d!Z+dS)#r a Instances cleans the document of each of the possible offending elements. The cleaning is controlled by attributes; you can override attributes in a subclass, or set them in the constructor. ``scripts``: Removes any ``