summaryrefslogtreecommitdiffstats
path: root/libraries/html5lib/README
diff options
context:
space:
mode:
Diffstat (limited to 'libraries/html5lib/README')
-rw-r--r--libraries/html5lib/README15
1 files changed, 4 insertions, 11 deletions
diff --git a/libraries/html5lib/README b/libraries/html5lib/README
index e97c619b21..7e57438059 100644
--- a/libraries/html5lib/README
+++ b/libraries/html5lib/README
@@ -1,12 +1,5 @@
-html5lib (HTML parser based on the HTML5 specification)
+html5lib is a pure-python library for parsing HTML. It is designed to
+conform to the WHATWG HTML specification, as is implemented by all
+major web browsers.
-HTML parser designed to follow the HTML5 specification. The parser is
-designed to handle all flavours of HTML and parses invalid documents
-using well-defined error handling rules compatible with the behaviour of
-major desktop web browsers.
-
-Output is to a tree structure; the current release supports output
-to DOM, ElementTree and lxml tree formats as well as a simple
-custom format.
-
-Optional: datrie, python-chardet, lxml and genshi
+Optional dependencies: genshi and lxml