diff options
Diffstat (limited to 'python/BeautifulSoup4/README')
-rw-r--r-- | python/BeautifulSoup4/README | 32 |
1 files changed, 3 insertions, 29 deletions
diff --git a/python/BeautifulSoup4/README b/python/BeautifulSoup4/README index e0e102b270..9e5e23a850 100644 --- a/python/BeautifulSoup4/README +++ b/python/BeautifulSoup4/README @@ -1,32 +1,6 @@ Beautiful Soup is a Python HTML/XML parser designed for quick -turnaround projects like screen-scraping. Three features make it -powerful: +turnaround projects like screen-scraping. It commonly saves +programmers hours or days of work. -1. Beautiful Soup won't choke if you give it bad markup. It yields a -parse tree that makes approximately as much sense as your original -document. This is usually good enough to collect the data you need -and run away. - -2. Beautiful Soup provides a few simple methods and Pythonic idioms for -navigating, searching, and modifying a parse tree: a toolkit for -dissecting a document and extracting what you need. You don't have to -create a custom parser for each application. - -3. Beautiful Soup automatically converts incoming documents to Unicode -and outgoing documents to UTF-8. You don't have to think about -encodings, unless the document doesn't specify an encoding and -Beautiful Soup can't autodetect one. Then you just have to specify -the original encoding. - -Beautiful Soup parses anything you give it, and does the tree traversal -stuff for you. You can tell it "Find all the links", or "Find all the -links of class externalLink", or "Find all the links whose urls match -"foo.com", or "Find the table heading that's got bold text, then give -me that text." - -Valuable data that was once locked up in poorly-designed websites is -now within your reach. Projects that would have taken hours take only -minutes with Beautiful Soup. - -If python3-soupsieve is installed, then this will also build for +If python3-soupsieve is installed, then this will also build for Python 3. |