next up previous contents
Next: Miscellaneous Up: New in Release Previous: Customizing Import and

Python and the World-Wide Web

There is a growing number of modules available for writing WWW tools. The previous release already sported modules gopherlib, ftplib, httplib and urllib (which unifies the other three) for accessing data through the commonest WWW protocols. This release also provides cgi, to ease the writing of server-side scripts that use the Common Gateway Interface protocol, supported by most WWW servers. The module urlparse provides precise parsing of a URL string into its components (address scheme, network location, path, parameters, query, and fragment identifier).

A rudimentary, parser for HTML files is available in the module htmllib. It currently supports a subset of HTML 1.0 (if you bring it up to date, I'd love to receive your fixes!). Unfortunately Python seems to be too slow for real-time parsing and formatting of HTML such as required by interactive WWW browsers --- but it's good enough to write a ``robot'' (an automated WWW browser that searches the web for information).



guido@cwi.nl