Content:
Then looking to send network requests in Python, there are two main options – urllib
, and requests
.
Unfortunately, there’s a lot of confusion surrounding these libraries, particularly the multiple different versions of urllib
.
This article will clear up the urllib
library naming mess, and explain a few key differences between urllib
and requests
.
The Origins of urllib
The original urllib
library was added to the Python standard library in Python 1.4. Long obsolete, it’s unlikely you’ll find any reference to tutorials for this version of urllib
.
urllib2
was added to the Python standard library in Python 1.6, intended as a replacement for urllib
. The two share very little in common, with urllib2
essentially being a brand new library, rather than an incremental upgrade over the original.
urllib2 remained part of the Python standard library throughout the 2.x released.
Python 3 urllib: The Standard
The urllib
library in Python 3 has no relation to the original urllib
. It’s essentially an implementation of urllib2 for Python 3, with a few changes. It’s possible to convert Python 2 urllib2
code to Python 3 urllib
using 2to3, due to their relative similarity.
urllib
not only supports HTTP requests, but can also connect through other protocols, such as FTP. A library is included to parse URLs, too.
Documentation for urllib
can be a little tricky to find. Not because of a lack of documentation, but because you’ll find a mix of code based on urllib2
thrown in for good measure. If you’re searching for documentation on Python3 urllib
, ensure there are no references to Python 2 or urllib2
.
For lower-level uses, or for situations where you need to stick to the Python standard library as much as possible, urllib
is the library to use.
A New Project: urllib3
Despite the name, urllib3
has nothing to do with urllib
or urllib2
. While the goal of the library may be similar (i.e. sending HTTP requests), that’s where the connection ends.
So why does the name suggest otherwise? The creator of urllib3
has stated that the name is one of their regrets, explaining that
The joke was:
shazow, urllib3 Github: https://github.com/urllib3/urllib3/issues/1065#issuecomment-265191841urllib
andurllib2
have nothing to do with each other (the designs are vastly different and separate), sourllib3
will also have nothing to do with the other two. lol?
urllib3
is not a part of the Python standard library, and there is no intention for it to ever be added.
An Easy Syntax: requests
The requests
package is often considered the recommended way to send HTTP requests in Python. In fact, the documentation for the built-in urllib library suggests as much in its documentation.
requests
makes it much easier send a request. Parameters are encoded automatically – pass in a dictionary, and requests
will encode it correctly for you.
RESTful APIs are fully supported, with functions targeting each HTTP request method.
response = requests.get(url)
response = requests.post(url)
response = requests.put(url)
response = requests.delete(url)
Handling responses is easy too. Functionality to parse JSON, for example, is included.
jsondata = response.json()
Plain-text responses are also easy to access.
response.text
This is really just scratching the surface, with requests having features including (but not limited to) authentication support, cookie persistence, connection pooling and multi-part file downloads.
At its core, the requests package uses the urllib3
library, which will be installed along with requests
if it’s not installed already.
A possible downside of requests
is that it lacks the ability to connect using protocols outside of HTTP, such as FTP. It’s also not possible to parse URLs with requests – for this, use urllib.parse
.
Which Should I Use?
It’s difficult to recommend anything other than requests for HTTP requests, given that the urllib
documentation suggests this itself. The easy-to-use syntax allows a request to be coded in very little time.
For lower-level HTTP requests, where you want a bit more control over the entire process, would be the main use-case for urllib
.
As a standard library, no extra software needs to be installed, so for small code snippets urllib
might also be preferable.
urllib
should also be used for non-HTTP requests, due to requests not supporting them.
Hopefully, this guide has helped to clear up some of the confusion surrounding the urllib
libraries, and requests
.