Tech Notes: 2009

Monday, December 14, 2009

Stop Firefox from Following Links That You Never Click

If you use Firefox, you may not be aware of one its hidden features that not everyone likes. When you fetch a Web page using Firefox, that page typically contains many links to other pages. Obviously, since you fetched the page in the first place, you are likely to click on some of the links on the page. Firefox tries to guess which links you are likely to click next, and before you click on them it follows those links and caches the downloaded pages. It does this to give you a faster browsing experience. If you actually click on one of those links, the resulting page loads very fast, because Firefox has already followed the link before you clicked on it.

On the face of it, this seems desirable, but there are some serious issues with this behavior. If you pay for your bandwidth by the byte — or if you have a bandwidth cap imposed by your ISP — then all that hidden page-fetching by Firefox still counts as downloading to your ISP.

But worse, if you visit a page that contains links to content that your employer finds objectionable or that your government finds illegal, then shouldn't it be your decision whether or not to click on those links? The fact is — unless you are surfing using anonymous free WiFi — your IP address can be tracked directly to you personally, and nearly every page you fetch is logged in some Web server's access log. So how do you turn off this "feature" in Firefox?

It's pretty simple. In the Firefox address bar, type:

about:config

and press Enter. If this is the first time you've done this, you'll have to click a button that means you promise to be careful. This brings you to a list of Firefox's internal configuration options. In the Filter field at the top of the page, enter "prefetch". You should see a single line for an option named "network.prefetch-next" that has the value true. Double click the option to change the value to false. That's all there is. Your change takes effect immediately, though I suggest that you close and re-start Firefox just to be safe, because it's possible that this configuration value is only consulted at start-time.

Friday, December 11, 2009

Digital Alchemy

Medieval alchemists sought to transmute base metals, such as lead and bronze, into gold. Modern science has taught us the futility of that pursuit. In the digital world, there's a kind of digital alchemy that exists thanks to the lowly Boolean XOR operation. The XOR operation is a cousin to the more widely known OR and AND operations. Here's a table showing how XOR works when computing A XOR B for all possible bitwise values of A and B:

As you can see, it's the same as the Boolean OR except the value of 1 XOR 1 is 0. This minor difference enables some truly unique properties, which are commonly used in the field of cryptography.

Suppose you have a message that you want to encrypt (the plaintext). If you represent the message as a string of bits, and you generate a random sequence of bits of the same length to use as the encryption key, then you can encrypt the message by simply XOR'ing each bit of the key with the corresponding bit of the message. The resulting ciphertext is indecipherable without the key. If you send the ciphertext to someone, they can decrypt it simply by XOR'ing the cyphertext with the key, which gives back the original plaintext.

This can be summed up as follows:

C = P XOR K
P = C XOR K

where P is the plaintext, K is the key, and C is the ciphertext. Clearly, XOR'ing with K is a bidirectional transmutation of the plaintext into the ciphertext and vice versa.

This is called a One-Time Pad, and it's a mathematically perfect encryption system, but it's completely impractical to use, because the key is just as large as the message, and you need to have a way to distribute the key to the recipient securely. Oh, and you can never — ever — reuse a key, or it is trivial to crack the encryption. But this post isn't about using XOR as an encryption tool. Instead, I want to show how to use it to transmute any data into any other data, much like the medieval alchemists sought to transmute base metals into gold. This ability to turn any data into any other data has implications for file sharing.

Suppose I want to share the latest DVD image of Windows 7 with other people. If I take the DVD image of Windows 7 and XOR it with an equally large DVD containing, say, a free Linux distribution, then the resulting string of bits will appear to be garbage. Anyone who downloads that string of garbage bits can transmute it back into a Windows 7 DVD image by downloading the same free Linux distribution and XOR'ing it with the garbage bits. The result of that XOR operation is a Windows 7 DVD image.

Of course, it's probably still copyright infringement to do this, but the main idea of this post is that any string of bits can be transmuted into any other — arbitrary — string of bits by XOR'ing with the appropriate string of (apparent) garbage bits. Convert the CD image of Black Sabbath's first album into Handel's Messiah, the text of Mein Kampf into the New Testament, the source code of vi into Emacs, or a DVD of the worst movie ever made into an Oscar-winning classic. This is the modern world's digital alchemy.

Friday, November 13, 2009

How to Break Out from Inside a Draconian Firewall

If you work at an organization where outbound Internet traffic is restricted — and possibly closely monitored — you may long for the freedom that we old-timers had back in the day, when most firewalls blocked only inbound traffic and allowed people inside the firewall to make arbitrary outbound connections. By "arbitrary", I mean the kinds of connections that let you do more than simply surf the Web using the HTTP and HTTPS protocols. For instance, most organizations these days don't allow you to make outbound connections that use these protocols:

FTP (file transfer)
TELNET (remote login)
NNTP (Usenet news)
SMTP/POP3/IMAP (sending and receiving email)
SSH (encrypted remote login)

Wouldn't it be nice if there was a way to make secure outbound connections using arbitrary protocols, even from behind a firewall that allows only monitored Web surfing? Well, you can, and here's how to do it.

First, you'll need a Linux machine outside of the firewall to which you have "root" access. You could take an old PC, install Linux on it, and hook it up to the Internet from your house. You'll probably need to ask your Internet Service Provider to give you a public IP address. Another way is to purchase a virtual Linux server from a Virtual Private Server (VPS) hosting company, such as Linode, where you can get a nice low-end server with a public IP address for US$19.95/month.

Next, install the OpenSSH server on that machine. If you're using Ubuntu Linux, you can do that by executing this command:

sudo apt-get install openssh-server

You probably won't have to do this, since most Linux distributions come with an OpenSSH server already installed. Next, install the stunnel utility with this command:

sudo apt-get install stunnel4

stunnel is a tool that listens for TLS (also known as SSL) connections on a specified port, decrypts the incoming client traffic, and forwards it to an arbitrary host and port. If you don't use apt-get to install stunnel, then you'll have to manually generate the TLS certificate file (see these instructions).

Next, we'll configure stunnel to listen on port 443, the HTTPS protocol port, which mimics the behavior a secure Web server, and forward the traffic received there to port 22, the SSH server port, on the same machine. Do this by creating a text file named stunnel.conf containing this text:

foreground = no
output = /tmp/stunnel.log

[sshtunnel]
accept = 443
connect = 22

Then, start stunnel with this command:

sudo stunnel stunnel.conf

The stunnel process will continue to run in the background as a daemon, writing log information to the file /tmp/stunnel.log. Any connection made to port 443 (the HTTPS port) on your Linux box will be received as an encrypted TLS connection, and the decrypted traffic will be forwarded to port 22 (the SSH server port) on the same machine.

Maybe you can see where this is going. Most firewalls that let you surf the Web will let you make connections to port 443, the HTTPS port, because that's how Web browsers connect to secure Web servers. Thanks to the stunnel process running on your Linux box, any TLS connection made to port 443 on your Linux box is automatically forwarded to your SSH server. If we can get an SSH client to make a connection to port 443 on the Linux box, then we can login to it.

Of course, this means you need to have an SSH client on a machine behind the firewall. If you're running Linux, you probably have the OpenSSH client installed already, but if not, you can get it under Ubuntu Linux with this command:

sudo apt-get install openssh-client

If you're running Windows, you have two options:

Install Cygwin, a free Linux emulation package that includes the OpenSSH client
Install Putty, a free Windows-based SSH client

I recommend installing Cygwin, because it contains both stunnel and OpenSSH. The commands shown below use the OpenSSH client syntax.

Now all we need to do is find a way to make the OpenSSH client that's behind your organization's firewall connect to port 443 on your Linux box. The OpenSSH client has a –p option that lets you specify the TCP port to which the client should connect. We could try to use that option to tell the OpenSSH client to connect to port 443 on your Linux box, but that wouldn't work. The problem is that the SSH client doesn't speak the TLS protocol, which is needed to communicate with the stunnel daemon listening on port 443. The solution is to use stunnel again, but running in client-mode on your machine inside the firewall. Just create a file named stunnel.conf on your machine inside the firewall containing this text:

foreground = no
output = /tmp/stunnel.log

[sshtunnel]
client = yes
accept = 9999
connect = yourlinuxbox:443

where yourlinuxbox is the hostname or IP address of your Linux machine outside the firewall. Start the client-side stunnel daemon like this:

stunnel stunnel.conf

Now you have an stunnel client listening for TCP connections on port 9999 on your machine inside the firewall. When an SSH client connects to port 9999 on your machine, the connection will be tunneled over an TLS-encrypted connection through the firewall to port 443 on your Linux box. From there, the traffic will be forwarded by the stunnel daemon to the SSH server on your Linux box, and you are logged in to the Linux box. To make such a connection, execute this OpenSSH client command on the same machine that's running the stunnel client:

ssh -p 9999 localhost

This works because the outbound connection to port 443 looks exactly like a normal Web browser connecting to a secure Web site! Your organization's firewall cannot tell that it's really carrying SSH protocol data, because it is encrypted end-to-end using the TLS protocol, a Web standard for securing the traffic between browsers and Web servers.

So all of this lets you SSH to your Linux box from behind a firewall. What about all those other protocols? How will this help you connect to a public NNTP or IMAP server — or surf to a Web site that is blocked by your firewall? This is where the power of SSH comes in. The OpenSSH client has a –L option that forwards connections received on arbitrary local TCP ports over the encrypted SSH connection. We'll use that option to open any number of other tunnels through the firewall.

Suppose you want to connect to Google's secure IMAP mail server (which listens on TCP port 993) to read your email using a desktop email reader such as Thunderbird or Outlook, but your organization's firewall blocks outbound secure IMAP connections. After setting up the stunnel client and server as described above, use this OpenSSH client command to login to your Linux box:

ssh -p 9999 -L 10993:imap.gmail.com:993 localhost

The –L option tells OpenSSH to listen for connections on port 10993 on your machine and forward them over the encrypted SSH connection (which is itself forwarded over the encrypted stunnel connection) to Google's IMAP server — imap.gmail.com — listening on port 993. Then simply configure your mail reader to connect to port 10993 on host localhost, and it will actually connect to Google's IMAP server. You can use as many –L options as you want, each one forwarding a different local TCP port to an arbitrary remote host and port.

Clearly, this is a powerful way to make arbitrary TCP connections from behind a firewall when that firewall only allows Web surfing. But what if your organization's firewall implements a "net nanny", which is software that watches your outbound Web connections and blocks connections to sites deemed unacceptable? Since you can't forward hundreds or thousands of ports to every possible Web server you might want to browse, the above SSH client command doesn't help. Instead, we'll use OpenSSH's built-in SOCKS proxy.

A SOCKS proxy is a software server that lets applications connect to it, and it forwards those connections to arbitrary remote hosts and ports. The application needs to have built-in support for SOCKS proxies, but thankfully every Web browser supports SOCKS. We can use this to enable unrestricted Web surfing over our SSH tunnel. We do this by using OpenSSH's –D option:

ssh -p 9999 -D 8888 localhost

The above command creates a SOCKS5 proxy listening on port 8888 on your machine inside the firewall. Applications that connect to port 8888 on your machine can request the connection to be forwarded to an arbitrary remote host and port. That forwarded connection is tunneled through the firewall over the encrypted SSH connection (which is itself tunneled over the encrypted stunnel connection).

Now configure your Web browser to use a SOCKS5 proxy on port 8888 on host localhost, and you are good to go. If possible, configure your Web browser to do remote DNS queries, which causes it to tunnel DNS queries over the SOCKS proxy instead of doing them from behind your firewall (Firefox users see these instructions). You don't want your organization's DNS administrators seeing all those name resolution requests.

When using this tunnel, your organization's firewall sees only a single outbound TLS connection to port 443 on your Linux machine. This looks like a single, long-lived connection to a secure Web server. The only evidence that it's not normal Web-surfing is that it lasts for a long time and, if you are uploading a lot of data over the tunnel, the traffic is not dominated by data flowing inbound. If you think that might arouse suspicion, then don't upload data and don't leave the tunnel up for more than a few minutes at a time. If you are asked about the duration of the connection, you can always blame your Web browser for leaving the connection open. No matter what happens, the traffic over the tunnel is completely encrypted.

Yes, this is complicated. Yes, this is something only geeks will want to do. But if you've made it this far, it's time to admit you're a geek. Since this procedure requires running two separate commands on your machine behind the firewall, I've written a shell script called tunnel that automates the process. You still have to start the stunnel server on the Linux box outside the firewall, but this script can do that for you too. Download the script from here: http://li58-96.members.linode.com/~franl/code/tunnel. Invoke it with option --help to see a usage summary.

Tech Notes