Robust detection of HTTP client IP address including connections through reverse proxy, load balancer, web acceleration and application firewall

The Hypertext Transfer Protocol (HTTP), the protocol which is driving the World Wide Web under the hood, inherently supports detection of the other end’s IP address by examining the Transmission Control Protocol (TCP) connection properties.

Why does one need to perform anything special to accurately detect the client IP address?

Here is why: examining TCP connection properties for the client endpoint IP address only works well when the connection between the client and the server is direct, meaning no reverse proxy servers, application firewalls, load balancers, web accelerators or content delivery networks (CDNs) (such as Limelight, Cotendo, Akamai and many others) are involved in between (all involving a reverse proxy of some kind that way or another)

What is a reverse proxy?

A reverse proxy (in a nutshell) is a web server acting as a proxy (meaning forwarding the requests of or executing requests on behalf of) the client, moreover, more than one reverse proxy server might be involved in delivering and serving an HTTP request, which means that the TCP client endpoint which the server sees, is the originating IP address of the last reverse proxy server in the chain and not the address of the original client.

image

This property of proxy servers is widely used for anonymization (hiding one’s Internet activity traces).

So, how does one know the IP address of the real client?

Since IP addresses are such a crucial piece of information for debugging, analysis and other intelligence, a mechanism has been devised to carry the client IP address from the first reverse proxy server in the chain to the web server servicing the request which is using HTTP request header for transport.

Various HTTP header fields control various aspects of the HTTP request and response, and the greatest part about them is that the information they carry is extensible, meaning anyone can add headers to the request as long as they conform to the HTTP protocol specification.

X-Forwarded-For saves the day

The X-Forwarded-For (XFF) HTTP header field is a de facto standard for identifying the originating IP address of a client connecting to a web server through an HTTP proxy or load balancer.

The general format of the field is:

X-Forwarded-For: client, proxy1, proxy2

where the value is a comma+space separated list of IP addresses, the left-most being the original client, and each successive proxy that passed the request adding the IP address where it received the request from. In this example, the request passed through proxy1, proxy2, and then proxy3 (not shown in the header). proxy3 appears as remote address of the request.

By examining the X-Forwarder-For HTTP header field value, you can track the real IP address of a client on the Internet accessing your web server in a reverse proxy scenario, even if your web server is not routable from the Internet.

Security considerations

Since it is easy to forge an X-Forwarded-For field the given information should be used with care. The last IP address is always the IP address that connects to the last proxy, which means it is the most reliable source of information. X-Forwarded-For data can be used in a forward or reverse proxy scenario.

You should NOT trust all X-Forwarded-For information in this scenario as you may have received bogus information from the Internet. As such a trust list should be used to make sure that proxy IP addresses in the X-Forwarded-For field are trusted by you.

Code sample

For convenience, I have created a C# class which implements the above information for determining the client IP address.

Here’s a short usage example for this class:

using System;
using System.Collections.Generic;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Net;

public partial class _Default : System.Web.UI.Page
{
    protected void Page_Load(object sender, EventArgs e)
    {
        // HttpClientInfo contains the information regarding the         // current client IP address
        HttpClientInfo client =             HttpClientInfo.Create(HttpContext.Current);

        // The first member of the ClientIPAddresses array is the         // client IP address
        // (either delivered through a reverse proxy, or directly)
        IPAddress ip = client.ClientIPAddresses[0];

        Response.Write("Real client IP address: " + ip.ToString());
    }
}

The HttpClientInfo class

The code sample above would be useless without the actual HttpClientInfo class implementation.

Here is the code listing for the class:

using System;
using System.Collections.Generic;
using System.Web;
using System.Net;
using System.Collections.Specialized;

/// <summary>
/// Http client information with reverse proxy support.
/// </summary>
public class HttpClientInfo
{
    // Private data members
    private IPAddress[] m_clientIPAddresses = new IPAddress[] { };
    private string[] m_reverseProxyChain = new string[] { };

    /// <summary>
    /// Client IP address chain, including reverse proxies.
    /// 
    /// The first IP address in the chain will be the address of the real client.
    /// </summary>
    public IPAddress[] ClientIPAddresses
    {
        get
        {
            return m_clientIPAddresses;
        }
    }

    /// <summary>
    /// Reverse proxy information chain.
    /// </summary>
    public string[] ReverseProxyChain
    {
        get
        {
            return m_reverseProxyChain;
        }
    }

    /// <summary>
    /// Private constructor to prevent instances of this class to be created directly.
    /// </summary>
    private HttpClientInfo()
    {
    }

    /// <summary>
    /// Create client info object from HttpRequest.
    /// </summary>
    /// <param name="request">HTTP request</param>
    /// <returns>HTTP client information</returns>
    public static HttpClientInfo Create(HttpRequest request)
    {
        HttpClientInfo result = new HttpClientInfo();

        result.m_clientIPAddresses = GetClientIPs(request);
        result.m_reverseProxyChain = GetReverseProxyServers(request);

        return result;
    }

    /// <summary>
    /// Create client info object from HttpContext.
    /// </summary>
    /// <param name="context">HTTP context</param>
    /// <returns>HTTP client information</returns>
    public static HttpClientInfo Create(HttpContext context)
    {
        return Create(context.Request);
    }

    /// <summary>
    /// Create client info object from current HTTP context.
    /// </summary>
    /// <returns>HTTP client information</returns>
    public static HttpClientInfo Create()
    {
        return Create(HttpContext.Current.Request);
    }

    /// <summary>
    /// Get reverse proxy servers chain information.
    /// </summary>
    /// <param name="context">HTTP request</param>
    /// <returns>Array of proxy servers descriptions</returns>
    private static string[] GetReverseProxyServers(HttpRequest request)
    {
        List<string> result = new List<string>();

        // Via and X-Via are de facto standard headers for proxy server chain identification
        result.AddRange(GetValues(request.Headers, "X-Via"));
        result.AddRange(GetValues(request.Headers, "Via"));

        return result.ToArray();
    }

    /// <summary>
    /// Get client IP address chain including reverse proxies.
    /// </summary>
    /// <param name="context">Request HTTP context</param>
    /// <returns>Array of client IP addresses</returns>
    private static IPAddress[] GetClientIPs(HttpRequest request)
    {
        List<IPAddress> result = new List<IPAddress>();

        // There might be several X-Forwarded-For headers
        foreach (            string ipValues in            GetValues(request.Headers, "X-Forwarded-For")        )
        {
            // IP addresses can be comma-delimited
            foreach (                string ip in                ipValues.Split(                    new char[] { ',', ' ', '\t', '\r', '\n' },                    StringSplitOptions.RemoveEmptyEntries                )            )
            {
                result.Add(IPAddress.Parse(ip.Trim()));
            }
        }

        // Either real client IP or the last reverse        // proxy server in the chain
        result.Add(IPAddress.Parse(request.UserHostAddress));

        return result.ToArray();
    }

    /// <summary>
    /// Get all values for specified NameValueCollection keys    //// if they exist.
    /// </summary>
    /// <param name="nameValueCollection">Input name/value collection</param>
    /// <param name="keys">Keys to look for</param>
    /// <returns>Array of values</returns>
    private static string[] GetValues(NameValueCollection nameValueCollection, params string[] keys)
    {
        List<string> result = new List<string>();

        foreach (string key in keys)
        {
            string[] values = nameValueCollection.GetValues(key);

            if (values != null)
            {
                result.AddRange(values);
            }
        }

        return result.ToArray();
    }
}
Advertisements

3 thoughts on “Robust detection of HTTP client IP address including connections through reverse proxy, load balancer, web acceleration and application firewall

  1. Can I simply just say what a relief to uncover someone who truly understands what they’re discussing over the internet.
    You definitely know how to bring an issue to light and make
    it important. A lot more people have to read this and understand this side of the
    story. I can’t believe you aren’t more popular given that you most certainly have
    the gift.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s