Lab - 10 - Web browsing privacy issues

Introduction

Browsers are the main tools we use to access Web pages, with which we can get many types of services. Many of those services are for free, but in fact the users pay them with their privacy.

The privacy-related add-ons that we are going to explore are based on the Firefox browser, but they may possibly exist for other browsers.

User privacy and Web cookies

The privacy of a user behind a Web browser is a complex matter. First, Web services store many kinds of data items in order to recognize a user throughout separated interaction periods. HTTP cookies are usually used for this, but they are not the unique way to do it. This is why, today, Web pages ask if we agree to use cookies, or ask us which cookies do we agree to use for a given service (session cookies, third-party cookies, efficiency-related cookies, etc.).

Sessions, cookies, logins and session cookies

Cookies were design to track a given client along separate HTTP request/response interactions. Such tracking defines a session. Many sessions begin transparently for the user (when the browser installs a cookie provided by the service) and many never end, which allows a service to track a browser for a potentially long period of time.

A cookie is an arbitrary key-value pair that is set in a browser by a resource provider. The ultimate goal is to ``mark’’ a client browser in order to recognize it again on a future contact between both, browser and service. Technically, a cookie is bound to a portion of the URL authority, or domain (the URL DNS part), that provided it. That portion should be matched with the URL of a resource prior to fetch it from the browser. For instance \texttt{www.google.com} matches with an exact equal name, while \texttt{.google.com} matches with any DNS name with that suffix. If a match occurs, the cookie is automatically sent along with the request.

You should not confuse a Web session with a (Web) login session. Upon a login operation, the server creates a login session, which is a Web session that explores a cookie that yields a user identity. But a server can keep a session with an anonymous user. In other words, an anonymous session is the one where the service can correlate separate requests from the same Web browser, though not knowing the identity of its user. Thus, a login session is a Web session, but the contrary is not true.

Although cookies are used to establish transparent or explicit sessions with Web services, the latter can control their persistence. Cookies have a lifetime and they can be tagged as session cookies. Session cookies are cookies that should be discard when closing the browser that holds them. Session cookies are not necessarily bound to cookies associated to a user login. It is up to the Web service to make, or not, this association, as defined by itself.

Technically, Web services do not need the user consent to install cookies on their browser. In fact, the cookie-related dialogues we see are mandated by laws, not by technology. And many services refuse to work without cookies, therefore they can force users to accept their use.

Tracking, in this context, means to be able to follow the actions of a user, or its browser, along its Web navigation through many different Web pages.

Many services, namely those explicitly looked upon by users, are not interested by themselves in tracking users, but they can help third-party trackers to do so (and earn money with that cooperation). For instance, if a Web resource provided by service A makes a request to a resource provided by a tracker T, this last one has the opportunity to install a cookie to identify the browser, and to bind that cookie with service A (because the request mentions the identity of A as the tracker resource ``referrer’’). If service B does the same, the tracker will be able to learn that that browser used services A and B; and this can continue forever, building a usage profile for the user using the browser.

The cookies that are installed by resources explicitly requested by users (first-party resources) are called first-party cookies. On the other hand, the cookies installed upon references from first-party resources to third-party resources, provided by third-party services, are called third-party cookies.

Today many Web pages use third-party resources, namely constant data items, such as JavaScript/CSS-based frameworks (e.g. bootstrap), fonts (e.g. google fonts), JavaScript packages (e.g. JQuery), etc. But many also use third-party ad-providers, which are services that provide targeted advertisement contents (usually called ads). These services usually explore third-party cookies to build the client profile, in order to adjust the provided ads.

Cookies are an effective and evident way of destroying a user privacy along a long period of Web browsing. The use of private tabs (their definition is not the same for all browsers) and the option to delete all cookies when closing the browser are two effective ways to clean all the accumulated cookies along a browsing session, and therefore to break the link between consecutive requests made to a tracker.

Install a cookie manager in your browser. For Firefox, it may be Cookie Quick Manager. Then, observe the cookies you have. If you are browsing for a long time, you may have accumulated a lot of cookies. Try to identify first-party cookies and third-party cookies. See how many third-party cookies are session cookies and their lifetime.

Delete all cookies.

Access www.google.com. You should get consent dialog, which you should accept (do not initiate a session!). Check the cookies installed (it should have 4). Check that none of them is a session cookie.

Access elearning.ua.pt, but do not log in. Check your cookies; you should have a session cookie called MoodleSession. Next, log in. Now, you should have another session cookie for elearning.ua.pt (named _shibsession_...) and another 3 new cookies for idp.ua.pt, two of them session cookies.

Finally, access a Portuguese newspaper Web site. Check your cookies now. See how many third-party cookies were installed. Take your conclusions.

Web browser fingerprinting

Can a Web service identify your browser without using cookies? The answer is yes, but is not guaranteed. And the trick is called browser fingerprinting.

Browser fingerprinting consists on the use of information provided by a browser to a Web service to build a browser profile. The more unique a profile is, the better a Web service can identify a given browser. And the lower is the privacy (or anonymity) of its user.

Fingerprinting from HTTP header fields

The HTTP protocol used in Web interactions uses requests that carry a header. This header specifies details that can modify the way the requests is going to be handled. For instance, the header specifies the browser type (called User agent, which include the browsing engine and the operating system), the preferred languages relatively to the resource requested, how the resource can be provided (e.g. compressed content encoding), the data formats accepted by the browser for the resources provided (e.g. HTML, PDF, image formats such as GIF and JPEG), among other. In short, two users using exactly the same browser in the exact same operating system can still provide Web services with different headers in their HTTP requests.

The fingerprinting with header fields is usually not very fine-grained, unless your browser uses some kind of very unusual header field or value that makes it close to unique.

JavaScript fingerprinting

JavaScript can be used to fetch parameters from the JavaScript run-time environment. These include details about the JavaScript interpreter, the list of available plug-ins, the list of available audio-formats, the dimensions of the screen, etc. Although some of these parameters may change over time, such as the screen dimensions, some are likely to be pretty rare, which increases the odds of fingerprinting uniqueness.

JavaScript fingerprinting only works if you enable JavaScript in your browser for a given resource. JavaScript blockers can prevent fingerprinting, but also remove functionalities from the resource presentation.

Fingerprinting demonstration tools

Access the Web page www.amiunique.org/fp to get an idea of your browser fingerprint. Observe the elements used in the fingerprint and how unique they are (from the site’s point of view). The lower the percentage, the more you are unique. Red marks highlight elements that are likely to be critical in reaching a unique fingerprinting of your browser.

Access the Web page coveryourtracks.eff.org to get your browser analysed. Do not use a third-party tracker in the analysis (uncheck the checkbox). You should get an evaluation about how unique your browser is, which is also given as the number of bits that identify your browser. This number of bits gives the number of detected fingerprints; you are one of them. The higher the bits, the more unique you are.

Make these experiments with different browsers and, if possible, with different hosts (e.g. with virtual machines). That your conclusions.

Bottom line

The more tweaked is your browser, the more unique you become, and the easier it is to fingerprint you. In order to disappear within the mob, you cannot stand.

Tracking with HTML5 storage: supercookies

There are several other ways to track a browser. One of them is based on HTML5 storage, available through JavaScript. There are 2 types of storage provided by the JavaScript engine: local and session. Local storage uses the HTTP same origin scope, thus different services use different variables. Local storage persists when the browser is terminated. Session storage is remove when the browser is closed.

This storage can be used to implement something similar to a cookie-based tracking system, though hiding from the supervision of cookies. This is why some people call them supercookies, because they are more resilient to user control and cleaning actions than regular cookies.

Use the following HTML page to manage local storage variables (for the origin of this page, only) You can add new variables, or change their value. Create a few variables, both on local and session storage. Close the page (tab or window). Open it again. See what happened.

<html>

<header>
<script>
function addSessionVar()
{
    var name = window.prompt( "Session variable name:");
    var value = window.prompt( "Session variable value:");
    sessionStorage.setItem( name, value );
    inspectStorage()
}

function addLocalVar()
{
    var name = window.prompt( "Local variable name:");
    var value = window.prompt( "Local variable value:");
    localStorage.setItem( name, value );
    inspectStorage()
}

function inspectStorage()
{
    var e;
    e = document.getElementById( "localStorage" );
    e.innerHTML = 'Local storage has ' + localStorage.length + ' elements';
    Object.keys(localStorage).forEach(function( key, index ) {
        el = document.getElementById( "localStorage" );
        el.innerHTML += '<br>' + key + ' = ' + localStorage.getItem( key );
    });

    e = document.getElementById( "sessionStorage" );
    e.innerHTML = 'Session storage has ' + sessionStorage.length + ' elements';
    Object.keys(sessionStorage).forEach(function( key, index ) {
        el = document.getElementById( "sessionStorage" );
        el.innerHTML += '<br>' + key + ' = ' + sessionStorage.getItem( key );
    });
}
</script>
<header>

<body onload="inspectStorage();">
<div id="localStorage">
</div>
<br>
<div id="sessionStorage">
</div>
<br>
<div>
<input onclick="addLocalVar();" type="button" value="Add variable to local storage">
<br>
<br>
<input onclick="addSessionVar();" type="button" value="Add variable to session storage">
</div>
</body>
</html>

Extra resources}

\end{document}

Previous
Next