Fresh

View study

User Tracking on the Web via Cross-Browser Fingerprinting

Introduction

In the very beginning of the creation of the Web, users could be effectively identified by the IP addresses of their computers. Later on, as the use of dynamic IP addresses and Network Address Translation became widespread, this piece of information alone was no longer enough; instead, tracking the browsing habits of a user could be performed by storing an identifier in a cookie in the web browser, so that it would supposedly identify the user for every HTTP response containing the cookie. This technique has two significant disadvantages: the cookie can only identify a single browser application, and the cookie database can be wiped, destroying the identifier. However, these techniques still seem to be widely used, albethey somewhat aged.

Although there are cross-browser storage techniques that avoid the aforementioned problem (e.g. Local Shared Objects or LSOs, also known as Flash cookies), active tracking methods (i.e. those that rely on client-side storage) all share the shortcoming of the possibility of destroying the identifier, which fueled the research of passive methods. These techniques do not store anything on the user’s computer; instead, they query certain parameters that are accessible through the web browser, e.g. time zone and screen resolution.

Passive techniques include history stealing attacks and browser fingerprinting algorithms. With history stealing, the attacker website tries to extract unique history entries from the browser – usually by exploiting unpatched vulnerabilities or misusing API functions. Browser fingerprinting checks certain properties of the browser and the computer it is run on, and tries to calculate a unique identifier from the gathered information. The first major fingerprinting experiment was Panopticlick, which has amassed more than 1.5 million records; it identifies users based on the so-called User Agent String (UAS, i.e. a line of text that includes the most important information about the system and the browser), parameters from the HTTP request, the list of plugins, the time zone, the screen resolution, the set of installed fonts, and the availability of some cookie-like storage techniques.

Switching browsers might provide some protection against fingerprinting, but it is unlikely that somebody would install several versions of multiple browsers to avoid being tracked. However, one version each of multiple types of browsers installed on a single computer is not uncommon. (Although defeating tracking techniques is presumably not the main motive; reasons could range from platform-exclusive extensions to selectively optimised webpages.). Furthermore, browser extensions that allow spoofing certain settings (e.g. the UAS) can also be effective means of defence. It must be noted, however, that these measures are completely ineffective against cross-browser fingerprinting techniques, which rely on other parameters, for instance, on the detection of installed font types or plugins. In order to cross-browser fingerprint a user, a website operator has to choose some browser-independent features as a basis of identification. These are likely to include a set of, but are not limited to, the following browser- and system-dependent properties:

Networking information. Since HTTP requests are sent via TCP/IP, the server always sees the IP address (and hostname), and the TCP port number. The location of the client can also be inferred from the IP address in most cases.
Application layer information. The user agent string is a standard HTTP header, and is sent with every request. It contains the type and version of the browser; the name and version of the operating system; the type and version of the layout engine (e.g. Gecko for Firefox); and the names and versions of certain extensions. It must be noted that some browsers (e.g. Opera) are extremely verbose about their version, so even minute patches change their UASes. Finally, the HTTP request usually contains a language preference code (e.g. ‘en-us’), too.
Information gained by querying the browser. JavaScript programs have access to the list of fonts, plugins (along with their version numbers), screen resolution, and the time zone. Additionally, some vulnerabilities may allow access to browser history or to other client-side databases that are otherwise inaccessible for the visited website.

In this paper, we discuss a new browser-independent fingerprinting technique as our main contribution, and provide the analysis of the collected data in regard to a related experiment. Our most important contribution is the analysis of font detection via JavaScript from the viewpoint of using the detected fonts as input for fingerprinting. Furthermore, we analyse other browser-related information sets, such as the UASes, which were eventually not incorporated into the aforementioned fingerprinting algorithm, but were collected during the same experiment; we have shown that these may also be of interest for a tracker with different goals than ours.

The paper is structured as follows. In Section 2, we briefly discuss the evolution of techniques that aim to track the browsing habits of users. Then, in Section 3, we describe our own browser fingerprinting experiment, and compare the gathered results to the Panopticlick dataset. Subsequently, in Section 4, we analyse the results which we collected by it. Finally, we discuss improvements to the algorithm in Section 5, and then conclude our work in Section 6.

Read the rest of the paper

Download article from here or below. The original publication will soon be available at www.springerlink.com.

Source: 16th Nordic Conference in Secure IT Systems, Tallinn, Estonia, 26–28 October 2011.

Permalink: https://pet-portal.eu/blog/read/37/2012-02-20-User-Tracking-on-the-Web-via-Cross-Browser-Fingerprint...

Attachments:

./files/articles/2011/fingerprinting/cross-browser_fingerprinting.pdf (363.501 Kb)

Back to studies

Comments

0 comments.

No comments.

Post new comment

Anyone can comment, in case of unregistered senders all fields are optional. Comment can be anonymous.

Name:
E-mail:
Blog:

BBCode is a simple markup language used for formatting comments. Valid codes are: bold: [b]Maecenas at nisl.[/b] italics: [i]Maecenas at nisl.[/i] underline: [u]Maecenas at nisl.[/u] url: [url]http://www.mysite.com[/url], [url=http://www.mysite.com]Maecenas at nisl.[/url] Picture: [img]http://www.mysite.com/mypic.png[/img] quote: [quote]Maecenas at nisl.[/quote] code: [code]Maecenas at nisl.[/code] size: [size=12]Maecenas at nisl.[/size] color: [color=#FF0000]Maecenas at nisl.[/color] Send

Fresh

Recent comments

View study

Introduction

Read the rest of the paper

Comments

Post new comment

Login

Tags