In the very beginning of the creation of the Web, users could be effectively identified by the IP addresses of their computers. Later on, as the use of dynamic IP addresses and Network Address Translation became widespread, this piece of information alone was no longer enough; instead, tracking the browsing habits of a user could be performed by storing an identifier in a cookie in the web browser, so that it would supposedly identify the user for every HTTP response containing the cookie. This technique has two significant disadvantages: the cookie can only identify a single browser application, and the cookie database can be wiped, destroying the identifier. However, these techniques still seem to be widely used, albethey somewhat aged.
Although there are cross-browser storage techniques that avoid the aforementioned problem (e.g. Local Shared Objects or LSOs, also known as Flash cookies), active tracking methods (i.e. those that rely on client-side storage) all share the shortcoming of the possibility of destroying the identifier, which fueled the research of passive methods. These techniques do not store anything on the user’s computer; instead, they query certain parameters that are accessible through the web browser, e.g. time zone and screen resolution.
Passive techniques include history stealing attacks and browser fingerprinting algorithms. With history stealing, the attacker website tries to extract unique history entries from the browser – usually by exploiting unpatched vulnerabilities or misusing API functions. Browser fingerprinting checks certain properties of the browser and the computer it is run on, and tries to calculate a unique identifier from the gathered information. The first major fingerprinting experiment was Panopticlick, which has amassed more than 1.5 million records; it identifies users based on the so-called User Agent String (UAS, i.e. a line of text that includes the most important information about the system and the browser), parameters from the HTTP request, the list of plugins, the time zone, the screen resolution, the set of installed fonts, and the availability of some cookie-like storage techniques.
Switching browsers might provide some protection against fingerprinting, but it is unlikely that somebody would install several versions of multiple browsers to avoid being tracked. However, one version each of multiple types of browsers installed on a single computer is not uncommon. (Although defeating tracking techniques is presumably not the main motive; reasons could range from platform-exclusive extensions to selectively optimised webpages.). Furthermore, browser extensions that allow spoofing certain settings (e.g. the UAS) can also be effective means of defence. It must be noted, however, that these measures are completely ineffective against cross-browser fingerprinting techniques, which rely on other parameters, for instance, on the detection of installed font types or plugins. In order to cross-browser fingerprint a user, a website operator has to choose some browser-independent features as a basis of identification. These are likely to include a set of, but are not limited to, the following browser- and system-dependent properties:
The paper is structured as follows. In Section 2, we briefly discuss the evolution of techniques that aim to track the browsing habits of users. Then, in Section 3, we describe our own browser fingerprinting experiment, and compare the gathered results to the Panopticlick dataset. Subsequently, in Section 4, we analyse the results which we collected by it. Finally, we discuss improvements to the algorithm in Section 5, and then conclude our work in Section 6.
|Source: 16th Nordic Conference in Secure IT Systems, Tallinn, Estonia, 26–28 October 2011.|
Anyone can comment, in case of unregistered senders all fields are optional. Comment can be anonymous.