Skip to main content

The State of Data Logging: An Evaluation of Threat Levels and Security Practices

Psiphon Inc. has for years been leading the charge to open the internet to those living under censoring regimes. Despite facing no content restrictions online, users from Western countries are turning to VPNs such as Psiphon, to protect their personal privacy online. Yet in March 2017, the Pew Research Center released a startling statistic that some 70% of American internet users are not sure what purpose a VPN serves.1

If the ever-increasing number of Psiphon accounts is any indication of the growing number of VPN users worldwide, a substantial influx of independent users are entering a market in which they have no metric for critically evaluating the products. This group now bears the burden of understanding risks in the unfamiliar domain of cybersecurity. As such, they are therefore highly susceptible to misinformation. One such term that puts users on high alert is the pervasiveness of the term data logging throughout VPN websites and forums, and a misrepresentation of individual risks and methods to efficiently protect oneself.

According to Techopedia, “data logging is the process of collecting and storing data over a period of time in order to analyze specific trends or record the data-based events/actions of a system, network or IT environment. It enables the tracking of all interactions through which data, files or applications are stored, accessed or modified on a storage device or application.”2  Many VPNs market a zero-logging policy as a seemingly unique product feature, thus insinuating that the internet traffic of average citizens is otherwise at constant risk of being logged and possibly exploited for commercial and legal purposes.

This has become a hot button issue following the April 2017 repeal of certain FCC protections that would have required American Internet Services Providers (ISPs) to obtain permission before sharing user data (had it ever taken effect as intended in December 2017). The thought of their government advocating for ISPs to make logs of our personal internet histories and then selling or disclosing them without consent is alarming to most internet users.

As a strong proponent of internet freedom, Psiphon strives daily to ensure that our users receive fair and open access to an uncensored internet without repercussions. The expanding legality of the unauthorized use of user data and browsing history creates a troubling narrative that demonstrates that profits often outweigh ethics. But Psiphon took a deeper dive into understanding the landscape of traffic and data logging practices.

A Brief Comparative Analysis of Data Logging Policies in the US, the EU, and Canada

In general, Western nations have defined protections and regulations through the scope of stored data security. These laws set provisions pertaining to the standards of security to which firms are held while collecting and storing user data. They also outline the issues of legality pertaining to information disclosure. However, all the laws referenced in this paper leave the decisions regarding the type of data and the situations in which it is collected, was well as the duration such data is retained, in the discretion of the collecting organization.  While in many cases these organizations must justify the logic behind their policies to a central governing body, such legislation creates a double-edged sword, as it tacitly supports data retention in many cases.

For example, there are no over-arching regulations in the US pertaining to data logging at present. ISPs are free to collect and store user data, provided that the user is made aware that they are subject to such practices in the terms of service or user licensing agreement. However, companies are governed by the Stored Communications Act, which interdicts the disclosure of user data, including electronic communications, “to any person other than the addressee or intended recipient.”3  While the American government burdens all firms with protection of user data, they do not expressly limit logging practices or reasonable duration of data storage.

The bigger question of ISP logging is not a new one, and it also is not a stand-alone issue. In fact, the question of data logging is rooted in the same legal questions recently resurged net neutrality debate in the US.

The European Union passed a piece of bloc-wide legislation titled the General Data Protection Regulation (GDPR) in April 2016 that builds upon the original Data Protection Directive of 2012 and outlines data security guidelines, user rights, and legal classifications of subjects. However, individual member states are obliged to determine the appropriate duration of data retention and are not subjected to a bloc-wide mandated timeframe.

Data retention regulations in the EU are designed to ensure the security of user data and its availability to law enforcement agencies, and much of the language used in the original Data Protection Directive suggests that data retention is necessary for the ease and timeliness of legal investigations. It goes as far as to codify Article 6, which allows for specific legal classifications such as suspected criminals, convicted criminals, and victims to be legally distinguished from other persons during logging and data retention processes.

Independent firms are responsible for determining the appropriate duration of retention for data collected, while additional periods of retention may be set for “archiving in the public interest scientific, statistical or historical use.”4  However, under the provisions of the GDPR, a private citizen has the “right to be forgotten” and may request that their personal data be removed from a server.

Finally, Canada’s federalist structure has created a multi-tiered jurisdictional organization regarding its data logging regulations. It has enacted the Personal Information Protection and Electronic Documents Act (PIPEDA), which outlines nationwide directives and regulations regarding the collection of information for commercial activity, and setting a national standard definition for personally identifiable information, which includes:
age, name, ID numbers, income, ethnic origin, and blood type;
opinions, evaluations, comments, social status, or disciplinary actions, etc.
employee files, credit records, loan records, medical records, existence of a dispute between a consumer and a merchant, intentions (for example, to acquire goods or services, or change jobs).5

Similar to the EU Directive, provisions are included to allow for the disclosure of information to the Financial Transactions and Reports Analysis Centre of Canada (FINTRAC) pertaining to individuals who are suspected of criminal acts such as financing terrorism. However, PIPEDA does not explicitly outline processes for securing information and limiting the legal duration of storage of information. While paragraph 4.5.3 states that “information that is no longer required to fulfill the identified purposes should be destroyed,” individual firms are at liberty to implement their own procedures.

Further down the chain, provinces that are capable of creating their own laws to ensure data protection may supersede the PIPEDA provisions.

So, are the alarmist VPNs right?

VPNs successfully alarm prospective customers for three fundamental flaws in current logging regimes:
1. Your personally identifiable data may not be adequately secure
2. ISPs and online companies may use your personal data for commercial gain
3. Databases of personal information being held by ISPs and online companies are at the disposal of law enforcement and government agencies

All three evaluated states have allowed for the legal collection and retention of personally identifiable user data. While ISPs and online companies are legally obliged to secure such information, at its core, this has created a weak trust-based system that lacks clear oversight in virtually every jurisdiction.  It is easy to say that those who are not using the internet for compromising or illegal activities need not worry, but at Psiphon we believe that all internet users have a reasonable expectation of privacy, regardless of online behaviors.

As such, it can be assumed that ISPs have the right to log user data. This may be troubling to individual internet users who fear the unauthorized use of private information such as medical records or banking information, or the malicious intrusion into such databases. Additionally, if an ISP received a National Security Letter or warrant requesting user records, they would be bound by law to provide such information to law enforcement.

With the current laws in place, online companies are not at liberty to sell personally identifiable data for commercial purposes or to disclose such information. However, they are well within their rights to retain and use this data for a variety of internal processes such as customer statistics. With the potentially imminent erosion of net neutrality in the United States, stores of personally identifiable data may be exploited or released without consent. Yet, a leading cause for concern among VPN users is the risk of such data being maliciously exfiltrated, despite the best intentions of the service provider.

Internet users will always bear the burden of educating themselves on the practices of the companies with whom they engage. A primary concern is that some end users may be unable to understand the meaning or consequences of terms of use and may base their usage on vague and often misleading social and technical standards of whether the VPN is trustworthy and reliable. Not to mention the market research that must be done to ensure that VPNs are truly abiding by the privacy policies they put forth to potential users, especially if the VPN service requires the creation of user accounts that are built around personally identifiable information.

What’s more, once an educated decision is reached, users in specific geographic locations and on specific sites may find that they feel comfortable using an unencrypted or standard https connection. But where there is still confusion or concern, a no-log VPN such as Psiphon may be used to protect personally identifiable data from being collected online.

Can a VPN protect me from data logging?

Many internet users turn to VPNs to anonymize themselves online and avoid data logging practices of commercial services online. However, it must be stated that no VPN can offer complete anonymity online. But those that are committed to a freely accessible internet, such as Psiphon, incorporate strong security measures into their services that protect the privacy of their users. By using a VPN proxy, internet traffic cannot be attributed to specific users, but users are still connected to and participating in the use of internet services.

In an effort to gain user trust and increase individual privacy, commercially available VPNs often highlight no-logging policies. They claim to discard all records of use, thus making it impossible. However, many VPN service agreements do not provide detailed explanations of such policies. In order to create an account or open an encrypted session, a VPN might require some detailed user information. Once again, the burden falls on the individual user to carefully read through the terms of service and make a determination whether to trust the VPN.

The structure of the internet requires that certain information must be exchanged in order for online communication to occur. No data can be sent or received without a functioning IP address, which includes the legitimate IP address of the VPN user. However, because none of the legislation assessed in this paper mandates that VPNs store user data for a specified amount of time, the core values of Psiphon and other VPN services might cause them to erase such information from their networks upon termination of an encrypted session.

What makes Psiphon different from traditional VPN services? Quite simply, we are borne from and exist predominantly for internet freedom. By design, the Psiphon platform does not store any personally identifiable information regarding end users’ browsing session. Because using Psiphon does not require an account, the software is virtually incapable of creating or storing information pertaining to any individual user. There is therefore no record of use, reassuring those who evade surveillance or fear the disclosure of personally identifiable information to unauthorized parties that using the Psiphon network cannot be attributed to a unique user or geolocation.

To ensure that our network remains efficient and scales with changing level of demand, we engage in internal data assessment, which includes traffic volume and user location information aggregated at the country level.6  This is the most detailed information retained by the service. But if, for example, a nefarious outside actor was to breach this statistical information, no personally identifiable information would be at risk.

By: Jacob Klein

3 S.Rep. No. 99-541, 97th Cong. 2nd Sess. 37, reprinted in 1986 U.S.C.C.A.N. 3555, 3591.
4 Ibid. Paragraph 26. 5
6 However, city-level data will be automatically aggregate only after the usage within a country surpasses a threshold at which it would be statistically impossible for individual users to be identified via an evaluation of the number of active connections within the city. 

Popular posts from this blog

Why You Don't Need Google's Domain Fronting

Google’s removal of domain fronting emphasizes the need for solutions like Psiphon. Google has confirmed that they will block domain fronting across Google domains and App Engine. For many apps and publishers, this represents a step backwards in the fight for internet freedom. While Psiphon has never relied on this Google service, many app developers continued to depend on the practice as a convenient and straightforward means of circumventing state-level censorship, despite the long-running speculation that Google would close this loophole (eg. Will Scott’s blog post in 2017). While the announcement has been met with criticism from internet activists and service providers alike, Google has defended their decision, saying “ domain fronting has never been a supported feature ”. Domain fronting has been a popular means of censorship circumvention for several years, being embraced by popular apps like Signal, who publicly adopted the practice in 2016 . While using Google domain

Social Media and Internet Ban in Turkey

Following the detainment of 12 pro-Kurdish lawmakers from the Peoples’ Democratic Party (HDP) in the early hours of November 4 th , Facebook, Twitter, Instagram, YouTube, WhatsApp and Skype were blocked in Turkey . There were reports that Turk Telekom internet provider completely disabled access to the internet or throttled the connection to the point that it was impossible to connect. Despite lack of official decision about the restrictions, and BTK’s explanation that there was a technical problem throughout Turkey, Prime Minister Binali Yildirim made a statement later in the day and said “For security reasons, these kinds of measures can be taken time to time. These are temporary measures. Everything goes back to normal after the danger is eliminated.” Social media and internet bans ended the following evening in most of the country, but there were still some short-term connection problems during the weekend in some regions, and it was reported that some Turk Telekom users

Psiphon Usage Surges as Brazil Blocks WhatsApp

At 9PM ET on December 16th WhatsApp was blocked in Brazil . The ban came after a judge ordered that the messenger app be blocked for 48 hours when the company refused to hand over private user information related to a criminal case. For months, Brazilian telecommunications companies have been attempting to shut down WhatsApp because it provides free messaging and voice services. WhatsApp is the most popular messenger service in Brazil and telecoms blame it for luring millions away from paid cell phone use. Internet users in Brazil reacted strongly to the ban, criticizing the decision to block WhatsApp widely on social media. Millions turned to alternate messenger services and shared circumvention techniques over social media. Psiphon was praised by people in Brazil for being free, open source, and able to keep them connected throughout the blocking event. Psiphon’s surge capacity was able to cope with the increased demand, with peak data use of more than 8x that of a normal day. Psip