Prevalence of Third-Party Tracking on COVID-19–Related Web Pages | Coronavirus (COVID19) | JAMA | JAMA Network
[Skip to Content]
Sign In
Individual Sign In
Create an Account
Institutional Sign In
OpenAthens Shibboleth
Purchase Options:
[Skip to Content Landing]
Table 1.  Third-Party Tracking Overall and by Website Type
Third-Party Tracking Overall and by Website Type
Table 2.  Most Prevalent Tracking Entities Overall and by Website Type
Most Prevalent Tracking Entities Overall and by Website Type
1.
Libert  T.  Privacy implications of health information seeking on the web.   Commun ACM. 2015;58(3):68-77. doi:10.1145/2658983Google ScholarCrossref
2.
Libert  T. An automated approach to auditing disclosure of third-party data collection in website privacy policies. In: Proceedings of the 2018 World Wide Web Conference. International World Wide Web Conferences Steering Committee; 2018:207-216. doi:10.1145/3178876.3186087
3.
US Senate Committee on Commerce, Science, and Transportation. Committee leaders introduce data privacy bill. Published May 7, 2020. Accessed July 2, 2020. https://www.commerce.senate.gov/2020/5/committee-leaders-introduce-data-privacy-bill
Limit 200 characters
Limit 25 characters
Conflicts of Interest Disclosure

Identify all potential conflicts of interest that might be relevant to your comment.

Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.

Err on the side of full disclosure.

If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.

Not all submitted comments are published. Please see our commenting policy for details.

Limit 140 characters
Limit 3600 characters or approximately 600 words
    1 Comment for this article
    Prevalent Indeed
    Les Landry, MSc | Government
    Just an observation: when visiting this site on JAMA Network, my browser security blocked no fewer than ten trackers and ads, four of which were third-party trackers, no doubt to "enhance my experience" as the Cookie Policy proclaims.
    CONFLICT OF INTEREST: None Reported
    Views 6,207
    Citations 0
    Research Letter
    September 8, 2020

    Prevalence of Third-Party Tracking on COVID-19–Related Web Pages

    Author Affiliations
    • 1Perelman School of Medicine, University of Pennsylvania, Philadelphia
    • 2School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania
    JAMA. 2020;324(14):1462-1464. doi:10.1001/jama.2020.16178

    The internet provides ready access to information related to coronavirus disease 2019 (COVID-19). With a simple web search, individuals can find symptom checkers, locate testing sites, and get tips for keeping themselves safe.

    However, online information seeking related to COVID-19 may carry privacy risks. Prior research has shown that web pages visited by individuals seeking health information frequently contain code that initiates data transfers to third parties, such as online advertisers.1 These transfers often include URLs of visited pages and users’ IP addresses. When third parties have code on multiple web pages, they can build detailed profiles of specific individuals’ browsing behaviors and interests. This practice, known as “web tracking,” can reveal sensitive information about individuals’ health conditions and concerns to parties who wish to profit from it.1

    To better understand the privacy risks of online information seeking related to COVID-19, we assessed the prevalence and characteristics of web tracking on COVID-19–related web pages.

    Methods

    To identify web pages likely to be visited by individuals seeking COVID-19–related information, we used Google Trends to identify the top 25 search queries related to COVID and coronavirus in the US on May 15, 2020. We retrieved the top 20 URLs for each query using nonpersonalized Google searches.

    We visited each unique web page using webXray, an automated tool that detects third-party tracking on websites.1 For each web page, we recorded data requests from third-party domains—that is, domains other than that of the website being visited. These requests are significant because they initiate data transfers from a user’s computer to third parties. We also recorded the presence of third-party cookies, data stored on a user’s computer, which often serve as persistent identifiers that allow users to be tracked across multiple websites.

    We calculated the percentages and 95% confidence intervals of web pages that included any third-party data request or any third-party cookie and the median number of third-party data requests and third-party cookies per page, overall and by website type (categorized by top-level domain). We compared results across website types. Using webXray’s database of corporate owners of third-party domains, we calculated the most prevalent tracking entities. Analysis was conducted in R version 4.0.2 (R Foundation).

    Results

    Overall, 535 of 538 (99%; 95% CI, 98%-100%) unique web pages included a third-party data request, with no significant differences by website type, while 477 (89%; 95% CI, 86%-91%) included a third-party cookie (Table 1). Compared with commercial web pages, third-party cookies were slightly less common, although still highly prevalent, among government and academic web pages. However, the median numbers of third-party data requests and third-party cookies per page were both higher on commercial web pages (77 requests; 130 cookies) than on government (8 requests; 4 cookies), nonprofit (16 requests; 7 cookies), or academic (14 requests; 10 cookies) web pages.

    Most (95%; 95% CI, 93%-97%) web pages included a data request from a third-party domain owned by Google, while 7 other companies received data from at least 40% of web pages studied (Table 2).

    Discussion

    This study found that 99% of COVID-19–related web pages included a third-party data request, and 89% included a third-party cookie. By comparison, a prior study of 1 million popular web pages found that 91% included a third-party data request and 70% included a third-party cookie.2

    Third-party tracking was pervasive even among government and academic COVID-19–related web pages, on which visitors might reasonably expect greater privacy protections. Decision-makers at these institutions may be unaware of third-party tracking on their websites because they do not realize that tools used to monitor website traffic transmit data to third parties.

    This study had limitations. First, only 2 mechanisms of third-party tracking were investigated. Because other means of third-party tracking exist, including some designed to evade automated capture, these findings likely underestimate the extent of third-party tracking. Second, because this study was limited to web pages that appeared in the top 20 results for a given Google query, findings may not generalize to web pages with lower search rankings or searches performed using other search engines.

    Amid debate and legislative activity focused on the privacy implications of COVID-19 contact-tracing apps, these findings suggest that attention should also be paid to privacy risks of online information seeking.3

    Section Editor: Jody W. Zylke, MD, Deputy Editor.
    Back to top
    Article Information

    Corresponding Author: Matthew S. McCoy, PhD, Department of Medical Ethics & Health Policy, Perelman School of Medicine, University of Pennsylvania, 423 Guardian Dr, Blockley Hall, Philadelphia, PA 19104 (mmcco@pennmedicine.upenn.edu).

    Accepted for Publication: August 10, 2020.

    Published Online: September 8, 2020. doi:10.1001/jama.2020.16178

    Author Contributions: Drs McCoy and Friedman had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

    Concept and design: McCoy, Grande, Friedman.

    Acquisition, analysis, or interpretation of data: All authors.

    Drafting of the manuscript: McCoy.

    Critical revision of the manuscript for important intellectual content: All authors.

    Statistical analysis: Libert, Buckler, Friedman.

    Administrative, technical, or material support: McCoy, Libert, Grande.

    Supervision: McCoy, Friedman.

    Conflict of Interest Disclosures: Dr McCoy reported being an uncompensated member of the University of Pennsylvania’s Data Ethics Working Group, funded in part through industry gifts to the university. Dr Libert reported receipt of grants from the Defense Advanced Research Projects Agency, CyLab Security and Privacy Institute, and Carnegie Mellon University and consulting with litigants and regulators on matters related to online privacy. No other disclosures were reported.

    References
    1.
    Libert  T.  Privacy implications of health information seeking on the web.   Commun ACM. 2015;58(3):68-77. doi:10.1145/2658983Google ScholarCrossref
    2.
    Libert  T. An automated approach to auditing disclosure of third-party data collection in website privacy policies. In: Proceedings of the 2018 World Wide Web Conference. International World Wide Web Conferences Steering Committee; 2018:207-216. doi:10.1145/3178876.3186087
    3.
    US Senate Committee on Commerce, Science, and Transportation. Committee leaders introduce data privacy bill. Published May 7, 2020. Accessed July 2, 2020. https://www.commerce.senate.gov/2020/5/committee-leaders-introduce-data-privacy-bill
    ×