{"id":34184,"date":"2020-03-18T09:06:21","date_gmt":"2020-03-18T13:06:21","guid":{"rendered":"https:\/\/www.kaspersky.com\/blog\/?post_type=emagazine&#038;p=34184"},"modified":"2020-06-19T16:09:00","modified_gmt":"2020-06-19T20:09:00","slug":"data-new-toxic-waste","status":"publish","type":"emagazine","link":"https:\/\/www.kaspersky.com\/blog\/secure-futures-magazine\/data-new-toxic-waste\/34184\/","title":{"rendered":"Data \u2013 the new oil, or potential for a toxic oil spill?"},"content":{"rendered":"<blockquote><p>Any data you collect will probably leak. Any data you retain will definitely leak, given enough time. <\/p>\n<\/blockquote>\n<p>Both of these statements were once controversial, but today, they\u2019re commonsense. If <a href=\"https:\/\/www.ftc.gov\/enforcement\/cases-proceedings\/refunds\/equifax-data-breach-settlement\" target=\"_blank\" rel=\"noopener nofollow\">Equifax<\/a>, <a href=\"https:\/\/en.wikipedia.org\/wiki\/Vault_7\" target=\"_blank\" rel=\"noopener nofollow\">the CIA<\/a>, <a href=\"https:\/\/www.theguardian.com\/us-news\/the-nsa-files\" target=\"_blank\" rel=\"noopener nofollow\">the NSA<\/a>, <a href=\"https:\/\/www.csoonline.com\/article\/3318238\/the-opm-hack-explained-bad-security-practices-meet-chinas-captain-america.html\" target=\"_blank\" rel=\"noopener nofollow\">The Office of Personnel Management<\/a>, <a href=\"https:\/\/www.nytimes.com\/2018\/09\/28\/technology\/facebook-hack-data-breach.html\" target=\"_blank\" rel=\"noopener nofollow\">Facebook<\/a> and <a href=\"https:\/\/www.wired.com\/story\/ok-cupid-dating-apps-hacks-breaches-security\/\" target=\"_blank\" rel=\"noopener nofollow\">dating sites<\/a> can\u2019t keep our secrets secret, then neither can your business.<\/p>\n<p>In truth, industry\u2019s old, ill-placed confidence in the security of data was always an example of motivated reasoning. Collecting data is so <em>cheap<\/em> and storing it is so <em>easy<\/em>, and there were <em>so many<\/em> analysts and investors and hustling grifters exclaiming that \u201cdata is the new oil\u201d that it seemed fiscally irresponsible <em>not<\/em> to collect everything you could, and retain it forever.<\/p>\n<p>Who knew how that data could be put to profitable use in the future? It was raining soup, so it was time to fill your boots \u2013 even if you couldn\u2019t find a market for soup-in-a-boot today, there was no doubt that such a market would appear in the foreseeable future.<\/p>\n<p>Given such a value proposition, it\u2019s not surprising that the people doing the collecting and the retaining of data talked themselves into the idea that both activities could be undertaken safely.<\/p>\n<p>But of course, they were wrong, and as history has caught up with them \u2013 as breach after breach has hit in ever-increasing waves \u2013 the rationale has changed. Now, rather than arguing that breaches are inevitable, the story goes that breaches aren\u2019t a big deal: every time there\u2019s a data breach, company spokespeople recite the catechism: \u201cWe take our customers\u2019 privacy very seriously. None of the data that leaked was compromising.\u201d<\/p>\n<p>Some of that is \u201cprivacy nihilism\u201d \u2013 it was all going to leak eventually, so what\u2019s the difference? But there\u2019s a more insidious version of this, which argues that breach data isn\u2019t a problem because bad people can\u2019t do much with it. This isn\u2019t just nihilism; it\u2019s <em>denialism<\/em>.<\/p>\n<p>Breach apologists argue that the data they leak isn\u2019t compromising because it\u2019s anonymized, or because key identifiers were removed from it. This profoundly misunderstands how data is used \u2013 and abused.<\/p>\n<p>Re-identification of anonymized data-sets is a hot research topic for computer science today, with researchers <a href=\"https:\/\/www.seas.harvard.edu\/news\/2020\/01\/imperiled-information\" target=\"_blank\" rel=\"noopener nofollow\">creating automatic tools that piece together disparate data-sets<\/a> to identify the people in them: for example, you can merge a health authority\u2019s database of anonymized prescribing data (doctor, medicine, date and time) with a breached database of taxi journeys that includes trips to hospitals that coincide with the prescribing times to infer who is taking antipsychotic medications, or antiretrovirals or cancer therapeutics.<\/p>\n<p>Many data-protection vendors have promised that they can inject noise into data-sets to prevent re-identification, but <a href=\"https:\/\/www.theguardian.com\/technology\/2017\/aug\/01\/data-browsing-habits-brokers\" target=\"_blank\" rel=\"noopener nofollow\">those promises<\/a> rarely survive contact with security researchers who evaluate their claims.<\/p>\n<p>It\u2019s been years since <a href=\"https:\/\/pursuit.unimelb.edu.au\/articles\/understanding-the-maths-is-crucial-for-protecting-privacy\" target=\"_blank\" rel=\"noopener nofollow\">the first significant<\/a> re-identification theoretical work <a href=\"https:\/\/www.cs.princeton.edu\/~arvindn\/publications\/precautionary.pdf\" target=\"_blank\" rel=\"noopener nofollow\">was done<\/a>, and <a href=\"https:\/\/www.nature.com\/articles\/s41467-019-10933-3\" target=\"_blank\" rel=\"noopener nofollow\">things keep getting worse for those who insist that anonymization is possible<\/a>.<\/p>\n<p>Re-identification methods tell us a lot about how digital criminals operate and their incredible frugality and resourcefulness.<img decoding=\"async\" class=\"aligncenter size-large wp-image-34186\" src=\"https:\/\/media.kasperskydaily.com\/wp-content\/uploads\/sites\/92\/2020\/03\/18085851\/145_data_toxic_waste_inline-1024x576.png\" alt=\"face women data toxic waste\" width=\"1024\" height=\"576\"><\/p><blockquote><p>Like our 1930s Depression-era haunted ancestors, identity thieves never throw anything away, and they find ways to use every scrap of leftover to make something new. <\/p>\n<\/blockquote>\n<p>Usernames and passwords can be recycled in credential-stuffing attacks that allow them to break into security cameras from <a href=\"https:\/\/www.vice.com\/en_us\/article\/3a88k5\/how-hackers-are-breaking-into-ring-cameras\" target=\"_blank\" rel=\"noopener nofollow\">Ring<\/a> and <a href=\"https:\/\/www.siliconvalley.com\/2019\/10\/18\/the-voice-from-our-nest-camera-threatened-to-steal-our-baby\/\" target=\"_blank\" rel=\"noopener nofollow\">Nest<\/a>, <a href=\"https:\/\/techcrunch.com\/2019\/09\/26\/doordash-data-breach\/\" target=\"_blank\" rel=\"noopener nofollow\">order takeaway<\/a> meals, or <a href=\"https:\/\/www.vice.com\/en_us\/article\/zmpx4x\/hacker-monitor-cars-kill-engine-gps-tracking-apps\" target=\"_blank\" rel=\"noopener nofollow\">track and immobilize entire fleets of corporate vehicles<\/a>. Breach identities can <a href=\"https:\/\/www.buzzfeednews.com\/article\/jsvine\/net-neutrality-fcc-fake-comments-impersonation\" target=\"_blank\" rel=\"noopener nofollow\">be used to overwhelm regulatory proceedings with plausible fake comments<\/a> or to create fleets of Twitter identities.<\/p>\n<p>Criminals operate by combining and recombining data-sets, using one company\u2019s breach in combination with a public data source, and a third company\u2019s anonymous data release to wreak incredible havoc. They might even get enough data fragments to <a href=\"https:\/\/ftalphaville.ft.com\/2015\/12\/14\/2147811\/stealing-london-houses\/\" target=\"_blank\" rel=\"noopener nofollow\">fraudulently obtain a duplicate deed for your house<\/a> and sell it to someone else while you\u2019re on holiday.<\/p>\n<p>Never mind that no one can point to a specific piece of data you\u2019re liable to lose control over someday and say, \u201cThat, that\u2019s the data-point that will cost someone their house, or let their stalker find them or expose their retirement savings to thieves.\u201d<\/p>\n<p>It\u2019s similarly true that no one can point to a specific droplet of dioxin in a factory\u2019s illegal effluent pipe and say, \u201cThat, that is the carcinogen that will kill a young mother of three, some five miles downstream of the pipe.\u201d This doesn\u2019t stop companies that poison the water or the air from paying the price.<\/p>\n<p>The harms from breaches are stochastic (i.e., randomly determined), not deterministic.<br>\n<\/p><blockquote><p>We can\u2019t know for sure which data will do which harm, but we know that harm is inevitable and it gets worse the bigger the breach is. <\/p>\n<\/blockquote>\n<p>So far, remedies for those who have been injured by breaches have been severely limited, but they\u2019re getting stiffer. Home Depot\u2019s 2014 breach cost it <a href=\"https:\/\/www.csoonline.com\/article\/3041994\/home-depot-will-pay-up-to-195-million-for-massive-2014-data-breach.html\" target=\"_blank\" rel=\"noopener nofollow\">$0.34\/customer<\/a> in direct compensation. But that was then. Breached Yahoo! customers <a href=\"https:\/\/www.wired.com\/story\/how-to-get-yahoo-breach-settlement-money\/\" target=\"_blank\" rel=\"noopener nofollow\">may get compensated $100<\/a> each. <a href=\"https:\/\/www.ftc.gov\/news-events\/press-releases\/2019\/07\/ftc-imposes-5-billion-penalty-sweeping-new-privacy-restrictions\" target=\"_blank\" rel=\"noopener nofollow\">Facebook just got hit with a $5B fine, and the party\u2019s just getting started.<\/a><\/p>\n<p>The harms from breaches are cumulative: like toxic waste in nature, breaches build up in the information environment, and they are effectively immortal in their potential for damage. As the public \u2013 and the law \u2013 come to grips with this, we\u2019re likely to see greater and greater remedies for those whose data has been released into the wild (forever).<\/p>\n<p>Remember, breaches affect everyone alike \u2013 all political persuasions, rich and poor, including the governing classes and lawmakers themselves.<\/p>\n<p>Inevitably, we will see the framework for breach remedies transformed to look more like the remedies for other probabilistic harms, such as environmental harms.<\/p>\n<p>When that happens, it might be too late for you: the data you\u2019re warehousing today might already have been ex-filtrated from your network without you even knowing that it\u2019s happened \u2013 until one of your customers finds out the hard way that you\u2019ve compromised them, and seeks legal remedies.<\/p>\n<p>Your insurer isn\u2019t going to write policies for you \u2013 or errors and omissions policies for your board \u2013 if you\u2019re warehousing all this digital toxic waste in leaky digital barrels, not once the penalties for losing control over it start to turn into real money.<\/p>\n<p>Maybe you could still justify all that risk if the profits from all that data were commensurate with it. But as researchers keep discovering, the benefits from data <a href=\"https:\/\/weis2019.econinfosec.org\/wp-content\/uploads\/sites\/6\/2019\/05\/WEIS_2019_paper_38.pdf\" target=\"_blank\" rel=\"noopener nofollow\">are wildly oversold<\/a> \u2013 the efficacy of ad targeting based on users\u2019 behavior is almost identical targeting based on the content of the pages where the ads appear, which requires <em>no<\/em> user data.<\/p>\n<p>But if you\u2019re an ad-tech company or a Big Tech platform like Facebook or Google, the mystique about the ability of data to convert customers allows you to sell your product as a massive premium, while intimidating potential competitors who think that they will never get started because they can never collect as much data as the companies that are already in the space.<\/p>\n<p>The people who claim data is the new oil are people who are <em>selling the data<\/em>, and the claims they make about the ways that this data lets you do amazing things are <em>sales literature<\/em>, not peer-reviewed studies.<\/p>\n<p>Data was never the new oil. It was always the new toxic waste: pluripotent, immortal \u2013 and impossible to contain. You don\u2019t want to be making more of it, and you definitely should be getting rid of the supply you\u2019ve so unwisely stockpiled so far.<\/p>\n<p>Data minimization isn\u2019t just good practice; it\u2019s good business. Collect as little data as you can, and keep it as briefly as you can. If your privacy policy fits on the back of a napkin \u2013 because you\u2019re collecting almost nothing and processing it only for specific purposes, and then deleting it forever \u2013 you\u2019re on the right track!<\/p>\n<p><em>This article reflects the opinions of the author.<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Any data you collect will probably leak. Any data you retain will definitely leak, given enough time. Cory Doctorow challenges us to think differently about customer data.<\/p>\n","protected":false},"author":2563,"featured_media":34185,"template":"","coauthors":[3724],"class_list":{"0":"post-34184","1":"emagazine","2":"type-emagazine","3":"status-publish","4":"has-post-thumbnail","6":"emagazine-category-data-and-privacy","7":"emagazine-category-data-breaches","8":"emagazine-category-opinions","9":"emagazine-tag-data","10":"emagazine-tag-data-privacy","11":"emagazine-tag-gdpr"},"hreflang":[{"hreflang":"x-default","url":"https:\/\/www.kaspersky.com\/blog\/secure-futures-magazine\/data-new-toxic-waste\/34184\/"},{"hreflang":"en-us","url":"https:\/\/usa.kaspersky.com\/blog\/secure-futures-magazine\/data-new-toxic-waste\/21732\/"},{"hreflang":"en-gb","url":"https:\/\/www.kaspersky.co.uk\/blog\/secure-futures-magazine\/data-new-toxic-waste\/20060\/"},{"hreflang":"es-mx","url":"https:\/\/latam.kaspersky.com\/blog\/secure-futures-magazine\/data-new-toxic-waste\/20499\/"},{"hreflang":"pt-br","url":"https:\/\/www.kaspersky.com.br\/blog\/secure-futures-magazine\/data-new-toxic-waste\/16399\/"}],"acf":[],"_links":{"self":[{"href":"https:\/\/www.kaspersky.com\/blog\/wp-json\/wp\/v2\/emagazine\/34184","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kaspersky.com\/blog\/wp-json\/wp\/v2\/emagazine"}],"about":[{"href":"https:\/\/www.kaspersky.com\/blog\/wp-json\/wp\/v2\/types\/emagazine"}],"author":[{"embeddable":true,"href":"https:\/\/www.kaspersky.com\/blog\/wp-json\/wp\/v2\/users\/2563"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.kaspersky.com\/blog\/wp-json\/wp\/v2\/media\/34185"}],"wp:attachment":[{"href":"https:\/\/www.kaspersky.com\/blog\/wp-json\/wp\/v2\/media?parent=34184"}],"wp:term":[{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.kaspersky.com\/blog\/wp-json\/wp\/v2\/coauthors?post=34184"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}