Talk:List of data breaches/Archive 1

This is an archive of past discussions about List of data breaches. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

New column "publication"

Latest comment: 5 years ago5 comments5 people in discussion

I'm proposing to add a new column "year of publicization" to the table. For instance the Yahoo! data breach entry has 2014 set as year (year the stolen data is of) but was brought to the public in 2016. --Fixuture (talk) 23:32, 22 September 2016 (UTC)

Personally, I wouldn't bother adding any more complexity to the table, which could make it harder for an editor to add to or modify. Plus clicking on the source gives any earlier dates of when a hack occurred. It could also force a lot more updating work since the date of the original hack isn't always known until much later after its been analyzed. --Light show (talk) 00:31, 23 September 2016 (UTC)

I agree with adding new column. It's complex already, but the two dates are very significant. --Wazz4444 (talk) 20:37, 9 June 2018 (UTC)

One more vote for the new column. Both these dates are usually present alongside the same information that populates the other columns and wouldn't represent substantial additional burden for the editor adding new items. --Jsoverson (talk) 17:22, 13 July 2018 (UTC)

I thing the title should be "List of Known Data Breaches." There are always breaches going on that have not been discovered yet. — Preceding unsigned comment added by 67.180.205.108 (talk) 23:11, 21 December 2018 (UTC)

Missing entries: leaks

Latest comment: 4 years ago2 comments2 people in discussion

https://www.animenewsnetwork.com/news/2017-02-22/report-2.5-million-funimation-accounts-compromised-in-data-breach/.112538 — Preceding unsigned comment added by 2601:840:8400:EC10:7854:7C9A:A8CF:A2D8 (talk) 01:37, 28 October 2020 (UTC)

As far as I understand it all leaks are data breaches except the ones were the leaking was done by an whistleblower from the inside who already got access to the data, right? Because it seems like many such leaks are missing from the list. (Most can be found at Category:News leaks). --Fixuture (talk) 17:58, 23 September 2016 (UTC)

External links modified

Latest comment: 6 years ago1 comment1 person in discussion

Hello fellow Wikipedians,

I have just modified 5 external links on List of data breaches. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 19:11, 29 December 2017 (UTC)

Google+ incident a data breach?

Latest comment: 6 years ago15 comments4 people in discussion

@Zazpot: Regarding the recent Google+ reports and your recent reversion of my edit, I've found multiple sources which cover this:

Google, a unit of Alphabet Inc., exposed the private data of some users of its Google+ social network to outside developers, but the company said it found no evidence that developers misused data. The phrase “data breach” in the headline for Tuesday’s Page One article about the exposure could be interpreted as suggesting that data was misused.

Corrections & Amplifications, The Wall Street Journal

Google said this incident represented an "exposure" rather than a "breach" of data. This means that personal data was exposed for any bad guy to take, but there's no evidence anyone did.

The company said private data in Google+ could have been viewed by third-party app developers, but there's no evidence any of these individuals even knew about the bug that caused the vulnerability, let alone exploited it.

"Google learned the hard way it's better to be transparent about privacy bugs than cover them up", CNBC

Now with what’s happening right now with Google, “breach” is the wrong word, although it’s certainly getting tossed around. Users of Google+ had some profile data “exposed,” meaning it was potentially accessible by third parties although that may not have actually happened.

"The wrong reaction to the Google data exposure", American Enterprise Institute

Given the distinction these articles discuss about a "data breach" and "data exposure" (including from The Wall Street Journal which first reported on the incident), it appears to me that this is out of scope for this article. ^Falling_Gravity 21:01, 10 October 2018 (UTC)

We probably should have a "data exposure" page too. This is not the first such case, eg [1]. Having that, and making sure the two pages link to each other would help greatly. --Masem (t) 01:44, 11 October 2018 (UTC)

I think what's happening here is that Google's PR representatives, obviously under instructions to limit the damage to Google's reputation, have contacted journalists to "educate" them by claiming that there is a distinction between a data exposure and a data breach and that Google Plus suffered the former rather than the latter (which, by implication, would absolve Google somewhat of its failure to notify users). Personally, I think that the distinction is ~~bollocks~~bogus. Quoting the Ars Technica piece, which seems to me to be much more level-headed: [Google] destroys most Google+ logs after two weeks. According to the WSJ, an internal memo acknowledged there was no way to know [therefore, whether the exposed data was accessed by people who should not have had access]. People who have used Google+ during the time the bugs were active should assume any exposed data is publicly available. Zazpot (talk) 11:52, 11 October 2018 (UTC); edited 19:28, 11 October 2018 (UTC)

The distinction between data exposure and data breach makes sense, because a data breach is an instance of data exposure, but data exposure is not necessarily a data breach (even if it's best practices to assume otherwise, as discussed in the Ars Technica piece). Your claim that the "distinction is bollocks" contradicts reliable sources, so I'm removing the entry for now. Perhaps we could have an article on data exposure to explain this distinction and discuss the Google+ and the voter records incident which Masem mentioned. ^Falling_Gravity 17:06, 11 October 2018 (UTC)

@FallingGravity: the idea that "data breaches" and "data exposures" are disjoint sets, no matter how plausible it sounds, is an artificial one promoted by Google's PR and regurgitated by gullible journalists.

It is not WP:OR to assert that in national and international public policy and in legal guidance from official public organisations, "data breach" is an umbrella term for incidents that include data exposure. I.e. "data exposures" are a proper subset of "data breaches". See:

You just learned that your business experienced a data breach. Whether hackers took personal information from your corporate server, an insider stole customer information, or information was inadvertently exposed on your company’s website, you are probably wondering what to do next.^[1]

A personal data breach can be broadly defined as a security incident that has affected the confidentiality, integrity or availability of personal data.^[2]

Zazpot (talk) 19:20, 11 October 2018 (UTC)

@Zazpot: I say we should follow the reliable sources which discuss the Google+ incident, not your WP:SYNTH of an FTC handbook for businesses and ICO guidelines for breaches of personal data. Also, what's your source for saying these journalists are "gullible"? ^Falling_Gravity 22:50, 11 October 2018 (UTC)

FallingGravity: you ask, what's your source for saying these journalists are "gullible"? Cicero.

Also, an FTC handbook for businesses about data breaches and the ICO guidelines for breaches of data are not WP:SYNTH about data breaches. They are authoritative sources about data breaches in general, which necessarily includes the Google Plus data breach. Your suggestion that they are not applicable here is akin to saying that Smoking and Health was irrelevant to Lucky Strikes because it didn't name that brand specifically.

Also, numerous WP:RS have used, in relation to the Google Plus revelations, the exact wording "data breach" (as though that were somehow the most important thing, which it isn't, but I'm mentioning it in order to address your concerns), e.g.: The Guardian, NPR, and CBS. Also slightly less WP:RS (but still WP:RS on this sort of topic, IMO): Politico and TheNextWeb. Zazpot (talk) 01:18, 12 October 2018 (UTC)

It looks like CBS News, The Guardian, and TheNextWeb have since corrected their stories to say "data exposure" or "data leak". But now I'm guessing those sources aren't reliable anymore because they've been duped by Google's PR team, right? ^Falling_Gravity 18:17, 13 October 2018 (UTC)

Again, I think it's a "layperson" issue. "data breach" vs "data exposure" means to the average people that their data was not kept private, and the same result for them happens. Computer experts know better. There's no problem making sure that difference is well known to exclude data exposures from this page, but I will repeat, if that is done, then we absolutely need a "List of data exposures", make sure both pages are clear what elements are included and link back to the other page. --Masem (t) 19:10, 13 October 2018 (UTC)

@Masem: I appreciate your goodwill here, but what you are proposing does not make sense to me. As explained above, the set of "data exposures" is a subset of the set of "data breaches". Zazpot (talk) 22:47, 14 October 2018 (UTC)

@FallingGravity: please can you provide links to those "corrections"? If you are right about those sources, then:

It sounds as though they have fallen below their usual standards of journalism. I am disappointed in them. They should know better.
How do you feel about noting that WP:RS disagree about whether the Google breach was a "breach"? (What a ridiculous world this is that that sentence should be valid, but it is.)

Zazpot (talk) 22:45, 14 October 2018 (UTC); edited 06:41, 15 October 2018 (UTC)

@Zazpot: It's funny that you think reliable sources have "fallen below their usual standards of journalism" because they issue corrections, even though corrections are a hallmark of reliable sources. TheNextWeb article says "Removed references calling the issue a “breach,” to more accurately reflect that the Google+ security flaw was a “glitch” or “bug” which could have potentially resulted in a breach." The Wall Street Journal issued a similar correction. Even The Washington Post agrees that "The Google+ bug, it seems, was not a breach but a vulnerability." As for your second proposal, I think this article should list incidents that are definitely data breaches; any "debate" can go in the data breach or Google+ articles. ^Falling_Gravity 16:06, 15 October 2018 (UTC)

A willingness to issue corrections, when appropriate, is indeed a hallmark of a reliable source. This does not, however, imply that all corrections are appropriate. These particular "corrections" are inappropriate, and disappointing.

I agree that the article should list only definite data breaches; but as I have explained, data exposures are necessarily (because of the subset relationship) data breaches. Zazpot (talk) 23:06, 15 October 2018 (UTC)

Response to third opinion request:

Policy sidenote: More than two participants already present. Nevertheless, I think the subject is interesting and relatively simple to answer, so here.

The term "breach" suggests one of two events, or both: Intrusion through security measures into an inner network or physical facility; or a localized failure of security policy. The latter need not involve the former: A loss of a flash drive with some sensitive information by a corporate employee on a business trip would often be termed a "breach", regardless of whether any information was exposed or even if the drive is actually in anyone's hands rather than just stuck under a mattress somewhere.
As this article points out, the US Dept. of Justice takes a similar broad approach [2].
Other than that, given that there's no standardized taxonomy of infosec failures from which we can draw, and as we're all aware of what the non-technical usage of "breach" encompassed (and in most likelihood the reader is too), I'd argue it simply doesn't matter. As long as we use the lead to define what this list is about (and the DOJ's definition is as good as any), there's no significant risk of misleading, misrepresentation or inaccuracy by simply keeping the current choice of name. Security vulnerabilities come in all shapes and sizes (so to speak), and as long as we're all aware of what we're discussing here and it reflects common as well as academic use, it's just not that important. François Robere (talk) 19:19, 17 October 2018 (UTC)
Although, if you insist, a more accurate title would be "List of security vulnerabilities that resulted in large scale data exposure". But well.

Thanks for the comments. However, expanding the definition of this article might make it unwieldy, to the point where any vulnerability that could expose data, such Row hammer and Spectre, is listed (because any device that isn't patched could be breached). I've started an RfC on the matter to determine if this particular incident should be included. ^Falling_Gravity 17:48, 20 October 2018 (UTC)

References

^ "Data Breach Response: A Guide for Business". Federal Trade Commission. Retrieved 2018-10-11.
^ "Personal data breaches". Information Commissioner's Office. Retrieved 2018-10-11.

RfC on the inclusion of the Google+ incident

Latest comment: 5 years ago7 comments5 people in discussion

CONSENSUS AGAINST

There is a weak consensus against inclusion of Google+'s data exposure. There was very little discussion, but the sources provided by Falling Gravity tipped me over. I'll also note that a "no consensus" close would default to not including it, per WP:NOCON Accordingly, the material should be removed. Thanks, --DannyS712 (talk) 02:14, 23 February 2019 (UTC) (non-admin closure)

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Should Google+'s reported data exposure be included or excluded from this list? RfC relisted by Cunard (talk) at 01:37, 13 January 2019 (UTC). RfC relisted by Cunard (talk) at 05:28, 2 December 2018 (UTC). ^Falling_Gravity 17:34, 20 October 2018 (UTC)

Exclude Multiples sources that discuss the Google+ incident draw a distinction between a "data breach" and "data exposure", including The Wall Street Journal, CNBC, AEI, TheNextWeb and The Washington Post. We should follow the reliable sources, not our own WP:OR. ^Falling_Gravity 17:40, 20 October 2018 (UTC)

Include. Data exposures are a proper subset of data breaches, according to relevant guidance from government bodies, and trade guides, e.g.:

You just learned that your business experienced a data breach. Whether hackers took personal information from your corporate server, an insider stole customer information, or information was inadvertently exposed on your company’s website, you are probably wondering what to do next.^[1]
A personal data breach can be broadly defined as a security incident that has affected the confidentiality, integrity or availability of personal data.^[2]
The term 'breach' is used to include the loss of control, compromise, unauthorized disclosure, unauthorized acquisition, unauthorized access, or any similar term referring to situations where persons other than authorized users and for an other than authorized purpose have access or potential access to information, whether physical or electronic.^[3]
A data breach is an incident wherein an unauthorised person(s) or company (companies) receives access to the personal data of data subjects. This may be the result of intentional or unintentional action.^[4]

References

^ "Data Breach Response: A Guide for Business". Federal Trade Commission. Retrieved 2018-10-11.
^ "Personal data breaches". Information Commissioner's Office. Retrieved 2018-10-11.
^ "Incident Response Procedures for Data Breaches" (PDF). United States Department of Justice. 2013-08-06. Retrieved 2018-10-21.
^ Bhatia, Punit (2018). Intro to GDPR: A Plain English Guide to Compliance. Advisera Expert Solutions.

To suggest that it is somehow WP:OR or WP:SYNTH to recognise that these passages are applicable to the Google incident, is akin to suggesting that Smoking and Health was irrelevant to Lucky Strikes because it didn't name that brand specifically.

Zazpot (talk) 05:43, 21 October 2018 (UTC)

Both your examples (Google+ and Lucky Strikes) are original research unless secondary sources make such connections to these primary sources. I suggest you read WP:PSTS very carefully. ^Falling_Gravity 23:48, 24 October 2018 (UTC)

I have read it several times previously, I read it again recently, I am broadly supportive of it, and yet I still disagree with you. Wikipedia does not source everything: it does not source each English word or term that is used in each article, for example. But having established what tobacco cigarettes are, or what data breaches are, etc, from reliable sources, we as Wikipedians can then categorise entities or events in the world appropriately. If WP:RS disagree with each other, then we may note this, as I suggested above; but we should not pretend that reliably-sourced facts about what is what can be temporarily suspended because they look bad on a company or its products, even if normally reliable sources choose to do so. Zazpot (talk) 02:36, 27 October 2018 (UTC)

Exclude - exposure is not a breach. Cheers Markbassett (talk) 16:38, 19 December 2018 (UTC)

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Data breach omission: University of Delaware, 2013

Latest comment: 6 years ago1 comment1 person in discussion

Hey crew. I am new to this and I hope this is the right venue. I see an omission that has affected over 74,000 people employed, enrolled, or matriculated from the University of Delaware. Here is the source: http://www1.udel.edu/udaily/2014/jul/resources073013.html

I see that I cannot edit the list directly, so please help me understand how we can get this one on the list. Thanks! — Preceding unsigned comment added by Kinobaby (talk • contribs) 16:55, 20 November 2018 (UTC)

Wordpress

Latest comment: 5 years ago1 comment1 person in discussion

Hi, can someone confirm Wordpress has been hacked recently? If so, it should be added to this page. https://www.zdnet.com/article/thousands-of-wordpress-sites-backdoored-with-malicious-code/ Kathelijne (talk) 13:44, 5 December 2018 (UTC)

Collection #1

Latest comment: 5 years ago2 comments2 people in discussion

This is being called a data breach , despite the fact it appears to be a collection of 773M+ from previous breaches and other data leaks; eg technically nothing new. [3]. I believe we should include it but putting the question on the table. --Masem (t) 04:29, 18 January 2019 (UTC)

These aggregate dumps are different though and not infrequent. Collection #1 is already shown to be a part of a much larger collection and there have been other dumps that have included parts of the included data. What is definitely worth adding to this list, though, are the dumps that were found in collection #1, were publicly disclosed, but aren't in this list (e.g. elance, cdprojektred, nexus mods). -- Jsoverson (talk) 16:31, 28 January 2019 (UTC)

New column "country"

Latest comment: 5 years ago1 comment1 person in discussion

I'm proposing to add a new column "Country" to the table to provide the country of origine of the company that suffered the breach. For instance the Yahoo! data breach would be US, OVH would be FR, ... — Preceding unsigned comment added by 194.3.119.2 (talk) 06:59, 17 July 2019 (UTC)

Add column to add more insight

Latest comment: 5 years ago1 comment1 person in discussion

Add some columns to add more insight into Data breaches.

a) What percentage of users/employees/customers were affected b) What was average compensation / account breach was settled in courts.

Sample as below: https://www.linkedin.com/feed/update/urn:li:activity:6559118839525834752 — Preceding unsigned comment added by Tapan.allabadi (talk • contribs) 17:22, 22 July 2019 (UTC)

MongoDB entries are erroneous

Latest comment: 4 years ago1 comment1 person in discussion

Joe Drumgoole (talk) 13:51, 23 November 2020 (UTC) The two MongoDB entries imply that MongoDB (the company) was responsible for these breaches. In both instances the owner of the database was an (unknown?)third party. I don't want to make the edit as I am employee of MongoDB. If we were listing vendors who sold the databases that were used to create the breaches every database vendor would be listed here. Can we amend the MongoDB entries to indicate the actual entity involved or mark the entity as unknown?

Adding Philip Morris

Latest comment: 3 years ago4 comments3 people in discussion

As I am not registered, someone can add Philip Morris International? The data breach concerns the data of 15 years of tobacco survey belonging to major tobacco companies (value of 70 million USD). Reference and source can be seen in a complaint at the New York State court: https://iapps.courts.state.ny.us/nyscef/DocumentList?docketId=ixdcabdUnWjejcynC/fJsQ==&display=all&courtType=New%20York%20County%20Supreme%20Court&resultsPageNum=1 — Preceding unsigned comment added by 2.53.134.87 (talk) 14:27, 30 November 2020 (UTC)

We can't use court documents as they are a primary source; it needs to be reported by third-party sources. --Masem (t) 14:44, 30 November 2020 (UTC)

So you can add it as it was reported by OCCRP https://www.occrp.org/en/daily/13413-complaint-phillip-morris-smuggled-smokes-distorted-data — Preceding unsigned comment added by 2.53.155.153 (talk) 22:07, 3 December 2020 (UTC)

It is still a claim and not proven, so we can't include it. --Masem (t) 22:09, 3 December 2020 (UTC)

[1] "Data Breach Response: A Guide for Business". Federal Trade Commission. Retrieved 2018-10-11.

[2] "Personal data breaches". Information Commissioner's Office. Retrieved 2018-10-11.

[3] "Data Breach Response: A Guide for Business". Federal Trade Commission. Retrieved 2018-10-11.

[4] "Personal data breaches". Information Commissioner's Office. Retrieved 2018-10-11.

[5] "Incident Response Procedures for Data Breaches" (PDF). United States Department of Justice. 2013-08-06. Retrieved 2018-10-21.

[6] Bhatia, Punit (2018). Intro to GDPR: A Plain English Guide to Compliance. Advisera Expert Solutions.

[1]

[2]

[1]

[2]

[3]

[4]