How Badly are Data Breach “Whales” Impacting the Breach Trend Lines?

Shortly after posting “Data Breaches: So How Bad is it Getting?“, it dawned on me that it might be interesting to factor out the big “whale” breaches (e.g. Yahoo! in 2016 with 3+ billion compromised records) to get a feel for what “run rate” breach rate really is in terms of compromised records. So consider this blog post an addendum to that blog post.

Thar She Blows!

In taking the data from security vendor Risk Based Security’s (RBS) 2018 report that shows the number of records compromised per year in data breaches …

Source: Risk Based Security 2018 Report

… and factoring out what they consider is the top 20 breaches of all time, this what I get — a much more linear view of the increase in data breach records.

In other words, the trend line from 2017 to 2018 with the original RBS data showed the number of compromised records decreasing in 2018 from 2017, but if you factor out the whales (i.e. the top 20 breaches over the last 5 years), there is an actual increase year over year, so the “run rate” of compromised records is going up and has doubled in the last few years. Here are my calculations below.

# of records is in millions

The other interesting thing is that the whales are representing a massive amount of the annual records compromised (e.g. 70%, 74% and 59% in the last 3 years respectively), which echoes the point in my last blog post that while the number of reported breaches is flattening out over the last years, that each year the number of reported compromised records is getting bigger. i.e. hackers are getting more bang for the buck.

But we must caveat all of this with the fact that RBS and other’s data sets are based on scouring the Internet for known and reported breaches. Unfortunately the US does not have a federal data breach notification law (unlike Europe’s GDPR), so we have a limited view of the severity of the problem. I like this quote in RBS’ 2019 mid-year report (which showed breach activity growing 50% year over year) about breaches being “swept under the rug.”

I think in my next blog post I am going to take a look at cybersecurity spending and compare it to the breach trend lines.


  1. […] While recently drilling down into trend lines regarding data breaches and cybersecurity spend, it has became readily apparent that we are flying blind regarding the true enormity of the hacking problem we are facing. That is because reporting of data breaches is driven by laws and regulations. Where those laws do not exist — or even when laws do exist but there is little to no motivation to enforce notification (e.g. large fines) — we find that breaches can easily be “swept under the rug.” […]


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s