[BreachExchange] 235 Million Instagram, TikTok And YouTube User Profiles Exposed In Massive Data Leak

Thu Aug 20 10:27:36 EDT 2020

https://www.forbes.com/sites/daveywinder/2020/08/19/massive-data-leak235-million-instagram-tiktok-and-youtube-user-profiles-exposed/#1bb385011111

The security research team at Comparitech today disclosed how an
unsecured database left almost 235 million Instagram, TikTok and
YouTube user profiles exposed online in what can only be described as
a massive data leak.

Recently there has been a spate of reports concerning account data
appearing on dark web cybercrime forums. From the dark web audit
suggesting there are currently 15 billion stolen logins from 100,000
breaches out there, to the hacker giving away 386 million stolen
records for free. Not all of this data will have been hacked, at least
not in the usual sense of the word: some, as was likely the case in
the Utah Gun Exchange incident, will have been exposed by an unsecured
database.

The unsecured database problem

Unsecured databases are fast becoming such a huge data protection
problem that it's thought a vigilante security researcher is behind
the spate of "Meow" attacks that have overwritten the indexes of
thousands of such databases. And it was such an unsecured database
that the Comparitech researchers, led by Bob Diachenko, discovered on
August 1, leaving the personal profile data of nearly 235 million
Instagram, TikTok and YouTube users up for grabs.

The data was spread across several datasets; the most significant
being two coming in at just under 100 million each and containing
profile records apparently scraped from Instagram. The third-largest
was a dataset of some 42 million TikTok users, followed by just under
4 million YouTube user profiles.

Comparitech says that, based on the samples it collected, one in five
records contained either a telephone number or email address. Every
record also included at least some, sometimes all, the following
information:

- Profile name
- Full real name
- Profile photo
- Account description

Statistics about follower engagement, including:

- Number of followers
- Engagement rate
- Follower growth rate
- Audience gender
- Audience age
- Audience location
- Likes
- Last post timestamp
- Age
- Gender

"The information would probably be most valuable to spammers and
cybercriminals running phishing campaigns," Paul Bischoff, Comparitech
editor, says. "Even though the data is publicly accessible, the fact
that it was leaked in aggregate as a well-structured database makes it
much more valuable than each profile would be in isolation," Bischoff
adds. Indeed, Bischoff told me that it would be easy for a bot to use
the database to post targeted spam comments on any Instagram profile
matching criteria such as gender, age or number of followers.

Tracing the source of the leaked data

So, where did all this data originate? The researchers suggest that
the evidence, including dataset names, pointed to a company called
Deep Social. However, Deep Social was banned by both Facebook and
Instagram in 2018 after scraping user profile data. The company was
wound down sometime after this.

A Facebook company spokesperson told me that "scraping people's
information from Instagram is a clear violation of our policies. We
revoked Deep Social's access to our platform in June 2018 and sent a
legal notice prohibiting any further data collection."

Once the researchers found the database and the clues to its origin,
"we sent an alert to Deep Social, assuming the data belonged to them,"
Bischoff says. The administrators of Deep Social then forwarded the
disclosure to a Hong Kong-registered social media influencer
data-marketing company called Social Data. "Social Data shut down the
database about three hours after our initial email," Bischoff says.

Social Data responds to the database exposure incident

Social Data has denied any connection between itself and Deep Social,
according to the Comparitech report. It should also be made clear that
the data leaked, social media public profile data is available to
anyone who visits the accounts of the users concerned. However, the
phishing risk is clearly amplified once such a hoard of profiles is
collected together in a well-structured database. It isn't known at
this time how long the database was exposed without a password before
the August 1 discovery. The Comparitech report points out that: "Our
honeypot experiments show that hackers can find and attack unsecured
databases within hours of being exposed."

I reached out to Social Data, and a spokesperson provided the
following statement:

"We collect data and enrich it with additional useful insights solely
on behalf of our reputable customers, who use it strictly for the
intended purposes. It is extremely sad that this incident has occurred
due to a mixture of unfortunate events. However, as soon as we learned
of the incident, we fixed it immediately. We have since been closely
working with the information security experts on auditing our security
infrastructure and increasing the required levels of information
security to avoid similar occurrences in the future."

A TikTok spokesperson told me: "TikTok places the highest priority on
user privacy, and we have anti-scraping policies in place. Our Terms
of Service prohibit third parties from running automated scripts to
collect information from our services, including public profile
information. If we identify any such practices, we will take rapid
action, including seeking legal redress."

I have also reached out to Google GOOGL +0.7%, who, at the time of
publication, was still looking into the matter and unable to provide a
statement. I will, of course, update this story if this changes.

Advice for concerned Instagram, TikTok and YouTube users

Meanwhile, I would advise users of all the services affected,
Instagram, TikTok and YouTube, to be especially alert to phishing
scams by email or posted as social media comments.

Meanwhile, if your company has any databases "in the cloud" then I
would strongly recommend you audit the access permissions and make
sure these are not open to anyone who comes looking. Elastic has an
excellent guide to securing Elasticsearch deployments.