[BreachExchange] GitHub Leaks: Lessons Learned

Wed May 5 10:41:42 EDT 2021

https://www.databreachtoday.com/github-leaks-lessons-learned-a-16504

Recent incidents involving inadvertent exposure of patient data on GitHub,
a software development and version control platform designed for
collaboration, point to the need to ensure that data loss prevention tools
are implemented, available security controls are leveraged and employees
are made aware of the risks involved in using internet-facing platforms.

In the most recent incident, COVID-19 test results and other sensitive data
of 164,000 individuals collected by the Wyoming Department of Health were
exposed on GitHub.

In another incident earlier this month, the PHI of 136,000 individuals was
accidentally exposed on GitHub by an employee of revenue cycle management
vendor Med-Data Inc.

Those incidents follow the discovery last year by Dutch independent
security researcher Jelle Ursem of several other health data exposures on
GitHub.

Organizations can take steps to avoid having sensitive data loaded onto
GitHub, security experts say. They include using data loss prevention
controls to scan for sensitive data before making uploads to the testing
site.

Cloud-based services, including GitHub, also have a range of
security-related capabilities, such as two-factor authentication, as well
as extensive controls limiting access to data on their platforms, says Bill
Santos, president of security services firm Cerberus Sentinel.

"Unfortunately, too many organizations don’t understand or leverage these
capabilities," he says. "Proper training and settings review of these
platforms are essential to using them securely."

Wyoming Incident
The Wyoming Department of Health said it became aware on March 10 of an
unintentional exposure of 53 files containing COVID-19 and influenza test
result data and one file containing breath alcohol test results. "These
files were mistakenly uploaded by a … workforce member to private and
public online storage locations, known as repositories, on servers
belonging to GitHub.com," the department said in a statement.

"This incident did not result from a compromise of GitHub or its systems,"
the state says. "While GitHub.com has privacy and security policies and
procedures in place regarding the use of data on their platform, the
mistakes made by the WDH employee still allowed the information to be
exposed."

The exposed health information included COVID-19 tests that were
electronically reported to the department, including name or patient ID,
address, date of birth, test results and dates of service, the department
says.

"While WDH staff intended to use this software service only for code
storage and maintenance rather than to maintain files containing health
information, a significant and very unfortunate error was made when the
test data was also uploaded to GitHub.com," said Michael Ceballos, director
of the department.

Med-Data Incident
The incident involving Texas-based Med-Data that was reported to the
Department of Health and Human Services on April 1 as an "unauthorized
access/disclosure" breach, affected nearly 136,000 individuals, according
to the HHS HIPAA Breach Reporting Tool website, which lists health data
breaches affecting 500 or more individuals.

Several Med-Data healthcare clients have also issued their own breach
notification statements about that incident, including Houston, Texas-based
Memorial Hermann; Wausau, Wisconsin-based Aspirus Health Plan; Peoria,
Illinois-based OSF HealthCare; and the University of Chicago Medical Center.

Med-Data says a former worker saved files containing PHI in personal
folders on the GitHub platform sometime between December 2018 and September
2019. Those files were removed on Dec. 17, 2020, Med-Data says.

Last year, security researcher Ursem and privacy blogger DataBreaches.net
published a paper describing nine other PHI data leaks found on GitHub
public repositories (see: Medical Records Exposed Via GitHub Leaks).

Avoiding Mishaps
Healthcare entities can take several steps to reduce the risk of
unintentional data exposures on GitHub, security experts note.

"When using any internet-facing platform, it is imperative that
organizations have processes in place to reduce or eliminate human error,"
says Erich Kron, security awareness advocate at security vendor and
consulting firm KnowBe4.

"Sensitive data such as passwords embedded in code or documentation can be
used to get into otherwise secure systems, and live data can easily be
accidentally included in uploads that were supposed to contain data that
was scrubbed," he says.

To avoid this, organizations should deploy data loss prevention controls
that can scan data for sensitive information to avoid uploading it to
external sites, he says.

"'Canary' data, such as fake records with obvious identifiers, can be used
to help spot when the wrong data set might be uploaded. If these fake
records are only present in the live dataset, not a dummy dataset, and are
found during an upload or through a search of the data on the external
platform, it should alert people to a significant problem."

In addition, consistent workforce training on the handling of sensitive
data can help keep employees mindful of the threats, Kron says.

Complex Landscape
As the threat landscape becomes more complex, it is imperative that
organizations across all sectors "think about data security differently
than we did 10 years ago," Santos says.

"Access is more extensive and interest in exfiltrating that data has
increased dramatically. A cultural shift in security awareness is essential
to the day-to-day activities of every employee - government or otherwise."
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.riskbasedsecurity.com/pipermail/breachexchange/attachments/20210505/ccb3cc85/attachment.html>