'Hidden' PHI in Medical Images Poses Risks3 Radiology Groups Urge Scrutiny of Online Presentations to Ensure Privacy
Patient identifiers embedded in medical images used for online presentations are at risk of inadvertent discovery by advanced web-crawling technologies in search engines, three radiology associations warn.
"Advances in web-crawling and content processing technology employed by search engine vendors - for example Google, Bing and others - increasingly enable large-scale information extraction from previously stored files," warns the recent alert from the American College of Radiology, the Radiological Society of North America and the Society for Imaging Informatics in Medicine.
"Among other things, this technology can extract source images contained in PowerPoint presentations and Adobe PDF files and recognize alphanumeric character information that may be embedded in the image pixels," the groups point out.
As a result, an image with embedded patient information can be indexed by the search engines, the alert warns. "When explicit patient information becomes associated with images in the search engine database, it can be found on subsequent internet searches on the patient's personal information."
For example, if a patient searches their name in a search engine, images from a diagnostic imaging study performed several years ago could appear, the three groups warn. When a patient clicks on those images, they could be directed to the website of a professional imaging association that stored an Adobe PDF file as part of an educational presentation.
Those creating PowerPoint presentations could be unaware that the files used contained PHI that was not sufficiently de-identified or that the saving in Adobe PDF format also had not preserved privacy, the groups point out
A study released last year by Digital Shadows' Photon Research Team revealed the inadvertent online exposure of 4.7 million healthcare files - the majority being medical images - that contained patient names and other identifiers as well as details about the patient's healthcare encounter (see: 2.3 Billion Files Exposed Online: Root Causes).
Many of the files were exposed via Server Message Block, or SMP, protocol file shares, as well as due to misconfigurations or a lack of security controls that made files accessible via the internet, Digital Shadows' research showed.
"Embedding identifiers into medical images is a good thing by helping to prevent confusing or mixing up patients' information. That said, we need to be careful where and when these images are stored, shared or viewed."
—Mark Johnson, LBMC Information Security
Last year, a joint investigation by news media site ProPublica and German broadcaster Bayerischer Rundfunk discovered millions of patients' medical image files exposed on the internet - including by a U.S. company, TridentUSA Health Services' MobilexUSA unit (see Sen. Warner Asks HHS for Answers on Unsecured Medical Images).
In that incident, the researchers said they found 187 servers in the U.S. - including a MobilexUSA server - left unprotected by passwords or basic security precautions. In total, the exposed records included medical images and health data - including X-rays, MRIs and CT scans - belonging to about 5 million Americans - plus "millions more around the world."
The names of more than 1 million patients were accessible on the unsecured MobilexUSA server "all by typing in a simple data query," ProPublica reported.
Incidents involving patient identifiers inadvertently exposed in online presentations, including those featuring medical images, are common, says former healthcare CISO Mark Johnson, who heads the healthcare practice at the consultancy LBMC Information Security.
"This happens all too often. It is due to an inattention to detail and a lack of process to review such presentations for these kinds of issues," he says.
"In several of my previous roles, the organizations had processes for having these types of presentations reviewed by legal and the privacy official. Occasionally, I was asked to review for technical or security points before the presentation was sent," he notes.
Including patient identifiers in medical imaging files serves a practical purpose, Johnson notes.
"We should recognize that embedding identifiers into these images is a good thing by helping to prevent confusing or mixing up patients' information," he says. "That said, we need to be careful where and when these images are stored, shared or viewed. The most common mistake I've seen is that people don't realize that the images themselves are PHI."
Precautions to Take
The three radiology groups also issued guidance to help healthcare entities prevent inadvertent PHI disclosures in medical images used for educational purposes or online presentations.
"The first place to pay attention to potential PHI exposure is the initial workflow step of exporting images from the PACS [picture archiving and communication systems] or another imaging device or application," the guidance notes.
"Every time an image is saved directly from a PACS as a file - as opposed to creating a limited screenshot - there is a risk that PHI gets into that file via patient data embedded as pixels within the image itself or in the form of metadata if a DICOM file is saved," the guidance says. "Even when images do, in fact, contain PHI data, it can be redacted using appropriate tools and processes."
Once a presentation and other documents containing medical images have been converted to PDFs for sharing via the web, entities must take precautions before posting the material.
"Although you may not see hidden data when simply viewing a PDF through a common viewer, the PDF can contain PHI in hidden objects as well as metadata stored in tags," the guidance warns. "Adobe has a 'sanitize' function that will help you identify and redact hidden data."