De-Identify Patient Data

Have you ever provisioned a new environment and need to populate it with quality, robust patient data?  Or have an old environment with great patient data but need to de-identify the patient data? Well so have we…

I was recently tasked with pulling a bundle of HL7 messages. The messages consisted of ORUs (lab and rad results), MDMs (doctor’s notes), ADTs (registration), and SIUs (scheduling) of twenty-five patient’s transactions over a five-day period.  This resulted in 1400 HL7 messages with live patient data. So just to recap, we have real life scenarios starting with registration of a patient to radiology, lab, and transcription information. Awesome!

We’re done… right?

Well this makes someone training providers or someone testing the new functionality of an application happy, but what about someone at HIPAA? We need to make sure if you we were to try and track down a patient we get a response like below.


We need to figure out a way to not compromise the patient data we compiled without leaving a trace. So what does that mean? We want to keep the patient orders date ranges consistent and adhere to HL7 message protocol without being able to figure out if your neighbor recently had a complete blood count (CBC). We could manually track down the patient sensitive data and update it but that is unreasonable. Luckily Inner Harbour Software has a great tool that accomplishes everything above called HL7Spy. The custom code functionality uses C# and will transform the patient data for you, while keeping the date ranges intact. I.E. if a CBC was ordered and two days later the specimen was collected; HL7Spy ensures the two-day period is maintained in the de-identified messages.

HL7Spy is a very robust and versatile tool. HL7Spy has the ability to analyze hundreds of thousands of messages and is a must have tool for HL7 integrators! I highly recommend you try their 20 day free trial (

If you also have a lot of PHI in a train, test, or development Allscripts Enterprise EHR environment that needs to be de-identified; we can help! Galen can go into these environments and run a script to scramble the patient data. We can only use a list of 200 first names and last names and replace it with all patients’ information. Only using a few specified SSNs, MRNs, addresses, etc ensures we protect the patient’s data.

Facebook Twitter Email

+ There are no comments

Add yours

This site uses Akismet to reduce spam. Learn how your comment data is processed.