Monday, 8 July 2013

HL7 Analysis with NoSQL MongoDB - 2


I would like to thank all who read my article on HL7 analysis with NoSQL MongoDB - 1 and provided their valuable inputs. This has helped me improve the schema design & execute more queries for HL7 message analysis.

Some of the important considerations during this schema design were
  • We will be inserting data once while loading messages for analysis
  • There will not be any updates
  • Schema should be designed to facilitate faster reads

Here is a snapshot of schema that I have created. I have provided some sample queries using Map Reduce & Aggregation at the end of this article.


Let's take same example HL7 message from previous post
 MSH|^~\&|ADT|ADI|ADT-1|ADI-1|20050215||ADT^A01|MSGADT003|T|2.4  
 EVN|A01|20031016000000  
 PID|1|111222333|H123123^^^^MR^ADT~111-222-333^^^^SS^ADT||John^Smith|GARSEN^^Melissa|19380818|M||2028-9|241 AVE^^Lake City^WA^98125^^^^100|100|(425)111-2222|(425)111-2222||S|CHR|1234567|111-222-333  
 NK1|2|GARSEN^Melissa  
 PV1|1|E|||||D123^Jeff^Carron|||MED||||7|||D123^Jeff^Taylor|E|3454|R^20050215|||||||||||||||||||EM|||||20050215  
 IN1|1|I123|ICOMP1|INS COMP 1|PO BOX 1^^Lake City^WA^98125||||||||||1|John^Smith|01|19380818  
 IN2|1||RETIRED  
 IN1|2|I456|ICOMP2|INS COMP 1|PO BOX 2^^Lake City^WA^98125||||||||||8|John^Smith|01|19380818  
 IN2|2||RETIRED  

Here is a json representation of this sample HL7 message:
 {  
      "_id": "MSGADT003",  
      "Event": "A01",  
      "MsgDt": "20050215",  
      "Segments": [  
           {  
                "_id": "MSH",  
                "Rep": 1,  
                "Seq": 0,  
                "Val": "MSH|^~\\&|ADT|ADI|ADT-1|ADI-1|20050215||ADT^A01|MSGADT003|T|2.4",  
                "FC": 12,  
                "VF": 11,  
                "Fields": [  
                     {  
                          "_id": "MSH_1",  
                          "Val": "|"  
                     },  
                     {  
                          "_id": "MSH_2",  
                          "Val": "^~\\&"  
                     },  
                     {  
                          "_id": "MSH_3",  
                          "Val": "ADT"  
                     },  
                     {  
                          "_id": "MSH_4",  
                          "Val": "ADI"  
                     },  
                     {  
                          "_id": "MSH_5",  
                          "Val": "ADT-1"  
                     },  
                     {  
                          "_id": "MSH_6",  
                          "Val": "ADI-1"  
                     },  
                     {  
                          "_id": "MSH_7",  
                          "Val": "20050215"  
                     },  
                     {  
                          "_id": "MSH_9",  
                          "Val": "ADT^A01"  
                     },  
                     {  
                          "_id": "MSH_9_1",  
                          "Val": "ADT"  
                     },  
                     {  
                          "_id": "MSH_9_2",  
                          "Val": "A01"  
                     },  
                     {  
                          "_id": "MSH_10",  
                          "Val": "MSGADT003"  
                     },  
                     {  
                          "_id": "MSH_11",  
                          "Val": "T"  
                     },  
                     {  
                          "_id": "MSH_12",  
                          "Val": "2.4"  
                     }  
                ]  
           },  
           {  
                "_id": "EVN",  
                "Rep": 1,  
                "Seq": 1,  
                "Val": "EVN|A01|20031016000000",  
                "FC": 2,  
                "VF": 2,  
                "Fields": [  
                     {  
                          "_id": "EVN_1",  
                          "Val": "A01"  
                     },  
                     {  
                          "_id": "EVN_2",  
                          "Val": "20031016000000"  
                     }  
                ]  
           },  
           {  
                "_id": "PID",  
                "Rep": 1,  
                "Seq": 2,  
                "Val": "PID|1|111222333|H123123^^^^MR^ADT~111-222-333^^^^SS^ADT||John^Smith|GARSEN^^Melissa|19380818|M|||241 AVE^^Lake City^WA^98125||(425)111-2222|||S|CHR|1234567",  
                "FC": 18,  
                "VF": 12,  
                "Fields": [  
                     {  
                          "_id": "PID_1",  
                          "Val": "1"  
                     },  
                     {  
                          "_id": "PID_2",  
                          "Val": "111222333"  
                     },  
                     {  
                          "_id": "PID_3",  
                          "Val": "H123123^^^^MR^ADT~111-222-333^^^^SS^ADT",  
                          "Repetitions": [  
                               {  
                                    "_id": "PID_3",  
                                    "Val": "H123123^^^^MR^ADT",  
                                    "Rep": 1  
                               },  
                               {  
                                    "_id": "PID_3_1",  
                                    "Val": "H123123",  
                                    "Rep": 1  
                               },  
                               {  
                                    "_id": "PID_3_5",  
                                    "Val": "MR",  
                                    "Rep": 1  
                               },  
                               {  
                                    "_id": "PID_3_6",  
                                    "Val": "ADT",  
                                    "Rep": 1  
                               },  
                               {  
                                    "_id": "PID_3",  
                                    "Val": "111-222-333^^^^SS^ADT",  
                                    "Rep": 2  
                               },  
                               {  
                                    "_id": "PID_3_1",  
                                    "Val": "111-222-333",  
                                    "Rep": 2  
                               },  
                               {  
                                    "_id": "PID_3_5",  
                                    "Val": "SS",  
                                    "Rep": 2  
                               },  
                               {  
                                    "_id": "PID_3_6",  
                                    "Val": "ADT",  
                                    "Rep": 2  
                               }  
                          ]  
                     },  
                     {  
                          "_id": "PID_5",  
                          "Val": "John^Smith"  
                     },  
                     {  
                          "_id": "PID_5_1",  
                          "Val": "John"  
                     },  
                     {  
                          "_id": "PID_5_2",  
                          "Val": "Smith"  
                     },  
                     {  
                          "_id": "PID_6",  
                          "Val": "GARSEN^^Melissa"  
                     },  
                     {  
                          "_id": "PID_6_1",  
                          "Val": "GARSEN"  
                     },  
                     {  
                          "_id": "PID_6_3",  
                          "Val": "Melissa"  
                     },  
                     {  
                          "_id": "PID_7",  
                          "Val": "19380818"  
                     },  
                     {  
                          "_id": "PID_8",  
                          "Val": "M"  
                     },  
                     {  
                          "_id": "PID_11",  
                          "Val": "241 AVE^^Lake City^WA^98125"  
                     },  
                     {  
                          "_id": "PID_11_1",  
                          "Val": "241 AVE"  
                     },  
                     {  
                          "_id": "PID_11_3",  
                          "Val": "Lake City"  
                     },  
                     {  
                          "_id": "PID_11_4",  
                          "Val": "WA"  
                     },  
                     {  
                          "_id": "PID_11_5",  
                          "Val": "98125"  
                     },  
                     {  
                          "_id": "PID_13",  
                          "Val": "(425)111-2222"  
                     },  
                     {  
                          "_id": "PID_16",  
                          "Val": "S"  
                     },  
                     {  
                          "_id": "PID_17",  
                          "Val": "CHR"  
                     },  
                     {  
                          "_id": "PID_18",  
                          "Val": "1234567"  
                     }  
                ]  
           },  
           {  
                "_id": "PV1",  
                "Rep": 1,  
                "Seq": 4,  
                "Val": "PV1|1|E|||||D123^Jeff^Carron|||MED||||7|||D123^Jeff^Carron|E|3454|R^20050215|||||||||||||||||||EM|||||20050215",  
                "FC": 44,  
                "VF": 11,  
                "Fields": [  
                     {  
                          "_id": "PV1_1",  
                          "Val": "1"  
                     },  
                     {  
                          "_id": "PV1_2",  
                          "Val": "E"  
                     },  
                     {  
                          "_id": "PV1_7",  
                          "Val": "D123^Jeff^Carron"  
                     },  
                     {  
                          "_id": "PV1_7_1",  
                          "Val": "D123"  
                     },  
                     {  
                          "_id": "PV1_7_2",  
                          "Val": "Jeff"  
                     },  
                     {  
                          "_id": "PV1_7_3",  
                          "Val": "Carron"  
                     },  
                     {  
                          "_id": "PV1_10",  
                          "Val": "MED"  
                     },  
                     {  
                          "_id": "PV1_14",  
                          "Val": "7"  
                     },  
                     {  
                          "_id": "PV1_17",  
                          "Val": "D123^Jeff^Carron"  
                     },  
                     {  
                          "_id": "PV1_17_1",  
                          "Val": "D123"  
                     },  
                     {  
                          "_id": "PV1_17_2",  
                          "Val": "Jeff"  
                     },  
                     {  
                          "_id": "PV1_17_3",  
                          "Val": "Carron"  
                     },  
                     {  
                          "_id": "PV1_18",  
                          "Val": "E"  
                     },  
                     {  
                          "_id": "PV1_19",  
                          "Val": "3454"  
                     },  
                     {  
                          "_id": "PV1_20",  
                          "Val": "R^20050215"  
                     },  
                     {  
                          "_id": "PV1_20_1",  
                          "Val": "R"  
                     },  
                     {  
                          "_id": "PV1_20_2",  
                          "Val": "20050215"  
                     },  
                     {  
                          "_id": "PV1_39",  
                          "Val": "EM"  
                     },  
                     {  
                          "_id": "PV1_44",  
                          "Val": "20050215"  
                     }  
                ]  
           },  
           {  
                "_id": "IN1",  
                "Rep": 1,  
                "Seq": 5,  
                "Val": "IN1|1|I123|ICOMP1|INS COMP 1|PO BOX 1^^Lake City^WA^98125||||||||||1|John^Smith|01|19380818",  
                "FC": 18,  
                "VF": 9,  
                "Fields": [  
                     {  
                          "_id": "IN1_1",  
                          "Val": "1"  
                     },  
                     {  
                          "_id": "IN1_2",  
                          "Val": "I123"  
                     },  
                     {  
                          "_id": "IN1_3",  
                          "Val": "ICOMP1"  
                     },  
                     {  
                          "_id": "IN1_4",  
                          "Val": "INS COMP 1"  
                     },  
                     {  
                          "_id": "IN1_5",  
                          "Val": "PO BOX 1^^Lake City^WA^98125"  
                     },  
                     {  
                          "_id": "IN1_5_1",  
                          "Val": "PO BOX 1"  
                     },  
                     {  
                          "_id": "IN1_5_3",  
                          "Val": "Lake City"  
                     },  
                     {  
                          "_id": "IN1_5_4",  
                          "Val": "WA"  
                     },  
                     {  
                          "_id": "IN1_5_5",  
                          "Val": "98125"  
                     },  
                     {  
                          "_id": "IN1_15",  
                          "Val": "1"  
                     },  
                     {  
                          "_id": "IN1_16",  
                          "Val": "John^Smith"  
                     },  
                     {  
                          "_id": "IN1_16_1",  
                          "Val": "John"  
                     },  
                     {  
                          "_id": "IN1_16_2",  
                          "Val": "Smith"  
                     },  
                     {  
                          "_id": "IN1_17",  
                          "Val": "01"  
                     },  
                     {  
                          "_id": "IN1_18",  
                          "Val": "19380818"  
                     }  
                ]  
           },  
           {  
                "_id": "IN1",  
                "Rep": 2,  
                "Seq": 7,  
                "Val": "IN1|2|I456|ICOMP2|INS COMP 1|PO BOX 2^^Lake City^WA^98125||||||||||8|John^Smith|01|19380818",  
                "FC": 18,  
                "VF": 9,  
                "Fields": [  
                     {  
                          "_id": "IN1_1",  
                          "Val": "2"  
                     },  
                     {  
                          "_id": "IN1_2",  
                          "Val": "I456"  
                     },  
                     {  
                          "_id": "IN1_3",  
                          "Val": "ICOMP2"  
                     },  
                     {  
                          "_id": "IN1_4",  
                          "Val": "INS COMP 1"  
                     },  
                     {  
                          "_id": "IN1_5",  
                          "Val": "PO BOX 2^^Lake City^WA^98125"  
                     },  
                     {  
                          "_id": "IN1_5_1",  
                          "Val": "PO BOX 2"  
                     },  
                     {  
                          "_id": "IN1_5_3",  
                          "Val": "Lake City"  
                     },  
                     {  
                          "_id": "IN1_5_4",  
                          "Val": "WA"  
                     },  
                     {  
                          "_id": "IN1_5_5",  
                          "Val": "98125"  
                     },  
                     {  
                          "_id": "IN1_15",  
                          "Val": "8"  
                     },  
                     {  
                          "_id": "IN1_16",  
                          "Val": "John^Smith"  
                     },  
                     {  
                          "_id": "IN1_16_1",  
                          "Val": "John"  
                     },  
                     {  
                          "_id": "IN1_16_2",  
                          "Val": "Smith"  
                     },  
                     {  
                          "_id": "IN1_17",  
                          "Val": "01"  
                     },  
                     {  
                          "_id": "IN1_18",  
                          "Val": "19380818"  
                     }  
                ]  
           },  
           {  
                "_id": "IN2",  
                "Rep": 1,  
                "Seq": 6,  
                "Val": "IN2|1||RETIRED",  
                "FC": 3,  
                "VF": 2,  
                "Fields": [  
                     {  
                          "_id": "IN2_1",  
                          "Val": "1"  
                     },  
                     {  
                          "_id": "IN2_3",  
                          "Val": "RETIRED"  
                     }  
                ]  
           },  
           {  
                "_id": "IN2",  
                "Rep": 2,  
                "Seq": 8,  
                "Val": "IN2|2||RETIRED",  
                "FC": 3,  
                "VF": 2,  
                "Fields": [  
                     {  
                          "_id": "IN2_1",  
                          "Val": "2"  
                     },  
                     {  
                          "_id": "IN2_3",  
                          "Val": "RETIRED"  
                     }  
                ]  
           }  
      ]  
 }  

Here is a snapshot of sample document from MongoVUE
HL7_MongoDB_Schema


Example: different Trigger Events with count using Map Reduce
 function Map() {  
   emit(this.Event, 1);  
 }  
 function Reduce(key, values) {  
   returnArray.sum(values);  
 }  

Example: different Trigger Events with count using Aggregation
 { $project : { "Event" : 1 }},  
 { $group: { _id: "$Event", count: {$sum: 1} } }  

Example: different Segments with count using Map Reduce
 function Map() {  
   this.Segments.forEach(function (s) {  
    emit(s._id, 1);  
   });  
 }  
 function Reduce(key, values) {  
   returnArray.sum(values);  
 }  

Example: different Segments with count using Aggregation
 { $unwind: "$Segments" },  
 { $project : { "Segments._id": 1}},  
 { $group: { _id: "$Segments._id", count: {$sum: 1} } }  

Example: distinct values for MSH_3 using Map Reduce
 function Map() {  
   this.Segments.forEach(function (s) {  
    if(s._id == "MSH") {  
      s.Fields.forEach(function (f) {  
       if(f._id == "MSH_3")  
        emit(f.Val, 1);  
      });  
    }  
   });  
 }  
 function Reduce(key, values) {  
 returnArray.sum(values);  
 }  

Example: distinct values for MSH_3 using Aggregation
 { $unwind: "$Segments" },  
 { $unwind: "$Segments.Fields" },  
 { $match : { "Segments.Fields._id": "MSH_3"} },  
 { $group: { _id: "$Segments.Fields.Val" } }  

Example: List of MSH_10 with PV1_2="E" using Aggregation
 { $unwind: "$Segments" },  
 { $unwind: "$Segments.Fields" },  
 { $match : { "Segments.Fields._id": "PV1_2", "Segments.Fields.Val": "E"}},  
 { $project : { "_id": 1}}  

Here is the list of queries that I have executed successfully with this schema in MongoDB.
  1. Total number of messages in sample
  2. Distinct MSH_9_2 (Trigger Events) [with count]
    1. This will tell you which Messages you should handle in your interface
  3. Different Segments in all messages [with count]
    1. This will tell you which segments you should handle in you interface
  4. Different Fields (with values)
    1. This will tell you which fields you need to process
    2. This list contains only those fields for which we have received values or HL7 null in any of the message.
    3. If the field is empty in all the messages in sample then it will not appear here in the list.
  5. List of fields which HL7 null (“”) value [with count]
  6. List of fields with components  [with count]
  7. Check if particular field has components
  8. Check if particular field has specified leading character
  9. List of fields with repetitions  [with count]
  10. Maximum number of repetitions received for each field
  11. List of different Z Segments  [with count]
  12. Find messages with particular segment
  13. Find messages without particular segment
  14. Find messages with particular field. For example messages where PID_19 is present
  15. Find messages with particular field & value. For example messages where PV1_2 = E
  16. Find messages with particular field excluding given value. For example messages where PV1_2 != E
  17. Find messages without particular field
  18. Distinct values of particular field. For example different values for MSH_3
  19. Maximum number of fields received in given segment
  20. Total fields with values (excluding empty fields) in given segment

You can download python script to insert HL7 messages into MongoDB database from my github. I am working on other scripts
https://github.com/j4jayant/HL7-2-MongoDB-Analysis/blob/master/python/hl7-2-mongodb.py
https://github.com/j4jayant/HL7-2-MongoDB-Analysis/blob/master/python/readme.txt

I welcome any comments or criticism to help improve this.

No comments:

Post a Comment

Blockchain - Healthcare Implementation Use Case - Part 1

It's been a long time I am following Blockchain and my interest continued to increase and increased enough to dive into it. This ...