How to parse the Multiple OBR Segment in HL7 using HAPI - java

The following text is the hl7 message , i could able to parse many segments except NTE segment .'m using HAPI to parse the hl7 messages.'m newvbie to HL7 so please can any one suggest relevant classes in HAPI how to parse NTE segments ? it would be better if explanation is with few examples,
MSH|^~\&|LCS|LCA|LIS|TEST9999|199807311532||ORU^R01|3629|P|2.2
PID|2|2161348462|20809880170|1614614|20809880170^TESTPAT||19760924|M|||^^^^
00000-0000|||||||86427531^^^03|SSN# HERE
ORC|NW|8642753100012^LIS|20809880170^LCS||||||19980727000000|||HAVILAND
OBR|1|8642753100012^LIS|20809880170^LCS|008342^UPPER RESPIRATORY
CULTURE^L|||19980727175800||||||SS#634748641 CH14885 SRC:THROA
SRC:PENI|19980727000000||||||20809880170||19980730041800||BN|F
OBX|1|ST|008342^UPPER RESPIRATORY CULTURE^L||FINALREPORT|||||N|F||| 19980729160500|BN
ORC|NW|8642753100012^LIS|20809880170^LCS||||||19980727000000|||HAVILAND
OBR|2|8642753100012^LIS|20809880170^LCS|997602^.^L|||19980727175800||||G|||
19980727000000||||||20809880170||19980730041800|||F|997602|||008342
OBX|2|CE|997231^RESULT 1^L||M415|||||N|F|||19980729160500|BN
NTE|1|L|MORAXELLA (BRANHAMELLA) CATARRHALIS
NTE|2|L| HEAVY GROWTH
NTE|3|L| BETA LACTAMASE POSITIVE
OBX|3|CE|997232^RESULT 2^L||MR105|||||N|F|||19980729160500|BN
NTE|1|L|ROUTINE RESPIRATORY FLORA
EDITED
Here I am supposed to parse multiple OBR segments, can anybody please guide me?

It looks like the message you have is valid, but the issue that you may be having is with the formatting of the sample. It looks like a couple of the lines were wrapped. If you properly format them, then the message can be parsed properly.
In HL7 2.x, all new lines must start with a segment identifier (e.g. MSH, PID, OBX, ...). If the line does not start with one of these identifiers, then the parser will not know how to interpret that line or the remainder of the message.
If you are using HAPI and looking to test message, I would recommend using their HAPI test panel. It is an extremely easy to use tool that can help you verify a message and test message transmission.
Below is a screenshot of what the message looked like in the test panel, once the formatting was cleaned up.

I have solved the issue by creating looping for every other segments with NTE segments loop , every segments has the optional NTE segments so iterated with every segment . Now its working fine ...

Related

"Strict" Avro Parsing Mode (No dropping additional fields)

This is tangentially related to avro json additional field
The issue I have is that JSON Avro decoding allows for additional fields on the root level recrod while disallowing them on inner records because of a parsing failure. In the current project I work on we have a requirement that we cannot drop any data which means I need to find a solution somehow.
See this code example https://gist.github.com/GrafBlutwurst/4d5c108b026b34ce83d2569bc8991b3d
(Avro 1.8.2)
Does anyone know if there's a "strict" mode for the AVRO Parser or something similar? This ticket also seems to link somewhat to it https://issues.apache.org/jira/browse/AVRO-2034
Thanks!
EDIT: After more researching it seems there's a PR open to fix this
https://github.com/apache/avro/pull/321 but only for ruby
EDIT II: It most likely seems to be a parser bug it's not only in nested object but also an issue if the string contains several json objects and the first one contains additional fields. There's a drain method that is supposed to pop left over tokens from the stack but it doesn't seem to work. as the current parsing position is always 1 when it's entered (top of the stack) as of yet I haven't figured out why.

deeplearning4j generate response to input

I have recently been trying to learn DL4J but have run into some issues. They have an example of a neural network generating Shakespeare-like text based off and input character but I can't seem to find anything that wold indicate a possible way of creating a response to an input statement.
I would like to use an input string such as "Hello" and have it be able to generate a response of varying length depended on the input. I would like to know if this is possible using LSTM and have a point in the right direction as I have no idea where to even start.
We have plenty of documentation this actually. This gives you a layout of what an RNN looks like:
http://deeplearning4j.org/usingrnns
The model you would be looking at is character level, in general what you want is question answering though. You may want to look at an architecture like this: https://cs.umd.edu/~miyyer/pubs/2014_qb_rnn.pdf
If you are completely new to NLP, I would look at this class:
https://www.youtube.com/playlist?list=PLhVhwi0Pz282aSA2uZX4jR3SkF3BKyMOK
It covers question answering as well.

Defining a manual Split algorithm for File Input

I'm new to Spark and the Hadoop ecosystem and already fell in love with it.
Right now, I'm trying to port an existing Java application over to Spark.
This Java application is structured the following way:
Read file(s) one by one with a BufferedReader with a custom Parser Class that does some heavy computing on the input data. The input files are of 1 to maximum 2.5 GB size each.
Store data in memory (in a HashMap<String, TreeMap<DateTime, List<DataObjectInterface>>>)
Write out the in-memory-datastore as JSON. These JSON files are smaller of size.
I wrote a Scala application that does process my files by one worker but that is obviously not the most performance benefit I can get out of Spark.
Now to my problem with porting this over to Spark:
The input files are line-based. I usually have one message per line. However, some messages depend on preceding lines to form an actual valid message in the Parser. For example it could happen that I get data in the following order in an input file:
{timestamp}#0x033#{data_bytes} \n
{timestamp}#0x034#{data_bytes} \n
{timestamp}#0x035#{data_bytes} \n
{timestamp}#0x0FE#{data_bytes}\n
{timestamp}#0x036#{data_bytes} \n
To form an actual message that out of the "composition message" 0x036, the parser also needs the lines from message 0x033, 0x034 and 0x035. Other messages could also get in between these set of needed messages. The most messages can be parsed by reading a single line though.
Now finally my question:
How to get Spark to split my file correctly for my purposes? The files can not be Split "randomly"; they must be split in a way that makes sure that all my messages can be parsed and the Parser will not wait for input that he will never get. This means that each composition message (messages that depend on preceding lines) need to be in one split.
I guess there are several ways to achieve a correct output but I'll throw some ideas that I had into this post as well:
Define a manual Split algorithm for the file input? This will check that the last few lines of a split do not contain the start of a "big" message [0x033, 0x034, 0x035].
Split the file however spark wants but also add a fixed number of lines (lets say 50, that will do the job for sure) from the last split to the next split. Multiple data will be handled by the Parser class correctly and would not introduce any issues.
The second way might be easier, however I have no clue how to implement this in Spark. Can someone point me into the right direction?
Thanks in advance!
I saw your comment on my blogpost on http://blog.ae.be/ingesting-data-spark-using-custom-hadoop-fileinputformat/ and decided to give my input here.
First of all, I'm not entirely sure what you're trying to do. Help me out here: your file contains lines containing the 0x033, 0x034, 0x035 and 0x036 so Spark will process them separately? While actually these lines need to be processed together?
If this is the case, you shouldn't interpret this as a "corrupt split". As you can read in the blogpost, Spark splits files into records that it can process separately. By default it does this by splitting records on newlines. In your case however, your "record" is actually spread over multiple lines. So yes, you can use a custom fileinputformat. I'm not sure this will be the easiest solution however.
You can try to solve this using a custom fileinputformat that does the following: instead of giving line by line like the default fileinputformat does, you parse the file and keep track of encountered records (0x033, 0x034 etc). In the meanwhile you may filter out records like 0x0FE (not sure if you want to use them elsewhere). The result of this will be that Spark gets all these physical records as one logical record.
On the other hand, it might be easier to read the file line by line and map the records using a functional key (e.g. [object 33, 0x033], [object 33, 0x034], ...). This way you can combine these lines using the key you chose.
There are certainly other options. Whichever you choose depends on your use case.

ROME : Unable to get full description of post for a feed

I am parsing feed from http://feeds.feedburner.com/Commercial_LCD_Monitors. But while getting description of each post I got few lines and then it gets truncated with ending [...] characters.
eg.
Stand out from the crowds with a higher level of professionalism with the L305 mobile data projector. The 3000:1 ANSI lumens and advanced 3-chip LCD technology delivers images that are of the highest quality, realistic and sharp. Colours are not only [...]
Can anyone explain what is the issue and possible resolution if any?
Thanks in advance.
Regards,
Amit
The feed only provides a truncated description. You'll either have to find another feed with a longer description or build something to retrieve the content from each item.

Rewriting Binary Streams using Java

I have been studying Netty and Mina but am confused as to the best way to rewrite binary streams. For example, I would like to create a proxy that will allow for replacement of XML and forward along.
Examples appreciated.
I think you're thinking at too low of a level. XML is not so much "binary" as it is an abstraction on top of binary. If you want to replace snippets of XML as they come across your line, you'll have to poke into the payload portion of the packets and look for patterns of XML.. a simple way is to use a regular expression after rebuilding the bytes into content temporarily.
Once you have this search and you have matched what you want, you can replace what you want to replace and re-send.
The hard part of this is that you will likely need to cache some input before it leaves your machine so that you are able to find the beginning and end of what it is you are searching for. What makes this difficult is that often times, you don't know what constitutes the "beginning" and the "end" of a data payload.

Categories