HL7 parsing to get ORC-2 - java

I am having trouble reading the ORC-2 field from ORM^O01 order message. I am using HapiStructures-v23-1.2.jar to read but this method(getFillerOrdersNumber()) is returning null value
MSH|^~\\&|recAPP|20010|BIBB|HCL|20110923192607||ORM^O01|11D900220|D|2.3|1\r
PID|1|11D900220|11D900220||TEST^FOURTYONE||19980808|M|||\r
ZRQ|1|11D900220||CHARTMAXX TESTING ACCOUNT 2|||||||||||||||||Y\r
ORC|NW|11D900220||||||||||66662^NOT INDICATED^X^^^^^^^^^^U|||||||||CHARTMAXX
TESTING ACCOUNT 2|^695 S.BROADWAY^DENVER^CO^80209\r
OBR|1|11D900220||66^BHL, 9P21 GENOTYPE^L|NORMAL||20110920001800|
||NOTAVAILABLE|N||Y|||66662^NOT INDICATED^X^^^^^^^^^^U\r
I want to parse this message and read the ORC-2 field and save it in the database
public static string getOrderNumber(){
Message hapiMsg = null;
Parser p = new GenericParser();
p.setValidationContext(null);
try {
hapiMsg = p.parse(hl7Message);
} catch (Exception e) {
Logger.error(e);
}
Terser terser = new Terser(hapiMsg);
try {
ORM_O01 getOrc = (ORM_O01)hapiMsg;
ORC orc = new ORC(getOrc, null);
String fn= orc.getFillerOrderNumber().toString();
}catch(Exception e){
logger.error(e);
}
return fn;
}
I read in some posts that I have to ladder through to reach the ORC OBR and NTE segments. can someone help me how to do this with a piece of code. Thanks in advance

First I have to point out that ORC-2 is Placer Order Number and ORC-3 is Filler Order Number, not the other way round. So, what you might want to do is this:
ORM_O01 msg = ...
ORC orc = msg.getORDER().getORC();
String placerOrderNumber =
orc.getPlacerOrderNumber().getEntityIdentifier().getValue();
String fillerOrderNumber =
orc.getFillerOrderNumber().getEntityIdentifier().getValue();
I would suggest you to read Hapi documentation yourself: http://hl7api.sourceforge.net/v23/apidocs/index.html

Based on this code:
ORM_O01 getOrc = (ORM_O01)hapiMsg;
ORC orc = new ORC(getOrc, null);
String fn= orc.getFillerOrderNumber().toString();
It looks like you are creating a new ORC rather than pulling out the existing one from the message. I unfortunately can't provide the exact code as I'm only familiar with HL7, not HAPI.
EDIT: It looks like you may be able to do ORC orc = getOrc.getORDER().getORC();

Related

how to write parquet files in java with apache arrow

I am trying to write data in java into apache parquet. So far, what i've done is use apache arrow via the examples here: https://arrow.apache.org/cookbook/java/schema.html#creating-fields and create an arrow format dataset.
Question is, how do I write it into parquet after that? Also, do I need to use apache arrow to output the data as a parquet file? or can I use apache parquet directly to serialize the data and then output it as a parquet file?
what i've done:
try (BufferAllocator allocator = new RootAllocator()) {
Field name = new Field("name", FieldType.nullable(new ArrowType.Utf8()), null);
Field age = new Field("age", FieldType.nullable(new ArrowType.Int(32, true)), null);
Schema schemaPerson = new Schema(asList(name, age));
try(
VectorSchemaRoot vectorSchemaRoot = VectorSchemaRoot.create(schemaPerson, allocator)
){
VarCharVector nameVector = (VarCharVector) vectorSchemaRoot.getVector("name");
nameVector.allocateNew(3);
nameVector.set(0, "David".getBytes());
nameVector.set(1, "Gladis".getBytes());
nameVector.set(2, "Juan".getBytes());
IntVector ageVector = (IntVector) vectorSchemaRoot.getVector("age");
ageVector.allocateNew(3);
ageVector.set(0, 10);
ageVector.set(1, 20);
ageVector.set(2, 30);
vectorSchemaRoot.setRowCount(3);
File file = new File("randon_access_to_file.arrow");
try (
FileOutputStream fileOutputStream = new FileOutputStream(file);
ArrowFileWriter writer = new ArrowFileWriter(vectorSchemaRoot, null, fileOutputStream.getChannel())
) {
writer.start();
writer.writeBatch();
writer.end();
System.out.println("Record batches written: " + writer.getRecordBlocks().size() + ". Number of rows written: " + vectorSchemaRoot.getRowCount());
} catch (IOException e) {
e.printStackTrace();
}
}
}
but this outputs as an arrow file. not a parquet. Any ideas how I can output this to parquet file instead? And do i need arrow to generate a parquet file to begin with - or can i just use parquet directly?
Arrow Java does not yet support writing to Parquet files, but you can use Parquet to do that.
There is some code in the Arrow dataset test classes that may help. See
org.apache.arrow.dataset.ParquetWriteSupport;
org.apache.arrow.dataset.file.TestFileSystemDataset;
The second class has some tests that use the utilities in the first one.
You can find them on GitHub here:
https://github.com/apache/arrow/tree/master/java/dataset/src/test/java/org/apache/arrow/dataset

Creating test data from Confluent Control Center JSON representation

I'm trying to write some unit tests for Kafka Streams and have a number of quite complex schemas that I need to incorporate into my tests.
Instead of just creating objects from scratch each time, I would ideally like to instantiate using some real data and perform tests on that. We use Confluent with records in Avro format, and can extract both schema and a text JSON-like representation from the Control Center application. The JSON is valid JSON, but it's not really in the form that you'd write it in if you were just writing JSON representations of the data, so I assume it's some representation of the underlying AVRO in text form.
I've already used the schema to create a Java SpecificRecord class (price_assessment) and would like to use the JSON string copied from the Control Center message to populate a new instance of that class to feed into to my unit test InputTopic.
The code I've tried so far is
var testAvroString = "{JSON copied from Control Center topic}";
Schema schema = price_assessment.getClassSchema();
DecoderFactory decoderFactory = new DecoderFactory();
Decoder decoder = null;
try {
DatumReader<price_assessment> reader = new SpecificDatumReader<price_assessment>();
decoder = decoderFactory.get().jsonDecoder(schema, testAvroString);
return reader.read(null, decoder);
} catch (Exception e)
{
return null;
}
which is adapted from another SO answer that was using GenericRecords. When I try running this though I get the exception Cannot invoke "org.apache.avro.Schema.equals(Object)" because "writer" is null on the reader.read(...) step.
I'm not massively familiar with streams testing or Java and I'm not sure what exactly I've done wrong. Written in Java 17, streams 3.1.0, though flexible with version
The solution that I've managed to come up with is the following, which seems to work:
private static <T> T avroStringToInstance(Schema classSchema, String testAvroString) {
DecoderFactory decoderFactory = new DecoderFactory();
GenericRecord genericRecord = null;
try {
Decoder decoder = decoderFactory.jsonDecoder(classSchema, testAvroString);
DatumReader<GenericData.Record> reader =
new GenericDatumReader<>(classSchema);
genericRecord = reader.read(null, decoder);
} catch (Exception e)
{
return null;
}
var specific = (T) SpecificData.get().deepCopy(genericRecord.getSchema(), genericRecord);
return specific;
}

Spark Save as Text File grouped by Key

I would like to save RDD to text file grouped by key, currently I can't figure out how to split the output to multiple files, it seems all the output spanning across multiple keys which share the same partition gets written to the same file. I would like to have different files for each key. Here's my code snippet :
JavaPairRDD<String, Iterable<Customer>> groupedResults = customerCityPairRDD.groupByKey();
groupedResults.flatMap(x -> x._2().iterator())
.saveAsTextFile(outputPath + "/cityCounts");
This can be achieved by using foreachPartition to save each partitions into separate file.
You can develop your code as follows
groupedResults.foreachPartition(new VoidFunction<Iterator<Customer>>() {
#Override
public void call(Iterator<Customer> rec) throws Exception {
FSDataOutputStream fsoutputStream = null;
BufferedWriter writer = null;
try {
fsoutputStream = FileSystem.get(new Configuration()).create(new Path("path1"))
writer = new BufferedWriter(fsoutputStream)
while (rec.hasNext()) {
Customer cust = rec.next();
writer.write(cust)
}
} catch (Exception exp) {
exp.printStackTrace()
//Handle exception
}
finally {
// close writer.
}
}
});
Hope this helps.
Ravi
So I figured how to solve this. Convert RDD to Dataframe and then just partition by key during write.
Dataset<Row> dataFrame = spark.createDataFrame(customerRDD, Customer.class);
dataFrame.write()
.partitionBy("city")
.text("cityCounts"); // write as text file at file path cityCounts

convert rdfxml into turtle triples

I've written a java program that ingests data from a .csv, and converts those data into RDFXML. I used sesame's framework when writing this program, and the program successfully does what it was written to do.
However, I am trying to unit test this program using jUnit, and I need to test a method which converts RDF triples (in turtle format) to RDFXML. To show that the method works correctly, I would like to do this by converting RDFXML back into triples and comparing them to the original triples I passed into the method. So far, I have not found anything in sesame's documentation does this. Any suggestions?
I just solved the problem a few minutes ago. Here's my solution:
#Test
public void testWriteStmtToRDFPos(){
RDFParser parser = new RDFXMLParser();
String baseURI = "";
Model origStmts = new LinkedHashModel();
Model processedStmts = new LinkedHashModel();
StatementCollector collector = new StatementCollector(processedStmts);
parser.setRDFHandler(collector);
origStmts.add(sexOffend,predicate,object);
try{
converter.writeStmtToRDF(origStmts, rdfFile);
FileReader reader = new FileReader(rdfFile);
parser.parse(reader, baseURI);
if(origStmts.equals(processedStmts)){
assert(true);
}
}catch(FileNotFoundException e){
e.printStackTrace();
fail();
}catch(Exception e){
e.printStackTrace();
fail();
}
}
When you set the collector for the parser above, it simply collects any statements that the parser ingests. After doing this, you can compare the collector with origStmts. This wasn't immediately obvious, but is really useful after finding it!

Java: CSV file read & write

I'm reading 2 csv files: store_inventory & new_acquisitions.
I want to be able to compare the store_inventory csv file with new_acquisitions.
1) If the item names match just update the quantity in store_inventory.
2) If new_acquisitions has a new item that does not exist in store_inventory, then add it to the store_inventory.
Here is what i have done so far but its not very good. I added comments where i need to add taks 1 & 2.
Any advice or code to do the above tasks would be great! thanks.
File new_acq = new File("/src/test/new_acquisitions.csv");
Scanner acq_scan = null;
try {
acq_scan = new Scanner(new_acq);
} catch (FileNotFoundException ex) {
Logger.getLogger(mainpage.class.getName()).log(Level.SEVERE, null, ex);
}
String itemName;
int quantity;
Double cost;
Double price;
File store_inv = new File("/src/test/store_inventory.csv");
Scanner invscan = null;
try {
invscan = new Scanner(store_inv);
} catch (FileNotFoundException ex) {
Logger.getLogger(mainpage.class.getName()).log(Level.SEVERE, null, ex);
}
String itemNameInv;
int quantityInv;
Double costInv;
Double priceInv;
while (acq_scan.hasNext()) {
String line = acq_scan.nextLine();
if (line.charAt(0) == '#') {
continue;
}
String[] split = line.split(",");
itemName = split[0];
quantity = Integer.parseInt(split[1]);
cost = Double.parseDouble(split[2]);
price = Double.parseDouble(split[3]);
while(invscan.hasNext()) {
String line2 = invscan.nextLine();
if (line2.charAt(0) == '#') {
continue;
}
String[] split2 = line2.split(",");
itemNameInv = split2[0];
quantityInv = Integer.parseInt(split2[1]);
costInv = Double.parseDouble(split2[2]);
priceInv = Double.parseDouble(split2[3]);
if(itemName == itemNameInv) {
//update quantity
}
}
//add new entry into csv file
}
Thanks again for any help. =]
Suggest you use one of the existing CSV parser such as Commons CSV or Super CSV instead of reinventing the wheel. Should make your life a lot easier.
Your implementation makes the common mistake of breaking the line on commas by using line.split(","). This does not work because the values themselves might have commas in them. If that happens, the value must be quoted, and you need to ignore commas within the quotes. The split method can not do this -- I see this mistake a lot.
Here is the source of an implementation that does it correctly:
http://agiletribe.purplehillsbooks.com/2012/11/23/the-only-class-you-need-for-csv-files/
With help of the open source library uniVocity-parsers, you could develop with pretty clean code as following:
private void processInventory() throws IOException {
/**
* ---------------------------------------------
* Read CSV rows into list of beans you defined
* ---------------------------------------------
*/
// 1st, config the CSV reader with row processor attaching the bean definition
CsvParserSettings settings = new CsvParserSettings();
settings.getFormat().setLineSeparator("\n");
BeanListProcessor<Inventory> rowProcessor = new BeanListProcessor<Inventory>(Inventory.class);
settings.setRowProcessor(rowProcessor);
settings.setHeaderExtractionEnabled(true);
// 2nd, parse all rows from the CSV file into the list of beans you defined
CsvParser parser = new CsvParser(settings);
parser.parse(new FileReader("/src/test/store_inventory.csv"));
List<Inventory> storeInvList = rowProcessor.getBeans();
Iterator<Inventory> storeInvIterator = storeInvList.iterator();
parser.parse(new FileReader("/src/test/new_acquisitions.csv"));
List<Inventory> newAcqList = rowProcessor.getBeans();
Iterator<Inventory> newAcqIterator = newAcqList.iterator();
// 3rd, process the beans with business logic
while (newAcqIterator.hasNext()) {
Inventory newAcq = newAcqIterator.next();
boolean isItemIncluded = false;
while (storeInvIterator.hasNext()) {
Inventory storeInv = storeInvIterator.next();
// 1) If the item names match just update the quantity in store_inventory
if (storeInv.getItemName().equalsIgnoreCase(newAcq.getItemName())) {
storeInv.setQuantity(newAcq.getQuantity());
isItemIncluded = true;
}
}
// 2) If new_acquisitions has a new item that does not exist in store_inventory,
// then add it to the store_inventory.
if (!isItemIncluded) {
storeInvList.add(newAcq);
}
}
}
Just follow this code sample I worked out according to your requirements. Note that the library provided simplified API and significent performance for parsing CSV files.
The operation you are performing will require that for each item in your new acquisitions, you will need to search each item in inventory for a match. This is not only not efficient, but the scanner that you have set up for your inventory file would need to be reset after each item.
I would suggest that you add your new acquisitions and your inventory to collections and then iterate over your new acquisitions and look up the new item in your inventory collection. If the item exists, update the item. If it doesnt, add it to the inventory collection. For this activity, it might be good to write a simple class to contain an inventory item. It could be used for both the new acquisitions and for the inventory. For a fast lookup, I would suggest that you use HashSet or HashMap for your inventory collection.
At the end of the process, dont forget to persist the changes to your inventory file.
As Java doesn’t support parsing of CSV files natively, we have to rely on third party library. Opencsv is one of the best library available for this purpose. It’s open source and is shipped with Apache 2.0 licence which makes it possible for commercial use.
Here, this link should help you and others in the situations!
For writing to CSV
public void writeCSV() {
// Delimiter used in CSV file
private static final String NEW_LINE_SEPARATOR = "\n";
// CSV file header
private static final Object[] FILE_HEADER = { "Empoyee Name","Empoyee Code", "In Time", "Out Time", "Duration", "Is Working Day" };
String fileName = "fileName.csv");
List<Objects> objects = new ArrayList<Objects>();
FileWriter fileWriter = null;
CSVPrinter csvFilePrinter = null;
// Create the CSVFormat object with "\n" as a record delimiter
CSVFormat csvFileFormat = CSVFormat.DEFAULT.withRecordSeparator(NEW_LINE_SEPARATOR);
try {
fileWriter = new FileWriter(fileName);
csvFilePrinter = new CSVPrinter(fileWriter, csvFileFormat);
csvFilePrinter.printRecord(FILE_HEADER);
// Write a new student object list to the CSV file
for (Object object : objects) {
List<String> record = new ArrayList<String>();
record.add(object.getValue1().toString());
record.add(object.getValue2().toString());
record.add(object.getValue3().toString());
csvFilePrinter.printRecord(record);
}
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
fileWriter.flush();
fileWriter.close();
csvFilePrinter.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
You can use Apache Commons CSV api.
FYI this anwser : https://stackoverflow.com/a/42198895/6549532
Read / Write Example

Categories