how to read a parquet file, in a standalone java code? [closed] - java

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
the parquet docs from cloudera shows examples of integration with pig/hive/impala. but in many cases I want to read the parquet file itself for debugging purposes.
is there a straightforward java reader api to read a parquet file ?
Thanks
Yang

Old method: (deprecated)
AvroParquetReader<GenericRecord> reader = new AvroParquetReader<GenericRecord>(file);
GenericRecord nextRecord = reader.read();
New method:
ParquetReader<GenericRecord> reader = AvroParquetReader.<GenericRecord>builder(file).build();
GenericRecord nextRecord = reader.read();
I got this from here and have used this in my test cases successfully.

You can use AvroParquetReader from parquet-avro library to read a parquet file as a set of AVRO GenericRecord objects.

Related

Parse a pdf containing tabular data and obtain a list of key value pairs in java? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 days ago.
Improve this question
I want to parse a pdf which is having a table. But, couldn't find any library(in java) which could read a pdf in a format in which it appears.
Basically, the existing libraries read a table just as a string without preserving the format. So, what is the best way to parse it and get the list of key value pairs in java?
Pdf screenshot sample

CSV with JSON columns [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I'm intending to have few of my columns in my dataset to be in JSON format, for grouping purposes, but the general CSV format is preserved (or whatever delimiter).
Is there a Java library that can process, parse, read, write CSVs that has columns as JSON?
I don't know of any library that can do this. However, it is simple enough to use your go to CSV library, e.g. Commons CSV and run the value string through Jackson or Gson.
However, I would recommend using JSON lines as a substitute for CSV. This can be parsed nicely out of the box with Jackson or Gson. You would end up with CSV like files like this:
["Name", "Session", "Score", "Completed", "some object"]
["Gilbert", "2013", 24, true, {"foo": "bar"}]

How to create Data file for storing input data such as employee info [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
guys i saw these couple of codes while i was browsing on the net. when i read the the whole content it says that these codes are for creating a data file for storing an inputs such as Employee information or registered accounts etc. can anyone please give me an example for this?
backupRecords();
outputstream = new DataOutputStream(new FileOutputStream("data.dat"));
reWriteRecords(outputstream);
inputstream = new DataInputStream(new FileOutputStream("temp.dat"));
I am writing this answer assuming that you wanted to know hot the contents of the input file look like.
The easy answer would be something like this:
data.txt:
Yuri;Gagarin;Russia;52;Male
Booch;Gary;USA;40;Male
Randy;Orton;USA;52;Male
Anna;Tereshkova;Russia;40;Female
Maria;Sharapova;Russia;32;Female
Notice the delimiter ';'. The java code after opening the file for reading, reads the contents of the file and would be written under the assumption that each occurrence of the delimiter ';' gives a piece of information. For example, FirstName or LastName or Country or Age or Gender.
The better solution for reading data is xml.
data.xml:
<PersonInfo>
<FirstName>Yuri</FirstName>
<LastName>Gagarin</LastName>
<Country>Russia</Country>
<Age>52</Age>
<Gender>Male</Gender>
....
</PersonInfo>

Transfer text file to csv format Java [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I have a "|" separated data file. Need to transfer it to csv format.
In csv I have 9 columns. And the data is in format:
257|30|1666|4906|3712|1|1.00|4.99|1.04|2|27.28|2.00|4.92|1|4.99|1.00|1.04|9|0.222222|0.000000|0.111111|-1.000000
254|1|1578|3713|4900|1|1.00|1.99|1.26|16|53.30|25.00|12.23|39|125.30|55.00|62.48|320|0.050000|0.000000|0.003125|0.000000
256|38|227|25303|25306|1|1.00|11.99|1.99|1|6.99|1.00|1.67|7|62.28|9.00|9.08|16|0.062500|1.000000|0.062500|0.000000
is there a built functions? What do you advise to use?
Thanks
If you use an inputstream to read from your data file and replace every occurrence of the pipe symbol with a comma and you use an output stream to write back you will be a happy man. you can use every 9th pipe as some sort signal to move your output to a new line. That's like 10 lines of code

Java REST Multi-Node API design [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I am new to REST design patterns. I am trying to write an API with following design in mind.
GET http://www.example.com/customers/33245/orders/8769/lineitems/1
I am able to write basic REST services with Java (JAX-RS) using Jersey:
GET|PUT|DELETE http://www.example.com/customers/{id}
Any tutorial that explains how we should do such multi-node routing in Java would be really helpful.
Thanks,
Kush
You should probably go with something like this (this is a snippet from the resource class):
...
#GET
#Path("customers/{customer-id}/orders/{order-id}/lineitems/{lineitem-id}")
public Response get(#PathParam("customer-id") String customerId, #PathParam("order-id") String orderId, #PathParam("lineitem-id") String lineItemId) {
// fetch logic goes here...
}
...

Categories