Bindy skipping empty csv files - java

I built a camel route that unmarshals a csv file using camel bindy and writes the contents to a database. This works perfectly apart from when the file is empty, which by the looks of it will happen a lot. The file does contain csv headers, but no relevant data e.g.:
CODE;CATEGORY;PRICE;
In this case the following error is thrown:
java.lang.IllegalArgumentException: No records have been defined in the CSV
I tried adding allowEmptyStream = true to the bindy object that I use for unmarshalling. This however does not seem to do much, as the same error appears.
Any ideas on how to skip processing these empty files is very welcome.

In your use case, the option allowEmptyStream must be set to true at Bindy DataFormat level, as next:
BindyCsvDataFormat bindy = new BindyCsvDataFormat(SomeModel.class);
bindy.setAllowEmptyStream(true);
from("file:some/path")
.unmarshal(bindy)
// The rest of the route

Related

Parsing double quote with new line from CSV using jackson-dataformat-csv

I have the following CSV format:
id,name,description,age
23,Anna,"Self-made
Chef
Shoemaker",23
The double-quotes are only present if the attribute is multi-line. While I am already been able to read the normal CSV correctly:
#Bean
public CsvMapper csvMapper() {
CsvMapper csvMapper = new CsvMapper();
csvMapper.registerModule(new JavaTimeModule());
csvMapper.configure(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS, false);
return csvMapper;
}
I tried adding a new feature:
csvMapper.configure(Feature.FAIL_ON_MISSING_COLUMNS, true);
But it makes the library skip failed rows. How do I parse the given format and get the whole Self-made\n Chef\n Shoemaker into the description attribute?
According to my reading of the code in the CSVDecoder class (link), the Jackson CSV parser in version 2.14 will correctly parse double-quoted strings that have embedded line-breaks. Look for the _nextQuotedString method. No attribute needs to be set to enable this. It happens unconditionally.
Indeed, it looks like the relevant lines of the code have not changed in at least 5 years.
So ... if the CSV parser is not working for you (without that Feature), my diagnosis is:
either you are using a really old version of Jackson,
or the apparently incomplete values you are seeing are injected somewhere / somehow after CSV parsing.
If this doesn't solve your problem, please add a minimal reproducible example to the Question, and tell us the version(s) of the Jackson dependencies you are using.

How to log the content of csv in Apache Camel?

I have the following code
DataFormat bindy = new BindyCsvDataFormat(Employee.class);
from("file:src/main/resources/csv2?noop=true").routeId("route3").unmarshal(bindy).to("mock:result").log("${body[0].name}");
I am trying to log every line of the csv file, currently I am only able to hardcode it to print.
Do I have to use Loop even I don't know the number of lines of the csv ? Or Do I have to use processor ? Whats the easiest way to achieve what I want ?
The unmarshalling step is producing an exchange whose body is a list of tuples. For that reason you can simply use Camel splitter to slice the original exchange into 1-N sub-exchanges (one per line/item of the list) and then log each of these lines:
from("file:src/main/resources/csv2?noop=true")
.unmarshal(bindy)
.split().body()
.log("${name}");
If you do not want to alter the original message, you can use the wiretap pattern in order to log a copy of the exchange:
from("file:src/main/resources/csv2?noop=true")
.unmarshal(bindy)
.wireTap("direct:logBody")
.to("mock:result");
from("direct:logBody")
.split().body()
.log("Row# ${exchangeProperty.CamelSplitIndex} : ${name}");

Apache Beam - Reading JSON and Stream

I am writing Apache beam code, where I have to read a JSON file which has placed in the project folder, and read the data and Stream it.
This is the sample code to read JSON. Is this correct way of doing it?
PipelineOptions options = PipelineOptionsFactory.create();
options.setRunner(SparkRunner.class);
Pipeline p = Pipeline.create(options);
PCollection<String> lines = p.apply("ReadMyFile", TextIO.read().from("/Users/xyz/eclipse-workspace/beam-prototype/test.json"));
System.out.println("lines: " + lines);
or I should use,
p.apply(FileIO.match().filepattern("/Users/xyz/eclipse-workspace/beam-prototype/test.json"))
I just need to read the below json file. Read the complete testdata from this file and then Stream it.
{
“testdata":{
“siteOwner”:”xxx”,
“siteInfo”:{
“siteID”:”id_member",
"siteplatform”:”web”,
"siteType”:”soap”,
"siteURL”:”www”,
}
}
}
The above code is not reading the json file, it is printing like
lines: ReadMyFile/Read.out [PCollection]
, could you please guide me with sample reference?
This is the sample code to read JSON. Is this correct way of doing it?
To quickly answer your question, yes. Your sample code is the correct way to read a file containing JSON, where each line of the file contains a single JSON element. The TextIO input transform reads a file line by line, so if a single JSON element spans multiple lines, then it will not be parseable.
The second code sample has the same effect.
The above code is not reading the json file, it is printing like
The printed result is expected. The variable lines does not actually contain the JSON strings in the file. lines is a PCollection of Strings; it simply represents the state of the pipeline after a transform is applied. Accessing elements in the pipeline can be done by applying subsequent transforms. The actual JSON string can be access in the implementation of a transform.

Remove spaces between column names in csv headers using apache camel csvdataformat

I am reading a csv file using apache camel csv data format and I am unmarshalling the content of the file.
when reading the data I have to remove spaces between column name header.
For example if the column name is "First Name", the space between them should be removed and it should be processed a "FirstName".
CsvDataFormat csvdataformat = new CsvDataFormat();
csvdataformat.setSkipHeaderRecord(true);
csvdataformat.setUseMaps(true);
from("file:/folder1/?fileName=sample.csv")
.routeId("samplerouteid")
.autoStartup(false)
.unmarshal(csvdataformat)
.process(new Processor() {
#Override
public void process(Exchange exchange) throws Exception {
/*code to process data*/
}
});
How to achieve this using Apache Camel CSV ?
Which version of Camel are you using? in 2.15 you have the option of overriding the header names, defining you own header.
header String[]
Overrides the header of the reference format.
This option is null by default. When null it keeps the value of the reference format which is null for CSVFormat.DEFAULT.
In the XML DSL, this option is configured using children tags:
<csv >
<header>orderId</header>
<header>amount</header>
</csv>
This documentation is available in the Camel web site.
We have encountered the same problem and we could solve it using the header. We use XML DSL but should be the same using Java.

Can camel generate File from java object

I need to generate a large file which is going to be done in Java, so a method is hit and the service goes off to the repository returns a list of type X as Java object. I then need to place this list in a file and send this off to an ftp server.
I know I can put files on ftp servers using camel, but wanted to know if it possible for camel to generate the file from the Java object and then place on the ftp server?
My code would look like this:
List<ObjectX> xList = someRepo.getListOfx();
So I need to write xList to a file and place on the ftp server.
Generally speaking, to convert your POJO messages to/from a text (or binary) based representation, you can use a Camel Dataformat. In your route, you will use the marshall and unmarshall keywords to perform the conversion.
There are several Camel dataformats available to marshall/unmarshal CSV, including the CSV dataformat or the Bindy dataformat (but there are a few others listed on the Dataformat page, under the "Flat data structure marshalling" header). One advantage of Bindy is that it can also support other unstructured formats (such as fixed width records)
Also note :
With Bindy, you will have to add annotations to your model class (ObjectX)
With CSV, you will have to convert you model objects (of type ObjectX) to Maps (or register an appropriate TypeConverter with Camel to do this conversion automatically)
If you check the other available dataformats, they may have different requirements too
Here is a very simple example with bindy:
package mypackage;
#CsvRecord(separator = ",")
public Class MyPojo {
#DataField(pos = 1) // first column in the CSV file
private int foo;
#DataField(pos = 2) // second column in the CSV file
private String bar;
// Plus constructors, accessors, etc.
}
// In the RouteBuilder :
DataFormat bindy = new BindyCsvDataFormat("mypackage");
from("...") // Message received as a List<MyPojo>
.marshal(bindy)
.to("ftp:...");

Categories