Camel-bindy : How to escape double quotes when marshalling CSV? - java

I am trying to escape double quotes inside a CSV field content. This is required because the double quote character is already used to enclose the fields.
But I don't see how to do that (escaping) with Camel/Camel-bindy.
What I want to produce (note how double quotes inside field content are escaped by doubling them):
"Some";"people";"Never ""finish""";"their"
What I'm actually producing (this won't parse as CSV):
"Some";"people";"Never "finish"";"their"
So,
Is there any option that I can add to tell Camel to escape quotes inside (all) CSV fields values?
Otherwise, is there an alternative solution that i could use to get the same result?
So far, following is the state of what i did :
My Camel version is 2.15.
I use a POJO which is later marshalled to CSV by bindy.
This is how the POJO looks like
#CsvRecord(separator = ";", crlf = "UNIX", generateHeaderColumns = true, quote = "\"", quoting = true)
public class MyCsvPOJO
{
#DataField(pos = 1)
private String prop1 = "";
// Some other properties + getters + setters
}
This is the camel route code that produces the CSV file (using Camel Java DSL) :
from("myRouteId")
.beanRef("myPojoProducerBean")
.marshal()
.bindy(BindyType.Csv, MyCsvPOJO.class)
.convertBodyTo(String.class, "UTF-8")
.to("/path/to/the/ouput-file.csv");
I considered using a custom crafted https://camel.apache.org/maven/camel-2.15.0/camel-csv/apidocs/org/apache/camel/dataformat/csv/CsvDataFormat.html that I could feed to .marshal(myCustomCsvDataFormat), but then I can no longer chain a call to bindy(...) because of incompatible return types.
So, at this point I'm stuck and any hint will be very much appreciated.
Thanks.

In version 2.19.0 was introduced quotingEscaped to #CsvRecord:
Indicate if the values must be escaped when quoting
Source (CAMEL-7519)
UPDATE:
I was able to get in output file your desired result ("Some";"people";"Never ""finish""";"their") using Camel CSV:
from("myRouteId")
.process(exchange -> {
ObjectMapper mapper = new ObjectMapper();
Map<String, String> map = mapper.convertValue(new SimplePojo(), Map.class);
exchange.getIn().setBody(map);
})
.marshal(new CsvDataFormat().setDelimiter(';').setQuoteMode(QuoteMode.ALL))
.to("file:out/?fileName=ouput-file.csv");
Where SimplePojo content is:
private String name = "Some";
private String people = "people";
private String never = "Never \"finish\"";
private String finish = "their"

Related

OWASP library custom policy - adding spaces where removing html

I am currently trying to use the OWASP library to remove some html from a string. A string that I have is a list:
one
two
three
Which in markup, the string looks like "<ul><li>one</li><li>two</li><li>three</li></ul>".
When I use the OWSAP library with a policy like:
PolicyFactory policy = new HtmlPolicyBuilder()
.allowUrlProtocols("https")
.allowStandardUrlProtocols()
.requireRelNofollowOnLinks()
.toFactory();
OSWAP converts the list to: "onetwothree" . I instead would like to add spaces between the list items, and be able to convert the string to "one two three". I was wondering how / if there is a way to do this with OWSAP? I am new to using this so any advice is appreciated!
This is how it works, in case someone in future faced the same issue
It's in kotlin but pretty simple to be converted to java
private class SeparateTextPostProcessor(htmlStreamEventReceiver: HtmlStreamEventReceiver, val separator: String = " ") :
HtmlStreamEventReceiverWrapper(htmlStreamEventReceiver) {
override fun text(elem: String) {
if (elem.isNotEmpty()) {
// adds a space between elements
underlying.text("$elem$separator")
} else {
underlying.text(elem)
}
}
}
fun String.stripHTML(): String {
val policy =
HtmlPolicyBuilder().withPostprocessor { SeparateTextPostProcessor(it) }.toFactory()
val sanitized = policy.sanitize(this)
// Sanitized text will contain additional escaped characters.
return sanitized.trim()
}

How to convert JsonNode to String without escaping quote - with \" instead of "

I have the following class. I use ObjectMapper.convertValue() to convert this class to Jackson ObjectNode. Then ObjectNode.toString() will return a String like "{"playing": false}".
But what I am looking for is the Json String without escaping quote like "{\"playing\": false}".
I am currently using String.replace("\"", "\\\""), and it worked. Is there any better way to achieve that?
Update Node: As mentioned in the comment, I need to send this state String to server, but it looks like my server can only recognize "{\"playing\": false}".
public class MyState {
public Boolean playing;
}
private String getStateString()
throws Exception
{
ObjectNode objectNode = objectMapper.convertValue(currentState, ObjectNode.class);
String pretty = objectMapper.writerWithDefaultPrettyPrinter().writeValueAsString(objectNode);
return objectNode.toString().replace("\"", "\\\"");
}
Update Solution: As mentioned in comment, the root cause that the service was not able to process my state string is not due to string escape, but due to incorrect attribute name (which is further converted to json key String).

Hive - Remove substring from string

I need to replace substring from a given string with empty string with the substring appearing in different positions of the string.
I want to remove the "fruit":"apple" from these possible combinations of the strings and expected the corresponding string:
{"client":"web","fruit":"apple"} --> {"client":"web"}
{"fruit":"apple","client":"web"} --> {"client":"web"}
{"client":"web","fruit":"apple","version":"v1.0"} --> {"client":"web","version":"v1.0"}
{"fruit":"apple"} --> null or empty string
I used regexp_replace(str, "\,*\"fruit\"\:\"apple\"", "") but that didn't get me the expected results. What is the right way to construct the regex?
It seems that you are working with data in JSON format. Depending from included dependencies you can achieve it totally without regular expression.
For example, if you are using Google's lib Gson, then you can parse String to JsonObject and then remove property from it
String input = "your data";
JsonParser parser = new JsonParser();
JsonObject o = parser.parse(input).getAsJsonObject();
try {
String foundValue = o.getAsJsonPrimitive("fruit").getAsString();
if ("apple".equals(foundValue)) {
o.remove("fruit");
}
} catch (Exception e) {
e.printStackTrace();
}
String filteredData = o.toJSONString();
P.S. code is not final version, it might needs handling of some situations (when there is no such field, or it contains non-primitive value), need further details to cover it
P.P.S. IMO, using regex in such situatioins makes code less readable and flexible

Spring REST Controller understanding arrays of strings when having special characters like blank spaces or commas

I am trying to write a Spring REST Controller getting an array of strings as input parameter of a HTTP GET request.
The problem arises when in the GET request, in some of the strings of the array, I use special characters like commas ,, blank spaces or forward slash /, no matter if I URL encode the query part of the URL HTTP GET request.
That means that the string "1/4 cup ricotta, yogurt" (edit which needs to be considered as a unique ingredient contained as a string element of the input array) in either this format:
http://127.0.0.1:8080/[...]/parseThis?[...]&ingredients=1/4 cup ricotta, yogurt
This format (please note the blank spaces encoded as + plus, rather than the hex code):
http://127.0.0.1:8080/[...]/parseThis?[...]&ingredients=1%2F4+cup+ricotta%2C+yogurt
Or this format (please note the blank space encoded as hex code %20):
http://127.0.0.1:8080/[...]/parseThis?[...]&ingredients=1%2F4%20cup%20ricotta%2C%20yogurt
is not rendered properly.
The system does not recognize the input string as one single element of the array.
In the 2nd and 3rd case the system splits the input string on the comma and returns an array of 2 elements rather than 1 element. I am expecting 1 element here.
The relevant code for the controller is:
#RequestMapping(
value = "/parseThis",
params = {
"language",
"ingredients"
}, method = RequestMethod.GET, headers = HttpHeaders.ACCEPT + "=" + MediaType.APPLICATION_JSON_VALUE)
#ResponseBody
public HttpEntity<CustomOutputObject> parseThis(
#RequestParam String language,
#RequestParam String[] ingredients){
try {
CustomOutputObject responseFullData = parsingService.parseThis(ingredients, language);
return new ResponseEntity<>(responseFullData, HttpStatus.OK);
} catch (Exception e) {
// TODO
}
}
I need to perform HTTP GET request against this Spring controller, that's a requirement (so no HTTP POST can be used here).
Edit 1:
If I add HttpServletRequest request to the signature of the method in the controller, then I add a log statement like log.debug("The query string is: '" + request.getQueryString() + "'"); then I am seeing in the log a line like The query string is: '&language=en&ingredients=1%2F4+cup+ricotta%2C+yogurt' (So still URL encoded).
Edit 2:
On the other hand if I add WebRequest request to the signature of the method, the the log as log.debug("The query string is: '" + request.getParameter("ingredients") + "'"); then I am getting a string in the log as The query string is: '1/4 cup ricotta, yogurt' (So URL decoded).
I am using Apache Tomcat as a server.
Is there any filter or something I need to add/review to the Spring/webapp configuration files?
Edit 3:
The main problem is in the interpretation of a comma:
#ResponseBody
#RequestMapping(value="test", method=RequestMethod.GET)
public String renderTest(#RequestParam("test") String[] test) {
return test.length + ": " + Arrays.toString(test);
// /app/test?test=foo,bar => 2: [foo, bar]
// /app/test?test=foo,bar&test=baz => 2: [foo,bar, baz]
}
Can this behavior be prevented?
The path of a request parameter to your method argument goes through parameter value extraction and then parameter value conversion. Now what happens is:
Extraction:
The parameter is extracted as a single String value. This is probably to allow simple attributes to be passed as simple string values for later value conversion.
Conversion:
Spring uses ConversionService for the value conversion. In its default setup StringToArrayConverter is used, which unfortunately handles the string as comma delimited list.
What to do:
You are pretty much screwed with the way Spring handles single valued request parameters. So I would do the binding manually:
// Method annotations
public HttpEntity<CustomOutputObject> handlerMethod(WebRequest request) {
String[] ingredients = request.getParameterValues("ingredients");
// Do other stuff
}
You can also check what Spring guys have to say about this.. and the related SO question.
Well, you could register a custom conversion service (from this SO answer), but that seems like a lot of work. :) If it were me, I would ignore the declaration the #RequestParam in the method signature and parse the value using the incoming request object.
May I suggest you try the following format:
ingredients=egg&ingredients=milk&ingredients=butter
Appending &ingredients to the end will handle the case where the array only has a single value.
ingredients=egg&ingredients=milk&ingredients=butter&ingredients
ingredients=milk,skimmed&ingredients
The extra entry would need to be removed from the array, using a List<String> would make this easier.
Alternatively if you are trying to implement a REST controller to pipe straight into a database with spring-data-jpa, you should take a look at spring-data-rest. Here is an example.
You basically annotate your repository with #RepositoryRestResource and spring does the rest :)
A solution from here
public String get(WebRequest req) {
String[] ingredients = req.getParameterValues("ingredients");
for(String ingredient:ingredients ) {
System.out.println(ingredient);
}
...
}
This works for the case when you have a single ingredient containing commas

Can't parse JSON property "null"

I faced with one trouble when tried to parse JSON "null" property, please help me to understand what's the real problem. I had a following JSON:
{
"properties" : {
"null" : {
"value" : false
}
}
}
I used http://jsonlint.com to validate that this JSON is valid. I tried to parse it from java:
import net.sf.json.JSONObject;
import java.io.IOException;
public class Test {
public static void main(String[] args) throws IOException {
String st = "{" +
" 'properties' : {" +
" 'null' : {" +
" 'value' : false" +
" }" +
" }" +
"}";
JSONObject.fromObject(st);
}
}
But got the exception:
Exception in thread "main" java.lang.ClassCastException: JSON keys must be strings.
at net.sf.json.JSONObject._fromJSONObject(JSONObject.java:927)
at net.sf.json.JSONObject.fromObject(JSONObject.java:155)
at net.sf.json.JSONSerializer.toJSON(JSONSerializer.java:108)
at net.sf.json.AbstractJSON._processValue(AbstractJSON.java:238)
at net.sf.json.JSONObject._processValue(JSONObject.java:2655)
at net.sf.json.JSONObject.processValue(JSONObject.java:2721)
at net.sf.json.JSONObject.element(JSONObject.java:1786)
at net.sf.json.JSONObject._fromJSONTokener(JSONObject.java:1036)
at net.sf.json.JSONObject._fromString(JSONObject.java:1201)
at net.sf.json.JSONObject.fromObject(JSONObject.java:165)
at net.sf.json.JSONObject.fromObject(JSONObject.java:134)
I used json-lib-2.4-jdk15.jar from http://json-lib.sourceforge.net to parse it. Could anybody please clarify this? Why this library throws exception, but online validator said that it's valid JSON? It is a bug in the library or I made something wrong?
JSON-lib initially parses and populates a Java Map with the input JSON. Unfortunately, JSON-lib then checks whether every JSON object element name is a JSON null. It's null check is performed in the JSONNull.equals(Object) method. This method returns true for a "null" JSON string, which of course is not actually a JSON null value.
I recommend filing a bug with the JSON-lib project for this issue. The implementation of JSONNull.equals(Object) is flawed.
Unfortunately, it's not possible to handle this with a custom PropertyNameProcessor.
Options available for a more immediate solution include altering the JSON-lib code yourself, or switching libraries.
If you can switch libraries, I highly recommend Jackson. Following is an example of using it to deserialize the example JSON in the original question.
/*
{
"properties" : {
"null" : {
"value" : false
}
}
}
*/
String json = "{\"properties\":{\"null\":{\"value\":false}}}";
ObjectMapper mapper = new ObjectMapper();
Map<String, Object> map = mapper.readValue(json, Map.class);
System.out.println(map);
// output: {properties={null={value=false}}}
Map<String, Object> propertiesMap = (Map) map.get("properties");
System.out.println(propertiesMap);
// output: {null={value=false}}
Map<String, Object> nullMap = (Map) propertiesMap.get("null");
System.out.println(nullMap);
// output: {value=false}
The first JSON posted is valid JSON: the JSON in the Java, however, is not valid -- only " is valid for the [required] key quote. From json.org:
A string is a sequence of zero or more Unicode characters, wrapped in double quotes, using backslash escapes....
However, that sounds like a bug, assuming it was not triggered by the invalid JSON fed to it (the library can do whatever it wants with invalid JSON)... one would have to look at the source (or bug reports / user experience) to say conclusively if this is indeed a "bug". I have added some suggestions of things to try below which may either show expected behavior or outline the cause/issue in further detail.
Consider this minimal test-case (with valid JSON):
String st = "{ \"null\": \"hello world!\" }";
This may also shed more light, depending on if the first item is "null" or null when extracted:
String st = "[ \"null\" ]";
Happy coding.
The gson library link is:
http://code.google.com/p/google-gson/
I normally usr gson to generate the josn string,so I found some example someone else posted in stackoverflow to parse json string with gson,see the link:
Converting JSON to Java
suggest you to use Gson,
and construct the json string using java Map and List,
then use Gson to output the Map or List object

Categories