My data is of newline delimited json form and looks like shown below. I am reading this type of data from a Kafka topic.
{"sender":"S1","senderHost":"ip-10-20-30-40","timestamp":"2018-08-13T16:17:12.874Z","topic":"test","messageType":"type_1","data":{"name":"John Doe", "id":"12DROIY321"}}
I want to build an apache Beam pipeline which reads this data from Kafka, parses this json format to give me an output as shown below:
S1,2018-08-13T16:17:12.874Z,type_1,12DROIY321
The output is basically a comma delimited string consisting of the sender, timestamp, messageType and id from within data.
My code so far is as below:
public class Pipeline1{
public static void main(String[] args){
PipelineOptions options = PipelineOptionsFactory.create();
// Create the Pipeline object with the options we defined above.
Pipeline p = Pipeline.create(options);
p.apply(KafkaIO.<Long, String>read()
.withBootstrapServers("localhost:9092")
.withTopic("test")
.withKeyDeserializer(LongDeserializer.class)
.withValueDeserializer(StringDeserializer.class)
.updateConsumerProperties(ImmutableMap.of("auto.offset.reset", (Object)"earliest"))
// We're writing to a file, which does not support unbounded data sources. This line makes it bounded to
// the first 35 records.
// In reality, we would likely be writing to a data source that supports unbounded data, such as BigQuery.
.withMaxNumRecords(35)
.withoutMetadata() // PCollection<KV<Long, String>>
)
.apply(Values.<String>create())
.apply(TextIO.write().to("test"));
p.run().waitUntilFinish();
}
}
I am unable to figure out how to parse the json to get the required csv format within the pipeline. Using the code above, I am able to write the same json lines into a file, and using the code below, i can parse the json, but can anyone please help me figure out how to accomplish this as an additional step with the beam pipeline logic?
JSONParser parser = new JSONParser();
Object obj = null;
try {
obj = parser.parse(strLine);
} catch (ParseException e) {
e.printStackTrace();
}
JSONObject jsonObject = (JSONObject) obj;
String sender = (String) jsonObject.get("sender");
String messageType = (String) jsonObject.get("messageType");
String timestamp = (String) jsonObject.get("timestamp");
System.out.println(sender+","+timestamp+","+messageType);
According to the documentation, you will need to write a transformation (or find one that matches your use case).
https://beam.apache.org/documentation/programming-guide/#composite-transforms
The documentation also provides an excellent example.
Example that should produce your output:
.apply(Values.<String>create())
.apply(
"JSONtoData", // the transform name
ParDo.of(new DoFn<String, String>() { // a DoFn as an anonymous inner class instance
#ProcessElement
public void processElement(#Element String word, OutputReceiver<String> out) {
JSONParser parser = new JSONParser();
Object obj = null;
try {
obj = parser.parse(strLine);
} catch (ParseException e) {
e.printStackTrace();
}
JSONObject jsonObject = (JSONObject) obj;
String sender = (String) jsonObject.get("sender");
String messageType = (String) jsonObject.get("messageType");
String timestamp = (String) jsonObject.get("timestamp");
out.output(sender+","+timestamp+","+messageType);
}
}));
To return CSV values, just change the generics to:
new DoFn<String, YourCSVClassHere>()
OutputReceiver<YourCSVClassHere> out
I didn't test this code, use at own risk.
Related
I am using objectMapper to convert object as string. While converting mappper is creating invalid string. How can I keep JsonNode/JsonObject string intact while using object mapper.
JsonObject:
{
"provision": " purpose of usefuleness.\n\n shall not use personal-provided facilities.\n\nWe shall not be required to pay you.
}
Is Converted to
{
"provision": " purpose of usefuleness.
\shall not use personal-provided facilities.
\We shall not be required to pay you.
}
Used:
new ObjectMapper().writeValuesAsString(json);
How can keep the original String intact.
with
new JsonObject(String)
or
when using
new ObjectMapper().writeValuesAsString(json);
Just unquote special character \n
String body = "{\"provision\" : \"purpose of usefuleness.\\n\\n shall not use personal-provided facilities.\\n\\nWe shall not be required to pay you.\"}";
ObjectMapper mapper = new ObjectMapper();
String result = null;
try {
JsonNode node = mapper.readTree(body);
result = mapper.writerWithDefaultPrettyPrinter().writeValueAsString(node);
} catch (IOException e) {
e.printStackTrace();
}
System.out.println(result);
{
"provision" : "purpose of usefuleness.\n\n shall not use personal-provided facilities.\n\nWe shall not be required to pay you."
}
Ok so I have been given a project to build a TST(Completed) and am supposed to use JSON parser on a Dictionary file to load the values into my Data structure and was given a basic class of code for example. This is the very first Time I have ever been exposed to this utility and I have absolutely no idea on how it works. Typically when I want to parse an input i would simply do something along the lines of
String[] parse = txt.split("|");
yet this obviously isn't going to work, So In the end of the code I see where it differentiates (or i think it does anyways) The Key & The Value, I need to read those line by line to feed into a another method in which I would typically do with a for Loop yet have no clue as to what syntax this method even uses
for(int i = 0; i < JSON.Size; i++) {
first = get.JSON_Key(i);
last = get.JSON_Value(i);
tst.put(key, value);
}
So obviously that would be better suited pseudo code, I don't know if this is storing separate values in separate containers and if so what to use to get a hold of those values the following is the example code we were given
public class ReadJSON
{
public static void main( String[] args )
{
String infile = "dictionary.json";
JsonReader jsonReader;
JsonObject jobj = null;
try
{
jsonReader = Json.createReader( new FileReader(infile) );
// assumes the top level JSON entity is an "Object", i.e. a dictionary
jobj = jsonReader.readObject();
}
catch(FileNotFoundException e)
{
System.out.println("Could not find the file to read: ");
e.printStackTrace();
}
catch(JsonParsingException e)
{
System.out.println("There is a problem with the JSON syntax; could not parse: ");
e.printStackTrace();
}
catch(JsonException e)
{
System.out.println("Could not create a JSON object: ");
e.printStackTrace();
}
catch(IllegalStateException e)
{
System.out.println("JSON input was already read or the object was closed: ");
e.printStackTrace();
}
if( jobj == null )
return;
Iterator< Map.Entry<String,JsonValue> > it = jobj.entrySet().iterator();//Not sure what this is doing
Map.Entry<String,JsonValue> me = it.next();//not sure what this is doing
String word = me.getKey();
String definition = me.getValue().toString();
for(int i =0; i < jsonReader.; i++) {
}
}
}
Any help in understanding this a bit more and correct syntax for that for loop would be appreciated
The code is using JSR 353: Java API for JSON Processing. Look at the https://jsonp.java.net/.
I would like to seek help on how to parse this string
{"success":false,"error":{"code":500,"message":"No keyword found."}}
I would want to be able to get the error code and the error message. The only problem I have is finding a regex that could capture the values I'm stuck at
Pattern pattern = Pattern.compile(REGEX?);
Matcher matcher = pattern.matcher(result);
You need to parse it to json and get value not regex as your response is in JSON.
JSONObject message = new JSONObject(yourResponse);
// use myJson as needed, for example
JSONObject error = message.getJSONObject(“error”);
int code = error.getInt(“code”);
String message2 = error.getString(“message”);
In Java, we do not often compile Patterns for such trivial tasks.
Quick Answer
code = Integer.parseInt(result.split("\"")[6].split(",")[0].substring(1));
msg = result.split("\"")[9].split(",")[0];
This won't work if result has commas.
if you want regex, this is it
s = s.replaceAll(".*\"code\":(.+?),.*", "$1");
The org.json library is easy to use. Example code is as below:
JSONObject obj = new JSONObject(" .... ");
int errCode= obj.getJSONObject("error").getInt("code");
This string is in json format, better use a json parser. Try this:
String s = "{\"success\":false,\"error\":{\"code\":500,\"message\":\"No keyword found.\"}}";
JSONObject jsonObject;
try {
jsonObject = new JSONObject(s);
JSONObject error = (JSONObject) jsonObject.get("error");
System.out.println(error.get("message"));
System.out.println(error.get("code"));
} catch (JSONException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Also have a look at this site.
This is my method
public String buildJsonData(String username , String message)
{
JsonObject jsonObject = Json.createObjectBuilder().add("Username",username+":"+message).build();
StringWriter stringWriter = new StringWriter();
try(JsonWriter jsonWriter = Json.createWriter(stringWriter))
{
jsonWriter.write(jsonObject);
}
catch(Exception e)
{
System.out.print("buildJsonData ="+e);
}
return stringWriter.toString();
}
If i input username as john and message as hello.I get output as
{"Username":"john:hello"}
But I want output without braces and doublequotes I want my output as
John:hello
I tried to split it using array[0] but didn't get the output.Is it possible in json to get my desired output(without braces and quotes).
On the sending end, you would put the Username and Message entities into a JSONObject and send the resulting string over the network.
On the receiving end, you would unmarshal the JSON to extract the entities. You can then format them however you like.
Please read about JSON encoding here.
This is a simple example:
private String getResponse(){
JSONObject json = new JSONObject();
try {
json.put("Username", "John");
json.put("Message", "Hellow");
} catch (JSONException e) {
e.printStackTrace();
}
return json.toString();
}
private void receiver(){
try {
JSONObject response = new JSONObject(getResponse());
String username = response.getString("Username");
String message = response.getString("Message");
System.out.println(String.format("%s : %s", username,message));
} catch (JSONException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Your structure is not really JSON.
A json structure would be like
{
Username : "John",
Message : "Hello"
}
Anf if your want to really use JSON, there is not way to remove braces and quotes. This IS Json.
If you want to output only the part you quoted, store the json value in a variable
String myoutput = stringWriter.toString();
And then remove the parts you don't want with replace() or a regexp
Braces are part of the JSON notation - they indicate an object. If you remove them, then it's not JSON any more. Same goes for double quotes.You are creating your JSON object as:
Json.createObjectBuilder().add("Username",username+":"+message)
This creates an object with property named Username and value john:hello. Again, this is the JSON notation. It's not intended to be read directly, but to facilitate data transfer between applications (on the same or different devices).
If all you want to create is john:message, then instead of creating a JSON object, you should simply do:
String result = username + ":" + message;
return result;
I want to modify a json content without converting it into a POJO. I am using GSON Library.
Following are the use case:
String jsonString = "[{\"key1\":\"Hello\",\"key2\":\"World\"},{\"key1\":\"Nice\",\"key2\":\"Town\"}]";
JsonElement jsonElement = gson.fromJson(jsonString, JsonElement.class);
Is there any way where I can set value of key1 to some value (let say "Test") in each array, without converting things into POJO
Here's the shortest I came up with.
JsonElement je = new Gson().fromJson(jsonString, JsonElement.class);
JsonObject jo = je.getAsJsonObject();
jo.add("key", value);
Once you have the JsonObject, gson has many methods to manipulate it.
You can always get a different type than JsonElement, or use JsonElement.getAsJsonObject to cast to an Object (if possible).
String jsonString = "[{\"key1\":\"Hello\",\"key2\":\"World\"}, ...]";
JsonArray jsonArray = gson.fromJson(jsonString, JsonElement.class).getAsJsonArray();
JsonObject firstObject = jsonArray.get(i).getAsJsonObject();
firstObject.addProperty("key1", "Test");
I was wrong earlier; there seems to be no JsonArray adapter; you'll have to get a JsonElement and use the casting tool.
One approach would be to just convert the JSON to a java.util.Map, modify the Map, and go from there (which may mean serializing the Map back to JSON).
This approach meets my preference to work with the right API for the right job, minimizing the use of tools like Gson to just handle serialization/deserialization (which is what I understand it was designed for). That is, to not use the Gson API as a replacement data structure.
GSON has two separate APIs (that can be combined): one is used for serialization and deserialization, and the other for streaming. If you want to process streams of JSON without memory overhead or using dynamic structures (rather than static POJOs) you can do something like:
create a JsonWriter (in my example I use StringWriter);
create a JsonReader;
make a loop that consumes events from the reader and feeds them to the writer, possibly making changes, additions, omissions etc.
The loop will consist of a single switch statement that must have a case all the possible events (10 of them). Even the simplest example must have all of them, so the code below looks rather verbose. But it is very easy to extend and further extensions will not make it much longer.
An example that appends "test": 1 pair to each object looks something like:
public class Whatever {
static void streamandmodify(JsonReader reader, JsonWriter writer) throws IOException {
while (true) {
JsonToken token = reader.peek();
switch (token) {
// most cases are just consume the event
// and pass an identical one to the writer
case BEGIN_ARRAY:
reader.beginArray();
writer.beginArray();
break;
case END_ARRAY:
reader.endArray();
writer.endArray();
break;
case BEGIN_OBJECT:
reader.beginObject();
writer.beginObject();
// this is where the change happens:
writer.name("test");
writer.value(1);
break;
case END_OBJECT:
reader.endObject();
writer.endObject();
break;
case NAME:
String name = reader.nextName();
writer.name(name);
break;
case STRING:
String s = reader.nextString();
writer.value(s);
break;
case NUMBER:
String n = reader.nextString();
writer.value(new BigDecimal(n));
break;
case BOOLEAN:
boolean b = reader.nextBoolean();
writer.value(b);
break;
case NULL:
reader.nextNull();
writer.nullValue();
break;
case END_DOCUMENT:
return;
}
}
}
public static void main(String[] args) throws IOException {
// just for test:
JsonReader jr = new JsonReader(new StringReader("{\"a\":1, \"b\":{\"c\":[1,2,3,{},{}]}}"));
StringWriter sw = new StringWriter();
JsonWriter jw = new JsonWriter(sw);
streamandmodify(jr, jw);
System.out.println(sw.getBuffer().toString());
}
}
The jsonString is a plain, ordinary Java String; so you can modify it whatever you like using the standards String functions of Java and replace the substring key1 with Test1:
jsonString = "[{\"key1\":\"Test\",\"key2\":\"World\"},{\"key1\":\"Nice\",\"key2\":\"Town\"}]";
Of course, String in Java are immutable so converting it first to a StringBuilder will possibly give you a better performance in term of memory usage.
Modify json with GSON JsonArray Java 8
Example of how to use GSON to modify a value within a JSON
import com.google.gson.Gson;
import com.google.gson.JsonArray;
import com.google.gson.JsonElement;
import com.google.gson.JsonObject;
public class ModifyJson {
public static void main(String[] args) {
String data = "[{\"ct_pk\":24,\"ct_name\":\"SISTEMA DE PRUEBAS\"},"
+ "{\"ct_pk\":1,\"ct_name\":\"CAPITAL FEDERAL\"}," +
"{\"ct_pk\":5,\"ct_name\":\"SISTEMA DE PRUEBAS DOS\"}]";
System.out.println("before................." + data);
JsonArray jsonArray = new Gson().fromJson(data, JsonElement.class).getAsJsonArray();
JsonArray jsonArray2 = new JsonArray();
for (JsonElement pa : jsonArray) {
JsonObject jsonObject2 = pa.getAsJsonObject();
String ct_name = jsonObject2.get("ct_name").getAsString();
if (ct_name.equals("SISTEMA DE PRUEBAS")) {
jsonObject2.addProperty("ct_name", "TODOS");
}
jsonArray2.add(jsonObject2);
}
System.out.println("after.................." +jsonArray2);
}
}