How to count particular pattern in a string? - java

we have some requests which have a lot of experiments. I just want to count the no experiments. If it's greater than some number then I will block those requests
{
"context": {
"requestId": "",
"locale": "",
"deviceId": "",
"currency": "",
"memberId": 0,
"cmsOrigin":,
"experiments": {
"forceByVariant":,
"forceByExperiment": [
{
"id": "test",
"variant": "A"
}
]
}
}
In this request, I just want to check how many id and variant inside the forceByExperiment. I have tried to do using regular expression but not able to do it. Anyone do it before similar thing.
I just split the string with variant and count them. Not sure good idea, but the end goal is to figure out that the request have a lot of experiments.

Using the circe library and Scala, here is an easy solution:
import io.circe._, io.circe.parser._
val jsonString = """{
"context": {
"requestId": "",
"locale": "",
"deviceId": "",
"currency": "",
"memberId": 0,
"cmsOrigin": "foo",
"experiments": {
"forceByVariant": [],
"forceByExperiment": [
{
"id": "test",
"variant": "A"
}
]
}
}
}"""
val parseResult = parse(jsonString)
val nElems = for {
json <- parse(jsonString)
array <- json.hcursor.downField("context").downField("experiments").downField("forceByExperiment").as[Seq[Json]]
} yield array.length
println(nElems) // Right(1)

If you really want to use regex, and if there is no id field in your json structure, you can use the following expression "id": "(\w+)" and count the number of match.
Example: https://regex101.com/r/GCeByw/1/

Related

Custom PrintFormatting for GSON

I'm having some trouble with GSON in regards to printing. GSON has two options when it comes to printing.
Pretty Printing
Compact Printing
I intend to use a modified form of Pretty Printing and even though the documentation says JsonPrintFormatter is the class which is used to modify the output format. I can't find that class in the GSON repository!
Any ideas on why this is the case or anyway I can modify the GSON printing?
Apart from that, any libraries used to modify spacing or formatting of JSON in the Java language would also be helpful.
Pretty Print:
{
"classname": "something",
"type": "object",
"version": 1,
"properties": [
{
"propertyname": "something1",
"type": "String",
"length": 255
},
{
"propertyname": "something2",
"type": "Date",
"length": 10
}
]
}
Compact Print:
{"classname":"something","type":"object","version":1,"properties":[{"propertyname":"something1","type":"String","length":255},{"propertyname":"something2","type":"Date","length":10}]}
My Print Style:
{
"classname": "something",
"type": "object",
"version": 1,
"properties": [
{"propertyname": "something1","type": "String","length": 255},
{"propertyname": "something2","type": "Date","length": 10}
]
}
Well, it's just work in progress for now, but this should do the trick for strings with only one array. Will look into to making it more stable and able to handle more complex structures.
private static String reformat(String og){
String reformattable = og;
String[] parts = reformattable.split("\\[",2);
String arrayPart = parts[1];
String arrayOnly = arrayPart.split("]",2)[0];
reformattable = arrayOnly.replaceAll("\\{\n","{");
reformattable = reformattable.replaceAll("\",\n", "\\\",");
reformattable = reformattable.replaceAll(" +"," ");
reformattable = reformattable.replaceAll("\\{ "," {");
reformattable = reformattable.replaceAll("\n }","}");
return og.replace(arrayOnly,reformattable);
}
Result should look like this (at least for my simple class):
{
"classname": "test",
"properties": [
{"propertyname": "1", "length": 1},
{"propertyname": "1", "length": 1}
]
}

how to find the string in a Json array response in JAVA based on conditions

Apologies if this is a duplicate post. I am trying to find a string in the following array response basing on conditions specified.
{
"MRData": {
"xmlns": "http://ergast.com/mrd/1.4",
"series": "f1",
"url": "http://ergast.com/api/f1/2016/drivers.json",
"limit": "30",
"offset": "0",
"total": "24",
"DriverTable": {
"season": "2016",
"Drivers": [
{
"driverId": "alonso",
"permanentNumber": "14",
"code": "ALO",
"url": "http://en.wikipedia.org/wiki/Fernando_Alonso",
"givenName": "Fernando",
"familyName": "Alonso",
"dateOfBirth": "1981-07-29",
"nationality": "Spanish"
},
{
"driverId": "bottas",
"permanentNumber": "77",
"code": "BOT",
"url": "http://en.wikipedia.org/wiki/Valtteri_Bottas",
"givenName": "Valtteri",
"familyName": "Bottas",
"dateOfBirth": "1989-08-28",
"nationality": "Finnish"
},
{
"driverId": "button",
"permanentNumber": "22",
"code": "BUT",
"url": "http://en.wikipedia.org/wiki/Jenson_Button",
"givenName": "Jenson",
"familyName": "Button",
"dateOfBirth": "1980-01-19",
"nationality": "British"
}
]
}
}
}
1) I would like to find the permanent number of driverId "alonso" assuming that it doesn't come first always in each request. i.e each time the request is made the arrays reshuffle. the logic here would be to get the array count of the driverId alonso and insert that into the query below
"MRData.DriverTable.Drivers[insert the array count of alonso here].permanentNumber"
2) I would like to get the permanent numbers that are less than 20. I would also like to get the driverIds of the drivers whose permanent numbers are less than 20.
thanks a lot for viewing!
Try to build the Classes "MRData" and "Driver" with all necessary parameters.
and let org.json or GSON do the magic. You should really look at How to parse JSON in Java as Lars mentioned.
got that sorted!
answer to my first question-
public void extraResponseWithInRange(String url) {
Response response = given().when().get(url);
List<Map<String, String>> responseFromArray = JsonPath.parse(response.asString()).read("$.MRData.DriverTable.Drivers[?(#.driverId== 'alonso')]");
for (Map<String, String> rfa : responseFromArray) {
assertThat(rfa.get("permanentNumber"), equalToIgnoringCase("14"));
answer to my second question-
List<Map<String,String>> driversBetween=JsonPath.parse(response.asString()).read("$.MRData.DriverTable.Drivers[?(#.permanentNumber > '0' && #.permanentNumber <'20')]");
for(Map<String,String> dbsmall: driversBetween){
System.out.println(dbsmall.get("permanentNumber"));
}
please let me know if i could write this in a better way.
thanks a lot!
Either marshall the data into a POJO, and check the values of the fields there, or use something like [JSONPath][1].
int permanentNumber = JSONPath.read(json, "$..Drivers[?(#.driverId == 'alonso')].permanentNumber");
Disclaimer, I don't have an environment currently to run this, but their docs are pretty good.

java : search for substring in elasticsearch

I'm trying to look for substrings in the elasticsearch, but what I've come to known and what I've coded doesn't exactly look for a substring like the way I want.
Here's what I've coded :
BoolQueryBuilder query = new BoolQueryBuilder();
query.must(new QueryStringQueryBuilder("tagName : *"+tagName+"*"));
SearchResponse response = esclient.prepareSearch(index).setTypes(type)
.setQuery(query)
.execute().actionGet();
SearchHit[] hits = response.getHits().getHits();
for (SearchHit hit : hits) {
Map map = hit.getSource();
list.add((String) map.get("tagName"));
}
list = list.stream().distinct().collect(Collectors.toList());
for(int i = 0; i < list.size(); i++) {;
jsonArrayBuilder.add((String) list.get(i));
}
What I'm trying to implement is to look even if part of the given tagname matches with anything should be listed.
But in case, for ex : if I'm looking for a tag named "social_security_number" and I type "social security" then I would like it to be listed.
But what's actually happening is if I miss the underscore, it's not getting listed.
Is it possible to be done? Should I modify this code to search that way?
Here is my index structure :
POST arempris/emptagnames
{
"mappings" : {
"emptags":{
"properties": {
"employeeid": {
"type":"integer"
},
"tagName": {
"type": "text",
"fielddata": true,
"analyzer": "lowercase_keyword",
"search_analyzer": "lowercase_keyword"
}
}
}
}
}
Would greatly appreciate for your help and thanks a lot in advance.
The analyzer that you have set does not tokenize anything, so the space is important. Specifying a custom analyzer that will split on whitespaces and underscores and anything you might find useful is a good solution. The below will work, but check really carefully what the analyzer does and visit the documentation for every part you don't understand.
PUT stackoverflow
{
"settings": {
"analysis": {
"analyzer": {
"customanalyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"standard",
"generatewordparts"
]
}
},
"filter": {
"generatewordparts": {
"type": "word_delimiter",
"split_on_numerics": false,
"split_on_case_change": false,
"generate_word_parts": true,
"generate_number_parts": false,
"stem_english_possessive": false,
"catenate_all": false
}
}
}
},
"mappings": {
"emptags": {
"properties": {
"employeeid": {
"type": "integer"
},
"tagName": {
"type": "text",
"fielddata": true,
"analyzer": "customanalyzer",
"search_analyzer": "customanalyzer"
}
}
}
}
}
GET stackoverflow/emptags/1
{
"employeeid": 1,
"tagName": "social_security_number"
}
GET stackoverflow/_analyze
{
"analyzer" : "customanalyzer",
"text" : "social_security_number123"
}
GET stackoverflow/_search
{
"query": {
"query_string": {
"default_field": "tagName",
"query": "*curi*"
}
}
}
Another solution would be to normalize your input and replace any symbol that you want to treat as a whitespace (e.g. underscore) with a whitespace.
Read here for more
http://nocf-www.elastic.co/guide/en/elasticsearch/reference/current/analysis-word-delimiter-tokenfilter.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-normalizers.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-tokenizers.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-analyze.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-custom-analyzer.html

Parse json response of rest API and delete certain jsonObjects - JAVA

I have a json file as below which I am getting as a response from rest API:
{
"label": " MARA LEYZIN",
"ClassCode": "PROFESSIONAL",
"actvFlg": "A",
"name": "MARA LEYZIN",
"Typ": {
"label": "C_TYP_LU",
"TypCode": "PROFESSIONAL "
},
"Address": {
"link": [],
"firstRecord": 1,
"pageSize": 10,
"searchToken": "multi",
"item": [
{
"label": "Address",
"addrTypFk": {
"label": "C_ADDRESS_TYPE_LU",
"addrTypCd": "INDUSTRY",
"addrTypDesc": "Industry"
}
}
]
}
I am trying to parse this in Java and to remove some unwanted json objects. Like I want the following string to be replaced by blank:
"link": [],
"firstRecord": 1,
"pageSize": 10,
"searchToken": "multi",
"item":
To achieve this I am trying the following approach:
String jsonStr = new String(Files.readAllBytes(Paths.get(inputFile)));
System.out.println(jsonStr);
jsonStr.replaceAll("link", "");
But it is not replacing the required string with blanks. Please help me in this.
string object is immutable , so basically if do you want to replace something
System.out.println(jsonStr.replaceAll("link", "")); this will print the replaced string but it will not affect the original string, however if you do this
jsonStr=jsonStr.replaceAll("link", "");
System.out.println(jsonStr); this will print the replaced string
First of all:
Your JSON is not validate. You're missing a closing curly bracket at the end of it.
{
"label": " MARA LEYZIN",
"ClassCode": "PROFESSIONAL",
"actvFlg": "A",
"name": "MARA LEYZIN",
"Typ": {
"label": "C_TYP_LU",
"TypCode": "PROFESSIONAL "
},
"Address": {
"link": [],
"firstRecord": 1,
"pageSize": 10,
"searchToken": "multi",
"item": [{
"label": "Address",
"addrTypFk": {
"label": "C_ADDRESS_TYPE_LU",
"addrTypCd": "INDUSTRY",
"addrTypDesc": "Industry"
}
}]
}
}
Second of all you should just change order of your commands to this:
jsonStr.replaceAll("link", "");
System.out.println(jsonStr);
Important addition:
And I would suggest you to use org.json library or even better JACKSON to parse JSON files.
Here's tutorial how to use jackson and it's my warmest suggestion.
You will save a lot of time and you can do whatever you like.

Extract all status codes from a JSON response using a regex

I would like to extract all the status codes from a JSON response (Elasticsearch response to a bulk request) so that I can count how many documents have been created and how many errored.
Which regex should I use in the following code?
import java.util.regex.Matcher;
import java.util.regex.Pattern;
...
List<String> allCodes = new ArrayList<String>();
Matcher m = Pattern.compile("regex").matcher(jsonResponseString);
while (m.find()) {
allCodes.add(m.group());
}
Example of JSON response:
{
"took": 9,
"errors": false,
"items": [
{
"index": {
"_index": "movies",
"_type": "drama",
"_id": "123",
"_version": 68,
"result": "updated",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"created": false,
"status": 200
}
},
{
"index": {
"_index": "movies",
"_type": "drama",
"_id": "456",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"created": true,
"status": 201
}
}
]
}
Thanks!
List<String> allCodes = new ArrayList<String>();
Matcher m = Pattern.compile("\"status\": (\\d.*)").matcher(YOUR_TEXT);
while (m.find()) {
allCodes.add(m.group(1));
}
System.out.println(allCodes);
But i would create a Pojo with just the information you want:
e.g
public class Response {
int took;
List<Item> itemList;
class Item{
int status;
}
}
and then use jackson to convert.
By the way there is a java api for elasticsearch: https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/index.html so you don't need to handle parsing etc.
If status is just a numbers, you can use something like this \"status\":\s+(\d+). Also if you will use group() you will get all matched sequences: "status": 200, "status": 201
List<String> allCodes = new ArrayList<>();
Matcher m = Pattern.compile("\"status\":\\s+(\\d+)").matcher(jsonResponseString);
while (m.find()) {
allCodes.add(m.group(1));
}
System.out.println(allCodes);
As suggested in the comment, you shouldn't use a regex. If you think parsing the full JSON graph is too memory-consuming and don't want to use a Elastic search Java client, you may want to consider:
https://github.com/jayway/JsonPath
Lightweight library which is simmilar to XPath for XML, but for JSON. It doesn't parse the full model into memory but streams it and only extracts the matching elements, based on the path you supply. In this case this path would be something like:
$.items[?(#status = 200)]
I think you can even have the expression immediatly return the count you're looking for by using .length(), without the need to count the matching elements manually...

Categories