I'm facing difficulties in a scenario that I need to read a JSON object, in Java, that has no double quotes in the keys and no values, like the example below:
"{id: 267107086801, productCode: 02-671070868, lastUpdate: 2018-07-15, lastUpdateTimestamp: 2018-07-15 01:49:58, user: {pf: {document: 123456789, name: Luis Fernando}, address: {street: Rua Pref. Josu00e9 Alves Lima,number:37}, payment: [{sequential: 1, id: CREDIT_CARD, value: 188, installments: 9}]}"
I was able to add the double quotes in the fields using the code below, with replaceAll and the Gson library:
String jsonString = gson.toJson (obj);
String jsonString = jsonString.replaceAll ("([\\ w] +) [] *:", "\" $ 1 \ ":"); // to quote before: value
jsonString = jsonString.replaceAll (": [] * ([\\ w # \\.] +)", ": \" $ 1 \ ""); // to quote after: value, add special character as needed to the exclusion list in regex
jsonString = jsonString.replaceAll (": [] * \" ([\\ d] +) \ "", ": $ 1"); // to un-quote decimal value
jsonString = jsonString.replaceAll ("\" true \ "", "true"); // to un-quote boolean
jsonString = jsonString.replaceAll ("\" false \ "", "false"); // to un-quote boolean
However, fields with dates are being broken down erroneously, for example:
"{"id" : 267107086801,"productCode" : 02-671070868,"lastUpdate" : 2018-07-15,"lastUpdateTimestamp" : 2018-07-15 "01" : 49 : 58,"user" :{"pf":{"document" : 123456789, "name" : "Luis" Fernando},"address" :{"street" : "Rua"Pref.Josu00e9AlvesLima,"number" : 37},"payment" : [{"sequential" : 1,"id" : "CREDIT_CARD","value" : 188,"installments" : 9}]}"
Also, strings with spaces are wrong as well. How could I correct this logic? What am I doing wrong? Thanks in advance.
String incorrectJson = "{id: 267107086801, productCode: 02-671070868,"
+ " lastUpdate: 2018-07-15, lastUpdateTimestamp: 2018-07-15 01:49:58,"
+ " user: {pf: {document: 123456789, name: Luis Fernando},"
+ " address: {street: Rua Pref. Josu00e9 Alves Lima,number:37},"
+ " payment: [{sequential: 1, id: CREDIT_CARD, value: 188, installments: 9}]}";
String correctJson = incorrectJson.replaceAll("(?<=: ?)(?![ \\{\\[])(.+?)(?=,|})", "\"$1\"");
System.out.println(correctJson);
Output:
{id: "267107086801", productCode: "02-671070868", lastUpdate:
"2018-07-15", lastUpdateTimestamp: "2018-07-15 01:49:58", user: {pf:
{document: "123456789", name: "Luis Fernando"}, address: {street: "Rua
Pref. Josu00e9 Alves Lima",number:"37"}, payment: [{sequential: "1",
id: "CREDIT_CARD", value: "188", installments: "9"}]}
One downside of non-trivial regular expressions is they can be hard to read. The one I use here matches each literal value (but not values that are objects or arrays). I am using colons, commas and curly braces to guide the matching so I don’t need to care what is inside each string value, it may be any characters (except comma or right curly brace). The parts mean:
(?<=: ?): there’s a colon an optionally a blank before the value (lookbehind)
(?![ \\{\\[]) the value does not start with a blank, curly brace or square bracket (negative lookahead; blank because we don’t want a blank between the colon and the value to be taken as part of the value)
(.+?): the value consists of at least one character, as few as possible (reluctant quantifier; or regex would try to take the rest of the string)
(?=,|}): after the value comes either a comma or a right curly brace (positive lookahead).
Without being well versed in JSON I don’t think you need to quote the name. You may, though:
String correctJson = incorrectJson.replaceAll(
"(?<=\\{|, ?)([a-zA-Z]+?): ?(?![ \\{\\[])(.+?)(?=,|})", "\"$1\": \"$2\"");
{"id": "267107086801", "productCode": "02-671070868", "lastUpdate":
"2018-07-15", "lastUpdateTimestamp": "2018-07-15 01:49:58", user: {pf:
{"document": "123456789", "name": "Luis Fernando"}, address:
{"street": "Rua Pref. Josu00e9 Alves Lima","number": "37"}, payment:
[{"sequential": "1", "id": "CREDIT_CARD", "value": "188",
"installments": "9"}]}
The following code takes care single quote present in JSON string as well as a key containing number
jsonString = jsonString.replaceAll(" :",":"); // to trip space after key
jsonString = jsonString.replaceAll(": ,",":,");
jsonString = jsonString.replaceAll("(?<=: ?)(?![ \{\[])(.+?)(?=,|})", ""$1"");
jsonString = jsonString.replaceAll("(?<=\{|, ?)([a-zA-Z0-9]+?)(?=:)",""$1"");
jsonString = jsonString.replaceAll(""true"", "true"); // to un-quote boolean
jsonString = jsonString.replaceAll(""false"", "false"); // to un-quote boolean
jsonString = jsonString.replaceAll(""null"", "null");// to un-quote null
jsonString = jsonString.replaceAll(":",", ":"" ,"); // to remove unnecessary double quotes
jsonString = jsonString.replaceAll("true"", "true"); // to un-quote boolean
jsonString = jsonString.replaceAll("'",", "',"); // to handle single quote within json string
jsonString = jsonString.replaceAll("'},", "'}","); // to put double quote after string ending with single quote
I have a String:
String s = "msqlsum81pv 0 0 25 25 25 2 -sn D:\\workdir\\PV 81\\config\\sum81pv.pwf -C 5000";
I want to get the path (in this case D:\\workdir\\PV 81\\config\\sum81pv.pwf) from this string. This path is an argument of a command option -sn or -n, so this path always appears after these options.
The path may or may not contain whitespaces, which needs to be handled.
public class TestClass {
public static void main(String[] args) {
String path;
String s = "msqlsum81pv 0 0 25 25 25 2 -sn D:\\workdir\\PV 81\\config\\sum81pv.pwf -C 5000";
path = s.replaceAll(".*(-sn|-n) \"?([^ ]*)?", "$2");
System.out.println("Path: " + path);
}
}
Current output: Path: D:\workdir\PV 81\config\sum81pv.pwf -C 5000
Expected output: Path: D:\workdir\PV 81\config\sum81pv.pwf
Below Answers working fine for the earlier case.
i need a regex which return `*.pwf` path if the option is `-sn, -n, -s, -s -n, or without -s or -n.`
But if I have below case then what would be the regex to find password file.
String s1 = msqllab91 0 0 1 50 50 60 /mti/root/bin/msqlora -n "tmp/my.pwf" -s
String s2 = msqllab92 0 0 1 50 50 60 /mti/root/bin/msqlora -s -n /mti/root/my.pwf
String s3 = msqllab93 0 0 1 50 50 60 msqlora -s -n "/mti/root/my.pwf" -C 10000
String s4 = msqllab94 0 0 1 50 50 60 msqlora.exe -sn /mti/root/my.pwf
String s5 = msqllab95 0 0 1 50 50 60 msqlora.exe -sn "/mti/root"/my.pwf
String s6 = msqllab96 0 0 1 50 50 60 msqlora.exe -sn"/mti/root"/my.pwf
String s7 = msqllab97 0 0 1 50 50 60 "/mti/root/bin/msqlora" -s -n /mti/root/my.pwf -s
String s8 = msqllab98 0 0 1 50 50 60 /mti/root/bin/msqlora -s
String s9 = msqllab99 0 0 1 50 50 60 /mti/root/bin/msqlora -s -n /mti/root/my.NOTpwf -s -n /mti/root/my.pwf
String s10 = msqllab90 0 0 1 50 50 60 /mti/root/bin/msqlora -sn /mti/root/my.NOTpwf -sn /mti/root/my.pwf
String s11 = msqllab901 0 0 1 50 50 60 /mti/root/bin/msqlora
String s12 = msqllab902 0 0 1 50 50 60 /mti/root/msqlora-n NOTmy.pwf
String s13 = msqllab903 0 0 1 50 50 60 /mti/root/msqlora-n.exe NOTmy.pwf
i need a regex which return *.pwf path if the option is -sn, -n, -s, -s -n, or without -s or -n.
path contains *.pwf file extension only not NOTpwf or any other extension and code should all work except the last two because it is an invalid command.
Note: I already asked this type of question but didn't get anything working as per my requirement. (How to get specific substring with option vale using java)
You can use:
path = s.replaceFirst(".*\\s-s?n\\s*(.+?)(?:\\s-.*|$)", "$1");
//=> D:\workdir\PV 81\config\sum81pv.pwf
Code Demo
RegEx Demo
Try this
String s = "msqlsum81pv 0 0 25 25 25 2 -sn D:\\workdir\\PV 81\\config\\sum81pv.pwf -C 5000";
int l=s.indexOf("-sn");
int l1=s.indexOf("-C");
System.out.println(s.substring(l+4,l1-2));
You can also use : [A-Z]:.*\.\w+
Demo and Explaination
Rather than using complex regexps for replacing, I'd rather suggest a simpler one for matching:
String s = "msqlsum81pv 0 0 25 25 25 2 -sn D:\\workdir\\PV 81\\config\\sum81pv.pwf -C 5000";
Pattern pattern = Pattern.compile("\\s-s?n\\s*(.*?)\\s*-C\\s+\\d+$");
Matcher matcher = pattern.matcher(s);
if (matcher.find()){
System.out.println(matcher.group(1));
}
// => D:\workdir\PV 81\config\sum81pv.pwf
See the IDEONE Demo
If the -C <NUMBER> is optional at the end, wrap with an optional group -> (?:\\s*-C\\s+\\d+)?$.
Pattern details:
\\s - a whitespace
-s?n - a -sn or -n (as s? matches an optional s)
\\s* - 0+ whitespaces
(.*?) - Group 1 matching any 0+ chars other than a newline
\\s* - ibid
-C - a literal -C
\\s+ - 1+ whitespaces
\\d+ - 1 or more digits
$ - end of string.
I have the following data which I want to split :
1111|AAA|DDDD|CCC00021|RR13|600999922|101111287|0|0|2011-06-20 15:38:31.549000|2011-06-30 08:57:20.114000
2222|BBB|DDDD|CCC00031|RR15|600911122|101000287|0|0|2011-06-20 15:38:31.549000|2011-06-30 08:57:20.114000
3333|AAA|DDDD|CCC11021|RR01|600955522|101122287|0|0|2011-06-20 15:38:31.549000|2011-06-30 08:57:20.114000
Treating them like each line . I need to store each elements
to get an output of :
1111
AAA
DDDD
CCC00021
RR13
600999922
101111287
0
0
2011-06-20 15:38:31.549000
2011-06-30 08:57:20.114000
Next line
2222
BBB
DDDD
CCC00031
RR15
600911122
101000287
0
0
2011-06-20 15:38:31.549000
2011-06-30 08:57:20.114000
Next Line
3333
AAA
DDDD
CCC11021
RR01
600955522
101122287
0
0
2011-06-20 15:38:31.549000
2011-06-30 08:57:20.114000
I am using Scanner class.
Code:
String var = "1111|AAA|DDDD|CCC00021|RR13|600999922|101111287|0|0|2011-06-20 15:38:31.549000|2011-06-30 08:57:20.114000 2222|BBB|DDDD|CCC00031|RR15|600911122|101000287|0|0|2011-06-20 15:38:31.549000|2011-06-30 08:57:20.114000 3333|AAA|DDDD|CCC11021|RR01|600955522|101122287|0|0|2011-06-20 15:38:31.549000|2011-06-30 08:57:20.114000"
for(String x : var.split("\\|")){
System.out.println(x);
}
I am creating an irc client in Java. It work fine but the message from the server is a bit "messed-up"
for example :User1!webirc#1.9.com PRIVMSG #channel :test. So i wanna know how to parse the irc message to human readable? Here is a regex that i found ^(:(\\S+) )?(\\S+)( (?!:)(.+?))?( :(.+))?$ for irc message.
The IRC Protocol is documented here: https://www.rfc-editor.org/rfc/rfc2812
2.3.1 Message format in Augmented BNF
The protocol messages must be extracted from the contiguous stream of
octets. The current solution is to designate two characters, CR and
LF, as message separators. Empty messages are silently ignored,
which permits use of the sequence CR-LF between messages without
extra problems.
The extracted message is parsed into the components ,
and list of parameters ().
The Augmented BNF representation for this is:
message = [ ":" prefix SPACE ] command [ params ] crlf
prefix = servername / ( nickname [ [ "!" user ] "#" host ] )
command = 1*letter / 3digit
params = *14( SPACE middle ) [ SPACE ":" trailing ]
=/ 14( SPACE middle ) [ SPACE [ ":" ] trailing ]
nospcrlfcl = %x01-09 / %x0B-0C / %x0E-1F / %x21-39 / %x3B-FF
; any octet except NUL, CR, LF, " " and ":"
middle = nospcrlfcl *( ":" / nospcrlfcl )
trailing = *( ":" / " " / nospcrlfcl )
SPACE = %x20 ; space character
crlf = %x0D %x0A ; "carriage return" "linefeed"
I want to split a number of strings similar to name: john, id: 20, dest: toledo, from: seattle, date_time: [2/8/12 15:48:01:837 MST] into only these tokens:
john
20
toledo
seattle
[2/8/12 15:48:01:837 MST]
I'm doing this
String delims = "(name|id|dest|from|date_time)?[:,\\s]+";
String line = "name: john, id: 20, dest: toledo, from: seattle, date_time: [2/8/12 15:48:01:837 MST]";
String[] lineTokens = line.split(delims, 5);
for (String t : lineTokens)
{
// for debugging
System.out.println (t);
// other processing I want to do
}
but every even element in lineTokens turns out to be either empty or just whitespace. Each odd element in lineTokens is what I want, i.e. lineTokens[0] is "", lineTokens[1] is "john", lineTokens[2] is "", lineTokens[3] is "20", etc. Can anyone explain what I'm doing wrong?
The problem is that your regex is not matching , id: as a whole, it is matching , as one and then id: as a 2nd match. Between these two matches you have an empty string. You need to modify it to match the whole thing. Something like this:
String delims = "(, )?(name|id|dest|from|date_time)?[:\\s]+";
http://ideone.com/Qgs8y
Why not a little less complicated regex solution.
String str = "name: john, id: 20, dest: toledo, from: seattle, date_time: [2/8/12 15:48:01:837 MST]";
String[] expr = str.split(", ");
for(String e : expr)
System.out.println(e.split(": ")[1]);
Output =
john
20
toledo
seattle
[2/8/12 15:48:01:837 MST]
I made some changes to your code:
String delims = "(name|id|dest|from|date_time)[:,\\s]+";
String line = "name: john, id: 20, dest: toledo, from: seattle, date_time: [2/8/12 15:48:01:837 MST]";
String[] lineTokens = line.split(delims);
for (String t : lineTokens)
{
// for debugging
System.out.println (t);
// other processing I want to do
}
also you should ignore the first element in lineTokens, since it's the capturing from the beginning of the line till "name:...."