java.net.URI.create("localhost:8080/foo") // Works
java.net.URI.create("127.0.0.1:8080/foo") // Throws exception
java.net.URI.create("//127.0.0.1:8080/foo") // Works
Is double slash required for when you have the host as an IP Address? I glanced through the RFC for URI - https://www.rfc-editor.org/rfc/rfc3986. But could not find anything pertaining to this.
java.net.URI.create uses the syntax described in RFC 2396.
java.net.URI.create("localhost:8080/foo")
This doesn't produce an exception, but the URI is parsed in a way which you probably don't expect. Its scheme (not host!) is set to localhost, and the 8080/foo isn't port + path, but a scheme-specific part. So this doesn't really work.
java.net.URI.create("//localhost:8080/foo")
parses the URL without scheme, as a net_path grammar element (see RFC 2396 for details).
Here's the relevant grammar excerpt from the RFC 2396:
URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
// This is how 'localhost:8080/foo' is parsed:
absoluteURI = scheme ":" ( hier_part | opaque_part )
relativeURI = ( net_path | abs_path | rel_path ) [ "?" query ]
...
// This is how '//127.0.0.1:8080/foo' is parsed:
net_path = "//" authority [ abs_path ]
...
// Scheme must start with a letter,
// hence 'localhost' is parsed as a scheme, but '127' isn't:
scheme = alpha *( alpha | digit | "+" | "-" | "." )
One proper way would be:
java.net.URI.create("http://localhost:8080/foo")
I'm facing difficulties in a scenario that I need to read a JSON object, in Java, that has no double quotes in the keys and no values, like the example below:
"{id: 267107086801, productCode: 02-671070868, lastUpdate: 2018-07-15, lastUpdateTimestamp: 2018-07-15 01:49:58, user: {pf: {document: 123456789, name: Luis Fernando}, address: {street: Rua Pref. Josu00e9 Alves Lima,number:37}, payment: [{sequential: 1, id: CREDIT_CARD, value: 188, installments: 9}]}"
I was able to add the double quotes in the fields using the code below, with replaceAll and the Gson library:
String jsonString = gson.toJson (obj);
String jsonString = jsonString.replaceAll ("([\\ w] +) [] *:", "\" $ 1 \ ":"); // to quote before: value
jsonString = jsonString.replaceAll (": [] * ([\\ w # \\.] +)", ": \" $ 1 \ ""); // to quote after: value, add special character as needed to the exclusion list in regex
jsonString = jsonString.replaceAll (": [] * \" ([\\ d] +) \ "", ": $ 1"); // to un-quote decimal value
jsonString = jsonString.replaceAll ("\" true \ "", "true"); // to un-quote boolean
jsonString = jsonString.replaceAll ("\" false \ "", "false"); // to un-quote boolean
However, fields with dates are being broken down erroneously, for example:
"{"id" : 267107086801,"productCode" : 02-671070868,"lastUpdate" : 2018-07-15,"lastUpdateTimestamp" : 2018-07-15 "01" : 49 : 58,"user" :{"pf":{"document" : 123456789, "name" : "Luis" Fernando},"address" :{"street" : "Rua"Pref.Josu00e9AlvesLima,"number" : 37},"payment" : [{"sequential" : 1,"id" : "CREDIT_CARD","value" : 188,"installments" : 9}]}"
Also, strings with spaces are wrong as well. How could I correct this logic? What am I doing wrong? Thanks in advance.
String incorrectJson = "{id: 267107086801, productCode: 02-671070868,"
+ " lastUpdate: 2018-07-15, lastUpdateTimestamp: 2018-07-15 01:49:58,"
+ " user: {pf: {document: 123456789, name: Luis Fernando},"
+ " address: {street: Rua Pref. Josu00e9 Alves Lima,number:37},"
+ " payment: [{sequential: 1, id: CREDIT_CARD, value: 188, installments: 9}]}";
String correctJson = incorrectJson.replaceAll("(?<=: ?)(?![ \\{\\[])(.+?)(?=,|})", "\"$1\"");
System.out.println(correctJson);
Output:
{id: "267107086801", productCode: "02-671070868", lastUpdate:
"2018-07-15", lastUpdateTimestamp: "2018-07-15 01:49:58", user: {pf:
{document: "123456789", name: "Luis Fernando"}, address: {street: "Rua
Pref. Josu00e9 Alves Lima",number:"37"}, payment: [{sequential: "1",
id: "CREDIT_CARD", value: "188", installments: "9"}]}
One downside of non-trivial regular expressions is they can be hard to read. The one I use here matches each literal value (but not values that are objects or arrays). I am using colons, commas and curly braces to guide the matching so I don’t need to care what is inside each string value, it may be any characters (except comma or right curly brace). The parts mean:
(?<=: ?): there’s a colon an optionally a blank before the value (lookbehind)
(?![ \\{\\[]) the value does not start with a blank, curly brace or square bracket (negative lookahead; blank because we don’t want a blank between the colon and the value to be taken as part of the value)
(.+?): the value consists of at least one character, as few as possible (reluctant quantifier; or regex would try to take the rest of the string)
(?=,|}): after the value comes either a comma or a right curly brace (positive lookahead).
Without being well versed in JSON I don’t think you need to quote the name. You may, though:
String correctJson = incorrectJson.replaceAll(
"(?<=\\{|, ?)([a-zA-Z]+?): ?(?![ \\{\\[])(.+?)(?=,|})", "\"$1\": \"$2\"");
{"id": "267107086801", "productCode": "02-671070868", "lastUpdate":
"2018-07-15", "lastUpdateTimestamp": "2018-07-15 01:49:58", user: {pf:
{"document": "123456789", "name": "Luis Fernando"}, address:
{"street": "Rua Pref. Josu00e9 Alves Lima","number": "37"}, payment:
[{"sequential": "1", "id": "CREDIT_CARD", "value": "188",
"installments": "9"}]}
The following code takes care single quote present in JSON string as well as a key containing number
jsonString = jsonString.replaceAll(" :",":"); // to trip space after key
jsonString = jsonString.replaceAll(": ,",":,");
jsonString = jsonString.replaceAll("(?<=: ?)(?![ \{\[])(.+?)(?=,|})", ""$1"");
jsonString = jsonString.replaceAll("(?<=\{|, ?)([a-zA-Z0-9]+?)(?=:)",""$1"");
jsonString = jsonString.replaceAll(""true"", "true"); // to un-quote boolean
jsonString = jsonString.replaceAll(""false"", "false"); // to un-quote boolean
jsonString = jsonString.replaceAll(""null"", "null");// to un-quote null
jsonString = jsonString.replaceAll(":",", ":"" ,"); // to remove unnecessary double quotes
jsonString = jsonString.replaceAll("true"", "true"); // to un-quote boolean
jsonString = jsonString.replaceAll("'",", "',"); // to handle single quote within json string
jsonString = jsonString.replaceAll("'},", "'}","); // to put double quote after string ending with single quote
I got the below result in spark after using GSON library.
[
"{"A":"1","A-Description":"Eastern "}",
"{"B":"2","B-Description":"Western "}",
"{"C":"3","C-Description":"Northern "}",
"{"D":"4","D-Description":"Southern"}"
]
I want to remove the double quotes from start and end of json string
Final result will be as below :
[
{"A":"1","A-Description":"Eastern "},
{"B":"2","B-Description":"Western "},
{"C":"3","C-Description":"Northern "},
{"D":"4","D-Description":"Southern"}
]
I have solved the issue as below :
val jsonString = str.replaceAll("\\\\", "").replaceAll("\"(.+)\"", "$1")
where str is some string.
Please suggest more efficient way if available.
I want my grammar to recognize the following expression &COL[0]. I have built the following grammar:
array:
ARRAY_NAME L_RIGHT_PAR (ARRAY_DIGIT|STRING) R_RIGHT_PAR;
ARRAY_DIGIT:DIGIT+;
ARRAY_NAME: '&''COL';
STRING : QUOT ('\\"' | ~'"')* QUOT
;
L_RIGHT_PAR : '[' ;
R_RIGHT_PAR : ']' ;
fragment
DIGIT : '0'..'9' ;
This gives the error:
mismatched input '[1]' expecting '['
It only works if I write &COL[ 0] with spaces between the [ and ]
I changed the grammar a bit to make it complete enough to run. The text &COL[0] lexes fine with this amended grammar.
grammar test1; // different name for my test rig
test1: ARRAY_NAME L_RIGHT_PAR (ARRAY_DIGIT|STRING) R_RIGHT_PAR;
ARRAY_DIGIT:DIGIT+;
ARRAY_NAME: '&''COL';
STRING : QUOT ('\\"' | ~'"')* QUOT
;
QUOT: '"'; // assumed this
L_RIGHT_PAR : '[' ;
R_RIGHT_PAR : ']' ;
fragment
DIGIT : '0'..'9' ;
WS : [ \t\r\n] -> skip; // added whitespace just so I could add \r\n
Here's the tokenized output:
[#0,0:3='&COL',<ARRAY_NAME>,1:0]
[#1,4:4='[',<'['>,1:4]
[#2,5:5='0',<ARRAY_DIGIT>,1:5]
[#3,6:6=']',<']'>,1:6]
[#4,9:8='<EOF>',<EOF>,2:0]
So this answers the question you asked but I'm still not sure about your definition of STRING. But &COL[0] parses great now.
I have the following string input (from a netstat -a command):
Proto RefCnt Flags Type State I-Node Path
unix 2 [ ] DGRAM 11453 /run/systemd/shutdownd
unix 2 [ ] DGRAM 7644 /run/systemd/notify
unix 2 [ ] DGRAM 7646 /run/systemd/cgroups-agent
unix 5 [ ] DGRAM 7657 /run/systemd/journal/socket
unix 14 [ ] DGRAM 7659 /dev/log
unix 3 [ ] STREAM CONNECTED 16620
unix 3 [ ] STREAM CONNECTED 16621
Meanwhile I'm attempting to parse the above string as:
// lines is an array representing each line above
for (int i = 0; i < lines.length; i++) {
String[] tokens = lines[i].split("\\s+");
}
I want to have tokens as an array of 7 entries [Proto, RefCnt, Flag, Type, State, I-Node, Path]. Instead, I'm obtaining an array that excludes the brackets under Flags and the empty State:
["unix", "2", "[", "]", "DGRAM", "11453", "/run/systemd/shutdownd"]
instead of
["unix", "2", "[]", "DGRAM", "", "11453", "/run/systemd/shutdownd"]
How can I fix my regex to produce the correct output?
You need to set minimal space length in your regular expression to 2, try split like this:
String[] tokens = lines[i].split("\\s{2,16}+");
Or like #revo suggests using lookarounds, like this:
String[] tokens = lines[i].split("(?<!\\[)\\s{2,16}+(?!\\])");