I am working on the parsing a website view-source:https://massive.ucsd.edu/ProteoSAFe/datasets.jsp. I want to parse the .jsp and extract the JSOn object from the same.
I am using Jsoup to extract the data
Document doc = Jsoup.connect("https://massive.ucsd.edu/ProteoSAFe/datasets.jsp").maxBodySize(0).get();
Then using Java pattern to extract Json as string:
Pattern p = Pattern.compile(String.format("\"%s\":\\s*(.*),", "dataset","\"%s\":\\s*(.*),", "datasetNum","\"%s\":\\s*(.*),", "title","\"%s\":\\s*(.*),", "user","\"%s\":\\s*(.*),", "site","\"%s\":\\s*(.*),", "flowname","\"%s\":\\s*(.*),", "createdMillis","\"%s\":\\s*(.*),", "created","\"%s\":\\s*(.*),", "fileCount","\"%s\":\\s*(.*),", "fileSizeKB","\"%s\":\\s*(.*),", "psms","\"%s\":\\s*(.*),", "peptides","\"%s\":\\s*(.*),", "variants","\"%s\":\\s*(.*),", "proteins","\"%s\":\\s*(.*),", "species","\"%s\":\\s*(.*),", "instrument","\"%s\":\\s*(.*),", "modification","\"%s\":\\s*(.*),", "pi","\"%s\":\\s*(.*),", "complete","\"%s\":\\s*(.*),", "status","\"%s\":\\s*(.*),", "private","\"%s\":\\s*(.*),", "hash","\"%s\":\\s*(.*),", "px","\"%s\":\\s*(.*),", "task","\"%s\":\\s*(.*),", "id"));
Matcher m = p.matcher(script.html());
While doing so I am getting error. Last line is not getting parsed correctly.
It cuts in the end so I get
'A JSONObject text must end with '}' at character 577' error.
Can anyone suggest me better way to parse this page to get data.
While it seems like a bad idea to parse any HTML with regex.
This works for me Pattern.compile("(?s)var datasets = (\\[.*?\\]);")
(Tested via Python, since that's all I have available).
And that returns a JSONArray, not a JSONObject.
I am getting the following json as a reponse of a rest call. I am unable to parse it. Replacing "\" with "" doesn't work as the string contains many escape characters like "\n".
"[{\"message_id\":50870,\"message\":\"4d074d54-6e08-a140-fb7a-ee1300b01fbf.png\"},
{\"message_id\":50823,\"message\":\"1\\n2\\n3\\n4\\n5\\n6\"},{\"message_id\":50341,\"message\":\"I am getting a \\\"Server Error\\\" }]"
I have tried JsonTokener, UrlDecoder, but nothing seems to work.
I have also tried using
JsonString.replace ("\\"", "\"");
This works but is there a better way for conversion
JSONSerialiser serialiser = new JSONSerialiser();
String jsonOutput= serialiser.include("id","message").exclude("*").serialize(javaobject);
JSONObject jObject = JSONFactoryUtil.createJSONObject(jsonOutput);
I get a String in the following format:
String buffer = "[{\"field1\": 11,\"field2\": 12,\"field3\": 13}]";
and want to convert it to a JSONArray.
Thus i use the following code:
JSONArray Jarray = CDL.toJSONArray(buffer);
My Problem is now i get the following exception:
org.json.JSONException: Bad character ':' (58). at 24 [character 25 line 1]
at org.json.JSONTokener.syntaxError(JSONTokener.java:432)
at org.json.CDL.rowToJSONArray(CDL.java:113)
at org.json.CDL.toJSONArray(CDL.java:193)
at org.json.CDL.toJSONArray(CDL.java:182)
at MyDataexchange.MyCVSConverter.convertJson(MyCVSConverter.java:44)
at Mainexe.DataTest.main(DataTest.java:22)
As you can see in the stacktrace i want to use this to convert the string to .cvs at the end.
Since i dont know how to do it in a better way i'd like to know how to fix this exception.
Do i need to substitute the ':' with anything?
(Substitue ':' to ',' would produce null for example but not throw an exception, still it doesnt help me)
If yes it would be nice to tell me with what, otherwise any suggestions are welcome.
org.json.CDL is for parsing and serializing comma delimited text. However, your sample string is not comma delimited text. It's JSON. You probably wanted JSONArray Jarray = new JSONArray(buffer)
Ok, here's a better way of doing this (similar to what "guest" said):
String s = "[{\"field1\": 11,\"field2\": 12,\"field3\": 13}]";
Object obj=JSONValue.parse(s);
JSONArray array=(JSONArray)obj;
The codes is like the following:
JSONObject solution = new JSONObject();
variableName = "TEST"
System.err.println("1:"+value);
solution.put(variableName, value);
System.err.println("2:"+solution);
Here is the output result:
1:{"min":10,"max":40}
2:{"TEST":"{\"min\":10,\"max\":40}"}
How can I get rid of the annoying '\'?
Thank you very much!
The reason why you are getting \ in your print line is because it is escaping the " character which is used in value. There's nothing wrong with this, it simply signifies that \" is not terminating the string and is instead a value part of that string.
Usage of double quotations is valid JSON, not single quotes - see related Q here.
But if you really want to, if you create value like this:
JSONObject value = new JSONObject("{'min':10,'max':40}");
Then you should get the desired output from your existing code:
1:{"min":10,"max":40}
2:{"TEST":{"min":10,"max":40}}
When I am sending a TextEdit data as a JSON with data as a combination of "; the app fails every time.
In detail if I am entering my username as anything but password as "; the resultant JSON file looks like:-
{"UserName":"qa#1.com","Password":"\";"}
I have searched a lot, what I could understand is the resultant JSON data voilates the syntax which results in throwing Default exception. I tried to get rid of special symbol by using URLEncoder.encode() method. But now the problem is in decoding.
Any help at any step will be very grateful.
Logcat:
I/SW_HttpClient(448): sending post: {"UserName":"qa#1.com","Password":"\";"}
I/SW_HttpClient(448): HTTPResponse received in [2326ms]
I/SW_HttpClient(448): stream returned: <!DOCTYPE html PUBLIC ---- AN HTML PAGE.... A DEFAULT HANDLER>
Hi try the following code
String EMPLOYEE_SERVICE_URI = Utils.authenticate+"?UserName="+uid+"&Email="+eid+"&Password="+URLEncoder.encode(pwd,"UTF-8");
The JSON you provided in the Question is valid.
The JSON spec requires double quotes in strings to be escaped with a backslash. Read the syntax graphs here - http://www.json.org/.
If something is throwing an exception while parsing that JSON, then either the parser is buggy or the exception means something else.
I have searched a lot, what I could understand is the resultant JSON data voilates the syntax
Your understanding is incorrect.
I tried to get rid of special symbol by using URLEncoder.encode() method.
That is a mistake, and is only going to make matters worse:
The backslash SHOULD be there.
The server or whatever that processes the JSON will NOT be expecting random escaping from a completely different standard.
But now the problem is in decoding.
Exactly.
Following provided JSON can be parsed through GSON library with below code
private String sampledata = "{\"UserName\":\"qa#1.com\",\"Password\":\"\\\";\"}";
Gson g = new Gson();
g.fromJson(sampledata, sample.class);
public class sample {
public String UserName;
public String Password;
}
For decoding the text I got the solution with..
URLDecoder.decode(String, String);