I am working on the parsing a website view-source:https://massive.ucsd.edu/ProteoSAFe/datasets.jsp. I want to parse the .jsp and extract the JSOn object from the same.
I am using Jsoup to extract the data
Document doc = Jsoup.connect("https://massive.ucsd.edu/ProteoSAFe/datasets.jsp").maxBodySize(0).get();
Then using Java pattern to extract Json as string:
Pattern p = Pattern.compile(String.format("\"%s\":\\s*(.*),", "dataset","\"%s\":\\s*(.*),", "datasetNum","\"%s\":\\s*(.*),", "title","\"%s\":\\s*(.*),", "user","\"%s\":\\s*(.*),", "site","\"%s\":\\s*(.*),", "flowname","\"%s\":\\s*(.*),", "createdMillis","\"%s\":\\s*(.*),", "created","\"%s\":\\s*(.*),", "fileCount","\"%s\":\\s*(.*),", "fileSizeKB","\"%s\":\\s*(.*),", "psms","\"%s\":\\s*(.*),", "peptides","\"%s\":\\s*(.*),", "variants","\"%s\":\\s*(.*),", "proteins","\"%s\":\\s*(.*),", "species","\"%s\":\\s*(.*),", "instrument","\"%s\":\\s*(.*),", "modification","\"%s\":\\s*(.*),", "pi","\"%s\":\\s*(.*),", "complete","\"%s\":\\s*(.*),", "status","\"%s\":\\s*(.*),", "private","\"%s\":\\s*(.*),", "hash","\"%s\":\\s*(.*),", "px","\"%s\":\\s*(.*),", "task","\"%s\":\\s*(.*),", "id"));
Matcher m = p.matcher(script.html());
While doing so I am getting error. Last line is not getting parsed correctly.
It cuts in the end so I get
'A JSONObject text must end with '}' at character 577' error.
Can anyone suggest me better way to parse this page to get data.
While it seems like a bad idea to parse any HTML with regex.
This works for me Pattern.compile("(?s)var datasets = (\\[.*?\\]);")
(Tested via Python, since that's all I have available).
And that returns a JSONArray, not a JSONObject.
I have a website that is in plain text. The website is in a format like this:
{"code1":"Text I want copied","code2":"Second text I want to copy"}
Every time the website refreshes though, the texts I want copied change in length. I am curious how I could retrieve the text starting after ' :" ' and before ' ", ', using Java. I want the same thing to happen with the second text as well. I also would like to remove the html tags. Help will be greatly appreciated.
Using the org.json library, you could parse the JSON like:
String myJSONString = "{\"code1\":\"Text I want copied\",\"code2\":\"Second text I want to copy\"}";
JSONObject object = new JSONObject(myJSONString);
String[] keys = JSONObject.getNames(object);
String firstText = (String) object.get(keys[0]);
String secondText = (String) object.get(keys[1]);
For parsing the web page, you can use the JSoup library. See an example from this answer.
I am getting the following json as a reponse of a rest call. I am unable to parse it. Replacing "\" with "" doesn't work as the string contains many escape characters like "\n".
"[{\"message_id\":50870,\"message\":\"4d074d54-6e08-a140-fb7a-ee1300b01fbf.png\"},
{\"message_id\":50823,\"message\":\"1\\n2\\n3\\n4\\n5\\n6\"},{\"message_id\":50341,\"message\":\"I am getting a \\\"Server Error\\\" }]"
I have tried JsonTokener, UrlDecoder, but nothing seems to work.
I have also tried using
JsonString.replace ("\\"", "\"");
This works but is there a better way for conversion
JSONSerialiser serialiser = new JSONSerialiser();
String jsonOutput= serialiser.include("id","message").exclude("*").serialize(javaobject);
JSONObject jObject = JSONFactoryUtil.createJSONObject(jsonOutput);
I get a String in the following format:
String buffer = "[{\"field1\": 11,\"field2\": 12,\"field3\": 13}]";
and want to convert it to a JSONArray.
Thus i use the following code:
JSONArray Jarray = CDL.toJSONArray(buffer);
My Problem is now i get the following exception:
org.json.JSONException: Bad character ':' (58). at 24 [character 25 line 1]
at org.json.JSONTokener.syntaxError(JSONTokener.java:432)
at org.json.CDL.rowToJSONArray(CDL.java:113)
at org.json.CDL.toJSONArray(CDL.java:193)
at org.json.CDL.toJSONArray(CDL.java:182)
at MyDataexchange.MyCVSConverter.convertJson(MyCVSConverter.java:44)
at Mainexe.DataTest.main(DataTest.java:22)
As you can see in the stacktrace i want to use this to convert the string to .cvs at the end.
Since i dont know how to do it in a better way i'd like to know how to fix this exception.
Do i need to substitute the ':' with anything?
(Substitue ':' to ',' would produce null for example but not throw an exception, still it doesnt help me)
If yes it would be nice to tell me with what, otherwise any suggestions are welcome.
org.json.CDL is for parsing and serializing comma delimited text. However, your sample string is not comma delimited text. It's JSON. You probably wanted JSONArray Jarray = new JSONArray(buffer)
Ok, here's a better way of doing this (similar to what "guest" said):
String s = "[{\"field1\": 11,\"field2\": 12,\"field3\": 13}]";
Object obj=JSONValue.parse(s);
JSONArray array=(JSONArray)obj;
I have a big json string which i will be getting as a request from the UI , which will be converted to a String and parsed .
I want to simulate the similar environment for testing locally , so for this purpose i captured the JSon format.
Currently i am manually adding "/" to this big json string .
Is there any other way to achieve this ??
For example i got this json
{"age":29,"messages":["msg 1","msg 2","msg 3"],"name":"Preethi"}
and converted that into
String str = "{\"age\":\"29\",\"messages\":[\"msg 1\",\"msg 2\",\"msg 3\"],\"name\":\"mkyong\"}";
Is there any other way to achieve this ??
On the client-side, do a search and regex "replace all" of double-quotes into single quotes on the desired form field before actually sending the request.
Actually, Java doesn't have verbatim string literals.
If you want a Java-like (and Java-VM-based) language that does, however, you might want to look at Groovy which has various forms of string literal.
we have in build method to convert jsonObject to string. Why don't you use that.
JSONObject json = new JSONObject();
json.toString();