How to parse file with lot of json? - java

I need to parse large file with more than one JSON in it. I didn't find any way how to do it. File looks like a BSON for mongoDB.
File example:
{"column" : value, "column_2" : value}
{"column" : valeu, "column_2" : value}
....

You will need to determine where one JSON begins and another ends within the file. If each JSON is on an individual line, then this is easy, if not: You can loop through looking for the opening and closing braces, locating the points between each JSON.
char[] characters;
int openBraceCount = 0;
ArrayList<Integer> breakPoints = new ArrayList<>();
for(int i = 0; i < characters.length; i++) {
if(characters[i] == '{') {
openBraceCount++;
} else if(characters[i] == '}') {
openBraceCount--;
if(openBraceCount == 0) {
breakPoints.add(i + 1);
}
}
}
You can then break the file apart at each break point, and pass the individual JSON's into whatever your favorite JSON library is.

Related

How to find a root word in an ArrayList

I'm working on a NLP project and try to match a specific input with a root in an ArrayList.
For example, the user will enter لاعبون and try to find the word لعب in an ArrayList, but when i run my code it gives me more than one root.
for(String dbData : rootList) {
//System.out.println(dbData);
// if(dbData.contains(x)) {
// System.out.println(dbData);
// }
for (int i = 0; i < dbData.length(); i++) {
c = dbData.charAt(i);
for (int j = 0; i < x.length(); i++) {
d = x.charAt(i);
if (c == d && m != rootList.size()) {
match = true;
//System.out.println(dbData);
} else {
++m;
match = false;
//System.out.println("لا يوجد تطابق");
}
if(match) {
System.out.println(dbData);
container = dbData;
}
}
}
}
This does not seem like a right approach to do stemming. Try the below that is a simple way to find stems in Arabic.
First you need a list of stems, and obviously you have that.
Then you should need to write the Arabic literature rules and forms that can parse a word to a stem.
Now you just convert your rules to java regex.
For example if you want to find لعب from لاعبون you should remove ون as it shows person and count, then you should check if لاعب is derived from one of the stems. As you know the forms لاعب is فاعل form of لعب so you should choose لعب.

Java/Angularjs - convert variable names to normal English conventions

My goal here is to retrieve the attribute names from a class, which I have already done using JAVA Reflections. But I want to be able to transform the variable naming convention, say firstName to First Name.
My current idea is to use .split() to transform position: 0 (usually a lower-case) to Uppercase, then loop until I find subsequent UpperCases, and push a blank space in between. Are there any better way to do this?
EDIT: This is my current method if any of you are interested:
public List<String> getProfileConstraintTemplateEnglish() {
//what I want to return
List<String> transformedList = new ArrayList<>();
//The reflection that I'm getting
List<ResultProfileConstraintTemplate> tmp = constraintService.getProfileCTml();
//loop each obj in reflection list
for (ResultProfileConstraintTemplate r : tmp) {
//get the letters first from the title in obj
String[] field = r.getTitle().split("");
//this is the transformed string in each tmp.
String transformed = "";
//converting the array to a list for simpler addition.
List<String> fieldString = Arrays.asList(field);
//adding a counter to know which is the "first" position.
int counter = 0;
for (String s : fieldString) {
//first letter
if (counter == 0) {
transformed += s.toUpperCase();
}
//everything else
if (counter != 0 && s.equals(s.toUpperCase())) {
transformed+= " ";
transformed+=s;
}
else if(counter != 0 && s.equals(s.toLowerCase())){
transformed+=s;
}
//increment counter
counter++;
}
//add the transformed word to list.
transformedList.add(transformed);
}
return transformedList;
}
Result:
I think your way is the only way. If you post your code, maybe we can shed more light on the matter.
You can use isUpperCase() method and if it returns true replace it with a space and the letter and always convert first letter i.e indexOf(0) to toUpperCase().

Character extract challenge

MY database value for bus column. 12,34,56,8,9, ... im trying to extract only the bus numbers and not the commas and adding them to a String ArrayList. Anyone have any idea? :
im really confuse. heres my code:
for(int i =0; i< buses1.length() ; i++ )
{
if(buses1.charAt(i) == ',')
{
}
else
{
bus1 += Character.toString(buses1.charAt(i));
buses.add(bus1);
}
}
at this point, the codes are adding like this, "1", "2" , "3" , "4" not "12", "34" ....
Any one have any ideal?
Get rid of your current logic. You just need String#split() with delimeter as "," which returns your bus numbers as a array.
The below line is enough
String[] numbers = columnValue.split(",");
Then your ArrayList delcaration turns
List<String> busesList = new ArrayList<String>(Arrays.asList(numbers));
All you have to do is to split the String by using the ',' delimiter.
List<String> buses = Arrays.asList(buses1.split(","));
EDIT: Make sure that by doing so, buses will be an unmodifiable list( a list where you cannot add/remove elements to/from it). If you need a modifiable list, you can easily wrap it into one :
List<String> buses = new LinkedList<String>(Arrays.asList(buses1.split(",")));
Other answers have explained the recommended what to solve this problem. I just want to point out why your current attempt fails.
To get the effect you describe, it must actually be something like this:
for (int i = 0; i < buses1.length(); i++ ) {
String bus1 = "";
if (buses1.charAt(i) == ',') {
} else {
bus1 += Character.toString(buses1.charAt(i));
buses.add(bus1);
}
}
The problem is that you are adding a "bus" to the list at the wrong point. You need to add it when you've got the last character of a (single- or multi-digit) bus number. But you are adding it for each digit.
Your code actually needs to be something like this:
String bus1 = "";
for (int i = 0; i < buses1.length(); i++ ) {
if (buses1.charAt(i) == ',') {
// When we see a comma, we know that is the end of the bus number.
if (!bus1.isEmpty()) {
buses.add(bus1);
bus1 = "";
}
} else {
// Accumulate the digits of the current bus number.
bus1 += Character.toString(buses1.charAt(i));
}
}
// Deal with stuff after the last comma.
if (!bus1.isEmpty()) {
buses.add(bus1);
}
Note that we could improve on that in a couple of important ways. But it is easier to see the relationship with your (hypothesized) original code with this version.

Get similar part of string array items

I have an array of Strings:
qTrees[0] = "023012311312201123123130110332";
qTrees[1] = "023012311130023103123130110332";
qTrees[2] = "023013200020123103123130110333";
qTrees[3] = "023013200202301123123130110333";
Using this cycle I'm trying to retrieve similar part from them:
String similarPart = "";
for (int i = 0; i < qTrees[0].length(); i++){
if (qTrees[0].charAt(i) == qTrees[1].charAt(i) &&
qTrees[1].charAt(i) == qTrees[2].charAt(i) &&
qTrees[2].charAt(i) == qTrees[3].charAt(i) ){
similarPart += qTrees[0].charAt(i);
} else {
break;
}
}
But this is wrong. As you see it will return only "02301", but the deeper similarity is possible.
Please suggest me a better way to do it. Thanks.
You need to better define what you are trying to achieve. Do you want to:
find the longest common starting sequence between any two entries in the array;
find the longest common starting sequence across all of the entries in the array;
find the longest common sequence (i.e. same characters in same position) between any two entries;
find the longest common sequence across all entries in the array.
All of these will give slightly different approaches, but it will all boil down to correctly using break and continue in your loops.
Remove the else part in your code. Then it will check until the end of the string.
The code :
for (int i = 0; i < qTrees[0].length(); i++){
if (qTrees[0].charAt(i) == qTrees[1].charAt(i) &&
qTrees[1].charAt(i) == qTrees[2].charAt(i) &&
qTrees[2].charAt(i) == qTrees[3].charAt(i) ){
similarPart += qTrees[0].charAt(i);
}
}

How to insert a StringBuilder element into a GWT app?

So, I am getting as return parameter from an already established code a StringBuilder element, and I need to insert it into my GWT app. This StringBuilder element has been formatted into a table before returning.
For more clarity, below is the code of how StringBUilder is being generated and what is returned.
private static String formatStringArray(String header, String[] array, int[] removeCols) {
StringBuilder buf = new StringBuilder("<table bgcolor=\"DDDDDD\" border=\"1\" cellspacing=\"0\" cellpadding=\"3\">");
if (removeCols != null)
Arrays.sort(removeCols);
if (header != null) {
buf.append("<tr bgcolor=\"99AACC\">");
String[] tokens = header.split(",");
//StringTokenizer tokenized = new StringTokenizer(header, ",");
//while (tokenized.hasMoreElements()) {
for (int i = 0; i < tokens.length; i++) {
if (removeCols == null || Arrays.binarySearch(removeCols, i) < 0) {
buf.append("<th>");
buf.append(tokens[i]);
buf.append("</th>");
}
}
buf.append("</tr>");
}
if (array.length > 0) {
for (String element : array) {
buf.append("<tr>");
String[] tokens = element.split(",");
if (tokens.length > 1) {
for (int i = 0; i < tokens.length; i++) {
if (removeCols == null || Arrays.binarySearch(removeCols, i) < 0) {
buf.append("<td>");
buf.append(tokens[i]);
buf.append("</td>");
}
}
} else {
// Let any non tokenized row get through
buf.append("<td>");
buf.append(element);
buf.append("</td>");
}
buf.append("</tr>");
}
} else {
buf.append("<tr><td>No results returned</td></tr>");
}
buf.append("</table>");
return buf.toString();
}
So, above returned buf.toString(); is to be received in a GWT class, added to a panel and displayed... Now the question is: how to make all this happen?
I'm absolutely clueless as I'm a newbie and would be very thankful for any help.
Regards,
Chirayu
Could you be more specific, Chirayu? The "already established code" (is that a serlvet? Does it run on server side or client side?) that supposedly returns a StringBuilder, obviously returns a String, which can be easily transferred via GWT-RPC, JSON, etc.
But like Eyal mentioned, "you are doing it wrong" - you are generating HTML code by hand, which is additional work, leads to security holes (XSS, etc) and is more error-prone. The correct way would be:
Instead of generating the view/HTML code on the server (I'm assuming the above code is executed on the server), you just fetch the relevant data - via any transport that is available in GWT
On the client, put the data from the server in some nice Widgets. If you prefer to work with HTML directly, check out UiBinder. Otherwise, the old widgets, composites, etc way is ok too.
This way, you'll minimize the data sent between the client and the server and get better separation (to take it further, check out MVP). Plus, less load on the server - win-win.
And to stop being a newbie, RTFM - it's all there. Notice that all the links I've provided here lead to the official docs :)

Categories