How to insert a StringBuilder element into a GWT app? - java

So, I am getting as return parameter from an already established code a StringBuilder element, and I need to insert it into my GWT app. This StringBuilder element has been formatted into a table before returning.
For more clarity, below is the code of how StringBUilder is being generated and what is returned.
private static String formatStringArray(String header, String[] array, int[] removeCols) {
StringBuilder buf = new StringBuilder("<table bgcolor=\"DDDDDD\" border=\"1\" cellspacing=\"0\" cellpadding=\"3\">");
if (removeCols != null)
Arrays.sort(removeCols);
if (header != null) {
buf.append("<tr bgcolor=\"99AACC\">");
String[] tokens = header.split(",");
//StringTokenizer tokenized = new StringTokenizer(header, ",");
//while (tokenized.hasMoreElements()) {
for (int i = 0; i < tokens.length; i++) {
if (removeCols == null || Arrays.binarySearch(removeCols, i) < 0) {
buf.append("<th>");
buf.append(tokens[i]);
buf.append("</th>");
}
}
buf.append("</tr>");
}
if (array.length > 0) {
for (String element : array) {
buf.append("<tr>");
String[] tokens = element.split(",");
if (tokens.length > 1) {
for (int i = 0; i < tokens.length; i++) {
if (removeCols == null || Arrays.binarySearch(removeCols, i) < 0) {
buf.append("<td>");
buf.append(tokens[i]);
buf.append("</td>");
}
}
} else {
// Let any non tokenized row get through
buf.append("<td>");
buf.append(element);
buf.append("</td>");
}
buf.append("</tr>");
}
} else {
buf.append("<tr><td>No results returned</td></tr>");
}
buf.append("</table>");
return buf.toString();
}
So, above returned buf.toString(); is to be received in a GWT class, added to a panel and displayed... Now the question is: how to make all this happen?
I'm absolutely clueless as I'm a newbie and would be very thankful for any help.
Regards,
Chirayu

Could you be more specific, Chirayu? The "already established code" (is that a serlvet? Does it run on server side or client side?) that supposedly returns a StringBuilder, obviously returns a String, which can be easily transferred via GWT-RPC, JSON, etc.
But like Eyal mentioned, "you are doing it wrong" - you are generating HTML code by hand, which is additional work, leads to security holes (XSS, etc) and is more error-prone. The correct way would be:
Instead of generating the view/HTML code on the server (I'm assuming the above code is executed on the server), you just fetch the relevant data - via any transport that is available in GWT
On the client, put the data from the server in some nice Widgets. If you prefer to work with HTML directly, check out UiBinder. Otherwise, the old widgets, composites, etc way is ok too.
This way, you'll minimize the data sent between the client and the server and get better separation (to take it further, check out MVP). Plus, less load on the server - win-win.
And to stop being a newbie, RTFM - it's all there. Notice that all the links I've provided here lead to the official docs :)

Related

Determine file extension for image urls

Is there a reliable and fast way to determine the file extension of an image url? THere are a few options I see but none of them work consistently for images of the below format
https://cdn-image.blay.com/sites/default/files/styles/1600x1000/public/images/12.jpg?itok=e-zA1T
I have tried:
new MimetypesFileTypeMap().getContentType(url)
Results in the generic "application/octet-stream" in which case I use the below two:
Files.getFileExtension
FilenameUtils.getExtension
I would like to avoid regex when possible so is there another utility that properly gets past links that have args (.jpeg?blahblah). I would also like to avoid downloading the image or connection to the url in anyway as this should be a performant call
If you can trust that the URLs are not malformed, how about this:
FilenameUtils.getExtension(URI.create(url).getPath())
Cant you just look at the file extension in the URL? so that would be something like:
public static String getFileExtension(String url) {
int phpChar = url.length();
for(int i = 0; i < url.length(); i++) {
if(url.charAt(i) == '?') {
phpChar = i;
break;
}
}
int character = phpChar - 1;
while(url.charAt(character) != '.') character -= 1;
return url.substring(character + 1, phpChar);
}
Maybe not the most elegant solution, but it works, even with the php ? in the url.

Checking each line of data in a text file and identifying invalid data

so i've looked around and could'nt find anything specificaly related to what i'm wanting to accomplish, so i'm here to ask some of you folks if ya'll could help. I am a Uni student, and am struggling to wrap my head around a specfific task.
The task revolves around the following:
Being able to have the program we develop check each line of data in a file we input, and report any errors (such as missing data) to the console via messages.
I am currently using Scanner to scan the file and .split to split the text at each hyphen that it finds and then placing that data into a String[] splitText array... the code for that is as follows:
File Fileobject = new File(importFile);
Scanner fileReader = new Scanner(Fileobject);
while(fileReader.hasNext())
{
String line = fileReader.nextLine();
String[] splitText = line.split("-");
}
The text contained within the file we are scanning, is formatted as follows:
Title - Author - Price - Publisher - ISBN
Title, Author and Publisher are varying lengths - and ISBN is 11characters, Price is to two decimal places. I am able to easily print Valid data to the console, though it's the whole validating and printing errors (such as: "The book title may be missing.") to the console which has my head twisted.
Would IF statements be suited to checking each line of data? And if so, how would those be structured?
If you want to check the length/presence of each of the five columns, then consider the following:
while (fileReader.hasNext()) {
String line = fileReader.nextLine();
String[] splitText = line.split("-");
if (splitText.length < 5) {
System.out.println("One or more columns is entirely missing.");
continue; // skip this line
}
if (splitText[0].length == 0) {
System.out.println("Title is missing")
}
if (splitText[1].length == 0) {
System.out.println("Author is missing")
}
boolean isValidPrice = true;
try {
Double.parseDouble(splitText[2]);
}
catch (Exception e) {
isValidPrice = false;
}
if (!isValidPrice) {
System.out.println("Found an invalid price " + splitText[2] + " but expected a decimal.");
}
if (splitText[4].length != 11) {
System.out.println("Found an invalid ISBN.");
}
I do a two level validation above. If splitting the current line on dash does not yield 5 terms, then we have missing columns and we do not attempt to even guess what data might actually be there. If there are 5 expected columns, then we do a validation on each field by length and/or by expected value.
Yes, your best bet is to use if statements (I can't think of another way?). For cleanliness, I recommend you create a validateData(String data) method, or multiple validator functions.
For example, because you know each line is going to be in the Title - Author - Price - Publisher - ISBN format, you can write code like this:
public void validatePrice(String data) {
//Write your logic to validate.
}
public void validateAuthor(String data) {
//Write your logic to validate.
}
...
Then in your while loop you can call
validatePrice([splitText[0]);
validateAuthor([splitText[1]);
for each validator method.
Depending on your needs you can turn this more a bit more OOP style, but this is one cleanish way to do it.
The first thing you want to check for validation is that you have the proper number of entries (in this case check that the array is of size 5), and after that, you want to check each piece of data
If statements are a good way to go, and you can do something as simple as:
if(title.isBad()) print("error");
if(author.isBad()) print("error");
if(price.isBad()) print("error");
if(publisher.isBad()) print("error");
if(isbn.isBad()) print("error");
Replacing the .isBad with which ever clauses you are checking, such as string[i].isEmpty(), the length of the ISBN, etc.
For ones that take longer to check, such as the Price, you'll want to make some nested for loops, checking if it contains a period, contains only numbers, and on'y has 2 digits after the period.
Something helpful to know is the Wrapper classes for the primitive data types, if allows you to do
Character.isLetter(strings[i].charAt[j])
in the place of
(strings[i].charAt[j] >= 'A' && strings[i].charAt[j] <= 'Z') &&
(strings[i].charAt[j] >= 'a' && strings[i].charAt[j] <= 'z')
and
try{
Double.parseDouble(strings[i]);
}
instead of manually checking the price.
Hope this helps!

How to parse file with lot of json?

I need to parse large file with more than one JSON in it. I didn't find any way how to do it. File looks like a BSON for mongoDB.
File example:
{"column" : value, "column_2" : value}
{"column" : valeu, "column_2" : value}
....
You will need to determine where one JSON begins and another ends within the file. If each JSON is on an individual line, then this is easy, if not: You can loop through looking for the opening and closing braces, locating the points between each JSON.
char[] characters;
int openBraceCount = 0;
ArrayList<Integer> breakPoints = new ArrayList<>();
for(int i = 0; i < characters.length; i++) {
if(characters[i] == '{') {
openBraceCount++;
} else if(characters[i] == '}') {
openBraceCount--;
if(openBraceCount == 0) {
breakPoints.add(i + 1);
}
}
}
You can then break the file apart at each break point, and pass the individual JSON's into whatever your favorite JSON library is.

Good Practices in parsing text files in Java

i want to parse a text file that represents a log. i want it to be powerful enough to handle all erros that might occur.. although i am clueless about the best practices and the errors i should account for . i will be using JAVA to implement this.
Sample log :
2012-07-16 10:23:40,558 - 127.0.0.1 - Paremter array[param1=1,param2=1,param3=0,] - 383
I already wrote a prasing code that works as follows :
public Parser(String log) {
this.log = log;
this.parse();
}
public void parse() {
String[] temp = new String[10];
String[] temp2 = new String[10];
temp = log.split(" - ");
key = temp[3];
id = Integer.parseInt(key);
String IP = temp[1];
String str;
String temp3 = temp[2].substring(temp[2].indexOf("g"), temp[2].indexOf("]"));
temp = temp3.split(",");
str = "param1";
boolean ordered = CheckOrder(temp);
if (ordered) {
for (int q = 0; q < temp.length; q++) {
temp[q] = temp[q].substring(temp[q].indexOf("=") + 1);
}
if (temp[0].equals("q")) {
param= 0;
} else if (temp[0].equals("k")) {
param= 1;
} else {
param= 2;
}
// Same way for all parameters
}
}
Check the javadoc of all the methods you use, and make sure to handle all the nominal and exceptional cases:
the file doesn't exist: an exception is being thrown. Handle this exception correctly
String.indexOf() doesn't find what it looks for. It returns -1. Handle this case correctly
String.split() doesn't return an array of the length I expect. Handle this case correctly
...
Split your big method into several sub-methods, each doing only one thing.
Write unit tests to check that your methods do what they're supposed to do, with all the possible inputs.
Note that "handling things correctly" might very well mean: throw an exception because the input is incorrect, if the contract is that the logs follow a well-defined format. In this case, it's the code generating the logs that is incorrect. But it's better to have an exception telling which format you expected and which format you got instead, rather than an obscure NullPointerException or ArrayIndexOutOfBoundsException.
The above applies to any kind of code you write, and not just to file parsing.
Side note:
String[] temp = new String[10];
temp = log.split(" - ");
What's the point in creating an array of 10 elements to discard it right after and replace it by another array (the one returned by log.split(" - ")).

Debugging Java Out of Memory Error

I'm still a relatively new programmer, and an issue I keep having in Java is Out of Memory Errors. I don't want to increase the memory using -Xmx, because I feel that the error is due to poor programming, and I want to improve my coding rather than rely on more memory.
The work I do involves processing lots of text files, each around 1GB when compressed. The code I have here is meant to loop through a directory where new compressed text files are being dropped. It opens the second most recent text file (not the most recent, because this is still being written to), and uses the Jsoup library to parse certain fields in the text file (fields are separated with custom delimiters: "|nTa|" designates a new column and "|nLa|" designates a new row).
I feel there should be no reason for using a lot of memory. I open a file, scan through it, parse the relevant bits, write the parsed version into another file, close the file, and move onto the next file. I don't need to store the whole file in memory, and I certainly don't need to store files that have already been processed in memory.
I'm getting errors when I start parsing the second file, which suggests that I'm not dealing with garbage collection. Please have a look at the code, and see if you can spot things that I'm doing that mean I'm using more memory than I should be. I want to learn how to do this right so I stop getting memory errors!
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Scanner;
import java.util.TreeMap;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;
import org.jsoup.Jsoup;
public class ParseHTML {
public static int commentExtractField = 3;
public static int contentExtractField = 4;
public static int descriptionField = 5;
public static void main(String[] args) throws Exception {
File directoryCompleted = null;
File filesCompleted[] = null;
while(true) {
// find second most recent file in completed directory
directoryCompleted = new File(args[0]);
filesCompleted = directoryCompleted.listFiles();
if (filesCompleted.length > 1) {
TreeMap<Long, File> timeStamps = new TreeMap<Long, File>(Collections.reverseOrder());
for (File f : filesCompleted) {
timeStamps.put(getTimestamp(f), f);
}
File fileToProcess = null;
int counter = 0;
for (Long l : timeStamps.keySet()) {
fileToProcess = timeStamps.get(l);
if (counter == 1) {
break;
}
counter++;
}
// start processing file
GZIPInputStream gzipInputStream = null;
if (fileToProcess != null) {
gzipInputStream = new GZIPInputStream(new FileInputStream(fileToProcess));
}
else {
System.err.println("No file to process!");
System.exit(1);
}
Scanner scanner = new Scanner(gzipInputStream);
scanner.useDelimiter("\\|nLa\\|");
GZIPOutputStream output = new GZIPOutputStream(new FileOutputStream("parsed/" + fileToProcess.getName()));
while (scanner.hasNext()) {
Scanner scanner2 = new Scanner(scanner.next());
scanner2.useDelimiter("\\|nTa\\|");
ArrayList<String> row = new ArrayList<String>();
while(scanner2.hasNext()) {
row.add(scanner2.next());
}
for (int index = 0; index < row.size(); index++) {
if (index == commentExtractField ||
index == contentExtractField ||
index == descriptionField) {
output.write(jsoupParse(row.get(index)).getBytes("UTF-8"));
}
else {
output.write(row.get(index).getBytes("UTF-8"));
}
String delimiter = "";
if (index == row.size() - 1) {
delimiter = "|nLa|";
}
else {
delimiter = "|nTa|";
}
output.write(delimiter.getBytes("UTF-8"));
}
}
output.finish();
output.close();
scanner.close();
gzipInputStream.close();
}
}
}
public static Long getTimestamp(File f) {
String name = f.getName();
String removeExt = name.substring(0, name.length() - 3);
String timestamp = removeExt.substring(7, removeExt.length());
return Long.parseLong(timestamp);
}
public static String jsoupParse(String s) {
if (s.length() == 4) {
return s;
}
else {
return Jsoup.parse(s).text();
}
}
}
How can I make sure that when I finish with objects, they are destroyed and not using any resources? For example, each time I close the GZIPInputStream, GZIPOutputStream and Scanner, how can I make sure they're completely destroyed?
For the record, the error I'm getting is:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2882)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:572)
at java.lang.StringBuilder.append(StringBuilder.java:203)
at org.jsoup.parser.TokeniserState$47.read(TokeniserState.java:1171)
at org.jsoup.parser.Tokeniser.read(Tokeniser.java:42)
at org.jsoup.parser.TreeBuilder.runParser(TreeBuilder.java:101)
at org.jsoup.parser.TreeBuilder.parse(TreeBuilder.java:53)
at org.jsoup.parser.Parser.parse(Parser.java:24)
at org.jsoup.Jsoup.parse(Jsoup.java:44)
at ParseHTML.jsoupParse(ParseHTML.java:125)
at ParseHTML.main(ParseHTML.java:81)
I haven't spent very long analysing your code (nothing stands out), but a good general-purpose start would be to familiarise yourself with the free VisualVM tool. This is a reasonable guide to its use, though there are many more articles.
There are better commercial profilers in my opinion - JProfiler for one - but it will at the very least show you what objects/classes most memory is being assigned to, and possibly the method stack traces that caused that to happen. More simply it shows you heap allocation over time, and you can use this to judge whether you are failing to clear something or whether it is an unavoidable spike.
I suggest this rather than looking at the specifics of your code because it is a useful diagnostic skill to have.
Update: This issue was fixed in JSoup 1.6.2
It looks to me like it's probably a bug in the JSoup parser that you're using...at present the documentation for JSoup.parse() has a warning "BETA: if you do get an exception raised, or a bad parse-tree, please file a bug." Which suggests they aren't confident that it's completely safe for use in production code.
I also found several bug reports mentioning out of memory exceptions, one of which suggests that it's due to parse error objects being held statically by JSoup, and that downgrading from JSoup 1.6.1 to 1.5.2 may be a work-around.
I am wondering if your parse is failing because you have bad HTML (e.g. unclosed tags, unpaired quotes or whatnot) being parsed? You could do a output /println to see how far you are getting in the document if at all. The Java library may not understand the end of the document /file before running out of memory.
parse
public static Document parse(String html) Parse HTML into a Document. As no base URI is specified, absolute URL detection relies on the HTML including a tag.
http://jsoup.org/apidocs/org/jsoup/Jsoup.html#parse(java.lang.String)
It's a little hard to tell what's going on but two things come to my mind.
1) In some weird circumstances (depending on the input file), the following loop might load the entire file into memory:
while(scanner2.hasNext()) {
row.add(scanner2.next());
}
2) By looking at the stackTrace it seems that the jsoupParse is the problem. I believe that this line Jsoup.parse(s).text(); loads s into memory first and depending on the string size (that again depends on the particular file input) this might cause the OutOfMemoryError
Maybe a combination of the two points above is the issue. Again, it's hard to tell by just looking at the code..
Does this happen always with the same file? Did you check the input content and the custom delimiters in it?
Assuming the problem is not in JSoup code, we can do some memory optimization. In example, ArrayList<String> row could be stripped, as it holds all parsed lines in memory, but only one line needed for parsing.
Inner loop with row removed:
//Caution! May contain obvious bugs!
while (scanner.hasNext()) {
String scanStr = scanner.next();
//manually count of rows to replace 'row.size()'
int rowCount = 0;
int offset = 0;
while ((offset = scanStr.indexOf("|nTa|", offset)) >= 0) {
rowCount++;
offset++;
}
rowCount++;
Scanner scanner2 = new Scanner(scanStr);
scanner2.useDelimiter("\\|nTa\\|");
int index = 0;
while (scanner2.hasNext()) {
String curRow = scanner2.next();
if (index == commentExtractField
|| index == contentExtractField
|| index == descriptionField) {
output.write(jsoupParse(curRow).getBytes("UTF-8"));
} else {
output.write(curRow.getBytes("UTF-8"));
}
String delimiter = "";
if (index == rowCount - 1) {
delimiter = "|nLa|";
} else {
delimiter = "|nTa|";
}
output.write(delimiter.getBytes("UTF-8"));
}
}

Categories