How to read a simple xpm image and display it using Java? - java

I am assigned a task to build a simple xpm image viewer. I can't use any existing toolkit library for this.
I know that xpm images are string array like this ( I can write one) -
/* XPM */
static const char *const hi[] = {
"7 5 2 1",
" c black",
". c yellow",
".. ..",
". . . .",
". . .",
". .",
". ."
};
I want to use java for this. My question is -
1. How to make a String variable (hi[]) from this xpm file so that I can use it in my main class?
2. Good way to display it in a GUI?
3. Any other dictation...
Many thanks for your help

You'll have to firstly write a parser - a program/method/class/whatever that reads this file line-wise and extract the necessary data.
BufferedReader r =
new BufferedReader(new InputStreamReader(new FileInputStream(file),
"US-ASCII"));
gives you a BufferedReader, which has a readLine() method.
The first some lines you throw away or handle specially, and then the main bunch of lines are the real image data. There you throw away the quotes and commas, and have the plain data in string form.
To put it in a image, look at the classes in java.awt.image - specially BufferedImage and the classes used by it (Raster/WriteableRaster, IndexColorModel).
Instead, you could also simply hold the data in your String[] form, and in the paint-method of a custom component access the individual pixels. This would be a bit slower, I think.

Don't know if this will work for you: http://www.bolthole.com/java/Xpm.html , but I reckon once it is converted into a Java image, you should able to do whatever you want in Java.

Related

does bufferedWriter require getters

Hi i know the use of getters is to restrict access. The issue is i have a buffered writer that is using some information in another class. Therefore using getters i have gotten that information and written into a file using the bufferedWriter.
The issue comes when i am trying to use some information in the same class as the bufferedWriter. It doesnt write those details . Shows no error in the code either just doesnt write it. If the data is in the same class as the buffered writer i assume that it doesnt need to be accessed using getters? although the values are stored in another method. Explain this
Thank you alot
bufWrite.write("Your Character Class:" + character_Class + "\n");
bufWrite.write("Your Character Level:" + level + "\n");
The character class and level here are from 1 class called character. To access these information i have used getters (Since the buffered writer is inside another class.Basically i have 3 classes. Character details in one and i am using the bufferedWriter which is in a different class to store these details into a file.)
These details are written into the file completely.
for(Object o:skillInfo){
bufWrite.write("" + o + "- Rank(skill Points) :" + rank);
}
i am trying to use this foreach loop to write stuff inside a linkedlist. This linked list is inside the same class as the buffered writer statements are. But this doesnt get written while the other details(as listed above) get written. The only difference is that those details are using getters since they are not in the same class as the bufferedWriter is while the linked list is in the same as the bufferedWriter thus doesnt use a getter.
I hope this is clear enough
**Update
Also please note that the character.Level and the rank are user entered values.
To answer your question, a BufferedWriter by itself does not 'require' getters.
If your BufferedWriter is not writing data, then the problem might be how you are actually writing the data. You also might want to check if you are calling .close() when you are done writing the data.
Another problem might be how your data is returned from the get-er.
Your question is rather ambiguous, so any answer can only be a guess. I recommend reading up on how to make a good question, so that people can provide better answers.
https://stackoverflow.com/help/how-to-ask
https://stackoverflow.com/help/mcve
*Edit:
Thank you for updating your question.
If you want to write out the details of character_Class, you do need to call the getters of the specific information that you want to get from the class.
The reason why bufWrite.write("Your Character Class:" + character_Class + "\n"); doesn't throw an exception or crash is because of how java coerces classes to String.
The line of code:
"Your Character Class:" + character_Class
is the same as:
"Your Character Class:" + character_Class.toString()
Same happens with objects from skillInfo:
bufWrite.write("" + o.toString() + "- Rank(skill Points) :" + rank);
Basically, if you concatenate '+' a String and an object, it calls toString() on the object. If you concatenate a String and a primitive type, it just converts them to Strings first.

Convert string into utf-8 in php

There is a script written in Java and I am trying to convert it into PHP, but I'm not getting the same output.
How can I get it in PHP same as in Java?
Script written in Java
String key = "hghBGJH/gjhgRGB+rfr4654FeVw12y86GHJGHbnhgnh+J+15F56H";
byte[] kSecret = ("AWS4" + key).getBytes("UTF8");
Output: [B#167cf4d
Script written in PHP
$secret_key = "hghBGJH/gjhgRGB+rfr4654FeVw12y86GHJGHbnhgnh+J+15F56H";
utf8_encode("AWS4".$secret_key);
Output: AWS4hghBGJH/gjhgRGB+rfr4654FeVw12y86GHJGHbnhgnh+J+15F56H
The result [B#167cf4d you are getting is a toString() call on the byte array. The [B means the value is a byte array. #167cf4d is it's location within the virtual memory. You simply cannot get the PHP script to give you the same result. What you need to do is fix the printing on the Java side, but we don't know what API you're using to print it and where. Is it a web-application, a local application, etc... ?
edit:
Since you're using it in a Java web-application, there are two likely scenarios. The code is either in a Servlet, or in a JSP. If it's in a servlet, you have to set your output to UTF-8, then you can simply print the string like so:
response.setCharacterEncoding("UTF-8");
PrintWriter out = response.getWriter();
out.write("AWS4");
out.write(key);
If it's in a JSP, that makes it a bit more inconvenient, because you have to make sure you don't leave white-spaces like new lines before the printing begins. It's not a problem for regular HTML pages, but it's a problem if you need precisely formatted output to the last byte. The response and out objects exist readily there. You will likely have some <%#page tags with attributes there, so you have to make sure you're opening the next one as you're closing the last. Also a new line on the end of file could skew the result, so in general, JSP is not recommended for white-space-sensitive data output.
You can't sysout an byte array without converting it to a string, UTF-8 by default, toString() is being called,when you do something like:
System.out.println("Weird output: " + byteArray);
This will output the garbage you mentioned above. Instead, create a new instance of the standard String class.
System.out.println("UTF-8: " + new String(byteArray));
In Java you get the object reference of the byte array, not its value. Try:
new String(kSecret, "UTF-8");

how to write items of an ArrayList<Arra> in a file with JAVA?

I read a text file containing list of words with their tags and put them as an ArrayList in an a wrapping ArrayList (ArrayList).
[[1, that, that, that, DT, DT], [2, table, table, table, NN, NN]]
Now I want to write the in a text file in a same format as follows:
1 that that that DT DT
2 table table table NN NN
each of the above rows is an ArrayList with 6 columns.
the following code return a file with Ԁ inside.
public void setPPOSOfWordInDevelopmentList(ArrayList<ArrayList> trainingList){
try{
FileOutputStream streamFile = new FileOutputStream("developmentFile.txt");
ObjectOutputStream streamFileWriter = new ObjectOutputStream(streamFile);
for(ArrayList word: developmentWordsList){
String inputWord = (String)word.get(1);
extractTag(inputWord,trainingList);
String extractedPPOSofWord =(String)findMaxTag().get(1);
word.set(5, extractedPPOSofWord);
}
streamFileWriter.close();
System.out.println(developmentWordsList);
}
catch(Exception e){
System.out.println("Something went wrong, check the code");
}
}
this code is coupled with some others so it is not easy to change the format of objects returned by the functions.
If you want to write a simple text file, would be better if you use a BufferedWriter. For your content, you can format it in a StringBuffer or a StringBuilder if it is too long. Here in this post, I replied to a question related with the kind of formatting you're trying to make. But you should need to adapt it according to your format and the logic of using a wrapping array.
Export array values to csv file java
I think, the loop or "enhanced for" statement should be used as something like:
for (ArrayList<String> innerArray: wrapperArray) {
for (String word : innerArray) {
//Adapt to your required format using a StringBuilder
}
}
//Here at the end save the content of your StringBuilder or StringBuffer using the BufferedWriter.
Hope you can get an idea on how to achieve this. Best regards :)
What you want sounds eerily like a standard CSV file. This stackoverflow thread will set you straight on how to parse that sort of content. I would strongly recommend that you refactor along the lines of a CSV file instead of using the ObjectInput/OutputStreams. It'll be easier to maintain and you'll be able to use tools like Excel and OpenOffice Calc to view your files when debugging.
If you are certain to use custom format file you can use formatted printing and add padding accordingly. It's pretty easy:
for (ArrayList<String> list : trainingList) {
writeToStream(
String.format(
%s, %-5s, %-5s, %-5s,
list.getAt(0),list.getAt(1),list.getAt(2),list.getAt(3)
);
}
}
This should work if your strings aren't longer than five characters. Just keep in mind that blank characters are bad demiliters and you will face indentation problems if you use other than monospaced fonts.

How can I speed up my Java text file parser?

I am reading about 600 text files, and then parsing each file individually and add all the terms to a map so i can know the frequency of each word within the 600 files. (about 400MB).
My parser functions includes the following steps (ordered):
find text between two tags, which is the relevant text to read in each file.
lowecase all the text
string.split with multiple delimiters.
creating an arrayList with words like this: "aaa-aa", then adding to the string splitted above, and discounting "aaa" and "aa" to the String []. (i did this because i wanted "-" to be a delimiter, but i also wanted "aaa-aa" to be one word only, and not "aaa" and "aa".
get the String [] and map to a Map = new HashMap ... (word, frequency)
print everything.
It is taking me about 8min and 48 seconds, in a dual-core 2.2GHz, 2GB Ram. I would like advice on how to speed this process up. Should I expect it to be this slow? And if possible, how can I know (in netbeans), which functions are taking more time to execute?
unique words found: 398752.
CODE:
File file = new File(dir);
String[] files = file.list();
for (int i = 0; i < files.length; i++) {
BufferedReader br = new BufferedReader(
new InputStreamReader(
new BufferedInputStream(
new FileInputStream(dir + files[i])), encoding));
try {
String line;
while ((line = br.readLine()) != null) {
parsedString = parseString(line); // parse the string
m = stringToMap(parsedString, m);
}
} finally {
br.close();
}
}
EDIT: Check this:
![enter image description here][1]
I don't know what to conclude.
EDIT: 80% TIME USED WITH THIS FUNCTION
public String [] parseString(String sentence){
// separators; ,:;'"\/<>()[]*~^ºª+&%$ etc..
String[] parts = sentence.toLowerCase().split("[,\\s\\-:\\?\\!\\«\\»\\'\\´\\`\\\"\\.\\\\\\/()<>*º;+&ª%\\[\\]~^]");
Map<String, String> o = new HashMap<String, String>(); // save the hyphened words, aaa-bbb like Map<aaa,bbb>
Pattern pattern = Pattern.compile("(?<![A-Za-zÁÉÍÓÚÀÃÂÊÎÔÛáéíóúàãâêîôû-])[A-Za-zÁÉÍÓÚÀÃÂÊÎÔÛáéíóúàãâêîôû]+-[A-Za-zÁÉÍÓÚÀÃÂÊÎÔÛáéíóúàãâêîôû]+(?![A-Za-z-])");
Matcher matcher = pattern.matcher(sentence);
// Find all matches like this: ("aaa-bb or bbb-cc") and put it to map to later add this words to the original map and discount the single words "aaa-aa" like "aaa" and "aa"
for(int i=0; matcher.find(); i++){
String [] tempo = matcher.group().split("-");
o.put(tempo[0], tempo[1]);
}
//System.out.println("words: " + o);
ArrayList temp = new ArrayList();
temp.addAll(Arrays.asList(parts));
for (Map.Entry<String, String> entry : o.entrySet()) {
String key = entry.getKey();
String value = entry.getValue();
temp.add(key+"-"+value);
if(temp.indexOf(key)!=-1){
temp.remove(temp.indexOf(key));
}
if(temp.indexOf(value)!=-1){
temp.remove(temp.indexOf(value));
}
}
String []strArray = new String[temp.size()];
temp.toArray(strArray);
return strArray;
}
600 files, each file about 0.5MB
EDIT3#- The pattern is no longer compiling each time a line is read. The new images are:
2:
Be sure to increase your heap size, if you haven't already, using -Xmx. For this app, the impact may be striking.
The parts of your code that are likely to have the largest performance impact are the ones that are executed the most - which are the parts you haven't shown.
Update after memory screenshot
Look at all those Pattern$6 objects in the screenshot. I think you're recompiling the pattern a lot - maybe for every line. That would take a lot of time.
Update 2 - after code added to question.
Yup - two patterns compiled on every line - the explicit one, and also the "-" in the split (much cheaper, of course). I wish they hadn't added split() to String without it taking a compiled pattern as an argument. I see some other things that could be improved, but nothing else like the big compile. Just compile the pattern once, outside this function, maybe as a static class member.
Try to use to single regex that has a group that matches each word that is within tags - so a single regex could be used for your entire input and there would be not separate "split" stage.
Otherwise your approach seems reasonable, although I don't understand what you mean by "get the String [] ..." - I thought you were using an ArrayList. In any event, try to minimize the creation of objects, for both construction cost and garbage collection cost.
Is it just the parsing that's taking so long, or is it the file reading as well?
For the file reading, you can probably speed that up by reading the files on multiple threads. But first step is to figure out whether it's the reading or the parsing that's taking all the time so you can address the right issue.
Run the code through the Netbeans profiler and find out where it is taking the most time (right mouse click on the project and select profile, make sure you do time not memory).
Nothing in the code that you have shown us is an obvious source of performance problems. The problem is likely to be something to do with the way that you are parsing the lines or extracting the words and putting them into the map. If you want more advice you need to post the code for those methods, and the code that declares / initializes the map.
My general advice would be to profile the application and see where the bottlenecks are, and use that information to figure out what needs to be optimized.
#Ed Staub's advice is also sound. Running an application with a heap that is too small can result serious performance problems.
If you aren't already doing it, use BufferedInputStream and BufferedReader to read the files. Double-buffering like that is measurably better than using BufferedInputStream or BufferedReader alone. E.g.:
BufferedReader rdr = new BufferedReader(
new InputStreamReader(
new BufferedInputStream(
new FileInputStream(aFile)
)
/* add an encoding arg here (e.g., ', "UTF-8"') if appropriate */
)
);
If you post relevant parts of your code, there'd be a chance we could comment on how to improve the processing.
EDIT:
Based on your edit, here are a couple of suggestions:
Compile the pattern once and save it as a static variable, rather than compiling every time you call parseString.
Store the values of temp.indexOf(key) and temp.indexOf(value) when you first call them and then use the stored values instead of calling indexOf a second time.
It looks like its spending most of it time in regular expressions. I would firstly try writing the code without using a regular expression and then using multiple threads as if the process still appears to be CPU bound.
For the counter, I would look at using TObjectIntHashMap to reduce the overhead of the counter. I would use only one map, not create an array of string - counts which I then use to build another map, this could be a significant waste of time.
Precompile the pattern instead of compiling it every time through that method, and rid of the double buffering: use new BufferedReader(new FileReader(...)).

Rename Pdf from Pdf title

I want to organize my pdf file downloaded from the internet. It is clear that many of them are ill-named. I want to extract the real title from the file. Here many of them are generated from Latex and I think from the compiled pdf we can find the \title{} keyword or something like that. I want then use this to rename the file.
I can read the meta-data using pypdf. But most pdf does not contains that title in its metadata. I have tried it with all my collections and find none!
Two questions:
1. Is it possible to read pdf title compiled from the pdf compiled from latex.
2. Which library(mainly in C/C++, java, python) can I use to get that information.
Thanks in advance.
I think this is not really possible. The LaTeX information is no longer present in the pdf. If the title is not present in the metadata, you might be able to deduce the title from the structure information if it is a "tagged pdf". Most pdfs aren't however, and those that are will probably provide the metadata anyway.
This leaves you with layout analysis: try to determine what is the title from the document by looking at layout characteristics. For python, you might want to have a look at pdfminer.
The following example uses pdfminer to determine the title using a rather simplistic approach:
we assume that the title is somewhere on the first page
we leave it to pdfminer to recognize "blocks of text" on the first page
we assume that the title is printed "bigger" than the rest of the page. Looking at the height of each line in the text blocks, we determine which block contains the "tallest" line, and assume that that block contains the title
we let pdfminer extract the text from the block,
the text will probably contain newlines (placed by pdfminer) because the title might contain more than one line, and other needless whitespace, so we do some simple whitespace normalization (replace consecutive whitespace by a single space, and strip leading and trailing whitespace), and that's it!
As I said: this approach is rather simplistic, and might or might not give good results for your documents, but it may point you in the right direction. Here it goes:
import sys
import re
from pdfminer.pdfparser import PDFParser, PDFDocument
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.converter import PDFPageAggregator
from pdfminer.layout import LAParams, LTTextBox
filename = sys.argv[1]
fp = open(filename, 'rb')
parser = PDFParser(fp)
doc = PDFDocument()
parser.set_document(doc)
doc.set_parser(parser)
doc.initialize()
rsrcmgr = PDFResourceManager()
laparams = LAParams()
device = PDFPageAggregator(rsrcmgr, laparams=laparams)
interp = PDFPageInterpreter(rsrcmgr, device)
pages = doc.get_pages()
first_page = pages.next()
interp.process_page(first_page)
layout = device.get_result()
textboxes = [i for i in layout if isinstance(i, LTTextBox)]
box_with_tallest_line = max(textboxes, key=lambda x: max(i.height for i in x))
text = box_with_tallest_line.get_text()
print re.sub('\s+', ' ', text).strip()
I'll leave renaming the file to you (note that the title might contain characters that you might not want, or that are not even valid in filenames). Pdfminer documentation is rather sparse at the moment, so you might want to ask on the mailing list if you need to know more. (don't know very much about it myself, but couldn't resist trying ;-)). Or you might try a similar approach with other pdf libraries/other languages.
In python, your best bet is to look at pyPdf (Debian package: python-pypdf). Here's some code:
import pyPdf, sys
filename=sys.argv[1]
i=pyPdf.PdfFileReader(open(filename,"rb"))
d=i.getDocumentInfo()
print d["/Title"]
In my experience, few PDFs have the "/Title" attribute set, though, so your mileage may vary. In that case, you'll have to guess the title from the contents, which is bound to be error-prone. pyPdf may help you with that as well.
Try iText (Java). I found this example, try it (you may add generics, if supported):
PdfReader reader = new PdfReader("yourpdf.pdf");
HashMap map= reader.getInfo();
Set keys = map.keySet();
Iterator i = keys.iterator();
while(i.hasNext()) {
String thiskey = (String)i.next();
System.out.println(thiskey + ":" + (String)map.get(thiskey));
}
Another option for C++ is Poppler.
I tried to do something similar in the past (and was asking advice here:
Extracting text from PDF with Poppler (C++) ) but never really got it working. At the end of the day I realised that at least for my use, it was easier to manually rename the files.
The best solution I found for renamin PDF files using not jus the tittle, but any text you need in the pdf file is the A-PDF rename app, it worked very well for all files I tried.

Categories