I am trying to read content from SequenceInputStream
but it return me null but when i inspect it contains value please refer the screenshot
following is my code
InputStream is = m.getContent(InputStream.class);
SequenceInputStream si = new SequenceInputStream(is, null);
int j;
int i=0;
while((i=si.read())!=-1) {
System.out.print((char)i);
}
I also tryid to read it from InputStream
but the same output
follwoing is my input stream read code
CachedOutputStream bos = new CachedOutputStream();
IOUtils.copy(is,bos);
String soapMessage = new String(bos.getBytes());
System.out.println("-------------------------------------------");
System.out.println("incoming message is " + soapMessage);
System.out.println("-------------------------------------------");
bos.flush();
It is mandatory to use two streams should be not null. If anyone of the stream is null you will get null pointer exception after reading the first one.
you change your code to SequenceInputStream si = new SequenceInputStream(null, is); you get NPE at first place
Why you have to use sequenceInputStream to read the content. You can use below code or any other stream reader
StringWriter writer = new StringWriter();
IOUtils.copy(inputStream, writer, encoding);
String theString = writer.toString();
Thanks,
Gowtham
Related
I am trying to copy some data from pdf to txt file here is the code
public void readPDFFile() throws IOException {
InputStreamReader reader;
OutputStreamWriter writer;
FileInputStream inputstream;
FileOutputStream outputStream;
BufferedReader bufferedReader = null;
BufferedWriter bufferedWriter = null;
String str;
File rfile = new File(
"C://Documents and Settings/Administrator/My Documents/EGDownloads/source.pdf");
File wFile = new File("C://Documents and Settings/Administrator/My Documents/Folder/destination.txt");
try {
inputstream = new FileInputStream(rfile);
outputStream = new FileOutputStream(wFile);
reader = new InputStreamReader(inputstream, "UTF-8");
writer = new OutputStreamWriter(outputStream, "UTF-8");
bufferedReader = new BufferedReader(reader);
bufferedWriter = new BufferedWriter(writer);
while ((str = bufferedReader.readLine()) != null) {
writer.write(str);
}
} catch (IOException es) {
System.out.println(es.getMessage());
es.printStackTrace(System.out);
} finally {
if (bufferedReader != null) {
bufferedReader.close();
}
if (bufferedWriter != null)
bufferedWriter.close();
}
}
Expected output is supposed in other language but all I am getting is some random boxes as tried both UTF-16 and UTF-8 unicodes
I tried pdfBox but is still not working as all I'm getting is only original language accent and in english language
Note :
1 I'm not trying to print data on console but copying from pdf to txt file
2 Other file contains non english words,
can anyone help me to solve that??
Or any link that might help
Thanks.
The PDF format is a binary format. You must have a really special PDF as all that I know of are compressed in some way. Use a proper library to read it, be it pdfbox or itext or other. Be aware that in some PDFs it's impossible to extract text, you can check it with Acrobat, if Acrobat can't do it nobody can.
I have to use a method whose signature is like this
aMethod(FileInputStream);
I call that method like this
FileInputStream inputStream = new FileInputStream(someTextFile);
aMethod(inputStream);
I want to remove/edit some char which is being read from someTextFile before it being passed into aMethod(inputStream);
I cannot change aMethod's signature or overload it. And, it just take a InputStream.
If method taking a string as param, then I wouldn't be asking this question.
I am InputStream noob. Please advise.
you can convert a string into input stream
String str = "Converted stuff from reading the other inputfile and modifying it";
InputStream is = new ByteArrayInputStream(str.getBytes());
Here is something that might help. It will grab your .txt file. Then it will load it and go through line by line. You have to fill in the commented areas to do what you want.
public void parseFile() {
String inputLine;
String filename = "YOURFILE.txt";
Thread thisThread = Thread.currentThread();
ClassLoader loader = thisThread.getContextClassLoader();
InputStream is = loader.getResourceAsStream(filename);
try {
FileWriter fstream = new FileWriter("path/to/NEWFILE.txt");
BufferedWriter out = new BufferedWriter(fstream);
BufferedReader reader = new BufferedReader(
new InputStreamReader(is));
while((inputLine = reader.readLine()) != null) {
String[] str = inputLine.split("\t");
if(/* IF WHAT YOU WANT IS IN THE FILE ADD IT */) {
// DO SOMETHING OR ADD WHAT YOU WANT
out.append(str);
out.newLine();
}
}
reader.close();
out.close();
} catch (Exception e) {
e.getMessage();
}
}
Have you looked at another class FilterInputStream which also extends InputStream which may fit into your requirement?
From the documentation for the class
A FilterInputStream contains some other input stream, which it uses as its basic source of data, possibly transforming the data along the way or providing additional functionality.
Also have a look at this question which also seems to be similar to your question.
How can I convert InputStreamReader to InputStream? I have an InputStream which contains some string and byte data and I want to parse it. So I wrap my InputStream to BufferedReader. Then I read 3 lines from it. After that I want to get the rest of data(bytes) as is. But if I try to get it nothing happens.
Code snippet:
BufferedReader br = new BufferedReader(new InputStreamReader(is,"UTF-8"));
String endOfData = br.readLine();
String contentDisposition = br.readLine();
String contentType = br.readLine();
file = new File(filename);
if(file.exists()) file.delete();
file.createNewFile();
FileOutputStream fos = new FileOutputStream(file);
byte[] data = new byte[8192];
int len = 0;
while (-1 != (len = is.read(data)) )
{
fos.write(data, 0, len);
Log.e("len", len+"");
}
fos.flush();
fos.close();
is.close();
The file is empty. If I don't wrap InputStream it works fine, but I need to read 3 lines and remove it.
Thanks.
If you want to mix text and byte data together, you should use OutputStream.writeUTF to write out those 3 lines, this way one single InputStream will be able to retrieve all the data that you need.
Take a look at commons-io's ReaderInputStream: it is a little heavy handed, but you can wrap the BufferedReader with that and read it as an input stream again.
It's pretty hard to mix byte and character input correctly, especially once you start throwing buffered readers / streams into the mix. I'd suggest that you either pick one and stick with it (converting your bytes to strings as necessary; care with the encoding!) or wrap the entire thing in a ZipOutputStream so you can have multiple logical "files" with different contents.
Virtually every code example out there reads a TXT file line-by-line and stores it in a String array. I do not want line-by-line processing because I think it's an unnecessary waste of resources for my requirements: All I want to do is quickly and efficiently dump the .txt contents into a single String. The method below does the job, however with one drawback:
private static String readFileAsString(String filePath) throws java.io.IOException{
byte[] buffer = new byte[(int) new File(filePath).length()];
BufferedInputStream f = null;
try {
f = new BufferedInputStream(new FileInputStream(filePath));
f.read(buffer);
if (f != null) try { f.close(); } catch (IOException ignored) { }
} catch (IOException ignored) { System.out.println("File not found or invalid path.");}
return new String(buffer);
}
... the drawback is that the line breaks are converted into long spaces e.g. " ".
I want the line breaks to be converted from \n or \r to <br> (HTML tag) instead.
Thank you in advance.
What about using a Scanner and adding the linefeeds yourself:
sc = new java.util.Scanner ("sample.txt")
while (sc.hasNext ()) {
buf.append (sc.nextLine ());
buf.append ("<br />");
}
I don't see where you get your long spaces from.
You can read directly into the buffer and then create a String from the buffer:
File f = new File(filePath);
FileInputStream fin = new FileInputStream(f);
byte[] buffer = new byte[(int) f.length()];
new DataInputStream(fin).readFully(buffer);
fin.close();
String s = new String(buffer, "UTF-8");
You could add this code:
return new String(buffer).replaceAll("(\r\n|\r|\n|\n\r)", "<br>");
Is this what you are looking for?
The code will read the file contents as they appear in the file - including line breaks.
If you want to change the breaks into something else like displaying in html etc, you will either need to post process it or do it by reading the file line by line. Since you do not want the latter, you can replace your return by following which should do the conversion -
return (new String(buffer)).replaceAll("\r[\n]?", "<br>");
StringBuilder sb = new StringBuilder();
try {
InputStream is = getAssets().open("myfile.txt");
byte[] bytes = new byte[1024];
int numRead = 0;
try {
while((numRead = is.read(bytes)) != -1)
sb.append(new String(bytes, 0, numRead));
}
catch(IOException e) {
}
is.close();
}
catch(IOException e) {
}
your resulting String: String result = sb.toString();
then replace whatever you want in this result.
I agree with the general approach by #Sanket Patel, but using Commons I/O you would likely want File Utils.
So your code word look like:
String myString = FileUtils.readFileToString(new File(filePath));
There is also another version to specify an alternate character encoding.
You should try org.apache.commons.io.IOUtils.toString(InputStream is) to get file content as String. There you can pass InputStream object which you will get from
getAssets().open("xml2json.txt") *<<- belongs to Android, which returns InputStream*
in your Activity. To get String use this :
String xml = IOUtils.toString((getAssets().open("xml2json.txt")));
So,
String xml = IOUtils.toString(*pass_your_InputStream_object_here*);
In Java, I am trying to parse an HTML file that contains complex text such as greek symbols.
I encounter a known problem when text contains a left facing quotation mark. Text such as
mutations to particular “hotspot” regions
becomes
mutations to particular “hotspot�? regions
I have isolated the problem by writting a simple text copy meathod:
public static int CopyFile()
{
try
{
StringBuffer sb = null;
String NullSpace = System.getProperty("line.separator");
Writer output = new BufferedWriter(new FileWriter(outputFile));
String line;
BufferedReader input = new BufferedReader(new FileReader(myFile));
while((line = input.readLine())!=null)
{
sb = new StringBuffer();
//Parsing would happen
sb.append(line);
output.write(sb.toString()+NullSpace);
}
return 0;
}
catch (Exception e)
{
return 1;
}
}
Can anybody offer some advice as how to correct this problem?
★My solution
InputStream in = new FileInputStream(myFile);
Reader reader = new InputStreamReader(in,"utf-8");
Reader buffer = new BufferedReader(reader);
Writer output = new BufferedWriter(new FileWriter(outputFile));
int r;
while ((r = reader.read()) != -1)
{
if (r<126)
{
output.write(r);
}
else
{
output.write("&#"+Integer.toString(r)+";");
}
}
output.flush();
The file read is not in the same encoding (probably UTF-8) as the file written (probably ISO-8859-1).
Try the following to generate a file with UTF-8 encoding:
BufferedWriter output = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(outputFile),"UTF8"));
Unfortunately, determining the encoding of a file is very difficult. See Java : How to determine the correct charset encoding of a stream
In addition to what Thierry-Dimitri Roy wrote, if you know the encoding you have to create your FileReader with a bit of extra work. From the docs:
Convenience class for reading
character files. The constructors of
this class assume that the default
character encoding and the default
byte-buffer size are appropriate. To
specify these values yourself,
construct an InputStreamReader on a
FileInputStream.
The Javadoc for FileReader says:
The constructors of this class assume that the default character encoding and the default byte-buffer size are appropriate. To specify these values yourself, construct an InputStreamReader on a FileInputStream.
In your case the default character encoding is probably not appropriate. Find what encoding the input file uses, and specify it. For example:
FileInputStream fis = new FileInputStream(myFile);
InputStreamReader isr = new InputStreamReader(fis, "charset name goes here");
BufferedReader input = new BufferedReader(isr);