I want to ask a question about Java. I have use the URLConnection in Java to retrieve the DataInputStream. and I want to convert the DataInputStream into a String variable in Java. What should I do? Can anyone help me. thank you.
The following is my code:
URL data = new URL("http://google.com");
URLConnection dataConnection = data.openConnection();
DataInputStream dis = new DataInputStream(dataConnection.getInputStream());
String data_string;
// convent the DataInputStream to the String
import java.net.*;
import java.io.*;
class ConnectionTest {
public static void main(String[] args) {
try {
URL google = new URL("http://www.google.com/");
URLConnection googleConnection = google.openConnection();
DataInputStream dis = new DataInputStream(googleConnection.getInputStream());
StringBuffer inputLine = new StringBuffer();
String tmp;
while ((tmp = dis.readLine()) != null) {
inputLine.append(tmp);
System.out.println(tmp);
}
//use inputLine.toString(); here it would have whole source
dis.close();
} catch (MalformedURLException me) {
System.out.println("MalformedURLException: " + me);
} catch (IOException ioe) {
System.out.println("IOException: " + ioe);
}
}
}
This is what you want.
You can use commons-io IOUtils.toString(dataConnection.getInputStream(), encoding) in order to achieve your goal.
DataInputStream is not used for what you want - i.e. you want to read the content of a website as String.
If you want to read data from a generic URL (such as www.google.com), you probably don't want to use a DataInputStream at all. Instead, create a BufferedReader and read line by line with the readLine() method. Use the URLConnection.getContentType() field to find out the content's charset (you will need this in order to create your reader properly).
Example:
URL data = new URL("http://google.com");
URLConnection dataConnection = data.openConnection();
// Find out charset, default to ISO-8859-1 if unknown
String charset = "ISO-8859-1";
String contentType = dataConnection.getContentType();
if (contentType != null) {
int pos = contentType.indexOf("charset=");
if (pos != -1) {
charset = contentType.substring(pos + "charset=".length());
}
}
// Create reader and read string data
BufferedReader r = new BufferedReader(
new InputStreamReader(dataConnection.getInputStream(), charset));
String content = "";
String line;
while ((line = r.readLine()) != null) {
content += line + "\n";
}
Related
Hey I am having a file nearly 110MB size at apache. I am reading that file into input stream and then converting that input stream to List of String based on all suggestion i find on stack overflow. But still i am facing out of memory issue.
Below is my code.
private List<String> readFromHttp(String url, PlainDiff diff) throws Exception {
HttpUrlConnection con = new HttpUrlConnection();
con.setGetUrl(url);
List<String> lines = new ArrayList<String>();
final String PREFIX = "stream2file";
final String SUFFIX = ".tmp";
final File tempFile = File.createTempFile(PREFIX, SUFFIX);
tempFile.deleteOnExit();
StringBuilder sb = new StringBuilder();
try {
InputStream data = con.sendGetInputStream();
if(data==null)
throw new UserAuthException("diff is not available at the location");
else {
try (FileOutputStream out = new FileOutputStream(tempFile)) {
IOUtils.copy(data, out);
LineIterator it = FileUtils.lineIterator(tempFile, "UTF-8");
try {
while (it.hasNext()) {
String line = it.nextLine();
lines.add(line);
sb.append(line);
}
} finally {
LineIterator.closeQuietly(it);
}
}
data.close();
diff.setLineAsString(sb.toString());
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
//System.out.println(lines);
return lines;
}
public InputStream sendGetInputStream() throws IOException {
String encoding = Base64.getEncoder().encodeToString(("abc:$xyz$").getBytes("UTF-8"));
URL obj = new URL(getGetUrl());
// Setup the connection
HttpURLConnection con = (HttpURLConnection) obj.openConnection();
// Set the parameters from the headers
con.setRequestMethod("GET");
con.setDoOutput(true);
con.setRequestProperty ("Authorization", "Basic " + encoding);
InputStream is;
int responseCode = con.getResponseCode();
logger.info("GET Response Code :: " + responseCode);
if (responseCode == HttpURLConnection.HTTP_OK) {
is = con.getInputStream();
}
else {
is = null;
}
return is;
}
Is something in memory i am doing that is consuming lot of heap? Is there a better way to do it?
Your code has multiple issues. I am not going to solve each and every issue but point that out so that you can review your code and learn to write better code.
In method readFromHttp(..):
There is no need to create a new file by IOUtils.copy(data, out);
No use of String Builder StringBuilder sb = new StringBuilder();
No use of line iterator LineIterator
And there are multiple other memory-related issues but for the time being correct these points and test with the below-mentioned code.
Change your reading lines from file to very simple way after correcting the above mistakes:
try(BufferedReader reader = new BufferedReader(new InputStreamReader(data, StandardCharsets.UTF_8))) {
for (String line; (line = reader.readLine()) != null;) {
lines.add(line);
}
}
I'm doing a simple JSON grab from two links with the same code. I'm doing it two separate times, so the cause of my issue isn't because they're running into each other or something.
Here is my code:
#Override
protected String doInBackground(Object... params) {
try {
URL weatherUrl = new URL("my url goes here");
HttpURLConnection connection = (HttpURLConnection) weatherUrl
.openConnection();
connection.connect();
responseCode = connection.getResponseCode();
if (responseCode == HttpURLConnection.HTTP_OK) {
InputStream inputStream = connection.getInputStream();
Reader reader = new InputStreamReader(inputStream);
int contentLength = connection.getContentLength();
char[] charArray = new char[contentLength];
reader.read(charArray);
String responseData = new String(charArray);
Log.v("test", responseData);
When I try this with:
http://www.google.com/calendar/feeds/developer-calendar#google.com/public/full?alt=json
I get an error of having an array lenth of -1
For this link:
http://api.openweathermap.org/data/2.5/weather?id=5815135
It returns fine and I get a log of all of the JSON. Does anyone have any idea why?
Note: I tried stepping through my code in debug mode, but I couldn't catch anything. I also downloaded a Google chrome extension for parsing json in the browser and both urls look completely valid. I'm out of ideas.
Log this: int contentLength = connection.getContentLength();
I don't see the google url returning a content-length header.
If you just want String output from a url, you can use Scanner and URL like so:
Scanner s = new Scanner(new URL("http://www.google.com").openStream(), "UTF-8").useDelimiter("\\A");
out = s.next();
s.close();
(don't forget try/finally block and exception handling)
The longer way (which allows for progress reporting and such):
String convertStreamToString(InputStream is) throws UnsupportedEncodingException {
BufferedReader reader = new BufferedReader(new
InputStreamReader(is, "UTF-8"));
StringBuilder sb = new StringBuilder();
String line = null;
try {
while ((line = reader.readLine()) != null)
sb.append(line + "\n");
} catch (IOException e) {
// Handle exception
} finally {
try {
is.close();
} catch (IOException e) {
// Handle exception
}
}
return sb.toString();
}
}
and then call String response = convertStreamToString( inputStream );
I am reading HTTP response from a Perl page in a Servlet like this:
public String getHTML(String urlToRead) {
URL url;
HttpURLConnection conn;
BufferedReader rd;
String line;
String result = "";
try {
url = new URL(urlToRead);
conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("GET");
conn.setRequestProperty("Accept-Charset", "UTF-8");
conn.setRequestProperty("Content-Type", "text/xml; charset=UTF-8");
rd = new BufferedReader(new InputStreamReader(conn.getInputStream(), "UTF-8"));
while ((line = rd.readLine()) != null) {
byte [] b = line.getBytes();
result += new String(b, "UTF-8");
}
rd.close();
} catch (Exception e) {
e.printStackTrace();
}
return result;
}
I am displaying this result with this code:
response.setContentType("text/plain; charset=UTF-8");
PrintWriter out = new PrintWriter(new OutputStreamWriter(response.getOutputStream(), "UTF-8"), true);
try {
String query = request.getParameter("query");
String type = request.getParameter("type");
String res = getHTML(url);
out.write(res);
} finally {
out.close();
}
But the response still is not encoded as UTF-8. What am I doing wrong?
Thanks in advance.
That call to line.getBytes() looks suspicious. You should probably make it line.getBytes("UTF-8") if you are certain that what is returned is UTF-8 encoded. Additionally, I'm not sure why it is even necessary. A typical approach to getting data out of a BufferedReader is to use a StringBuilder to continue appending each String retrieved from readLine into a result. The conversion back and forth between String and byte[] is unnecessary.
Change result into a StringBuilder and do this:
while ((line = rd.readLine()) != null) {
result.append(line);
}
Here is where you break the chain of character encoding conversions:
while ((line = rd.readLine()) != null) {
byte [] b = line.getBytes(); // NOT UTF-8
result += new String(b, "UTF-8");
}
From String#getBytes() javadoc:
Encodes this String into a sequence of bytes using the platform's
default charset, storing the result into a new byte array
And, defaullt charset is probably not UTF-8.
But why do all the conversions in the first place? Just read the raw bytes from the source and write the raw bytes to the consumer. It's supposed to be UTF-8 all the way.
I also faced the same problem in another scenario, but just do it I believe it will work:
byte[] b = line.getBytes(UTF8_CHARSET);
in the while loop:
while ((line = rd.readLine()) != null) {
byte [] b = line.getBytes(); // NOT UTF-8
result += new String(b, "UTF-8");
}
In my case, I have do add another configuration.
Previously, I was writing the page this way:
try (PrintStream printStream = new PrintStream(response.getOutputStream()) {
printStream.print(pageInjecting);
}
I changed to:
try (PrintStream printStream = new PrintStream(response.getOutputStream(), false, "UTF-8")) {
printStream.print(pageInjecting);
}
How do I retrieve the contents of a file and assign it to a string?
The file is located on a https server and the content is plain text.
I suggest Apache HttpClient: easy, clean code and it handles the character encoding sent by the server -- something that java.net.URL/java.net.URLConnection force you to handle yourself:
String url = "http://example.com/file.txt";
HttpClient client = new DefaultHttpClient();
HttpResponse response = client.execute(new HttpGet(url));
String contents = EntityUtils.toString(response.getEntity());
Look at the URL Class in the Java API.
Pretty sure all you need is there.
First download the file from the server using the URL class of java.
String url = "http://url";
java.io.BufferedInputStream in = new java.io.BufferedInputStream(new
java.net.URL(url).openStream());
java.io.FileOutputStream fos = new java.io.FileOutputStream("file.txt");
java.io.BufferedOutputStream bout = new BufferedOutputStream(fos,1024);
byte data[] = new byte[1024];
while(in.read(data,0,1024)>=0)
{
bout.write(data);
}
bout.close();
in.close();
Then read the downloaded file using FileInputStream class of java
File file = new File("file.txt");
int ch;
StringBuffer strContent = new StringBuffer("");
FileInputStream fin = null;
try {
fin = new FileInputStream(file);
while ((ch = fin.read()) != -1)
strContent.append((char) ch);
fin.close();
} catch (Exception e) {
System.out.println(e);
}
System.out.println(strContent.toString());
Best answer I found:
public static String readPage(String url, String delimeter)
{
try
{
URL URL = new URL(url);
URLConnection connection = URL.openConnection();
BufferedReader reader = new BufferedReader(new InputStreamReader(connection.getInputStream()));
String line, lines = "";
while ((line = reader.readLine()) != null)
{
if(lines != "")
{
lines += delimeter;
}
lines += line;
}
return lines;
}
catch (Exception e)
{
return null;
}
}
I have strange problem with BufferedReader reading from web.
This URL content is different in browsers than in pasted Java code.
In content fetched using Java first elements result is empty in browser it is not.
My code:
public static void main(String[] args) {
try {
String url = "https://api.freebase.com/api/service/mqlread?queries={\"q1\":{\"query\":[{\"name\":\"Pulp Fiction\",\"*\":null,\"type\":\"/film/film\"}]},\"q3\":{\"query\":[{\"name\":\"Portal\",\"*\":null,\"type\":\"/cvg/computer_videogame\"}]}}";
URL u = new URL(url);
System.out.println(u.toString());
URLConnection urlConn = u.openConnection();
InputStreamReader is = new InputStreamReader(urlConn.getInputStream());
BufferedReader br = new BufferedReader(is);
String line = null;
String data = "";
while ((line = br.readLine()) != null) {
data += line + "\n";
}
br.close();
System.out.println(data);
} catch (Exception ex) {
System.err.println(ex);
}
}
EDIT: Ahh. Figured it out. No space characters in URLs. Just replace them with %20.