Get File Length from header Java Android - java

I have Java code I tried many variants but didn't help.
Some hosting are closing file Content length (size of file )so please help me to read file size from header.
Here is the part of my streaming code when so I need to open the list and read the int value am I right?
is = ucon.getInputStream();
fileLength = ucon.getContentLength();
List headersize = ucon.getHeaderFields().get("content-Lenght");

You could do something like this:
URL url = new URL("https://www.google.com/logos/classicplus.png");
URLConnection openConnection = url.openConnection();
System.out.println(openConnection.getContentLength());
According documentation:
public int getContentLength()
Returns the value of the content-length header field.
Returns: the content length of the resource that this connection's URL references, or -1 if the content length is not known.
The method does the same thing you are trying to do manually. And if not defined you will receive -1

You have spelt Content-Length wrong. Use this spelling

Related

Get URL from InputStream

I'm wondering how InputStream in java is implemented from low level perspective.
Suppose I write a below java code for making connection with a website.
url = new URL("[some url info]");
URLConnection urlcon = url.openConnection();
InputStream in = urlcon.getInputStream();
while((readcount = in.read(buffer)) != -1){
fos.write(buffer,0,readcount);
Could I know URL from InputStream("in" in the above code block) directly by casting it's type and call an appropriate method like below? Are there any other ways to get URL from InputStream?
(newtype) new = (newtype) in;
String Url = new.appropriatemethod();
I searched all subclasses of InputStream, but I couldn't find any classes which have a interface to give it's URL.
(https://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html)
However, I think InputStream somehow have information of URL to receive data from a website which has this URL.
I might have big misunderstanding about "Stream".
Thank you for reading my quetion. :)
InputStream is just a stream not a URLConnection. When you type InputStream in = urlcon.getInputStream(); . You will get input stream not a url connection. When you type in.read then you are reading the Stream and not URLConnection.
An InputStream is a reference to source of data (be it a file, network
connection etc), that we want to process as follows:
we generally want to read the data as "raw bytes" and then write our own code to do something interesting with the bytes;
we generally want to read the data in sequential order: that is, to get to the nth byte of data, we have to read all the preceding bytes first, and we're not guaranteed to be able to "jump back" again once we've read them.

Checking content type from URL

I asked this question before and Evgeniy Dorofeev answered it. Although worked for direct link only, but I accepted his answer. He just told me about check the content type from direct link:
String requestUrl = "https://dl-ssl.google.com/android/repository/android-14_r04.zip";
URL url = new URL(requestUrl);
URLConnection c = url.openConnection();
String contentType = c.getContentType();
As far I know, there are two URL types to download a file:
Direct link. For example: https://dl-ssl.google.com/android/repository/android-14_r04.zip. From this link, we can download data directly and get the file name, included with file extension (in this link, .zip extension). So we can know what file to be downloaded. You can try to download from that link.
Undirect link. For example: http://www.example.com/directory/download?file=52378. Have you ever tried to download data from Google Drive? When downloading data from Google Drive, it will gives you an undirect link, such as the link above. We never know whether the link contains a file or webpage. Also, we don't know the file name and file extension is, because of this link type is unclear and random.
I need to check whether it is a file or webpage. I must download it if the content type is a file.
So my question:
How do I check the content type from an undirect link?
As shown in the comments of this question, can HTTP-redirects solves the problem?
Thanks for your help.
After you open an URLConnection, a header file is returned. There are some information about the file in it. You can pull what you want from there. For example:
URLConnection u = url.openConnection();
long length = Long.parseLong(u.getHeaderField("Content-Length"));
String type = u.getHeaderField("Content-Type");
length is size of the file in bytes, type is something like application/x-dosexec or application/x-rar.
Such links redirect browsers to the actual content using HTTP redirects. To get the correct content type, all you have to do is tell HttpURLConnection to follow the redirects by setting setFollowRedirects() to true (documented here).
MimeTypeMap.getFileExtensionFromUrl(url)
This one worked for me, you have to use retrofit to check the headers of response. First you have to define an endpoint to call it with the url you want to check:
#GET
suspend fun getContentType(#Url url: String): Response<Unit>
Then you call it like this to get the content type header:
api.getContentType(url).headers()["content-type"]

How to programmarically read the source code of a webpage after JavaScript has altered the DOM?

I want to view source code of a web page, but the JavaScript change it.
E.g. https://delicious.com/search/ali this is a site page when we click CTRL+U it shows the source code which JavaScript changed not actual one. If you see code using Inspect Element than it shows the complete source code. so I want to get the complete source code.
kindly let me know is there any technique to get the source code provided by the Inspect Element. I am building a software and this is the requirement of that. It is Good if the technique or api you are going to refer me is in JAVA.
I am going to build a software which gets urls from this site.
But because of change made by the JavaScript I can't get the actual Source code.
I'm not sure, but this might be what you are asking for. The code takes a URL object, gets the server's response, and returns the body of the response. This should be a HTML document in your case.
String getSource(URL url) {
HttpURLConnection connection = url.openConnection();
connection.setDoOutput(true);
connection.setDoInput(true);
connection.getOutputStream().write(42);
byte[] bytes = new byte[512];
try (BufferedInputStream bis = new BufferedInputStream(connection.getInputStream())) {
StringBuilder response = new StringBuilder(500);
int in;
while ((in = bis.read(bytes)) != -1) {
response.append(new String(bytes, 0, in));
}
return response.toString().split("\r\n\r\n")[1];
};
}

How to find if url containing special characters or space exists in java

I have tried to find whether a url exists or not with the following code. The requirement is to find whether a file exists or not in the url.
try{
HttpURLConnection.setFollowRedirects(false);
HttpURLConnection con = (HttpURLConnection) new URL(url).openConnection();
con.setRequestMethod("HEAD");
responseCode = con.getResponseCode();
return Response.ok(Integer.toString(responseCode)).build();
}
catch(Exception e){
return Response.ok(e.getMessage()).build();
}
This works perfectly if the url doesn't contain any spaces/special characters. But if it has any, it always returns code 404. Can I know how this can be solved? Thanks in advance.
URLs with spaces in them are invalid. The correct way to create a properly encoded URL from a filename that may contain spaces is URI.toASCIIString(), and then passing that to new URL(), making sure to use a URI constructor that takes multiple arguments so the filename part gets encoded: see the Javadoc.
However I question the requirement. The best way to test whether any resource is available is to try to use it. In this case presumably you are going to read from the URL if it exists, so just do that and catch the FileNotFoundException.

URLConnection and content length : how much data is download?

I've created a servlet which reads the content of a file to a byte array which subsequently is written to the OutputStream of the response:
// set headers
resp.setHeader("Content-Disposition","attachment; filename=\"file.txt\"");
resp.setHeader("Content-Length", "" + fileSize);
// output file content.
OutputStream out = resp.getOutputStream();
out.write(fileBytes);
out.close();
Now, I've also written a "client" which needs to find out how big the file is. This should be easy enough as I've added the "Content-Length" header.
URLConnection conn = url.openConnection();
long fileSize = conn.getContentLength();
However, I am a little uncertain about the big picture. As I understand my own servlet, the entire file content is dumped to the OutputStream of the response. However, does calling getContentLength() also result in the actual file data somehow partially or fully being downloaded? In other words, when i invoke conn.getContentLength(), how much of the file will be returned from the server? Does the headers come "separate" from the content?
All input highly appreciated!
However, does calling getContentLength() also result in the actual
file data somehow partially or fully being downloaded?
No, the getContentLength() method just returns a String value of the size of the content as an Integer.
In other words, when i invoke conn.getContentLength(), how much of
the file will be returned from the server?
None of the file will be downloaded.
Does the headers come "separate" from the content?
Yes, the headers come "separate" from the content.
Now you're certain :D
See the javadocs
Returns the value of the content-length header field.
So a call to getContentLength() merely reads the header value and does not cause any downloading. You have to call getContent() for that.

Categories