I need to download files with multiple links from a page (may be more than 100 files with separate links) automatically. I know the URL to login and I have credentials.
I'm willing to do this in Java program by automation. The only way to go to the downloading location page is through login to the site.
Is cURL command helpful to this?
Please advise me to do this.
You can use wget which can download log files:
wget -r --no-parent --user=user --password=password --no-check-certificate <URL>
You can pass headers in --header, e.g. --header "Cookie: JSONSESSIONID=3433434343434"
you can pass post data using --post-data 'email=$EMAIL&password=$PASSWRD'
Or You can use following HttpClient in java:
Here is examples of HTTPClient for login and passing POST/GET/Headers information
First get whole HTML page as String
Either parse that String to get links for files or convert to java objects using XML to Object mappers like https://github.com/FasterXML/jackson-dataformat-xml
Once you get the links of files to download files using HttpClient
public void saveFile(String url, String FileName) throws ClientProtocolException, IOException{
HttpGet httpget = new HttpGet(url);
HttpResponse response = httpClient.execute(httpget);
HttpEntity entity = response.getEntity();
if (entity != null) {
long len = entity.getContentLength();
InputStream is = entity.getContent();
FileOutputStream fos = new FileOutputStream(new File(filePath)));
IOUtils.copy(is, fos);
}
return;
}
If you mean to copy a file from a site to a local file then you can use java.nio.file
Files.copy(new URL("http://host/site/filename").openStream(), Paths.get(localfile)
Related
My requirement is to Download / pull a file from azure git repo And convert it to a byte Array. I searched in Azure git repo API but I couldn't found the rest api call. Please help to get the solution.
I tried with the below url but it's returning unicode value in content object.
GET https://dev.azure.com{organization}/{project}/_apis/git/repositories/{repositoryId}/items?path={path}&versionDescriptor.version={versionDescriptor.version}&versionDescriptor.versionType={versionDescriptor.versionType}&includeContent=true&api-version=6.0
are you saying you want to use a get request from Microsoft azure? I would maybe recommend using fetch to retrieve your data. something along the lines of:
fetch("https://westus.api.cognitive.microsoft.com/face/v1.0/detect? returnFaceId=true&returnFaceLandmarks=false&returnFaceAttributes=emotion&recognitionModel=
recognition_01&returnRecognitionModel=false&detectionModel=detection_01"
, {
method: 'post',
headers: {
'Content-Type': 'application/octet-stream',
'Ocp-Apim-Subscription-Key': '<subscription key>'
},
body: makeblob(contents)
}).then((response) => response.json()).then(success => {
that.setState({selectedFile: url1});
that.setState({facesArray: success});
console.log("facesArray is", that.state.facesArray);
console.log("new selected state is", that.state.selectedFile);
// console.log(success);
}).catch(error =>
console.log("did not work ",error))
I understand i'm using a post request, but if you change the subscription key, body, fetch url, and content type, you might be able to get what your are looking for. Also, you can somehow contact microsoft azure services for more help on their api.
Have you tried this? Make sure you add httpclient to your gradle/madle build. Replace azurePathString with your URL.
public byte[] executeBinary(URI uri) throws IOException, ClientProtocolException {
HttpGet httpget = new HttpGet(azurePathString);
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
entity.writeTo(baos);
return baos.toByteArray();
}
You could use API: https://dev.azure.com/{organization}/{project}/_apis/git/repositories/{repositoryId}/items?path={path}&versionDescriptor.version={versionDescriptor.version}&download=true&api-version=6.0 to download the target file, and then read this file and convert its content to a byte Array. See: Items - Get for more details.
Using URL class in java.net package.
Method 1
String sourceUrl = "https://thumbor.thedailymeal.com/P09kUdGYdBReFSJne1qjVDIphDM=//https://videodam-assets.thedailymeal.com/filestore/5/3/0/2_37ec80e4c368169/5302scr_43fcce37a98877f.jpg%3Fv=2020-03-16+21%3A06%3A42&version=0";
java.net.URL url = new URL(sourceUrl);
InputStream inputStream = url.openStream();
Files.copy(inputStream, Paths.get("/Users/test/rr.png"), StandardCopyOption.REPLACE_EXISTING);
Using Apache's HttpClient class.
Method 2
String sourceUrl = "https://thumbor.thedailymeal.com/P09kUdGYdBReFSJne1qjVDIphDM=//https://videodam-assets.thedailymeal.com/filestore/5/3/0/2_37ec80e4c368169/5302scr_43fcce37a98877f.jpg%3Fv=2020-03-16+21%3A06%3A42&version=0";
CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet httpget = new HttpGet(sourceUrl);
HttpResponse httpresponse = httpclient.execute(httpget);
InputStream inputStream = httpresponse.getEntity().getContent();
Files.copy(inputStream, Paths.get("/Users/test/rr.png"), StandardCopyOption.REPLACE_EXISTING);
I have downloaded the rr.png file using both the methods. I found both the files are different even in sizes also and using method 2 download a blank image. I have read both the methods are same but I do not understand why method1 downloading correct file and method2 downloading wrong file. Please clarify this and also let me know if there is a fix in the method 2 through which I can download the correct file.
First: cross-posting: https://coderanch.com/t/728266/java/URL-openStream-HttpResponse-getEntity-getContent
Second: I guess the issue is the url and how it's handled differently by javas internal class and apache lib - use a debugger and step through them to see what url really gets send out the tls stream.
I can't manage to upload a JAR file, using Webdav and Apache HTTPClient without leading to "invalid or corrupt jarfile" when I attempt to launch it.
Here's my Setup:
Webdav server, using tomcat 8.5 on an external directory (defined in $CATALINA_HOME/conf/Catalina/localhost/webdav.xml)
Apache HTTP Client (org.apache.httpcomponents:httpclient:4.5.5)
Custom Maven Plugin using HTTP Client to upload the file
File is uploaded using a custom maven plugin (which uses HTTP Client internally) after building the JAR.
If I try to use HTTP Client to upload the file to the remote server, it leads to corruption. But I can launch the Jar without any problemif I send the exact same file using curl command
curl -u <user>:<pass> -T <myjar>.jar http://<remotehost>/<myjar>.jar
Here is the sample code using HTTP Client:
class FileSender {
public static void main(String[] args) {
// [...]
RequestConfig.Builder cfg = RequestConfig.copy(RequestConfig.DEFAULT);
cfg = cfg.setConnectTimeout(timeout)
.setConnectionRequestTimeout(timeout)
.setSocketTimeout(timeout);
CredentialsProvider credentialsProvider = authentication.credentials();
HttpClientBuilder builder = HttpClientBuilder.create()
.setDefaultRequestConfig(cfg.build())
.setDefaultCredentialsProvider(credentialsProvider);
try(CloseableHttpClient client = builder.build()) {
HttpPut httpPut = new HttpPut("http://<remote>/<myJar>.jar");
httpPut.setEntity(MultipartEntityBuilder.create()
.addBinaryBody("file", new File("path/to/<myJar>.jar"))
.build());
try (CloseableHttpResponse response = client.execute(httpPut)) {
// Check response HTTP status
}
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
Do you have any what might cause my issue ?
Edit: MD5 hashes seems different if I use HTTP Client and CURL, but CURL & FTP copy share the same hashes.
This is the way how you can upload any file with HttpClient:
CloseableHttpClient httpClient = HttpClientBuilder.create().build();
HttpEntity requestEntity = MultipartEntityBuilder.create().addBinaryBody("file", new File("myfile")).build();
HttpPost post = new HttpPost("http://...");
post.setEntity(requestEntity);
try (CloseableHttpResponse response = httpClient.execute(post)) {
System.out.print(response.getStatusLine());
}
Usually POST method is used to upload form or file content.
I solved it by changing this
MultipartEntityBuilder.create()
.addBinaryBody("file", new File("path/to/<myJar>.jar"))
.build()
by this
new InputStreamEntity(new FileInputStream(new File("path/to/<myJar>.jar)))
I am wanting to upload a video file (.mp4) by POST Request to JIRA. The file gets uploaded to the server, but the video becomes corrupt (i.e. opening it doesn't work). Sending other attachments, like screenshots (.png) and text files (.txt), works fine without corrupting the file.
I am using the Apache HttpComponents HttpClient 4.3.6.
Here is example code:
File file = new File("location/to/file.mp4");
MultipartEntityBuilder multipartEntity = MultipartEntityBuilder.create().addBinaryBody("file", file);
HttpPost postRequest = new HttpPost();
postRequest.addHeader(HttpHeaders.AUTHORIZATION, BASIC_AUTH);
postRequest.addHeader("X-Atlassian-Token", "nocheck");
postRequest.setEntity(multipartEntity.build());
postRequest.setURI(uri);
CloseableHttpClient client = HttpClients.createDefault();
try {
HttpResponse response = client.execute(request);
} finally {
client.close();
}
I attempted to add a video/mp4 MIME type but that didn't seem to help any:
MultipartEntityBuilder.create().addBinaryBody("file", file, ContentType.create("video/mp4"), file.getName())
The issue I had here was that QuickTime on Mac wasn't compatible with the .mp4 file format. I downloaded VLC media player and the file worked just fine without specifying a MIME type.
Is it possible to load login page once, using HttpClient, and get image file of img element from cache, not from src link, without reload? It is important because I need to save captcha for just loaded page, if I try load it from src link, it will be another captcha. I tried:
DefaultHttpClient httpclient = new DefaultHttpClient();
HttpGet httpget = new HttpGet("http://www.mysite/login.jsp");
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
InputStream instream = entity.getContent();
OutputStream outstream = new FileOutputStream("d://file.html");
org.apache.commons.io.IOUtils.copy(instream, outstream);
outstream.close();
instream.close();
but there are not any images. I also tried HtmlUnitDriver from selenium library, there are not any images too. Maybe I must try something else? Can you help me with it?
Thanks and sorry for my English.
As it mentioned here: HttpClient Get images from response the DefaultHttpClient/HttpClient get's only one content, which is in your case it's an HTML page (served from: http://www.mysite/login.jsp). Than you need to parse that HTML page and get the specified img tag with it's src than you need only to download it (ONLY that, without resend the login.jsp request!). If you download a captcha image you need to get that image as soon as possible or it could be overwritten by another user, who tries to login.
As the browser does, you need to do the same way, download HTML, than parse it, than request all src/link/ect depends on what you need.
DefaultHttpClient doesn't cache by default.
CachingHttpClient cache is enabled by default, in this case you need to analyzes If-Modified-Since and If-None-Match headers in order to decide if request to the remote server is performed, or if its result is returned from cache. If there's no change on the server, you will get cached data, if you cached previously.