I am attempting to write a code to scrape websites and use httpclient. I am trying to import the proper classes to run my program, but it is saying the package does not exist. I have looked at their API to try to figure it out and still cannot. My code is:
import java.io.IOException;
import org.apache.commons.httpclient.*;
import org.apache.commons.httpclient.methods.*;
import java.util.Scanner
public class Scraper3 {
public static String scrapeWebsite() throws IOException {
HttpClient client = new DefaultHttpClient();
HttpGet get = new HttpGet("http://ichart.finance.yahoo.com/table.csv?s=MSFT");
HttpResponse response = client.execute(get);
HttpEntity entity = response.getEntity();
if (entity != null) {
Scanner scanner = new Scanner(entity.getContent());
while (scanner.hasNextLine()) {
System.out.println(scanner.nextLine());
}
}
}
}
try: org.apache.http.client.HttpClient
Related
So these are the imports that I use:
import org.apache.hc.client5.http.classic.HttpClient;
import org.apache.hc.client5.http.classic.methods.HttpGet;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.core5.http.HttpEntity;
import org.apache.hc.core5.http.HttpResponse;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.io.entity.StringEntity;
import org.apache.hc.core5.net.URIBuilder;
import java.net.URI;
And this is code that I made:
HttpResponse response = httpclient.execute(request);
HttpEntity entity = response.getEntity();
But I get this error:
Cannot resolve method 'getEntity' in 'HttpResponse'
I tried looking for solutions, but they were all for Android. I am using Java, and I use IntelliJ
This code comes from this sample:
// // This sample uses the Apache HTTP client from HTTP Components (http://hc.apache.org /httpcomponents-client-ga/)
public class JavaSample
{
public static void main(String[] args)
{
HttpClient httpclient = HttpClients.createDefault();
try
{
URIBuilder builder = new URIBuilder("https://gateway.apiportal.ns.nl/reisinformatie-api/api/v2/departures");
builder.setParameter("station", "{string}");
builder.setParameter("uicCode", "{string}");
builder.setParameter("dateTime", "{integer}");
builder.setParameter("lang", "nl");
builder.setParameter("maxJourneys", "{integer}");
URI uri = builder.build();
HttpGet request = new HttpGet(uri);
request.setHeader("Ocp-Apim-Subscription-Key", "{subscription key}");
// Request body
StringEntity reqEntity = new StringEntity("{body}");
request.setEntity(reqEntity);
HttpResponse response = httpclient.execute(request);
HttpEntity entity = response.getEntity();
if (entity != null)
{
System.out.println(EntityUtils.toString(entity));
}
}
catch (Exception e)
{
System.out.println(e.getMessage());
}
}
}
What am I doing wrong?
This solved the problem:
ClassicHttpResponse response = (ClassicHttpResponse) httpclient.execute(request);
HttpEntity entity = response.getEntity();
I am wondering how I can send form data using euc-jp encoding. My attempt at encoding below is still sending japanese text as ? and odd characters. Thank you!
This is how I am currently doing it (not working properly):
HttpPost request = new HttpPost("http://httpbin.org/post");
List<NameValuePair> params = new ArrayList<>();
params.add(new BasicNameValuePair("Testing", "雄大"));
request.setEntity(new UrlEncodedFormEntity(params, forName("EUC-JP")));
Your code seems good to me. httpbin.org doesn't seem to be handle EUC-JP in response. Instead you can use putsreq.com to see your request parameters.
import java.util.*;
import org.apache.http.client.entity.UrlEncodedFormEntity;
import org.apache.http.message.BasicNameValuePair;
import org.apache.http.client.methods.*;
import org.apache.http.NameValuePair;
import java.nio.charset.*;
import org.apache.http.impl.client.*;
import org.apache.http.client.*;
import org.apache.http.*;
import java.io.*;
class Main {
public static void main(String[] args) throws Exception {
HttpClient httpclient = new DefaultHttpClient();
// Create new PutsReq URL by yourself
HttpPost request = new HttpPost("https://putsreq.com/xxxxxxxxxxxxxxxxxxxx");
List<NameValuePair> params = new ArrayList<>();
params.add(new BasicNameValuePair("Testing", "雄大"));
request.setEntity(new UrlEncodedFormEntity(params, Charset.forName("euc-jp")));
HttpResponse response = httpclient.execute(request);
BufferedReader reader = new BufferedReader(new InputStreamReader((response.getEntity().getContent())));
while ((reader.readLine()) != null) {
System.out.println (reader.readLine());
}
reader.close();
}
}
And you will see
Testing=%CD%BA%C2%E7
in the inspect page. 0xCDBA means 雄 in EUC-JP.
I'm trying to connect to my server by HTTP and obtain a JSON object, but it seems that I can't do it.
Here's my code:
import org.apache.http.HttpEntity;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;
public class Conector {
public static void main(String[] args) throws Exception {
CloseableHttpClient httpclient = HttpClients.createDefault();
try {
HttpGet httpGet = new HttpGet("https://falseweb/select_all.php");
CloseableHttpResponse response1 = httpclient.execute(httpGet);
try {
System.out.println(response1.getStatusLine());
HttpEntity entity1 = response1.getEntity();
System.out.println(entity1.getContentEncoding());
EntityUtils.consume(entity1);
} finally {
response1.close();
}
} finally {
httpclient.close();
}
}
}
But instead of the JSON I get a printed null. I already tried the php file and it works, returning a json. Any idea what I'm doing wrong?.
The connection works because i got this message :
HTTP/1.1 200 OK
You seem to be logging out the content encoding as opposed to the actual content of the response which is probably why you aren't having any JSON joy.
Give the following a go (using Apache Commons IOUtils) :
System.out.println(IOUtils.toString(entity1.getContent(), "UTF8"));
If you can't use IOUtils then you can use any method of converting the InputStream into a String. More on that here.
I am trying to use some code that I got from a website that has sports data served publically via an API (http://developer.fantasydata.com).
The site provide some sample JAVA code to make the http request. For some reason the setEntity method for the declared request (request) is showing a "cannot find symbol error.
package epl.fixtures.test.app;
import java.net.URI;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.utils.URIBuilder;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;
public class EPLFixturesTestApp {
/**
* #param args the command line arguments
*/
public static void main(String[] args) {
// TODO code application logic here
HttpClient httpclient = HttpClients.createDefault();
try
{
URIBuilder builder = new URIBuilder("https://api.fantasydata.net/soccer/v2/json/CompetitionDetails/EPL");
URI uri = builder.build();
HttpGet request = new HttpGet(uri);
request.setHeader("Ocp-Apim-Subscription-Key", "****************");
// Request body
StringEntity reqEntity = new StringEntity("{body}");
request.setEntity(reqEntity);
HttpResponse response = httpclient.execute(request);
HttpEntity entity = response.getEntity();
if (entity != null)
{
System.out.println(EntityUtils.toString(entity));
}
}
catch (Exception e)
{
System.out.println(e.getMessage());
}
}
}
The line causing the issue is the request.setEntity(reqEntity); line
Can anyone explain this to me please? I have all the relevant jar files from apache added to the project libraries directory.
Thanks
HttpGet does not have a setEntity method.
This makes sense, since the request body has no meaning in GET requests.
Only classes implementing HttpEntityEnclosingRequest have this method.
I don't know why the documentation uses it, but it seems to work when omitting those two lines (which look meaningless anyway). Code:
URIBuilder builder = new URIBuilder("https://api.fantasydata.net/soccer/v2/json/CompetitionDetails/EPL");
URI uri = builder.build();
HttpGet request = new HttpGet(uri);
request.setHeader("Ocp-Apim-Subscription-Key", "****************");
HttpResponse response = httpclient.execute(request);
HttpEntity entity = response.getEntity();
if (entity != null)
{
System.out.println(EntityUtils.toString(entity));
}
There is a json file on my websites floder.
Here is the content:
{
"IsUpdateForcibly": "false",
"Version": "1.0",
"ReleaseNote": "OHOHOHOHOHO",
"DownloadLink": "http://192.168.1.37:11604/APK/FrauleinProject.apk"
}
If I use the browser to see,like http://localhost:11604/Content/CheckVersion.json, the result is same as thefile's content.
While I use the Java code. the response content is a little bit different.
?{
"IsUpdateForcibly": "false",
"Version": "1.0",
"ReleaseNote": "OHOHOHOHOHO",
"DownloadLink": "http://192.168.1.37:11604/APK/FrauleinProject.apk"
}
Why there is a question mark in the front of the string?
Here is is my httpclient code.
import net.sf.json.JSONArray;
import net.sf.json.JSONObject;
import sun.misc.BASE64Decoder;
import sun.misc.BASE64Encoder;
import sun.misc.IOUtils;
import sun.net.www.http.HttpClient;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.ResponseHandler;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.ContentType;
import org.apache.http.entity.InputStreamEntity;
import org.apache.http.entity.mime.MultipartEntityBuilder;
import org.apache.http.entity.mime.content.FileBody;
import org.apache.http.entity.mime.content.StringBody;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.impl.client.WinHttpClients;
import org.apache.http.util.EntityUtils;
public class DesUtil {
public static void main(String[] args) throws Exception {
CloseableHttpClient httpclient = WinHttpClients.createDefault();
// There is no need to provide user credentials
// HttpClient will attempt to access current user security context through
// Windows platform specific methods via JNI.
try {
HttpGet httpget = new HttpGet("http://localhost:11604/Content/CheckVersion.json");
System.out.println("Executing request " + httpget.getRequestLine());
CloseableHttpResponse response = httpclient.execute(httpget);
try {
System.out.println("----------------------------------------");
ResponseHandler<String> responseHandler = new ResponseHandler<String>() {
#Override
public String handleResponse(
final HttpResponse response) throws ClientProtocolException, IOException {
int status = response.getStatusLine().getStatusCode();
if (status >= 200 && status < 300) {
HttpEntity entity = response.getEntity();
return entity != null ? EntityUtils.toString(entity) : null;
} else {
throw new ClientProtocolException("Unexpected response status: " + status);
}
}
};
String json= new String(httpclient.execute(httpget, responseHandler).getBytes("ISO-8859-1"),"UTF-8");
PrintWriter out = new PrintWriter("filename.txt");
out.println(json);
out.close();
System.out.println(json);
JSONObject obj = JSONObject.fromObject(json);
System.out.println(obj==null);
Sb newSB= (Sb)JSONObject.toBean(obj,Sb.class);
System.out.println(newSB==null);
System.out.println(newSB.IsUpdateForcibly);
System.out.println(newSB.Version);
System.out.println(newSB.ReleaseNote);
System.out.println(newSB.DownloadLink);
}
catch(Exception ex){
System.out.println(ex.getMessage());
}
finally {
response.close();
}
}
catch(Exception ex){
System.out.println(ex.getMessage());
}
finally {
httpclient.close();
}
System.out.println("end");
}
}
I had a similar problem. I solved it by adding "UTF-8"
String str= EntityUtils.toString(entity2);
to
String str= EntityUtils.toString(entity2,"UTF-8");
demo:
private static void sendPost() throws ClientProtocolException, IOException
{
CloseableHttpClient httpClient = HttpClients.createDefault();
HttpPost httpPost = new HttpPost("http://127.0.0.1:8911/crr");
ArrayList<NameValuePair> nvps = new ArrayList <NameValuePair>();
nvps.add(new BasicNameValuePair("crawlId", "123"));
nvps.add(new BasicNameValuePair("transType", "0"));
httpPost.setEntity(new UrlEncodedFormEntity(nvps));
CloseableHttpResponse response2 = httpClient.execute(httpPost);
try {
System.out.println(response2.getStatusLine());
HttpEntity entity2 = response2.getEntity();
String str= EntityUtils.toString(entity2,"UTF-8");
System.out.println(str);
} finally {
response2.close();
}
}
This probably stems from a Unicode BOM character, a zero-width space in Unicode that is used in UTF-8, UTF-16LE, UTF-16BE at the beginning of a file to mark it as Unicode: \uFEFF. It is redundant, unneeded, and - as seen here - causes several problems.
It was replaced with a question mark, as the character encoding of the saved text could not represent the BOM character.
As #zhizhi mentioned, better safe the JSON as UTF-8. Still better is to remove the BOM.
PrintWriter out = new PrintWriter("filename.txt", "UTF-8");
json = json.replaceFirst("^\uFEFF", "");
Mind that removing the BOM poses a UTF-8 recognition problem for Notepad.