I'm trying to download this html
I'm using this code:
Document doc = null;
try {
doc =Jsoup.connect(link).userAgent("Mozilla").get();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Log.i ("html", doc.toString());
UPDATED:
ASLO tried to use it:
HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet(link);
HttpResponse response = null;
try {
response = client.execute(request);
} catch (ClientProtocolException e1) {
//
e1.printStackTrace();
} catch (IOException e1) {
//
e1.printStackTrace();
}
InputStream in = null;
try {
in = response.getEntity().getContent();
} catch (IllegalStateException e1) {
//
e1.printStackTrace();
} catch (IOException e1) {
//
e1.printStackTrace();
}
BufferedReader reader = null;
try {
reader = new BufferedReader(new InputStreamReader(in, "UTF-8"));
} catch (UnsupportedEncodingException e) {
//
e.printStackTrace();
}
StringBuilder str = new StringBuilder();
String line = null;
try {
while((line = reader.readLine()) != null)
{
str.append(line);
}
} catch (IOException e1) {
//
e1.printStackTrace();
}
try {
in.close();
} catch (IOException e1) {
//
e1.printStackTrace();
}
String html = str.toString();
Log.e("html", html);
again responce like this one:
<html>
<body>
<script>document.cookie="BPC=f563534535121d5a1ba5bd1e153b";
document.location.href="http://...link.../all?attempt=1";</script>
</body>
</html>
I can't find any solution... Page can not be downloaded maybe because haven't cookie ... or what?
In the script tag, you have this statement :
document.location.href="....link..../all?attempt=1";
Normally it forces the browser to reload the page with the location. I think it's the page "....link...?attempt=1" that you want to download in fact.
It is not sure that it will work anyway if you don't use the cookie defined in the script but it deserves a try.
Related
I have created a java gui which takes values from the user send it to python file for processing and then displays the output from the python file onto the java gui. This is working perfectly on eclipse but when i exported it into a jar file the output is not displayed. I've seen a bunch of other questions like this but they do not give a solution that would help me.
This is how i connect my python script to java.
public void connection(String name)
{
ProcessBuilder pb= new ProcessBuilder("python","recomold.py","--movie_name",name);
///System.out.println("running file");
Process process = null;
try {
process = pb.start();
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
int err = 0;
try {
err = process.waitFor();
} catch (InterruptedException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
// System.out.println("any errors?"+(err==0 ? "no" : "yes"));
/* try {
System.out.println("python output "+ output(process.getInputStream()));
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}*/
try {
matches.setText(output(process.getInputStream()));
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
private String output(InputStream inputStream) throws IOException {
StringBuilder sb = new StringBuilder();
BufferedReader br = null;
try{
br= new BufferedReader(new InputStreamReader(inputStream));
String line = null;
while((line=br.readLine())!=null)
{
sb.append(line+"\n");
//descp.setText("<html><br/><html>");
//sb.append("\n");
}
}
finally
{
br.close();
}
return sb.toString();
}
My actual class accepting URL as input and calling url.openStream(),this should return InputStream.
public static Map<String, Object> parseA(URL url) throws Exception {
byte[] readData = new byte[25*1024*1024];
// Here url.openStream() returning null
InputStream is = url.openStream();
while((readLength = is.read(readData, 0, 25*1024*1024)) != -1){
br = new BufferedReader(new InputStreamReader(new
ByteArrayInputStream(readData)));
// All CW_* strings are collected first
}
My test class is
#Rule
public TemporaryFolder folder= new TemporaryFolder();
#Test(enabled = true)
public void parseATest() {
File file=null;
try {
file =folder.newFile("testingData.txt");
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
final URLConnection mockConnection = EasyMock.createMock(URLConnection.class);
final URLStreamHandler handler = new URLStreamHandler() {
#Override
protected URLConnection openConnection(final URL arg0)
throws IOException {
return mockConnection;
}
};
URL url=null;
try {
url = new URL("http://foo.bar", "foo.bar", 80, "", handler);
} catch (MalformedURLException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
InputStream is=null;
try {
is = new FileInputStream(file);
} catch (FileNotFoundException e2) {
// TODO Auto-generated catch block
e2.printStackTrace();
}
try {
EasyMock.expect(url.openStream()).andReturn(is).anyTimes();
} catch (FileNotFoundException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
try {
// imageHeaderParser is object of actual class
imageHeaderParser.parseA(url);
} catch (IfmSwimParserException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Here TemporaryFolder used to create temp file.And I dont want URL to go network.It can be just dummy URL and when i call url.openStream(),it should return stream of temp file i mentioned.
A file can be converted to an URL, so just do that :
Url testUrl = Paths.get("folder",("testingData.txt").toUri().toURL();
Map<String, Object> map = parseA(testUrl);
// assert map content
Besides you don't need any mock if you want to test with the file processing behavior.
I have created a program to convert text to xml by using ReverseXSL API.
This program is to be executed by an application by calling static method (static int transformXSL).
I am able to execute and produce output with running from Eclipse. However, When I ran program (jar) by using application it stuck somewhere and I couldnt find anything.
Then, I debugged by "Debug as...-> Remote Java Application" in Eclipse from Application and found "InvocationTargetException" at ClassLoaders.callStaticFunction.
Below Static method is called by application.
public class MyTest4 {
public MyTest4()
{
}
public static int transformXSL(String defFile, String inputFile, String XSLFile, String OutputFile) {
System.out.println("Dheeraj's method is called");
// start time
FileWriter fw=null;
try {
fw = new FileWriter("D://Countime.txt");
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
BufferedWriter output=new BufferedWriter(fw);
DateFormat sd=new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSS");
Date dt= new Date();
System.out.println("Date is calculated");
try {
output.write("Start Time:"+sd.format(dt).toString());
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.out.println(sd.format(dt));
FileReader myDEFReader=null, myXSLReader=null;
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t=null;
FileInputStream inStream = null;
ByteArrayOutputStream outStream = null;
// Step 1:
//instantiate a transformer with the specified DEF and XSLT
if (new File(defFile).canRead())
{
try {
myDEFReader = new FileReader(defFile);
System.out.println("Definition file is read");
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
else myDEFReader = null;
if (new File(XSLFile).canRead())
try {
myXSLReader = new FileReader(XSLFile);
System.out.println("XSL file is read");
} catch (FileNotFoundException e) {
e.printStackTrace();
}
else myXSLReader = null;
try {
t = tf.newTransformer(myDEFReader, myXSLReader);
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("Step 1: DEF AND XSLT Transformation completed");
// Step 2:
// Read Input data
try {
inStream = new FileInputStream(inputFile);
} catch (FileNotFoundException e) {
e.printStackTrace();
}
outStream = new ByteArrayOutputStream();
System.out.println("Step 2: Reading Input file: completed");
// Step 3:
// Transform Input
try {
try (BufferedReader br = new BufferedReader(new FileReader("D://2.txt"))) {
String line = null;
while ((line = br.readLine()) != null) {
System.out.println("Content: "+line);
}
}
System.out.println("File: "+inputFile.toString());
System.out.println("\n content: \n"+ inStream.toString());
System.out.println("Calling Transform Function");
t.transform(inStream, outStream);
System.out.println("Transformation is called");
outStream.close();
try(OutputStream outputStream = new FileOutputStream(OutputFile)) {
outStream.writeTo(outputStream);
System.out.println("Outstream is generated; Output file is creating");
}
System.out.println(outStream.toString());
} catch (TransformerException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (ParserException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (ParserConfigurationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (FactoryConfigurationError e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (TransformerFactoryConfigurationError e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (javax.xml.transform.TransformerException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.out.println("output file is created");
// End time
Date dt2= new Date();
System.out.println(sd.format(dt2));
System.out.println("End time:"+dt2.toString());
try {
output.append("End Time:"+sd.format(dt2).toString());
output.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return 0;
}
}
I am creating an application that makes calls to the Hitbox API. I am trying to get the game name (listed as category_name from a list.
Thus far, I have managed to get the game name one time during the programs running stage, however when I change where to get the game name from, the program doesn't do anything. I am at a loss as to what could cause it not to send another request to the server.
public void apiConnect(){
String channel = text.getText();
HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet("http://api.hitbox.tv/media/live/" + channel);
HttpResponse response = null;
try {
response = client.execute(request);
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
// Get the response
BufferedReader rd = null;
try {
rd = new BufferedReader
(new InputStreamReader(response.getEntity().getContent()));
} catch (UnsupportedOperationException | IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
String line = "";
try {
while ((line = rd.readLine()) != null) {
hitbox.append(line);
}
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
try {
FileUtils.writeStringToFile(new File("hitbox.json"), hitbox.getText());
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
String game = null;
FileInputStream fileHitbox = null;
try {
fileHitbox = new FileInputStream(new File("hitbox.json"));
} catch (FileNotFoundException e2) {
// TODO Auto-generated catch block
e2.printStackTrace();
}
String strHitbox = null;
try {
strHitbox = IOUtils.toString(fileHitbox, "UTF-8");
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
JSONObject obj = new JSONObject(strHitbox);
JSONArray ar = obj.getJSONArray("livestream");
for (int i = 0; i < ar.length(); i++)
{
game = ar.getJSONObject(i).getString("category_name");
nameOf.setText("Game Name: " + game);
}
File hb = new File("hitbox.json");
if(hb.exists()){
hb.delete();
}
}
The above sample is the defined function, and the Get Game Name button code is below:
btnGetGameName.addSelectionListener(new SelectionAdapter() {
#Override
public void widgetSelected(SelectionEvent e) {
apiConnect();
}
});
Could anyone suggest what is causing it to not work after the first request, and if possible suggest a solution?
EDIT: I have found the issue. The reading of the data from the API is appended to the hitbox variable. I have thus added a snippet that clears what "hitbox" variable has when the button is pressed, thus meaning the code works without issues.
Try to consume your response after your read it to release the resource :
rd = new BufferedReader
(new InputStreamReader(response.getEntity().getContent()));
response.getEntity().consumeContent();
//Or if you have EntityUtils
EntityUtils.consume(response.getEntity());
source
I got some code from java httpurlconnection cutting off html and I am pretty much the same code to fetch html from websites in Java.
Except for one particular website that I am unable to make this code work with:
I am trying to get HTML from this website:
http://www.geni.com/genealogy/people/William-Jefferson-Blythe-Clinton/6000000001961474289
But I keep getting junk characters. Although it works very well with any other website like http://www.google.com.
And this is the code that I am using:
public static String PrintHTML(){
URL url = null;
try {
url = new URL("http://www.geni.com/genealogy/people/William-Jefferson-Blythe-Clinton/6000000001961474289");
} catch (MalformedURLException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
HttpURLConnection connection = null;
try {
connection = (HttpURLConnection) url.openConnection();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
connection.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.6) Gecko/20100625 Firefox/3.6.6");
try {
System.out.println(connection.getResponseCode());
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
String line;
StringBuilder builder = new StringBuilder();
BufferedReader reader = null;
try {
reader = new BufferedReader(new InputStreamReader(connection.getInputStream()));
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
try {
while ((line = reader.readLine()) != null) {
builder.append(line);
builder.append("\n");
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
String html = builder.toString();
System.out.println("HTML " + html);
return html;
}
I don't understand why it doesn't work with the URL that I mentioned above.
Any help will be appreciated.
That site is incorrectly gzipping the response regardless of the client's capabilities. Normally a server should only gzip the response whenever the client supports it (by Accept-Encoding: gzip). You need to ungzip it using GZIPInputStream.
reader = new BufferedReader(new InputStreamReader(new GZIPInputStream(connection.getInputStream()), "UTF-8"));
Note that I also added the right charset to the InputStreamReader constructor. Normally you'd like to extract it from the Content-Type header of the response.
For more hints, see also How to use URLConnection to fire and handle HTTP requests? If all what you after all want is parsing/extracting information from the HTML, then I strongly recommend to use a HTML parser like Jsoup instead.