Google image search: How do I construct a reverse image search URL? - java

How can I programmatically through java convert an image to "some string" to pass it as a parameter for searching in google image search. Actually I have made some base64 convertion of image but it differs from that that google does in its image search engine. I've made such a convertion(java 7):
import javax.xml.bind.DatatypeConverter;
...
Path p = Paths.get("my_photo.JPG");
try(InputStream in = Files.newInputStream(p);
PrintWriter write = new PrintWriter("base64.txt");
) {
byte [] bytes = new byte[in.available()];
in.read(bytes);
String base64 = DatatypeConverter.printBase64Binary(bytes);
write.println(base64);
} catch(IOException ex) {
ex.printStackTrace();
}
the output of this simple program differs from the google's string in url. I talk about that string that goes after tbs=sbi:AMhZZ...

This is my best guess for how the image search works:
The data in the URL is not an encoded form of the image. The data is an image fingerprint used for fuzzy matching.
You should notice that when you upload an image for searching, it is a 2 step process. The first step uploads the image via the url http://images.google.com/searchbyimage/upload. The Google server returns the fingerprint. The browser is then redirected to a search page with a query string based on the fingerprint.
Unless Google publishes the algorithm for generating the fingerprint, you will be unable to generate the search query string from within your application. Until then, you can have your application post the image to the upload URI. You should be able to parse the response and construct the query string.
EDIT
These are the keys and values sent to the server when I uploaded a file.
image_url =
btnG = Search
encoded_image = // the binary image content goes here
image_content =
filename =
hl = en
bih = 507
biw = 1920
"bih" and "biw" look like dimensions, but do not corrispond to the uploaded file.
Use this information at your own risk. It is an undocumented api that could change and break your application.

Using google's image search.
import java.io.BufferedReader;
import java.io.File;
import java.io.IOException;
import java.io.InputStreamReader;
import org.apache.http.HttpResponse;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.mime.MultipartEntity;
import org.apache.http.entity.mime.content.FileBody;
import org.apache.http.entity.mime.content.StringBody;
import org.apache.http.impl.client.DefaultHttpClient;
public class HttpFileUpload {
public static void main(String args[]){
try {
HttpClient client = new DefaultHttpClient();
String url="https://www.google.co.in/searchbyimage/upload";
String imageFile="c:\\temp\\shirt.jpg";
HttpPost post = new HttpPost(url);
MultipartEntity entity = new MultipartEntity();
entity.addPart("encoded_image", new FileBody(new File(imageFile)));
entity.addPart("image_url",new StringBody(""));
entity.addPart("image_content",new StringBody(""));
entity.addPart("filename",new StringBody(""));
entity.addPart("h1",new StringBody("en"));
entity.addPart("bih",new StringBody("179"));
entity.addPart("biw",new StringBody("1600"));
post.setEntity(entity);
HttpResponse response = client.execute(post);
BufferedReader rd = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
String line = "";
while ((line = rd.readLine()) != null) {
if (line.indexOf("HREF")>0)
System.out.println(line.substring(8));
}
}catch (ClientProtocolException cpx){
cpx.printStackTrace();
}catch (IOException ioex){
ioex.printStackTrace();
}
}
}

Based on #Ajit's answer, this does the same but using the curl command (Linux / Cygwin / etc)
curl -s -F "image_url=" -F "image_content=" -F "filename=" -F "h1=en" -F "bih=179" -F "biw=1600" -F "encoded_image=#my_image_file.jpg" https://www.google.co.in/searchbyimage/upload
This will print a URL on standard output. You can download that URL with curl or wget but you may have to change the User Agent to that of a graphical web browser like Chrome.

This is what work for me. No need any encoding actually.
https://www.google.com/searchbyimage?image_url=YOUR_IMAGE_URL

Use Google Vision API for that. There are also lot of examples available from Google

Related

Zip Archives get corrupted when uploading to Azure Blob Store using REST API

I have been really banging my head against the wall with this one, uploading text files is fine, but when I upload a zip archive into my blob store -> it gets corrupted, and cannot be opened once downloaded.
Doing a hex compare (image below) of the original versus file that has been through Azure shows some subtle replacements have happened, but I cannot find the source of the change/corruption.
I have tried forcing UTF-8/Ascii/UTF-16, but found UTF-8 is probably correct, none have resolved the issue.
I have also tried different http libraries but got the same result.
Deployment environment is forcing unirest, and cannot use the Microsoft API (Which seems to work fine).
package blobQuickstart.blobAzureApp;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.Base64;
import org.junit.Test;
import kong.unirest.HttpResponse;
import kong.unirest.Unirest;
public class StackOverflowExample {
#Test
public void uploadSmallZip() throws Exception {
File testFile = new File("src/test/resources/zip/simple.zip");
String blobStore = "secretstore";
UploadedFile testUploadedFile = new UploadedFile();
testUploadedFile.setName(testFile.getName());
testUploadedFile.setFile(testFile);
String contentType = "application/zip";
String body = readFileContent(testFile);
String url = "https://" + blobStore + ".blob.core.windows.net/naratest/" + testFile.getName() + "?sv=2020-02-10&ss=b&srt=o&sp=c&se=2021-09-07T20%3A10%3A50Z&st=2021-09-07T18%3A10%3A50Z&spr=https&sig=xvQTkCQcfMTwWSP5gXeTB5vHlCh2oZXvmvL3kaXRWQg%3D";
HttpResponse<String> response = Unirest.put(url)
.header("x-ms-blob-type", "BlockBlob").header("Content-Type", contentType)
.body(body).asString();
if (!response.isSuccess()) {
System.out.println(response.getBody());
throw new Exception("Failed to Upload File! Unexpected response code: " + response.getStatus());
}
}
private static String readFileContent(File file) throws Exception {
InputStream is = new FileInputStream(file);
ByteArrayOutputStream answer = new ByteArrayOutputStream();
byte[] byteBuffer = new byte[8192];
int nbByteRead;
while ((nbByteRead = is.read(byteBuffer)) != -1)
{
answer.write(byteBuffer, 0, nbByteRead);
}
is.close();
byte[] fileContents = answer.toByteArray();
String s = Base64.getEncoder().encodeToString(fileContents);
byte[] resultBytes = Base64.getDecoder().decode(s);
String encodedContents = new String(resultBytes);
return encodedContents;
}
}
Please help!
byte[] resultBytes = Base64.getDecoder().decode(s);
String encodedContents = new String(resultBytes);
You are creating a String from a byte array containing binary data. String is only for printable characters. You do multiple pointless encoding/decoding just taking more memory.
If the content is in a ZIP format, it's binary, just return the byte array. Or you can encode the content, but then you should return the content encoded. As a weakness, you're doing it all in memory, limiting potential size of the content.
Unirest file handlers will by default force a multipart body - not supported by Azure.
A Byte Array can be provided directly as per this: https://github.com/Kong/unirest-java/issues/248
Unirest.put("http://somewhere")
.body("abc".getBytes())

Best way to download all images from a site using Java? Currently getting an 403 Status Error

I am trying to download all the images off of a site, but I'm not sure if this is the best way, as I have tried setting a user agent and referrer to no avail. The 403 Status Error only occurs when trying to download the images from the src page, while the page that has all the images in one place is doesn't show any errors and sends the src to the images. I am not sure if there is a way to download the images without visiting the src page? Or a better way to do this entirely.
Here is my code so far.
private static void getPages() throws IOException {
Document doc = Jsoup.connect("https://manganelo.com/chapter/read_bleach_manga_online_for_free2/chapter_686")
.get();
Elements media = doc.getElementsByTag("img");
System.out.println(media);
Iterator<Element> ie = media.iterator();
int i = 1;
while (ie.hasNext()) {
Response resultImageResponse = Jsoup.connect(ie.next().attr("src")).ignoreContentType(true)
.userAgent("Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0")
.referrer("www.google.com").timeout(120000).execute();
FileOutputStream out = (new FileOutputStream(new java.io.File("image #" + i++ + ".jpg")));
out.write(resultImageResponse.bodyAsBytes());
out.close();
}
}
You have a few problems with your suggested approach:
you're trying to use JSoup to download file content data... JSoup is only for the text data but won't return the image content/values. To download image content you will need an HTTP request
to download the images you also need to copy the request that would be made via a browser. You can open up Chrome, open developer tools and open the network tab. Enter the URL for the page you want to scrape images from, and you'll see a bunch of requests being made. There'll be an individual request for each image somewhere in the view... if you click on the one labelled 1.jpg you'll see the request made to download the first image, you'll then need to copy all headers that are used to make the request for that image. You'll note, request AND response headers are shown in this view. Once you've replicated the request successfully, you can then start testing which headers/cookies are required. I found the only real requirement was for the "referer" header being necessary.
I've stripped out most of what you might need/want but something similar to the below is what you're after. I've pulled the comic book images in their entirety at full quality. I introduced a small sleep timer so as not to overload the server as sometimes you'll get rate limited. Even without it you should be fine but you don't want to get blocked for a lengthy period of time so the slower you can allow the requests to come back to you the better. You could even make the requests in parallel.
You could cut back even more on some of the code below I'm almost certain, to get a cleaner result... but it works and I'm assuming that's more than enough of a result.
Interesting question.
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.Iterator;
public class JSoupExample {
private static int TIMEOUT = 30000;
private static final int BUFFER_SIZE = 4096;
public static void main(String... args) throws InterruptedException, IOException {
String url = "https://manganelo.com/chapter/read_bleach_manga_online_for_free2/chapter_686";
Document doc = Jsoup.connect(url).get();
// Select only urls where the source starts with the relevant url (not all images)
Elements media = doc.select("img[src^=\"https://s5.mkklcdnv5.com/mangakakalot/r1/read_bleach_manga_online_for_free2/chapter_686_death_and_strawberry/\"]");
Iterator<Element> ie = media.iterator();
int i = 1;
while (ie.hasNext()) {
String imageUrlString = ie.next().attr("src");
System.out.println(imageUrlString + " ");
try {
HttpURLConnection response = makeImageRequest(url, imageUrlString);
if (response.getResponseCode() == 200) {
writeToFile(i, response);
}
} catch (IOException e) {
// skip file and move to next if unavailable
e.printStackTrace();
System.out.println("Unable to download file: " + imageUrlString);
}
i++; // increment image ID whatever the result of the request.
Thread.sleep(200l); // prevent yourself from being blocked due to rate limiting
}
}
private static void writeToFile(int i, HttpURLConnection response) throws IOException {
// opens input stream from the HTTP connection
InputStream inputStream = response.getInputStream();
// opens an output stream to save into file
FileOutputStream outputStream = new FileOutputStream("image_" + i + ".jpg");
int bytesRead = -1;
byte[] buffer = new byte[BUFFER_SIZE];
while ((bytesRead = inputStream.read(buffer)) != -1) {
outputStream.write(buffer, 0, bytesRead);
}
outputStream.close();
inputStream.close();
System.out.println("File downloaded");
}
private static HttpURLConnection makeImageRequest(String referer, String imageUrlString) throws IOException {
URL imageUrl = new URL(imageUrlString);
HttpURLConnection response = (HttpURLConnection) imageUrl.openConnection();
response.setRequestMethod("GET");
response.setRequestProperty("referer", referer);
response.setConnectTimeout(TIMEOUT);
response.setReadTimeout(TIMEOUT);
response.connect();
return response;
}
}
I'd also want to ensure I set the right file extension based on the content type as I believe some were coming back as .png format rather than .jpeg. I'm also fairly sure the write to file can be cleaned up to be simpler/clearer, rather than reading in a byte stream.

Lotus Notes Java API mistake with encoding

I'm newble in Java and I need to extract data from NSF files to DXL xml files. I tried it with python, but OLE wrapper doesn't have a few major functionality. I have two different formates of DXL files: with rawitemdata and with richtext.
When I switched to Java Lotus Notes API I receive mistakes in rawitemdata fields. I use doc.convertToMIME(2); procedure to provide mail body in the HTML format for getting sameness for each type of documents. As result all fields stored as rawitemdata and it's great. It's what I want. But some rawitemdata fields have crashed encoding.
It's look like
<rawitemdata type="19">
AgAPAAAAAgAmAj4AhQEAAAAAAAANCi0tMF9fPUNDQkIwRTFFREZBRjk4Mjg4ZjllOGE5M2RmOTM4NjkwOTE4Y0NDQkIwRTFFREZBRjk4
MjgNCkNvbnRlbnQtVHJhbnNmZXItRW5jb2Rpbmc6IGJpbmFyeQ0KQ29udGVudC10eXBlOiBhcHBsaWNhdGlvbi9vY3RldC1zdHJlYW07
IA0KCW5hbWU9Ij0/S09JOC1SP0I/OFBMdjlPL3I3K3d1Wkc5amVBPT0/PSINCkNvbnRlbnQtRGlzcG9zaXRpb246IGF0dGFjaG1lbnQ7
IGZpbGVuYW1lPSI9P0tPSTgtUj84UEx2OU8vcjcrd3VaRzlqZUE9PT89Ig0KQ29udGVudC1JRDogPDNfXz1DQ0JCMEUxRURGQUY5ODI4
OGY5ZThhOTNkZjkzODY5MDkxQGxvY2FsPg0KDQoFzwXQBc4F0gXOBcoFzgXLLmRvY3g=
</rawitemdata>
After decode from base64 I got this
\x02\x00\x0f\x00\x00\x00\x02\x00&\x02>\x00\x85\x01\x00\x00\x00\x00\x00\x00\r\n-
-0__=CCBB0E1EDFAF98288f9e8a93df938690918cCCBB0E1EDFAF9828\r\nContent-Transfer-Encoding:
binary\r\nContent-type: application/octet-stream; \r\n\tname="=?KOI8-R?B?8PLv9O/r7+wuZG9jeA==?
="\r\nContent-Disposition: attachment; filename="=?KOI8-R?8PLv9O/r7+wuZG9jeA==?="\r\nContent-ID:
<3__=CCBB0E1EDFAF98288f9e8a93df93869091#local>\r\n\r\n\x05\xcf\x05\xd0\x05\xce\x05\xd2\x05\xce\x05\xca\x
05\xce\x05\xcb.docx
It's easy to explain:
first 20 bytes it's a header: \x02\x00\x0f\x00\x00\x00\x02\x00&\x02>\x00\x85\x01\x00\x00\x00\x00\x00\x00
to unpack it I use next code struct.unpack("<hhhh hhLL", data[:20])
it returns tuple like this: (2, 15, 0, 2, 550, 62, 389, 0) where the 5th element it is a length of body. But I changed the body and lazed to calc current header for new body
the body with rfc822 headers
After base64 and then koi8-r header Content-Disposition decoded we could see field name with content ПРОТОКОЛ.docx. And it's correct content.
If I try decode the last part of the body which contains byte encoding \x05\xcf\x05\xd0\x05\xce\x05\xd2\x05\xce\x05\xca\x
05\xce\x05\xcb.docx I got exception. This is part can't being decoded correctly.
It's look like cp1251 encoding under unicode. Because the content ПРОТОКОЛ.docx in cp1251 codepage has '\xcf\xd0\xce\xd2\xce\xca\xce\xcb.docx' bytes.
My java code is quite simple:
import lotus.domino.Database;
import lotus.domino.Session;
import lotus.domino.NotesFactory;
import java.io.File;
import java.io.FileWriter;
import java.io.BufferedWriter;
import lotus.domino.*;
public class ExportDXL {
public static void main(String[] args) throws Exception {
String server = "";
String dbPath = "C:\\share\\test.nsf";
NotesThread.sinitThread();
Session session = NotesFactory.createSession((String)null, (String)null, "test");
Database db = session.getDatabase(server, dbPath);
System.out.println("Db Title: " + db.getTitle());
DxlExporter exporter = session.createDxlExporter();
exporter.setConvertNotesBitmapsToGIF(true);
View view = db.getView("$ALL");
Document doc = view.getFirstDocument();
while(doc != null){
doc.convertToMIME(2); // or set 1 for get plain text
String xmldoc = doc.generateXML();
FileWriter fw = null;
String id = doc.getUniversalID();
fw = new FileWriter("C:\\lotus_test\\"+ id +".xml");
fw.write(xmldoc);
fw.flush();
fw.close();
doc = view.getNextDocument(doc);
}
}
}
How can I provide correct encoding in this case? Or how to set code page for Lotus Notes API ?

Decode base64 PDF and write to file in JMeter

I am trying to decode a pdf from a response and write it to a file.
The file gets created and appears to be the correct file size, but when I go to open it, I get an error that says, "There was an error opening this document. The file is damaged and could not be repaired."
I am using the code from this post to decode and create the file.
I set the base64 encoded file returned from the API as the variable vars.get("documentText")
Here is how my BeanShell PostProcessor code looks:
import org.apache.commons.io.FileUtils;
import org.apache.commons.codec.binary.Base64;
String Createresponse= vars.get("documentText");
vars.put("response",new String(Base64.decodeBase64(Createresponse.getBytes("UTF-8"))));
Output = vars.get("response");
f = new FileOutputStream("C:\\Users\\user\\Desktop\\Test.pdf");
p = new PrintStream(f);
this.interpreter.setOut(p);
print(Output);
f.close();
Am I doing something incorrectly?
I have also done the following, but get the same result:
byte[] data = Base64.decodeBase64(vars.get("documentText"));
FileOutputStream out = new FileOutputStream("C:\\Users\\user\\Desktop\\Test.pdf");
out.write(data);
out.close();
EDIT:
The entire PDF from the Response looks like the following: (these are just the first 5 lines (of approx. 7,548 lines), but they are all similar):
JVBERi0xLjQKMSAwIG9iago8PAovVGl0bGUgKP7/KQovQ3JlYXRvciAo/v8pCi9Qcm9kdWNlciAo
/v8AUQB0ACAANQAuADUALgAxKQovQ3JlYXRpb25EYXRlIChEOjIwMTcwMzI3MTgwNTEzKQo+Pgpl
bmRvYmoKMiAwIG9iago8PAovVHlwZSAvQ2F0YWxvZwovUGFnZXMgMyAwIFIKPj4KZW5kb2JqCjQg
MCBvYmoKPDwKL1R5cGUgL0V4dEdTdGF0ZQovU0EgdHJ1ZQovU00gMC4wMgovY2EgMS4wCi9DQSAx
LjAKL0FJUyBmYWxzZQovU01hc2sgL05vbmU+PgplbmRvYmoKNSAwIG9iagpbL1BhdHRlcm4gL0Rl
I'm assuming this is what is causing an issue? Is there a way to convert the response to a single String that can be decoded?
EDIT 2:
So the 
 in the response is definitely my problem. I looked up the hex code character and it translates to a carriage return. If I manually copy the Response from within JMeter, paste it into Notepad++, remove 
 and then decode it manually, the PDF opens as it should.
I tried modifying my BeanShell script to remove the carriage return and then decode it, but it still isn't fully functional. The PDF now opens, however, it is just blank white pages. Here is my updated code:
String Createresponse= vars.get("documentText");
String b64 = Createresponse.replace("
","");
vars.put("response",new String(Base64.decodeBase64(b64)));
Output = vars.get("response");
f = new FileOutputStream("C:\\Users\\user\\Desktop\\Test.pdf");
p = new PrintStream(f);
this.interpreter.setOut(p);
print(Output);
f.close();
This works for me. You input data is wrong.
package com.test;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Base64;
import org.junit.Test;
public class TestBase64 {
String data =
"JVBERi0xLjQKMSAwIG9iago8PAovVGl0bGUgKP7/KQovQ3JlYXRvciAo/v8pCi9Qcm9kdWNlciAo/v8AUQB0ACAANQAuADUALgAxKQovQ3JlYXRpb25EYXRlIChEOjIwMTcwMzI3MTgwNTEzKQo+Pgpl";
#Test
public void decodeBase64()
{
byte[] localData = Base64.getDecoder().decode(data);
try (FileOutputStream out = new FileOutputStream("/testout64.dat"))
{
out.write(localData);
out.close();
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
This results in
%PDF-1.4
1 0 obj
<<
/Title (þÿ)
/Creator (þÿ)
/Producer (þÿ Q t 5 . 5 . 1)
/CreationDate (D:20170327180513)
>>
e
and seems to be valid PDF.
What is the &_#_x_d_;_ part? Seems to be some custom format characters.
I basically had the answer to my question, the problem was with the base64 encoded Response I was trying to decode was multi-line and included carriage return hex code.
My solution to this was to remove the carriage return hex code from the response and condense it to a single string of base64 encoded text and then write the file out.
import org.apache.commons.io.FileUtils;
import org.apache.commons.codec.binary.Base64;
String response = vars.get("documentText");
String encodedFile = response.replace("
","").replaceAll("[\n]+","");
// Decode the response
vars.put("decodedFile",new String(Base64.decodeBase64(encodedFile)));
// Write out the decoded file
Output = vars.get("decodedFile");
file = new FileOutputStream("C:\\Users\\user\\Desktop\\decodedFile.pdf");
p = new PrintStream(file);
this.interpreter.setOut(p);
print(Output);
p.flush();
file.close();

How do I scrape website with term acceptance page? [duplicate]

This question already has answers here:
How to code an automated bot that can browse and do operations on a webpage
(6 answers)
Closed 7 years ago.
I am new to writing code and I am trying to write code to scrape a specific website. The issue is that this website has a page to accept the conditions of use and privacy page. This can be seen by the website: http://cpdocket.cp.cuyahogacounty.us/
I need to bypass this page somehow and I have no idea how. I am writing my code in Java, and so far have working code that scrapes the source for any website. This code is:
import java.net.URL;
import java.net.URLConnection;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.lang.StringBuilder;
import java.io.IOException;
// Scraper class takes an input of a string, and returns the source code of the of the website
public class Scraper {
private static String url; // the input website to be scraped
//constructor
public Scraper(String url) {
this.url = url;
}
//scrapeWebsite runs the method to scrape the input variable. As of now it retuns a string. This string idealy should be saved
//so it is able to be parsed by another method
public static String scrapeWebsite() throws IOException {
URL urlconnect = new URL(url); //creates the url from the variable
URLConnection connection = urlconnect.openConnection(); // connects to the created url
BufferedReader in = new BufferedReader(new InputStreamReader(
connection.getInputStream(), "UTF-8")); // annonymous class to stream the website
String inputLine; //creates a new variable of string
StringBuilder a = new StringBuilder(); // creates stringbuilder
//loop appends to the string builder as long as there is information
while ((inputLine = in.readLine()) != null)
a.append(inputLine);
in.close();
return a.toString();
}
}
Any suggestions on how to go about doing this would be greatly appreciated.
I am rewriting the code based off a ruby code. The code is:
def initializeSession()
## SETUP # POST headers
post_header = Hash.new()
post_header['Host'] = 'cpdocket.cp.cuyahogacounty.us'
post_header['User-Agent'] = 'Mozilla/5.0 (Windows NT 5.1; rv:20.0) Gecko/20100101 Firefox/20.0'
post_header['Accept'] = 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'
post_header['Accept-Language'] = 'en-US,en;q=0.5'
post_header['Accept-Encoding'] = 'gzip, deflate'
post_header['X-Requested-With'] = 'XMLHttpRequest'
post_header['X-MicrosoftAjax'] = 'Delta=true'
post_header['Cache-Control'] = 'no-cache'
post_header['Content-Type'] = 'application/x-www-form-urlencoded; charset=utf-8'
post_header['Referer'] = 'http://cpdocket.cp.cuyahogacounty.us/Search.aspx' # may have to alter this per request
# post_header['Content-Length'] = '12197'
post_header['Connection'] = 'keep-alive'
post_header['Pragma'] = 'no-cache'
# STEP # set up simulated browser and make first request
#browser = SimBrowser.new()
#logname = 'log.txt'
#s = Scribe.new(logname)
session_cookie = 'ASP.NET_SessionId'
url = 'http://cpdocket.cp.cuyahogacounty.us/'
#browser.http_get(url)
#puts browser.get_body() # debug
puts 'DEBUG: session cookie: ' + #browser.get_cookie_var(session_cookie)
#log.slog('DEBUG: home page response code: expected 200, actual ' + #browser.get_response().code)
# s.flog('### HOME PAGE RESPONSE')
# s.flog(browser.get_body()) # debug
# STEP # send our acceptance of the terms of service
data = {
'ctl00$SheetContentPlaceHolder$btnYes' => 'Yes',
'__EVENTARGUMENT'=>'',
'__EVENTTARGET'=>'',
'__EVENTVALIDATION'=>'/wEWBwKc78CQCQLn3/HqCQLZw/fZCgLipuudAQK42duKDQL33NjnAwKn6+K4CIM3TSmrbrsn2xBRJf2DRwg01Vsbdk+oJV9lhG/in+xD',
'__VIEWSTATE'=>'/wEPDwUKLTI4MzA1ODM0OA9kFgJmD2QWAgIDD2QWDgIDD2QWAgIBD2QWCAIBDxYCHgRUZXh0BQ9BbmRyZWEgRi4gUm9jY29kAgMPFgIfAAUfQ3V5YWhvZ2EgQ291bnR5IENsZXJrIG9mIENvdXJ0c2QCBQ8PFgIeB1Zpc2libGVoZGQCBw8PFgIfAWhkZAIHDw9kFgIeB29uY2xpY2sFGmphdmFzY3JpcHQ6d2luZG93LnByaW50KCk7ZAILDw9kFgIfAgUiamF2YXNjcmlwdDpvbkNsaWNrPXdpbmRvdy5jbG9zZSgpO2QCDw8PZBYCHwIFRmRpc3BsYXlQb3B1cCgnaF9EaXNjbGFpbWVyLmFzcHgnLCdteVdpbmRvdycsMzcwLDIyMCwnbm8nKTtyZXR1cm4gZmFsc2VkAhMPZBYCZg8PFgIeC05hdmlnYXRlVXJsBRMvVE9TLmFzcHg/aXNwcmludD1ZZGQCFQ8PZBYCHwIFRWRpc3BsYXlQb3B1cCgnaF9RdWVzdGlvbnMuYXNweCcsJ215V2luZG93JywzNzAsMzcwLCdubycpO3JldHVybiBmYWxzZWQCFw8WAh8ABQYxLjAuNTRkZEnXSWiVLEPsDmlc7dX4lH/53vU1P1SLMCBNASGt4T3B'
}
#post_header['Referer'] = url
#browser.http_post(url, data, post_header)
#log.slog('DEBUG: accept terms response code: expected 200, actual ' + #browser.get_response().code)
#log.flog('### TOS ACCPTANCE RESPONSE')
# #log.flog(#browser.get_body()) # debug
end
can this be done in Java as well?
If you don't understand how to do this, the best way to learn is to do this manually while watching what happens with FireBug (on Firefox) or the equivalent tools for IE, Chrome or Safari.
You must duplicate in your code whatever happens in the protocol when the user accepts the terms & conditions manually.
You must also be aware that the UI presented to the user may not be sent directly as HTML, it may be constructed dynamically by Javascript that would normally run on the browser. If you are not prepared to fully emulate a browser to the point of maintaining a DOM and executing Javascript, then this may not be possible.

Categories