Google search using custom search - java

I'm requested to write an inverted index, so I would like as a start to write a java program which google searches a word and putting the results into an arraylist.
Here's my code:
String search = "Dan";
String google = "http://www.google.com/cse/publicurl?cx=012216303403008813404:kcqyeryhhm8&q=" + search;
URL url = new URL(google);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("GET");
conn.setRequestProperty("Accept", "application/json");
BufferedReader reader = new BufferedReader(new InputStreamReader(
(conn.getInputStream())));
// Gather the results to a String array
List<String> resultsList = new ArrayList<String>();
String r;
while ((r = reader.readLine()) != null)
resultsList.add(r);
conn.disconnect();
System.out.println("Google Search for: " + search + " Is Done!");
The programs runs with no crashes in the middle, but I get only a source code of a page (which does not contain any links).
What do I need to change in the code? Maybe I need a whole different method?

If you want to use google search in your app you should use Google's API for that:
Custom search API
You get search results in JSON format.

Related

Google custom image search returns bad images

I want to get images from the Google Custom Search API. My problem is that Iam getting very weird images and no matter what I change in the settings.
keywords: empty
edition: free, with ads
image search: on
safe search: off
speech input: off
language: english
sites to search: -
restrictions: empty
search entire web: on
(Sorry if something is wrong translated, my UI is in german).
Some other user also had this problem but his solution didnt help me. Google custom search - poor image results
So no matter what I change in the settings, Iam getting the same images.
If I search "apfel" (english: apple) Iam getting this image link:
https://scontent-atl3-1.cdninstagram.com/v/t51.2885-19/s150x150/31514744_140795226776868_4684314220345425920_n.jpg?_nc_ht=scontent-atl3-1.cdninstagram.com&_nc_ohc=FdhVBUbROnkAX9AJdVR&oh=ea552d4c8b23acd0a3c82d83632e0895&oe=5ECA7F0E
But when I search it in the UI I get this:
It should not be the issue but here the code:
public static void main(String[] args) throws Exception {
String key = "";
String cx = "";
String keyword = "apfel";
URL url = new URL("https://www.googleapis.com/customsearch/v1?key=" + key + "&cx=" + cx + "&q=" + keyword);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.addRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)");
BufferedReader br = new BufferedReader(new InputStreamReader((conn.getInputStream())));
String output;
System.out.println("Output from Server .... \n");
while ((output = br.readLine()) != null) {
if(output.contains("\"src\": \"")){
System.out.println(output); //Will print the google search links
}
}
conn.disconnect();
}

Slow HTTPURLConnection in Android

I use the following code
private String resultGET(String addr)
{
try
{
String result = "";
HttpURLConnection conn = null;
addr = (isFull)?addr:Statics.fullURL(addr);
try
{
URL url = new URL(addr);
conn = (HttpURLConnection)url.openConnection();
conn.setRequestMethod("GET");
conn.setRequestProperty("User-Agent", Statics.USER_AGENT);
InputStream ips = conn.getInputStream();
int responseCode = conn.getResponseCode();
if (200 != responseCode)
{
Feedback.setError("GET error: " + responseCode + " on " + addr);
return "";
}
BufferedReader bufr = new BufferedReader(new InputStreamReader(ips));
String line;
while ((line = bufr.readLine()) != null) result += line;
bufr.close();
} finally{if (null != conn) conn.disconnect();}
return result;
} catch(Exception e)
{
Feedback.setError("get fault " + Utils.stackTrace(e));
return "";
}
}
Feedback is simply a Java class I use internally to handle all messages that I send back to the Android app front end (this is a hybrid app and the code above is part of a plugin I have written for the app).
I find that when any significant amount of data are returned the resultGETcall gets excruciatingly slow. For instance, a 43Kb Javascript file - which I later use to run JS code via Duktape takes the best part of 1 minute to download and save.
I am still quite a newbie when it comes to Java so I imagine that I am doing something wrong here which is causing the issue. I'd be most obliged to anyone who might be able to put me on the right track.
A while later...
I have now tested the issue on an Android 6 device instead of my default Android 4.4.2 device. On the Android 6 the download + file save speed comes in at a decent 5 seconds. On Android 4.4.2 it is over 40s. Are there any known issues with HTTPURLConnection on earlier versions of Android? I
String result = "";
The += operator on a String is slow. If you have a lot of lines use a StringBuilder sb = new StringBuilder(); and use its append() method to sb.append(line + " \n");
At the end you can use result = sb.toString();

Trying to extract from JSON when it's null

Will try to explain my question here.
I have a program that is suppose to parse through an incoming JSON-file that I receive from a web-crawler.
public static void Scan(Article article) throws Exception
{
//When running program, creates a error text-file inside java Project folder
File file = new File("errorlogg.txt");
FileWriter fileWriter = new FileWriter(file, true);
// if file doesn't exists, then create it
if (!file.exists())
{
file.createNewFile();
}
//Setting up an URL HttpURLConnection given DOI
URL urlDoi = new URL (article.GetElectronicEdition());
//Used for debugging
System.out.println("Initial DOI: " + urlDoi);
//Transform from URL to String
String doiCheck = urlDoi.toString();
//Redirect from IEEE Xplore toe IEEE Computer Society
if(doiCheck.startsWith("http://dx."))
{
doiCheck = doiCheck.replace("http://dx.doi.org/", "http://doi.ieeecomputersociety.org/");
urlDoi = new URL(doiCheck);
}
HttpURLConnection connDoi = (HttpURLConnection) urlDoi.openConnection();
// Make the logic below easier to detect redirections
connDoi.setInstanceFollowRedirects(false);
String doi = "{\"url\":\"" + connDoi.getHeaderField("Location") + "\",\"sessionid\":\"abc123\"}";
//Setting up an URL to translation-server
URL url = new URL("http://127.0.0.1:1969/web");
URLConnection conn = url.openConnection();
conn.setDoOutput(true);
conn.setRequestProperty("Content-Type", "application/json");
OutputStreamWriter writer = new OutputStreamWriter(conn.getOutputStream());
writer.write(doi);
writer.flush();
String line;
BufferedReader reader = new BufferedReader(new InputStreamReader(conn.getInputStream()));
while ((line = reader.readLine()) != null )
{
//Used to see of we get something from stream
System.out.println(line);
//Incoming is JSONArray, so create new array and parse fill it with incoming information
JSONArray jsonArr = new JSONArray(line);
JSONObject obj = jsonArr.getJSONObject(0);
//Check if names from DBLP is the same as translators get
//AuthorName, from field creators
JSONArray authorNames = obj.getJSONArray("creators");
ArrayList<Author> TranslationAuthors = new ArrayList<Author>();
Here is the bit of the code that I'm talking about. As you can see I wanna run this code when I get some information from the bufferreader.
My problem is that my program doesn't seem to skip when I don't get a valid JSON. Instead it runs to this line of code:
JSONArray authorNames = obj.getJSONArray("creators")
And then is forced to exit since it can't get the field "creators" since there is none.
How can I do to make sure that my program don't encounter this problem? How can I easy put it in the error-logg file that I create that I could't collect any information.
I think you are working with a org.json.JSONObject? If that's so, there is a has method, which can be used to avoid the JSONException in case the key does not exist.
JSONArray authorNames = null;
if (obj.has("creators")) {
authorNames = obj.getJSONArray("creators");
}

HTTP URL connection response

I am trying to hit the URL and get the response from my Java code.
I am using URLConnection to get this response. And writing this response in html file.
When opening this html in browser after executing the java class, I am getting only google home page and not with the results.
Whats wrong with my code, my code here,
FileWriter fWriter = null;
BufferedWriter writer = null;
URL url = new URL("https://www.google.co.in/?gfe_rd=cr&ei=aS-BVpPGDOiK8Qea4aKIAw&gws_rd=ssl#q=google+post+request+from+java");
byte[] encodedBytes = Base64.encodeBase64("root:pass".getBytes());
String encoding = new String(encodedBytes);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("GET");
connection.setRequestProperty("User-Agent", "Mozilla/5.0");
connection.setRequestProperty("Accept-Charset", "UTF-8");
connection.setDoInput(true);
connection.setRequestProperty("Authorization", "Basic " + encoding);
connection.connect();
InputStream content = (InputStream) connection.getInputStream();
BufferedReader in = new BufferedReader(new InputStreamReader(content));
String line;
try {
fWriter = new FileWriter(new File("f:\\fileName.html"));
writer = new BufferedWriter(fWriter);
while ((line = in.readLine()) != null) {
String s = line.toString();
writer.write(s);
}
writer.close();
} catch (Exception e) {
e.printStackTrace();
}
}
Same code works couple of days back, but not now.
The reason is that this url does not return search results it self. You have to understand google's working process to understand it. Open this url in your browser and view its source. You will only see lots of javascript there.
Actually, in a short summary, google uses Ajax requests to process search queries.
To perform required task you either have to use a headless browser (the hard way) which can execute javascript/ajax OR better use google search api as directed by anand.
This method of searching is not advised is supposed to fail, you must use google search APIs for this kind of work.
Note: Google uses some redirection and uses token, so even if you will find a clever way to handle it, it is ought to fail in long run.
Edit:
This is a sample of how using Google search APIs you can get your work done in reliable way; please do refer to the source for more information.
public static void main(String[] args) throws Exception {
String google = "http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=";
String search = "stackoverflow";
String charset = "UTF-8";
URL url = new URL(google + URLEncoder.encode(search, charset));
Reader reader = new InputStreamReader(url.openStream(), charset);
GoogleResults results = new Gson().fromJson(reader, GoogleResults.class);
// Show title and URL of 1st result.
System.out.println(results.getResponseData().getResults().get(0).getTitle());
System.out.println(results.getResponseData().getResults().get(0).getUrl());
}

Searching image with its description on google

I have successfully created an API key for using Google Custom Search Api,The task now I want to perform is to upload some image from my hard drive and get the results from the website I have specified while getting my API key from Google console(from control panel).I have tried the code from this question asked on stackoverflow(code also given below)
public static void main(String[] args) throws Exception {
String key="YOUR KEY";
String qry="Android";
URL url = new URL(
"https://www.googleapis.com/customsearch/v1?key="+key+ "&cx=013036536707430787589:_pqjad5hr1a&q="+ qry + "&alt=json");
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("GET");
conn.setRequestProperty("Accept", "application/json");
BufferedReader br = new BufferedReader(new InputStreamReader(
(conn.getInputStream())));
String output;
System.out.println("Output from Server .... \n");
while ((output = br.readLine()) != null) {
if(output.contains("\"link\": \"")){
String link=output.substring(output.indexOf("\"link\": \"")+("\"link\": \"").length(), output.indexOf("\","));
System.out.println(link); //Will print the google search links
}
}
conn.disconnect();
}
Now how can I search my image and get the results. And also while searching,this piece of code is searching the whole Google , but I want it to search only the websites I have specified in the control panel at the google console while creating API KEY.

Categories