Remove query parameters that Google appends to search results - java

Thanks for taking the time to answer my question.
I have written a function in Java which connects to google.com and collects the first 200 search results returned given a query. However, Google appends some "funky" parameters to the original url so the original:
http://en.wikipedia.org/wiki/World_Chess_Championship_2013
becomes:
http://en.wikipedia.org/wiki/World_Chess_Championship_2013&sa=U&ei=EPiIUuSGB5OV7AbB0YCQCA&ved=0CCMQFjAD&usg=AFQjCNEsQZZJUO1CU7cCwBaUDAXP9LSsjQ.
Now this would not be a problem since I could just cut the String at the point where I encounter "&sa..". However, Google appends different parameters for different data types. So a PDF link contains one set of parameters, images another, websites a third one etc..
Do you know of a way where I can programatically remove the parameters that google appends in order to get the original url?
Thanks

Related

SharePoint REST API: Not able to get list by getByTitle when list name contains spaces

I am trying to fetch SharePoint list using the SharePoint REST API from my Java application.
When the list's name does not contain a space, it works. If it contains a space it does NOT work. For example when I do the following:
http:(site URL)/_api/web/lists/getbytitle('ONEWORD')/items --> it works
But when I try on another list like this:
http:(site URL)/_api/web/lists/getbytitle('TWO WORDS')/items --> it DOESN'T work
I tried encoding the list name with the following in Java:
String encoded= URLEncoder.encode("TWO WORDS", "UTF-8");
But it didn't work.
I know there are many questions about this same issue, however everyone is suggesting to get list items by List GUID but I can't use this solution as I'm developing a dynamic tool for several lists with the same name. (Not the same GUID).
Any suggestions?
Thank you
You will need to use %20 to replace space. In your case it will be like following url
http:(site URL)/_api/web/lists/getbytitle('TWO%20WORDS')/items

Reading a lot of data from the internet

I am currently working on a project for my portfolio. I having a little trouble trying to find the right solution to this problem mainly because I have never tried it before.
I am using a free API service that I found online. I have created the database to match all the information, and not I just need to download the information and parse into my application.
I have parsed data from the API (JSON) and into my database. A couple of suggestions that I have found is reading 10 records at a time, but I want to try reading everything at once and then updating accordingly (let us say every 24 hours).
The APi I am using is the a free Game of Thrones API and below is the list I want of about the URL is formed to access each part data as I move through it.
https://anapioficeandfire.com/api/characters/
https://anapioficeandfire.com/api/books/
https://anapioficeandfire.com/api/houses/
At the end of the each of these URLS is a number that indicates the record number that I am trying to get. I have done this before while get information from a single page, and the page contained multiple JSON object. This time I need to move through multiple pages to get the single object on that page.
To give you an idea of the steps that I am looking for :
Go the page
Download the Information
Move on the next page
Break when there when I have reached the end.

Java - use searchbar on given website

Let me just start by saying that this is a soft question.
I am rather new to application development, and thus why I'm asking a question without presenting you with any actual code. I know the basics of Java coding, and I was wondering if anyone could enlighten me on the following topic:
Say I have an external website, Craigslist, or some other site that allows me to search through products/services/results manually by typing a query into a searchbox somewhere on the page. The trouble is, that there is no API for this site for me to use.
However I do know that http://sfbay.craigslist.org/search/sss?query=QUERYHERE&sort=rel points me to a list of results, where QUERYHERE is replaced by what I'm looking for.
What I'm wondering here is: is it possible to store these results in an Array (or List or some form of Collection) in Java?
Is there perhaps some library or external tool that can allow me to specify a query to search for, have it paste it in to a search-link, perform the search, and fill an Array with the results?
Or is what I am describing impossible without an API?
This depends, if the query website accepts returning the result as XML or JSON (usually with a .xml or .json at the end of url) you can parse it easily with DOM for XML on Java or download and use the JSONLibrary to parse a JSON.
Otherwise you will receive a HTML that is the page that a user would see in a browser, then you can try parse it as a XML but you will have a lot of work to map all fields in the HTML to get the list as you want.

Request.getParameter java to vb.net

I have a question about retrieving data from a client who is using java and i am using vb.net.
I am expecting a form posted to me and read the data.
My issues is when i do Request.Form("DATA") i get nothing from the client.
Now if i create a html form and post it to my url with the field "DATA" i can read everything fine. I can also loop through my form and see the field and the button if i right them out to the screen or to a text file. Code is below
response.write(Request.Form("DATA"))
OR
Dim entryName As String
For Each entryName In Request.Form
response.write("Entity Name: " & entryName)
Next
Either method above works fine for me but not for the client. When the client hits my page i see nothing at all no buttons no fields, nothing.
I am concerned he is not posting properly to me. I spoke with the developer and he said he would retrieve the data on his end by doing something like "Request.getparameter"
I do not know java at all but from what i read it sounds like "Request.getparameter" will grab any field out of a form or querysting that has the name specified aka my "DATA" field that i am looking for.
Can anyone explain to me what request.getparameter means in java and what the equivalent code would be in vb.net?
Again i do not know java at all and have searched on this for a while but cant quite find a definitive answer.
Thanks in advance.
It is correct that in Java, request.getParameter("DATA") will look in both the query string and posted form data, while in .NET, Request.Form("DATA") only looks at posted form data. Therefore, it seems likely that your client is sending the data in the query string, since you are not seeing it.
You have a few options. You could use Request.QueryString("DATA") to check only the query string, or either Request.Item("DATA") / Request("DATA") or Request.Params("DATA") to check both the query string and posted form data, plus cookies and server variables. I think Items and Params may be a little different in what they return, e.g. for multiple values. They are probably the closest equivalent to the Java request.getParameter function.

get number of google search results

I searched a lot to retrieve the number of search results in google using java, but nothing worked.
I have tried Google Custom Search API aswell.
I don't want the title/url of results, just number of total results found.
Can some one please guide me?
By using the Custom Search API, you're on the right way.
There's a totalResults key in the response JSON that you get from your query. Just grab it's value and you're done.
If you want your JSON to only contain that value, add the fields parameter to your query like that:
https://www.googleapis.com/customsearch/v1?key={YOUR_API_KEY}&cx={YOUR_SEARCH_ENGINE_ID}
&q={YOUR_SEARCH_STRING}&alt=json&fields=queries(request(totalResults))

Categories