WebFlux expand is not retrieving the second request - java

I'm trying to use Spring's webflux to create an http endpoint to stream github users using Github's api. I tried to do what is described here and here but it seems that the expand is not fetching the second page of results from github's api. What am I doing wrong?
Here's the code I currently have:
#RestController
#RequestMapping("/user")
public class GithubUserController {
private static final String GITHUB_API_URL = "https://api.github.com";
private final WebClient client = WebClient.create(GITHUB_API_URL);
#GetMapping(value = "/search/stream", produces = MediaType.APPLICATION_STREAM_JSON_VALUE)
public Flux<GithubUser> search(
#RequestParam String location,
#RequestParam String language,
#RequestParam String followers) {
return fetchUsers(
uriBuilder ->
uriBuilder
.path("/search/users")
.queryParam(
"q",
String.format(
"location:%s+language:%s+followers:%s", location, language, followers))
.build())
.expand(
response -> {
var links = response.headers().header("link");
Pattern p = Pattern.compile("<(.*)>; rel=\"next\".*");
for (String link : links) {
Matcher m = p.matcher(link);
if (m.matches()) {
return client.get().uri(m.group(1)).exchange();
}
}
return Flux.empty();
})
.flatMap(response -> response.bodyToFlux(GithubUsersResponse.class))
.flatMap(parsedResponse -> Flux.fromIterable(parsedResponse.getItems()))
.log();
}
private Mono<ClientResponse> fetchUsers(Function<UriBuilder, URI> url) {
return client.get().uri(url).exchange();
}
}
I can see that the regex for the second page works because if I add a print inside the if, it gets printed, however if I test this on the browser or on postman I only get the results for the first page of results returned by github's api:
{"login":"chrisbanes","id":"227486"}
{"login":"keyboardsurfer","id":"336005"}
{"login":"lucasr","id":"730395"}
{"login":"hitherejoe","id":"3879281"}
{"login":"StylingAndroid","id":"933874"}
{"login":"rstoyanchev","id":"401908"}
{"login":"RichardWarburton","id":"328174"}
{"login":"slightfoot","id":"906564"}
{"login":"tomwhite","id":"85085"}
{"login":"jstrachan","id":"30140"}
{"login":"wakaleo","id":"55986"}
{"login":"cesarferreira","id":"277426"}
{"login":"kevalpatel2106","id":"20060162"}
{"login":"jodastephen","id":"213212"}
{"login":"caveofprogramming","id":"19751656"}
{"login":"AlmasB","id":"3594742"}
{"login":"scottyab","id":"404105"}
{"login":"makovkastar","id":"1076309"}
{"login":"salaboy","id":"271966"}
{"login":"blundell","id":"655860"}
{"login":"PierfrancescoSoffritti","id":"7457011"}
{"login":"0xddr","id":"4354177"}
{"login":"irsdl","id":"1798313"}
{"login":"andreban","id":"1733592"}
{"login":"TWiStErRob","id":"2906988"}
{"login":"geometer","id":"344328"}
{"login":"neomatrix369","id":"1570917"}
{"login":"nebraslabs","id":"32421477"}
{"login":"lucko","id":"8352868"}
{"login":"isabelcosta","id":"11148726"}

The link header in the Github API provides the URI in an escaped format. The String you pass to client.get().uri() should be unescaped - so it escapes the escaped string, and you end up with a URL that returns nothing.
Instead, you probably want to use something similar to:
if (m.matches()) {
return client.get().uri(URI.create(m.group(1))).exchange();
}
Side note - your regular expression will probably want to account for any number of characters before the "next" link as well otherwise you'll be unable to go past the second page, so you probably want to prepend .* to that:
Pattern p = Pattern.compile(".*<(.*)>; rel=\"next\".*");
Second side note - Github's API is rate limited (heavily rate limited if you're unauthenticated), so you may well run into those rate limits. You'll probably want to handle that situation elegantly somehow, but that's a reasonably big topic that's beyond the scope of this question.

Related

Spring request mapping with regex like in javax.ws.rs

I'm trying rewrite this Google App Engine maven server repository to Spring.
I have problem with URL mapping.
Maven repo server standard looks like this:
URL with slash at the end, points to a folder, example:
http://127.0.0.1/testDir/
http://127.0.0.1/testDir/testDir2/
all others (without slash at the end) point to files, example:
http://127.0.0.1/testFile.jar
http://127.0.0.1/testFile.jar.sha1
http://127.0.0.1/testDir/testFile2.pom
http://127.0.0.1/testDir/testFile2.pom.md5
Original app mapping for directories and for files.
There were used annotations #javax.ws.rs.Path which supports regexy differently than Spring.
I tried bunch of combinations, for example something like this:
#ResponseBody
#GetMapping("/{file: .*}")
public String test1(#PathVariable String file) {
return "test1 " + file;
}
#ResponseBody
#GetMapping("{dir: .*[/]{1}$}")
public String test2(#PathVariable String dir) {
return "test2 " + dir;
}
But I can't figure out how to do this in right way in Spring application.
I'd like to avoid writing a custom servlet dispatcher.
I had a similar problem once, also regarding a Spring implementation of a maven endpoint.
For the file endpoints, you could do something like this
/**
* An example Maven endpoint for Jar files
*/
#GetMapping("/**/{artifactId}/{version}/{artifactId}-{version}.jar")
public ResponseEntity<String> getJar(#PathVariable("artifactId") String artifactId, #PathVariable("version") String version) {
...
}
This gives you the artifactId and the version, but for the groupId you would need to do some string parsing. You can get the current requestUri with the help of the ServletUriComponentsBuilder
String requestUri = ServletUriComponentsBuilder.fromCurrentRequestUri().build().toUri().toString();
// requestUri = /api/v1/com/my/groupId/an/artifact/v1/an-artifact-v1.jar
For the folder endpoints, I'm not sure if this will work, but you can give it a try
#GetMapping("/**/{artifactId}/{version}")
public ResponseEntity<String> getJar(#PathVariable("artifactId") String artifactId, #PathVariable("version") String version) {
// groupId extracted as before from the requestUri
...
}
Don't know about your java code, but if you are verifying one path at a time, you can just check if the string ends in "/" for a folder and the ones that don't are files
\/{1}$
this regular expression just checks that the string ends with "/" if there is a match, you have a folder, if there is not, you have a file
Well there is no other specific standard in Spring then the way you have used it. However if you can customize URL then I have a special way to differentiate directory and files. That will increase the scalibility and readability of application and will reduce lot of code for you.
Your Code as of now
#ResponseBody
#GetMapping("/{file: .*}")
public String test1(#PathVariable String file) {
return "test1 " + file;
}
#ResponseBody
#GetMapping("{dir: .*[/]{1}$}")
public String test2(#PathVariable String dir) {
return "test2 " + dir;
}
Change above code to as below in your controller class
private final Map<String, String> managedEntities=ImmutableMap.of(
"file","Type_Of_Operation_You_want_For_File",
"directory","Type_Of_Operation_You_want_For_Directory"
);
#GetMapping(path = "/{type:file|directory}")
public String myFileOperationControl(#PathVariable String type){
return "Test"+managedEntities.get(type));
}
And proceed further the way you want to per your business logic. Let me know if you have any questions.
Note: Please simply enhance endpoint per your need.
Spring doesn't allow matching to span multiple path segments. Path segments are delimited values of path on path separator (/). So no regex combination will get you there. Spring 5 although allows the span multiple path segments only at the end of path using ** or {*foobar} to capture in foobar uri template variable for reactive stack but I don't think that will be useful for you.
Your options are limited. I think the best option if possible is to use different delimiter than / and you can use regex.
Other option ( which is messy ) to have catch all (**) endpoint and read the path from the request and determine if it is file or directory path and perform actions.
Try this solution:
#GetMapping("**/{file:.+?\\..+}")
public String processFile(#PathVariable String file, HttpServletRequest request) {
return "test1 " + file;
}
#GetMapping("**/{dirName:\\w+}")
public String processDirectory(#PathVariable String dirName, HttpServletRequest request) {
String dirPath = request.getRequestURI();
return "test2 " + dirPath;
}
Results for URIs from the question:
test2 /testDir/
test2 /testDir/testDir2/
test1 testFile.jar
test1 testFile.jar.sha1
test1 testFile2.pom
test1 testFile2.pom.md5

REST Assured doesn't accept curly brackets

Unable to use query in my Endpoint URL
I have tried using .queryParams() but it does not seem to work . I am getting the following error
java.lang.IllegalArgumentException: Invalid number of path parameters.
Expected 1, was 0.Undefined path parameters are:
cycle-id[12345];test.name[Validate_REST_Assured_Curly_Brackets].
Can someone help me out
almQuery=https://{almurl}/qcbin/rest/domains/{domain}/projects/{project}/test-instances?query={cycle-id[12345];test.name[Validate_REST_Assured_Curly_Brackets]}
Response response = RestAssured.given().relaxedHTTPSValidation()
.contentType("application/xml")
.cookie(cookie) .get(getEntityEndpoint(almQuery)).then().extract().response();
This is how RestAssured implementation works. Whenever your url contains curly braces it will expect path param with for that. For example, if your url contains {project} you should provide a path param with name project.
The only way to avoid it is by manually encoding { and } characters in your url. You could use URLEncoder.encode(), but it will mess your other characters so try simply replacing all { and } with encoded values:
public class App {
public static void main(String[] args) {
String url = "http://www.example.com/path/{project}";
String encoded = encodeUrlBraces(url);
RestAssured.given()
.when()
.get(encoded);
}
private static String encodeUrlBraces(String url) {
return url.replaceAll("\\{", "%7B").replaceAll("}", "%7D");
}
}
Here's an answer for this from Rest Assured founder and contributor https://github.com/rest-assured/rest-assured/issues/682

Rest Assured code not allowing to use println

I am trying to automate twitter API. when tried to print "js.get("text") using
System.out.println(js.get("text")); I am getting error as
"The method println(boolean) is ambiguous for the type PrintStream"
I downloaded jars and passed in Build path as well "scribejava-apis-2.5.3" and "scribejava-core-4.2.0"
Below code is not allowing me use println for ------>js.get("text")
public class Basicfunc {
String Consumerkeys= "**************";
String Consumersecretkeys="*******************";
String Token="*******************";
String Tokensecret="***************************";
#Test
public void getLatestTweet(){
RestAssured.baseURI = "https://api.twitter.com/1.1/statuses";
Response res = given().auth().oauth(Consumerkeys, Consumersecretkeys, Token, Tokensecret).
queryParam("count","1").
when().get("/home_timeline.json").then().extract().response();
String response = res.asString();
System.out.println(response);
JsonPath js = new JsonPath(response);
System.out.println(js.get("text"));
}
}
Use System.out.println(js.getString("text")); instead of System.out.println(js.get("text"));, because get returns any primitive value.
I think your problem is that your twitter response is actually a list.
Try to use System.out.println(js.getList()[0].get("text")); and be aware that you are only using the first [0] entry and ignoring the rest.

Changing the format of list / array in URL [Retrofit 2]

As stated in the Retrofit documentation above the #Query annotation:
Passing a List or array will result in a query parameter for each
non-null item.
As of now my call looks something like this:
#GET("questions")
Call<List<QuestionHolder>> getQuestionsExcludingTheSpecified(
#Query("exclude_ids") long[] excludedQuestionIds
);
This works but results in fairly long URLs quite fast.
E.g. for excludedQuestionIds = new long[]{1L, 4L, 16L, 64L} the request URL already will be /questions?exclude_ids=1&exclude_ids=4&exclude_ids=16&exclude_ids=64.
Is there an easy way to exchange this behaviour resulting in arrays formatted as exclude_ids=[1,4,16,64] or something similar?
What came to my mind yet was, to:
use JsonArray as parameter, but then I need to convert every array / list before making the call
intercept every request and compress duplicated keys
override the built-in #Query decorator
Any ideas?
I decided to go with the Interceptor approach. I simply change any outgoing request that includes more than one value for a single query parameter.
public class QueryParameterCompressionInterceptor implements Interceptor {
#Override
public Response intercept(Interceptor.Chain chain) throws IOException {
Request request = chain.request();
HttpUrl url = request.url();
for (String parameterName : url.queryParameterNames()) {
List<String> queryParameterValues = url.queryParameterValues(parameterName);
if (queryParameterValues.size() > 1) {
String formattedValues= "[" + TextUtils.join(",", queryParameterValues) + "]";
request = request.newBuilder()
.url(
url.newBuilder()
.removeAllQueryParameters(parameterName)
.addQueryParameter(parameterName, formattedValues)
.build()
).build();
}
}
return chain.proceed(request);
}
non android solution
TextUtils is part of the Android SDK, in case you're not developing for Android you might exchange TextUtils.join for a method like this:
public static String concatListOfStrings(String separator, Iterable<String> strings) {
StringBuilder sb = new StringBuilder();
for (String str : strings) {
sb.append(separator).append(str);
}
sb.delete(0, separator.length());
return sb.toString();
}
}
You may also have a look at this SO question for more solutions regarding the concatenation.

Using java to get the google search results [duplicate]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
Does anyone know if and how it is possible to search Google programmatically - especially if there is a Java API for it?
Some facts:
Google offers a public search webservice API which returns JSON: http://ajax.googleapis.com/ajax/services/search/web. Documentation here
Java offers java.net.URL and java.net.URLConnection to fire and handle HTTP requests.
JSON can in Java be converted to a fullworthy Javabean object using an arbitrary Java JSON API. One of the best is Google Gson.
Now do the math:
public static void main(String[] args) throws Exception {
String google = "http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=";
String search = "stackoverflow";
String charset = "UTF-8";
URL url = new URL(google + URLEncoder.encode(search, charset));
Reader reader = new InputStreamReader(url.openStream(), charset);
GoogleResults results = new Gson().fromJson(reader, GoogleResults.class);
// Show title and URL of 1st result.
System.out.println(results.getResponseData().getResults().get(0).getTitle());
System.out.println(results.getResponseData().getResults().get(0).getUrl());
}
With this Javabean class representing the most important JSON data as returned by Google (it actually returns more data, but it's left up to you as an exercise to expand this Javabean code accordingly):
public class GoogleResults {
private ResponseData responseData;
public ResponseData getResponseData() { return responseData; }
public void setResponseData(ResponseData responseData) { this.responseData = responseData; }
public String toString() { return "ResponseData[" + responseData + "]"; }
static class ResponseData {
private List<Result> results;
public List<Result> getResults() { return results; }
public void setResults(List<Result> results) { this.results = results; }
public String toString() { return "Results[" + results + "]"; }
}
static class Result {
private String url;
private String title;
public String getUrl() { return url; }
public String getTitle() { return title; }
public void setUrl(String url) { this.url = url; }
public void setTitle(String title) { this.title = title; }
public String toString() { return "Result[url:" + url +",title:" + title + "]"; }
}
}
###See also:
How to fire and handle HTTP requests using java.net.URLConnection
How to convert JSON to Java
Update since November 2010 (2 months after the above answer), the public search webservice has become deprecated (and the last day on which the service was offered was September 29, 2014). Your best bet is now querying http://www.google.com/search directly along with a honest user agent and then parse the result using a HTML parser. If you omit the user agent, then you get a 403 back. If you're lying in the user agent and simulate a web browser (e.g. Chrome or Firefox), then you get a way much larger HTML response back which is a waste of bandwidth and performance.
Here's a kickoff example using Jsoup as HTML parser:
String google = "http://www.google.com/search?q=";
String search = "stackoverflow";
String charset = "UTF-8";
String userAgent = "ExampleBot 1.0 (+http://example.com/bot)"; // Change this to your company's name and bot homepage!
Elements links = Jsoup.connect(google + URLEncoder.encode(search, charset)).userAgent(userAgent).get().select(".g>.r>a");
for (Element link : links) {
String title = link.text();
String url = link.absUrl("href"); // Google returns URLs in format "http://www.google.com/url?q=<url>&sa=U&ei=<someKey>".
url = URLDecoder.decode(url.substring(url.indexOf('=') + 1, url.indexOf('&')), "UTF-8");
if (!url.startsWith("http")) {
continue; // Ads/news/etc.
}
System.out.println("Title: " + title);
System.out.println("URL: " + url);
}
To search google using API you should use Google Custom Search, scraping web page is not allowed
In java you can use CustomSearch API Client Library for Java
The maven dependency is:
<dependency>
<groupId>com.google.apis</groupId>
<artifactId>google-api-services-customsearch</artifactId>
<version>v1-rev57-1.23.0</version>
</dependency>
Example code searching using Google CustomSearch API Client Library
public static void main(String[] args) throws GeneralSecurityException, IOException {
String searchQuery = "test"; //The query to search
String cx = "002845322276752338984:vxqzfa86nqc"; //Your search engine
//Instance Customsearch
Customsearch cs = new Customsearch.Builder(GoogleNetHttpTransport.newTrustedTransport(), JacksonFactory.getDefaultInstance(), null)
.setApplicationName("MyApplication")
.setGoogleClientRequestInitializer(new CustomsearchRequestInitializer("your api key"))
.build();
//Set search parameter
Customsearch.Cse.List list = cs.cse().list(searchQuery).setCx(cx);
//Execute search
Search result = list.execute();
if (result.getItems()!=null){
for (Result ri : result.getItems()) {
//Get title, link, body etc. from search
System.out.println(ri.getTitle() + ", " + ri.getLink());
}
}
}
As you can see you will need to request an api key and setup an own search engine id, cx.
Note that you can search the whole web by selecting "Search entire web" on basic tab settings during setup of cx, but results will not be exactly the same as a normal browser google search.
Currently (date of answer) you get 100 api calls per day for free, then google like to share your profit.
In the Terms of Service of google we can read:
5.3 You agree not to access (or attempt to access) any of the Services by any means other than through the interface that is provided by Google, unless you have been specifically allowed to do so in a separate agreement with Google. You specifically agree not to access (or attempt to access) any of the Services through any automated means (including use of scripts or web crawlers) and shall ensure that you comply with the instructions set out in any robots.txt file present on the Services.
So I guess the answer is No. More over the SOAP API is no longer available
Google TOS have been relaxed a bit in April 2014. Now it states:
"Don’t misuse our Services. For example, don’t interfere with our Services or try to access them using a method other than the interface and the instructions that we provide."
So the passage about "automated means" and scripts is gone now. It evidently still is not the desired (by google) way of accessing their services, but I think it is now formally open to interpretation of what exactly an "interface" is and whether it makes any difference as of how exactly returned HTML is processed (rendered or parsed). Anyhow, I have written a Java convenience library and it is up to you to decide whether to use it or not:
https://github.com/afedulov/google-web-search
Indeed there is an API to search google programmatically. The API is called google custom search. For using this API, you will need an Google Developer API key and a cx key. A simple procedure for accessing google search from java program is explained in my blog.
Now dead, here is the Wayback Machine link.
As an alternative to BalusC answer as it has been deprecated and you have to use proxies, you can use this package. Code sample:
Map<String, String> parameter = new HashMap<>();
parameter.put("q", "Coffee");
parameter.put("location", "Portland");
GoogleSearchResults serp = new GoogleSearchResults(parameter);
JsonObject data = serp.getJson();
JsonArray results = (JsonArray) data.get("organic_results");
JsonObject first_result = results.get(0).getAsJsonObject();
System.out.println("first coffee: " + first_result.get("title").getAsString());
Library on GitHub
In light of those TOS alterations last year we built an API that gives access to Google's search. It was for our own use only but after some requests we decided to open it up. We're planning to add additional search engines in the future!
Should anyone be looking for an easy way to implement / acquire search results you are free to sign up and give the REST API a try: https://searchapi.io
It returns JSON results and should be easy enough to implement with the detailed docs.
It's a shame that Bing and Yahoo are miles ahead on Google in this regard. Their APIs aren't cheap, but at least available.

Categories