How to exit akka stream after n elements recieved? - java

I'm brand new to Akka and I'm just trying to get the hang of it.
As an experiment, I want to read from a Kinesis stream and collect n messages and stop.
The only one I found that would stop reading records was Sink.head(). But that only returns one record, I'd like to get more than that.
I can't quite figure out how to stop reading from the stream after receiving the n messages though.
Here's the code I have tried so far
#Test
public void testReadingFromKinesisNRecords() throws ExecutionException, InterruptedException {
final ActorSystem system = ActorSystem.create("foo");
final Materializer materializer = ActorMaterializer.create(system);
ProfileCredentialsProvider profileCredentialsProvider = ProfileCredentialsProvider.create();
final KinesisAsyncClient kinesisClient = KinesisAsyncClient.builder()
.credentialsProvider(profileCredentialsProvider)
.region(Region.US_WEST_2)
.httpClient(AkkaHttpClient.builder()
.withActorSystem(system).build())
.build();
system.registerOnTermination(kinesisClient::close);
String streamName = "akka-test-stream";
String shardId = "shardId-000000000000";
int numberOfRecordsToRead = 3;
final ShardSettings settings = ShardSettings.create(streamName, shardId)
.withRefreshInterval(Duration.ofSeconds(1))
.withLimit(numberOfRecordsToRead) // return a maximum of n records (and quit?!)
.withShardIterator(ShardIterators.latest());
final Source<Record, NotUsed> sourceKinesisBasic = KinesisSource.basic(settings, kinesisClient);
Flow<Record, String, NotUsed> flowMapRecordToString = Flow.of(Record.class).map(record -> extractDataFromRecord(record));
Flow<String, String, NotUsed> flowPrinter = Flow.of(String.class).map(s -> debugPrint(s));
// Flow<String, List<String>, NotUsed> flowGroupedWithinMinute =
// Flow.of(String.class).groupedWithin(
// numberOfRecordsToRead, // group size
// Duration.ofSeconds(60) // group time
// );
Source<String, NotUsed> sourceStringsFromKinesisRecords = sourceKinesisBasic
.via(flowMapRecordToString)
.via(flowPrinter);
// .via(flowGroupedWithinMinute); // nope
// sink to list of strings
// Sink<String, CompletionStage<List<String>>> sinkToList = Sink.seq();
Sink<String, CompletionStage<List<String>>> sink10 = Sink.takeLast(10);
// Sink<String, CompletionStage<String>> sinkHead = Sink.head(); // only gives you one message
CompletionStage<List<String>> streamCompletion = sourceStringsFromKinesisRecords
.runWith(sink10, materializer);
CompletableFuture<List<String>> completableFuture = streamCompletion.toCompletableFuture();
completableFuture.join(); // never stops running...
List<String> result = completableFuture.get();
int foo = 1;
}
private String extractDataFromRecord(Record record) {
String encType = record.encryptionTypeAsString();
Instant arrivalTimestamp = record.approximateArrivalTimestamp();
String data = record.data().asString(StandardCharsets.UTF_8);
return data;
}
private String debugPrint(String s) {
System.out.println(s);
return s;
}
Thank you for any clues

I found out the answer is to use a takeN at the flow level
...
Flow<String, String, NotUsed> flowTakeN = Flow.of(String.class).take(numberOfRecordsToRead);
Source<String, NotUsed> sourceStringsFromKinesisRecords = sourceKinesisBasic
.via(flowMapRecordToString)
.via(flowPrinter)
.via(flowTakeN);
...

Just to add on to the answer you found, it is also possible to express things more directly without via:
Source<String, NotUsed> sourceStringsFromKinesisRecords = sourceKinesisBasic
.map(record -> extractDataFromRecord(record))
.map(s -> debugPrint(s))
.take(10)

Related

How can JDBC URLs be parsed for extracting connection properties in mysql-connector-java 8.0? [duplicate]

I've got the URI like this:
https://google.com.ua/oauth/authorize?client_id=SS&response_type=code&scope=N_FULL&access_type=offline&redirect_uri=http://localhost/Callback
I need a collection with parsed elements:
NAME VALUE
------------------------
client_id SS
response_type code
scope N_FULL
access_type offline
redirect_uri http://localhost/Callback
To be exact, I need a Java equivalent for the C#/.NET HttpUtility.ParseQueryString method.
If you are looking for a way to achieve it without using an external library, the following code will help you.
public static Map<String, String> splitQuery(URL url) throws UnsupportedEncodingException {
Map<String, String> query_pairs = new LinkedHashMap<String, String>();
String query = url.getQuery();
String[] pairs = query.split("&");
for (String pair : pairs) {
int idx = pair.indexOf("=");
query_pairs.put(URLDecoder.decode(pair.substring(0, idx), "UTF-8"), URLDecoder.decode(pair.substring(idx + 1), "UTF-8"));
}
return query_pairs;
}
You can access the returned Map using <map>.get("client_id"), with the URL given in your question this would return "SS".
UPDATE URL-Decoding added
UPDATE As this answer is still quite popular, I made an improved version of the method above, which handles multiple parameters with the same key and parameters with no value as well.
public static Map<String, List<String>> splitQuery(URL url) throws UnsupportedEncodingException {
final Map<String, List<String>> query_pairs = new LinkedHashMap<String, List<String>>();
final String[] pairs = url.getQuery().split("&");
for (String pair : pairs) {
final int idx = pair.indexOf("=");
final String key = idx > 0 ? URLDecoder.decode(pair.substring(0, idx), "UTF-8") : pair;
if (!query_pairs.containsKey(key)) {
query_pairs.put(key, new LinkedList<String>());
}
final String value = idx > 0 && pair.length() > idx + 1 ? URLDecoder.decode(pair.substring(idx + 1), "UTF-8") : null;
query_pairs.get(key).add(value);
}
return query_pairs;
}
UPDATE Java8 version
public Map<String, List<String>> splitQuery(URL url) {
if (Strings.isNullOrEmpty(url.getQuery())) {
return Collections.emptyMap();
}
return Arrays.stream(url.getQuery().split("&"))
.map(this::splitQueryParameter)
.collect(Collectors.groupingBy(SimpleImmutableEntry::getKey, LinkedHashMap::new, mapping(Map.Entry::getValue, toList())));
}
public SimpleImmutableEntry<String, String> splitQueryParameter(String it) {
final int idx = it.indexOf("=");
final String key = idx > 0 ? it.substring(0, idx) : it;
final String value = idx > 0 && it.length() > idx + 1 ? it.substring(idx + 1) : null;
return new SimpleImmutableEntry<>(
URLDecoder.decode(key, StandardCharsets.UTF_8),
URLDecoder.decode(value, StandardCharsets.UTF_8)
);
}
Running the above method with the URL
https://stackoverflow.com?param1=value1&param2=&param3=value3&param3
returns this Map:
{param1=["value1"], param2=[null], param3=["value3", null]}
org.apache.http.client.utils.URLEncodedUtils
is a well known library that can do it for you
import org.apache.hc.client5.http.utils.URLEncodedUtils
String url = "http://www.example.com/something.html?one=1&two=2&three=3&three=3a";
List<NameValuePair> params = URLEncodedUtils.parse(new URI(url), Charset.forName("UTF-8"));
for (NameValuePair param : params) {
System.out.println(param.getName() + " : " + param.getValue());
}
Outputs
one : 1
two : 2
three : 3
three : 3a
If you are using Spring Framework:
public static void main(String[] args) {
String uri = "http://my.test.com/test?param1=ab&param2=cd&param2=ef";
MultiValueMap<String, String> parameters =
UriComponentsBuilder.fromUriString(uri).build().getQueryParams();
List<String> param1 = parameters.get("param1");
List<String> param2 = parameters.get("param2");
System.out.println("param1: " + param1.get(0));
System.out.println("param2: " + param2.get(0) + "," + param2.get(1));
}
You will get:
param1: ab
param2: cd,ef
use google Guava and do it in 2 lines:
import java.util.Map;
import com.google.common.base.Splitter;
public class Parser {
public static void main(String... args) {
String uri = "https://google.com.ua/oauth/authorize?client_id=SS&response_type=code&scope=N_FULL&access_type=offline&redirect_uri=http://localhost/Callback";
String query = uri.split("\\?")[1];
final Map<String, String> map = Splitter.on('&').trimResults().withKeyValueSeparator('=').split(query);
System.out.println(map);
}
}
which gives you
{client_id=SS, response_type=code, scope=N_FULL, access_type=offline, redirect_uri=http://localhost/Callback}
The shortest way I've found is this one:
MultiValueMap<String, String> queryParams =
UriComponentsBuilder.fromUriString(url).build().getQueryParams();
UPDATE: UriComponentsBuilder comes from Spring. Here the link.
For Android, if you are using OkHttp in your project. You might get a look at this. It simple and helpful.
final HttpUrl url = HttpUrl.parse(query);
if (url != null) {
final String target = url.queryParameter("target");
final String id = url.queryParameter("id");
}
PLAIN Java 11
Given the URL to analyse:
URL url = new URL("https://google.com.ua/oauth/authorize?client_id=SS&response_type=code&scope=N_FULL&access_type=offline&redirect_uri=http://localhost/Callback");
This solution collects a list of pairs:
List<Map.Entry<String, String>> list = Pattern.compile("&")
.splitAsStream(url.getQuery())
.map(s -> Arrays.copyOf(s.split("=", 2), 2))
.map(o -> Map.entry(decode(o[0]), decode(o[1])))
.collect(Collectors.toList());
This solution on the other hand collects a map (given that in a url there can be more parameters with same name but different values).
Map<String, List<String>> list = Pattern.compile("&")
.splitAsStream(url.getQuery())
.map(s -> Arrays.copyOf(s.split("=", 2), 2))
.collect(groupingBy(s -> decode(s[0]), mapping(s -> decode(s[1]), toList())));
Both the solutions must use an utility function to properly decode the parameters.
private static String decode(final String encoded) {
return Optional.ofNullable(encoded)
.map(e -> URLDecoder.decode(e, StandardCharsets.UTF_8))
.orElse(null);
}
On Android, there is a Uri class in package android.net . Note that Uri is part of android.net, whereas URI is part of java.net .
Uri class has many functions to extract key-value pairs from a query.
Following function returns key-value pairs in the form of HashMap.
In Java:
Map<String, String> getQueryKeyValueMap(Uri uri){
HashMap<String, String> keyValueMap = new HashMap();
String key;
String value;
Set<String> keyNamesList = uri.getQueryParameterNames();
Iterator iterator = keyNamesList.iterator();
while (iterator.hasNext()){
key = (String) iterator.next();
value = uri.getQueryParameter(key);
keyValueMap.put(key, value);
}
return keyValueMap;
}
In Kotlin:
fun getQueryKeyValueMap(uri: Uri): HashMap<String, String> {
val keyValueMap = HashMap<String, String>()
var key: String
var value: String
val keyNamesList = uri.queryParameterNames
val iterator = keyNamesList.iterator()
while (iterator.hasNext()) {
key = iterator.next() as String
value = uri.getQueryParameter(key) as String
keyValueMap.put(key, value)
}
return keyValueMap
}
If you are using servlet doGet try this
request.getParameterMap()
Returns a java.util.Map of the parameters of this request.
Returns:
an immutable java.util.Map containing parameter names as keys and parameter values as map values. The keys in the parameter map are of type String. The values in the parameter map are of type String array.
(Java doc)
Netty also provides a nice query string parser called QueryStringDecoder.
In one line of code, it can parse the URL in the question.
I like because it doesn't require catching or throwing java.net.MalformedURLException.
In one line:
Map<String, List<String>> parameters = new QueryStringDecoder(url).parameters();
See javadocs here: https://netty.io/4.1/api/io/netty/handler/codec/http/QueryStringDecoder.html
Here is a short, self contained, correct example:
import io.netty.handler.codec.http.QueryStringDecoder;
import org.apache.commons.lang3.StringUtils;
import java.util.List;
import java.util.Map;
public class UrlParse {
public static void main(String... args) {
String url = "https://google.com.ua/oauth/authorize?client_id=SS&response_type=code&scope=N_FULL&access_type=offline&redirect_uri=http://localhost/Callback";
QueryStringDecoder decoder = new QueryStringDecoder(url);
Map<String, List<String>> parameters = decoder.parameters();
print(parameters);
}
private static void print(final Map<String, List<String>> parameters) {
System.out.println("NAME VALUE");
System.out.println("------------------------");
parameters.forEach((key, values) ->
values.forEach(val ->
System.out.println(StringUtils.rightPad(key, 19) + val)));
}
}
which generates
NAME VALUE
------------------------
client_id SS
response_type code
scope N_FULL
access_type offline
redirect_uri http://localhost/Callback
If you're using Java 8 and you're willing to write a few reusable methods, you can do it in one line.
private Map<String, List<String>> parse(final String query) {
return Arrays.asList(query.split("&")).stream().map(p -> p.split("=")).collect(Collectors.toMap(s -> decode(index(s, 0)), s -> Arrays.asList(decode(index(s, 1))), this::mergeLists));
}
private <T> List<T> mergeLists(final List<T> l1, final List<T> l2) {
List<T> list = new ArrayList<>();
list.addAll(l1);
list.addAll(l2);
return list;
}
private static <T> T index(final T[] array, final int index) {
return index >= array.length ? null : array[index];
}
private static String decode(final String encoded) {
try {
return encoded == null ? null : URLDecoder.decode(encoded, "UTF-8");
} catch(final UnsupportedEncodingException e) {
throw new RuntimeException("Impossible: UTF-8 is a required encoding", e);
}
}
But that's a pretty brutal line.
There a new version of Apache HTTP client - org.apache.httpcomponents.client5 - where URLEncodedUtils is now deprecated. URIBuilder should be used instead:
import org.apache.hc.core5.http.NameValuePair;
import org.apache.hc.core5.net.URIBuilder;
private static Map<String, String> getQueryParameters(final String url) throws URISyntaxException {
return new URIBuilder(new URI(url), StandardCharsets.UTF_8).getQueryParams()
.stream()
.collect(Collectors.toMap(NameValuePair::getName,
nameValuePair -> URLDecoder.decode(nameValuePair.getValue(), StandardCharsets.UTF_8)));
}
A ready-to-use solution for decoding of URI query part (incl. decoding and multi parameter values)
Comments
I wasn't happy with the code provided by #Pr0gr4mm3r in https://stackoverflow.com/a/13592567/1211082 . The Stream-based solution does not do URLDecoding, the mutable version clumpsy.
Thus I elaborated a solution that
Can decompose a URI query part into a Map<String, List<Optional<String>>>
Can handle multiple values for the same parameter name
Can represent parameters without a value properly (Optional.empty() instead of null)
Decodes parameter names and values correctly via URLdecode
Is based on Java 8 Streams
Is directly usable (see code including imports below)
Allows for proper error handling (here via turning a checked exception UnsupportedEncodingExceptioninto a runtime exception RuntimeUnsupportedEncodingException that allows interplay with stream. (Wrapping regular function into functions throwing checked exceptions is a pain. And Scala Try is not available in the Java language default.)
Java Code
import java.io.UnsupportedEncodingException;
import java.net.URLDecoder;
import java.util.*;
import static java.util.stream.Collectors.*;
public class URIParameterDecode {
/**
* Decode parameters in query part of a URI into a map from parameter name to its parameter values.
* For parameters that occur multiple times each value is collected.
* Proper decoding of the parameters is performed.
*
* Example
* <pre>a=1&b=2&c=&a=4</pre>
* is converted into
* <pre>{a=[Optional[1], Optional[4]], b=[Optional[2]], c=[Optional.empty]}</pre>
* #param query the query part of an URI
* #return map of parameters names into a list of their values.
*
*/
public static Map<String, List<Optional<String>>> splitQuery(String query) {
if (query == null || query.isEmpty()) {
return Collections.emptyMap();
}
return Arrays.stream(query.split("&"))
.map(p -> splitQueryParameter(p))
.collect(groupingBy(e -> e.get0(), // group by parameter name
mapping(e -> e.get1(), toList())));// keep parameter values and assemble into list
}
public static Pair<String, Optional<String>> splitQueryParameter(String parameter) {
final String enc = "UTF-8";
List<String> keyValue = Arrays.stream(parameter.split("="))
.map(e -> {
try {
return URLDecoder.decode(e, enc);
} catch (UnsupportedEncodingException ex) {
throw new RuntimeUnsupportedEncodingException(ex);
}
}).collect(toList());
if (keyValue.size() == 2) {
return new Pair(keyValue.get(0), Optional.of(keyValue.get(1)));
} else {
return new Pair(keyValue.get(0), Optional.empty());
}
}
/** Runtime exception (instead of checked exception) to denote unsupported enconding */
public static class RuntimeUnsupportedEncodingException extends RuntimeException {
public RuntimeUnsupportedEncodingException(Throwable cause) {
super(cause);
}
}
/**
* A simple pair of two elements
* #param <U> first element
* #param <V> second element
*/
public static class Pair<U, V> {
U a;
V b;
public Pair(U u, V v) {
this.a = u;
this.b = v;
}
public U get0() {
return a;
}
public V get1() {
return b;
}
}
}
Scala Code
... and for the sake of completeness I can not resist to provide the solution in Scala that dominates by brevity and beauty
import java.net.URLDecoder
object Decode {
def main(args: Array[String]): Unit = {
val input = "a=1&b=2&c=&a=4";
println(separate(input))
}
def separate(input: String) : Map[String, List[Option[String]]] = {
case class Parameter(key: String, value: Option[String])
def separateParameter(parameter: String) : Parameter =
parameter.split("=")
.map(e => URLDecoder.decode(e, "UTF-8")) match {
case Array(key, value) => Parameter(key, Some(value))
case Array(key) => Parameter(key, None)
}
input.split("&").toList
.map(p => separateParameter(p))
.groupBy(p => p.key)
.mapValues(vs => vs.map(p => p.value))
}
}
Using above mentioned comments and solutions, I am storing all the query parameters using Map<String, Object> where Objects either can be string or Set<String>. The solution is given below. It is recommended to use some kind of url validator to validate the url first and then call convertQueryStringToMap method.
private static final String DEFAULT_ENCODING_SCHEME = "UTF-8";
public static Map<String, Object> convertQueryStringToMap(String url) throws UnsupportedEncodingException, URISyntaxException {
List<NameValuePair> params = URLEncodedUtils.parse(new URI(url), DEFAULT_ENCODING_SCHEME);
Map<String, Object> queryStringMap = new HashMap<>();
for(NameValuePair param : params){
queryStringMap.put(param.getName(), handleMultiValuedQueryParam(queryStringMap, param.getName(), param.getValue()));
}
return queryStringMap;
}
private static Object handleMultiValuedQueryParam(Map responseMap, String key, String value) {
if (!responseMap.containsKey(key)) {
return value.contains(",") ? new HashSet<String>(Arrays.asList(value.split(","))) : value;
} else {
Set<String> queryValueSet = responseMap.get(key) instanceof Set ? (Set<String>) responseMap.get(key) : new HashSet<String>();
if (value.contains(",")) {
queryValueSet.addAll(Arrays.asList(value.split(",")));
} else {
queryValueSet.add(value);
}
return queryValueSet;
}
}
I had a go at a Kotlin version seeing how this is the top result in Google.
#Throws(UnsupportedEncodingException::class)
fun splitQuery(url: URL): Map<String, List<String>> {
val queryPairs = LinkedHashMap<String, ArrayList<String>>()
url.query.split("&".toRegex())
.dropLastWhile { it.isEmpty() }
.map { it.split('=') }
.map { it.getOrEmpty(0).decodeToUTF8() to it.getOrEmpty(1).decodeToUTF8() }
.forEach { (key, value) ->
if (!queryPairs.containsKey(key)) {
queryPairs[key] = arrayListOf(value)
} else {
if(!queryPairs[key]!!.contains(value)) {
queryPairs[key]!!.add(value)
}
}
}
return queryPairs
}
And the extension methods
fun List<String>.getOrEmpty(index: Int) : String {
return getOrElse(index) {""}
}
fun String.decodeToUTF8(): String {
URLDecoder.decode(this, "UTF-8")
}
Also, I would recommend regex based implementation of URLParser
import java.util.regex.Matcher;
import java.util.regex.Pattern;
class URLParser {
private final String query;
public URLParser(String query) {
this.query = query;
}
public String get(String name) {
String regex = "(?:^|\\?|&)" + name + "=(.*?)(?:&|$)";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(this.query);
if (matcher.find()) {
return matcher.group(1);
}
return "";
}
}
This class is easy to use. It just needs the URL or the query string on initialization and parses value by given key.
class Main {
public static void main(String[] args) {
URLParser parser = new URLParser("https://www.google.com/search?q=java+parse+url+params&oq=java+parse+url+params&aqs=chrome..69i57j0i10.18908j0j7&sourceid=chrome&ie=UTF-8");
System.out.println(parser.get("q")); // java+parse+url+params
System.out.println(parser.get("sourceid")); // chrome
System.out.println(parser.get("ie")); // UTF-8
}
}
Kotlin's Answer with initial reference from https://stackoverflow.com/a/51024552/3286489, but with improved version by tidying up codes and provides 2 versions of it, and use immutable collection operations
Use java.net.URI to extract the Query. Then use the below provided extension functions
Assuming you only want the last value of query i.e. page2&page3 will get {page=3}, use the below extension function
fun URI.getQueryMap(): Map<String, String> {
if (query == null) return emptyMap()
return query.split("&")
.mapNotNull { element -> element.split("=")
.takeIf { it.size == 2 && it.none { it.isBlank() } } }
.associateBy({ it[0].decodeUTF8() }, { it[1].decodeUTF8() })
}
private fun String.decodeUTF8() = URLDecoder.decode(this, "UTF-8") // decode page=%22ABC%22 to page="ABC"
Assuming you want a list of all value for the query i.e. page2&page3 will get {page=[2, 3]}
fun URI.getQueryMapList(): Map<String, List<String>> {
if (query == null) return emptyMap()
return query.split("&")
.distinct()
.mapNotNull { element -> element.split("=")
.takeIf { it.size == 2 && it.none { it.isBlank() } } }
.groupBy({ it[0].decodeUTF8() }, { it[1].decodeUTF8() })
}
private fun String.decodeUTF8() = URLDecoder.decode(this, "UTF-8") // decode page=%22ABC%22 to page="ABC"
The way to use it as below
val uri = URI("schema://host/path/?page=&page=2&page=2&page=3")
println(uri.getQueryMapList()) // Result is {page=[2, 3]}
println(uri.getQueryMap()) // Result is {page=3}
There are plenty of answers which work for your query as you've indicated when it has single parameter definitions. In some applications it may be useful to handle a few extra query parameter edge cases such as:
list of parameter values such as param1&param1=value&param1= meaning param1 is set to List.of("", "value", "")
invalid permutations such as querypath?&=&&=noparamname&.
use empty string not null in maps a= means "a" is List.of("") to match web servlet handling
This uses a Stream with filters and groupingBy to collect to Map<String, List<String>>:
public static Map<String, List<String>> getParameterValues(URL url) {
return Arrays.stream(url.getQuery().split("&"))
.map(s -> s.split("="))
// filter out empty parameter names (as in Tomcat) "?&=&&=value&":
.filter(arr -> arr.length > 0 && arr[0].length() > 0)
.collect(Collectors.groupingBy(arr -> URLDecoder.decode(arr[0], StandardCharsets.UTF_8),
// drop this line for not-name definition order Map:
LinkedHashMap::new,
Collectors.mapping(arr -> arr.length < 2 ? "" : URLDecoder.decode(arr[1], StandardCharsets.UTF_8), Collectors.toList())));
}
If you are using Spring, add an argument of type #RequestParam Map<String,String> to your controller method, and Spring will construct the map for you!
Just an update to the Java 8 version
public Map<String, List<String>> splitQuery(URL url) {
if (Strings.isNullOrEmpty(url.getQuery())) {
return Collections.emptyMap();
}
return Arrays.stream(url.getQuery().split("&"))
.map(this::splitQueryParameter)
.collect(Collectors.groupingBy(SimpleImmutableEntry::getKey, LinkedHashMap::new, **Collectors**.mapping(Map.Entry::getValue, **Collectors**.toList())));
}
mapping and toList() methods have to be used with Collectors which was not mentioned in the top answer. Otherwise it would throw compilation error in IDE
Answering here because this is a popular thread. This is a clean solution in Kotlin that uses the recommended UrlQuerySanitizer api. See the official documentation. I have added a string builder to concatenate and display the params.
var myURL: String? = null
if (intent.hasExtra("my_value")) {
myURL = intent.extras.getString("my_value")
} else {
myURL = intent.dataString
}
val sanitizer = UrlQuerySanitizer(myURL)
// We don't want to manually define every expected query *key*, so we set this to true
sanitizer.allowUnregisteredParamaters = true
val parameterNamesToValues: List<UrlQuerySanitizer.ParameterValuePair> = sanitizer.parameterList
val parameterIterator: Iterator<UrlQuerySanitizer.ParameterValuePair> = parameterNamesToValues.iterator()
// Helper simply so we can display all values on screen
val stringBuilder = StringBuilder()
while (parameterIterator.hasNext()) {
val parameterValuePair: UrlQuerySanitizer.ParameterValuePair = parameterIterator.next()
val parameterName: String = parameterValuePair.mParameter
val parameterValue: String = parameterValuePair.mValue
// Append string to display all key value pairs
stringBuilder.append("Key: $parameterName\nValue: $parameterValue\n\n")
}
// Set a textView's text to display the string
val paramListString = stringBuilder.toString()
val textView: TextView = findViewById(R.id.activity_title) as TextView
textView.text = "Paramlist is \n\n$paramListString"
// to check if the url has specific keys
if (sanitizer.hasParameter("type")) {
val type = sanitizer.getValue("type")
println("sanitizer has type param $type")
}
Here is my solution with reduce and Optional:
private Optional<SimpleImmutableEntry<String, String>> splitKeyValue(String text) {
String[] v = text.split("=");
if (v.length == 1 || v.length == 2) {
String key = URLDecoder.decode(v[0], StandardCharsets.UTF_8);
String value = v.length == 2 ? URLDecoder.decode(v[1], StandardCharsets.UTF_8) : null;
return Optional.of(new SimpleImmutableEntry<String, String>(key, value));
} else
return Optional.empty();
}
private HashMap<String, String> parseQuery(URI uri) {
HashMap<String, String> params = Arrays.stream(uri.getQuery()
.split("&"))
.map(this::splitKeyValue)
.filter(Optional::isPresent)
.map(Optional::get)
.reduce(
// initial value
new HashMap<String, String>(),
// accumulator
(map, kv) -> {
map.put(kv.getKey(), kv.getValue());
return map;
},
// combiner
(a, b) -> {
a.putAll(b);
return a;
});
return params;
}
I ignore duplicate parameters (I take the last one).
I use Optional<SimpleImmutableEntry<String, String>> to ignore garbage later
The reduction start with an empty map, then populate it on each SimpleImmutableEntry
In case you ask, reduce requires this weird combiner in the last parameter, which is only used in parallel streams. Its goal is to merge two intermediate results (here HashMap).
If you happen to have cxf-core on the classpath and you know you have no repeated query params, you may want to use UrlUtils.parseQueryString.
The Eclipse Jersey REST framework supports this through UriComponent. Example:
import org.glassfish.jersey.uri.UriComponent;
String uri = "https://google.com.ua/oauth/authorize?client_id=SS&response_type=code&scope=N_FULL&access_type=offline&redirect_uri=http://localhost/Callback";
MultivaluedMap<String, String> params = UriComponent.decodeQuery(URI.create(uri), true);
for (String key : params.keySet()) {
System.out.println(key + ": " + params.getFirst(key));
}
If just want the parameters after the URL from a String. Then the following code will work. I am just assuming the simple Url. I mean no hard and fast checking and decoding. Like in one of my test case I got the Url and I know I just need the value of the paramaters. The url was simple. No encoding decoding needed.
String location = "https://google.com.ua/oauth/authorize?client_id=SS&response_type=code&scope=N_FULL&access_type=offline&redirect_uri=http://localhost/Callback";
String location1 = "https://stackoverflow.com?param1=value1&param2=value2&param3=value3";
String location2 = "https://stackoverflow.com?param1=value1&param2=&param3=value3&param3";
Map<String, String> paramsMap = Stream.of(location)
.filter(l -> l.indexOf("?") != -1)
.map(l -> l.substring(l.indexOf("?") + 1, l.length()))
.flatMap(q -> Pattern.compile("&").splitAsStream(q))
.map(s -> s.split("="))
.filter(a -> a.length == 2)
.collect(Collectors.toMap(
a -> a[0],
a -> a[1],
(existing, replacement) -> existing + ", " + replacement,
LinkedHashMap::new
));
System.out.println(paramsMap);
Thanks
That seems tidy to me the best way:
static Map<String, String> decomposeQueryString(String query, Charset charset) {
return Arrays.stream(query.split("&"))
.map(pair -> pair.split("=", 2))
.collect(Collectors.toMap(
pair -> URLDecoder.decode(pair[0], charset),
pair -> pair.length > 1 ? URLDecoder.decode(pair[1], charset) : null)
);
}
The prerequisite is that your query syntax does not allow repeated parameters.
The Hutool framework supports this through HttpUtil. Example:
import cn.hutool.http.HttpUtil;
String url ="https://google.com.ua/oauth/authorize?client_id=SS&response_type=code&scope=N_FULL&access_type=offline&redirect_uri=http://localhost/Callback";
Map<String, List<String>> stringListMap = HttpUtil.decodeParams(url, "UTF-8");
System.out.println("decodeParams:" + stringListMap);
You will get:
decodeParams:{client_id=[SS], response_type=[code], scope=[N_FULL], access_type=[offline], redirect_uri=[http://localhost/Callback]}
A kotlin version
of the answer Answer by matthias provided
fun decomposeQueryString(query: String, charset: Charset): Map<String, String?> {
return if (query.split("?").size <= 1)
emptyMap()
else {
query.split("?")[1]
.split("&")
.map { it.split(Pattern.compile("="), 2) }
.associate {
Pair(
URLDecoder.decode(it[0], charset.name()),
if (it.size > 1) URLDecoder.decode(it[1], charset.name()) else null
)
}
}
}
This takes of the first parameter after the question mark '?' as well.
Plain Java, No Special Libraries, Nothing Fancy
// assumes you are parsing a line that looks like:
// /path/resource?key=value&parameter=value
// which you got from a request header line that looks like:
// GET /path/resource?key=value&parameter=value HTTP/1.1
public HashMap<String, String> parseQuery(String path){
if(path == null || path.isEmpty()){ //basic sanity check
return null;
}
int indexOfQ = path.indexOf("?"); //where the query string starts
if(indexOfQ == -1){return null;} //check query exists
String queryString = path.substring(indexOfQ + 1);
String[] queryStringArray = queryString.split("&");
Map<String, String> kvMap = new HashMap<>();
for(String kvString : queryStringArray){
int indexOfE = kvString.indexOf("="); //check query is formed correctly
if(indexOfE == -1 || indexOfE == 0){return null;}
String[] kvPairArray = kvString.split("=");
kvMap.put(kvPairArray[0], kvPairArray[1]);
}
return kvMap;
}
org.keycloak.common.util.UriUtils
I had to parse URIs and Query Parameters in a Keycloak extension and found this utility classes very useful:
org.keycloak.common.util.UriUtils:
static MultivaluedHashMap<String,String> decodeQueryString(String queryString)
There is also a useful method to delete one query parameter:
static String stripQueryParam(String url, String name)
And to parse the URL there is
org.keycloak.common.util.KeycloakUriBuilder:
KeycloakUriBuilder uri(String uriTemplate)
String getQuery()
and lots of other goodies.

Empty data is returned when querying using Kafka tumbling window

I'm trying to query the state store to get the data in a window of 5 mins. For that I'm using tumbling window. Have added REST to query the data.
I've stream A which consumes data from topic1 and performs some transformations and output a key value to topic2.
Now in stream B I'm doing tumbling window operation on topic2 data. When I run the code and queried using REST, I'm seeing empty data on my browser. I can see the data in the state store flowing.
What I've observed is, instead of topic2 getting data from stream A, I used a producer class to inject the data to topic2 and able to query the data from browser. But when the topic2 is getting data from stream A, I'm getting empty data.
Here is my stream A code :
public static void main(String[] args) {
final StreamsBuilder builder = new StreamsBuilder();
KStream<String, String> source = builder.stream("topic1");
KStream<String, String> output = source
.map((k,v)->
{
Map<String, Object> Fields = new LinkedHashMap<>();
Fields.put("FNAME","ABC");
Fields.put("LNAME","XYZ");
Map<String, Object> nFields = new LinkedHashMap<>();
nFields.put("ADDRESS1","HY");
nFields.put("ADDRESS2","BA");
nFields.put("addF",Fields);
Map<String, Object> eve = new LinkedHashMap<>();
eve.put("nFields", nFields);
Map<String, Object> fevent = new LinkedHashMap<>();
fevent.put("eve", eve);
LinkedHashMap<String, Object> newMap = new LinkedHashMap<>(fevent);
return new KeyValue<>("JAY1234",newMap.toString());
});
output.to("topic2");
}
Here is my stream B code (where tumbling window operation happening):
public static void main(String[] args) {
final StreamsBuilder builder = new StreamsBuilder();
KStream<String, String> eventStream = builder.stream("topic2");
eventStream.groupByKey()
.windowedBy(TimeWindows.of(300000))
.reduce((v1, v2) -> v1 + ";" + v2, Materialized.as("TumblingWindowPoc"));
final Topology topology = builder.build();
KafkaStreams streams = new KafkaStreams(topology, props);
streams.start();
}
REST code :
#GET()
#Path("/{storeName}/{key}")
#Produces(MediaType.APPLICATION_JSON)
public List<KeyValue<String, String>> windowedByKey(#PathParam("storeName") final String storeName,
#PathParam("key") final String key) {
final ReadOnlyWindowStore<String, String> store = streams.store(storeName,
QueryableStoreTypes.<String, String>windowStore());
if (store == null) {
throw new NotFoundException(); }
long timeTo = System.currentTimeMillis();
long timeFrom = timeTo - 30000;
final WindowStoreIterator<String> results = store.fetch(key, timeFrom, timeTo);
final List<KeyValue<String,String>> windowResults = new ArrayList<>();
while (results.hasNext()) {
final KeyValue<Long, String> next = results.next();
windowResults.add(new KeyValue<String,String>(key + "#" + next.key, next.value));
}
return windowResults;
}
And this is how my key value data looks like :
JAY1234 {eve = {nFields = {ADDRESS1 = HY,ADDRESS2 = BA,Fields = {FNAME = ABC,LNAME = XYZ,}}}}
I should be able to get the data when querying using REST. Any help is greatly appreciated.
Thanks!
to fetch the window timeFrom should be before window start. So if you want the data for last 30 seconds, you can substract window duration for fetching, like timeTo - 30000 - 300000, and then filter out events required events from whole window data

AWS Firehose Transformation lambda putting all messages in same s3 folder

I have a Kinesis stream, i have created firehose delivery stream and saving all the data to s3, it was saving correctly in hourly folders. Then i have written firehose transformation lambda, after deploying that all the messages are going to same folder, i am not sure what i am missing. I have below fields in my response from lambda function:
result.put("recordId", record.getRecordId());
result.put("result", "Ok");
result.put("approximateArrivalEpoch", record.getApproximateArrivalEpoch());
result.put("approximateArrivalTimestamp",record.getApproximateArrivalTimestamp());
result.put("kinesisRecordMetadata", record.getKinesisRecordMetadata());
result.put("data", Base64.getEncoder().encodeToString(jsonData.getBytes()));
Edit:
Here is my code in java. I am using KinesisFirehoseEvent and decoding was not needed for my case and i got ByteBuffer in KinesisFirehoseEvent
public JSONObject handler(KinesisFirehoseEvent kinesisFirehoseEvent, Context context) {
final LambdaLogger logger = context.getLogger();
final JSONArray resultArray = new JSONArray();
for (final KinesisFirehoseEvent.Record record: kinesisFirehoseEvent.getRecords()) {
final byte[] data = record.getData().array();
final Optional<TestData> testData = deserialize(data, logger);
if (testData.isPresent()) {
final JSONObject jsonObj = new JSONObject();
final String jsonData = gson.toJson(testData.get());
jsonObj.put("recordId", record.getRecordId());
jsonObj.put("result", "Ok");
jsonObj.put("approximateArrivalEpoch", record.getApproximateArrivalEpoch());
jsonObj.put("approximateArrivalTimestamp", record.getApproximateArrivalTimestamp());
jsonObj.put("kinesisRecordMetadata", record.getKinesisRecordMetadata());
jsonObj.put("data", Base64.getEncoder().encodeToString
(jsonData.getBytes()));
resultArray.add(jsonObj);
}
else {
logger.log("testData not deserialized");
}
}
final JSONObject jsonFinalObj = new JSONObject();
jsonFinalObj.put("records", resultArray);
return jsonFinalObj;
}
The lambda function returning data is not in correct format,
Checkout the below example,
'use strict';
console.log('Loading function');
/* Stock Ticker format parser */
const parser = /^\{\"TICKER_SYMBOL\"\:\"[A-Z]+\"\,\"SECTOR\"\:"[A-Z]+\"\,\"CHANGE\"\:[-.0-9]+\,\"PRICE\"\:[-.0-9]+\}/;
exports.handler = (event, context, callback) => {
let success = 0; // Number of valid entries found
let failure = 0; // Number of invalid entries found
let dropped = 0; // Number of dropped entries
/* Process the list of records and transform them */
const output = event.records.map((record) => {
const entry = (new Buffer(record.data, 'base64')).toString('utf8');
let match = parser.exec(entry);
if (match) {
let parsed_match = JSON.parse(match);
var milliseconds = new Date().getTime();
/* Add timestamp and convert to CSV */
const result = `${milliseconds},${parsed_match.TICKER_SYMBOL},${parsed_match.SECTOR},${parsed_match.CHANGE},${parsed_match.PRICE}`+"\n";
const payload = (new Buffer(result, 'utf8')).toString('base64');
if (parsed_match.SECTOR != 'RETAIL') {
/* Dropped event, notify and leave the record intact */
dropped++;
return {
recordId: record.recordId,
result: 'Dropped',
data: record.data,
};
}
else {
/* Transformed event */
success++;
return {
recordId: record.recordId,
result: 'Ok',
data: payload,
};
}
}
else {
/* Failed event, notify the error and leave the record intact */
console.log("Failed event : "+ record.data);
failure++;
return {
recordId: record.recordId,
result: 'ProcessingFailed',
data: record.data,
};
}
/* This transformation is the "identity" transformation, the data is left intact
return {
recordId: record.recordId,
result: 'Ok',
data: record.data,
} */
});
console.log(`Processing completed. Successful records ${output.length}.`);
callback(null, { records: output });
};
Below documentation can help more details on the data returning format,
https://aws.amazon.com/blogs/compute/amazon-kinesis-firehose-data-transformation-with-aws-lambda/
Hope it helps.
I got this working using above code only, its just that looks like stream is slow so data of new hours haven't reached yet.

Apache Spark merge after updateStateByKey()

I'm trying to merge two streams and one of them should be stateful (like static data with not frequent updates):
SparkConf conf = new SparkConf().setAppName("Test Application").setMaster("local[*]");
JavaStreamingContext context = new JavaStreamingContext(conf, Durations.seconds(10));
context.checkpoint(".");
JavaDStream<String> dataStream = context.socketTextStream("localhost", 9998);
JavaDStream<String> refDataStream = context.socketTextStream("localhost", 9999);
JavaPairDStream<String, String> pairDataStream = dataStream.mapToPair(e -> {
String[] tmp = e.split(" ");
return new Tuple2<>(tmp[0], tmp[1]);
});
JavaPairDStream<String, String> pairRefDataStream = refDataStream.mapToPair(e -> {
String[] tmp = e.split(" ");
return new Tuple2<>(tmp[0], tmp[1]);
}).updateStateByKey((Function2<List<String>, Optional<String>, Optional<String>>) (strings, stringOptional) -> {
if (!strings.isEmpty()) {
return Optional.of(strings.get(0));
}
return Optional.absent();
});
pairDataStream.join(pairRefDataStream).print();
context.start();
context.awaitTermination();
When I write 1 aaa into the first stream and 1 111 into the second immediately everything works fine, I see result of the merge. But, when I write 1 bbb into the first stream after one minute I see nothing.
Do I understand correctly what updateStateByKey() does? Or I am wrong?
updateStateByKey does exactly what you ask it for. In particular if current window contains no data (strings.isEmpty()) you instruct it to forget (return Optional.absent();):
if (!strings.isEmpty()) {
return Optional.of(strings.get(0));
}
return Optional.absent();
while what you probably want is to return previous state:
if (!strings.isEmpty()) {
return Optional.of(strings.get(0));
}
return stringOptional;

Output matrix correctly in Spark Java

I would like to know how I go about getting the correct output, I want the output to have the same format as the input. I'm just not quite sure how to map a rowNatrix to have this output.
Input File
0,0,0.0
0,1,1.0
0,2,2.0
0,3,3.0
0,4,4.0
1,0,5.0
1,1,6.0
1,2,7.0
1,3,8.0
1,4,9.0
Code
String inputPathA = "data/At.txt";
SparkConf conf = new SparkConf().setMaster("local");
JavaSparkContext sc = new JavaSparkContext(conf);
JavaRDD<String> fileA = sc.textFile(inputPathA);
JavaRDD<MatrixEntry> matrixA = fileA.map(new Function<String, MatrixEntry>() {
public MatrixEntry call(String x){
String[] indeceValue = x.split(",");
long i = Long.parseLong(indeceValue[0]);
long j = Long.parseLong(indeceValue[1]);
double value = Double.parseDouble(indeceValue[2]);
return new MatrixEntry(i, j, value );
}
});
CoordinateMatrix cooMatrixA = new CoordinateMatrix(matrixA.rdd());
BlockMatrix matA = cooMatrixA.toBlockMatrix();
BlockMatrix ata = matA.transpose().multiply(matA);
IndexedRowMatrix id = ata.toIndexedRowMatrix();
RowMatrix rm = id.toRowMatrix();
RDD<Vector> result = rm.rows();
result.saveAsTextFile("data/output1")
the output I get
(5,[0,1,2,3,4],[45.0,58.0,71.0,84.0,97.0])
(5,[0,1,2,3,4],[25.0,30.0,35.0,40.0,45.0])
(5,[0,1,2,3,4],[30.0,37.0,44.0,51.0,58.0])
(5,[0,1,2,3,4],[40.0,51.0,62.0,73.0,84.0])
(5,[0,1,2,3,4],[35.0,44.0,53.0,62.0,71.0])
How do I map that correctly in Spark (Java) to be the same as my input?
rowMatrix has no meaningful row indices so it cannot be converted back to the same shape as an input. Instead you simply convert BlockMatrix back to CoordinateMatrix and prepare JavaRDD<String> which can be saved:
JavaRDD<MatrixEntry> entries = ata.toCoordinateMatrix().entries().toJavaRDD();
JavaRDD<String> output = entries.map(new Function<MatrixEntry, String>() {
public String call(MatrixEntry e) {
return String.format("%d,%d,%s", e.i(), e.j(), e.value());
}
});
output.saveAsTextFile("data/output1");

Categories