convert first character of string to uppercase using java 8 lambdas only - java

I want to create a basic program of converting first character of string to uppercase through lambdas
Input
singhakash
Output
Singhakash
I tried
String st = "singhakash";
//approach 1
System.out.print(st.substring(0, 1).toUpperCase());
st.substring(1).codePoints()
.forEach(e -> System.out.print((char) e));
System.out.println();
//approach 2
System.out.print(st.substring(0, 1).toUpperCase());
IntStream.range(0, st.length())
.filter(i -> i > 0)
.mapToObj(st::charAt)
.forEach(System.out::print);
But for both the cases I have to print the first character seperately.Is there any way I can do that without having a seperate print statement?
Note: I can do that normally by loop or any other approach but I am looking for lambdas only solution.
Thanks

You could do it like this:
String st = "singhakash";
IntStream.range(0, st.length())
.mapToObj(i -> i == 0 ? Character.toUpperCase(st.charAt(i)) : st.charAt(i))
.forEach(System.out::print);

The simplest way to do it would be
String result = Character.toUpperCase(st.charAt(0))+st.substring(1);
If you feel like you have to optimize it, i.e. reduce the number of copying operations (instead of letting the JVM do it), you may use:
StringBuilder sb=new StringBuilder(st);
sb.setCharAt(0, Character.toUpperCase(sb.charAt(0)));
String result=sb.toString();
But if it really has to be done using the fancy new Java 8 feature, you can use
String result=IntStream.concat(
IntStream.of(st.codePointAt(0)).map(Character::toUpperCase), st.codePoints().skip(1) )
.collect(StringBuilder::new, StringBuilder::appendCodePoint, StringBuilder::append)
.toString();
This solution will even handle supplementary code points correctly, so it has even an advantage over the simple solutions (though it would not be too hard to makes these supplementary code point aware too).
If you want to print directly, you can use
IntStream.concat(
IntStream.of(st.codePointAt(0)).map(Character::toUpperCase), st.codePoints().skip(1))
.forEach(cp -> System.out.print(Character.toChars(cp)));

String is immutable in Java. Just uppercase the first character, and append the rest. Something like,
System.out.println(Character.toUpperCase(st.charAt(0)) + st.substring(1));

st.replaceFirst(st.subSequence(0,1).toString(),st.subSequence(0,1).toString().toUpperCase().codePoints().forEach(e -> System.out.print((char)e));

Related

How to append a separator string when using System.out::print to print String[]?

I am trying to learn and get comfortable using lambdas, streams and method references - all the new-fangled Java 11 stuff.
I want to sort and print out the array of strings, keeping the null intact.
In the last line, how can I print the names with spaces?
(a general case would be printing each name with a message).
The current output is:
nullGoyleMalfoyCrabbe
I want
null Goyle Malfoy Crabbe
without a space after the last element.
ArrayList<String> enemies = new ArrayList<>(Arrays.asList("Malfoy", "Crabbe", "Goyle", null));
List<String> es = enemies.stream().map(object -> Objects.toString(object,null)).collect(Collectors.toList());
String[] as = es.toArray(String[]::new);
Arrays.sort(as, Comparator.nullsFirst((a,b) -> a.length() - b.length()));
Arrays.stream(as).forEach(System.out::print);
The most simple solution would be to add a whitespace after each list entry:
Arrays.stream(as).map(s -> s + " ").forEach(System.out::print);
But this will also add a whitespace at the end.
Since Java 8 there's StringJoiner which can be configured with a separator, prefix and suffix:
StringJoiner s = new StringJoiner("my separator", "my prefix", "my suffix");
s.add("str 1");
s.add("str 2");
System.out.print(s.toString());
There's also a Collector that can join Streams (which uses StringJoiner under the hood):
String s = myStream.collect(Collectors.joining("my separator", "my prefix", "my suffix"));
System.out.print(s);
StringJoiner does only add the separator between the elements, not at the end. If you want to add something at the end with StringJoiner you must add a suffix.
All you need is a join:
System.out.print(String.join(" ", as));
You can use method reference like below:
Arrays.stream(as).forEach(enem -> System.out.print(enem + " "));
One of the nice thing about streams (IMHO) is the method chaining. I would try something like this, doing everything within the stream pipeline:
String output = Stream.of("Malfoy", "Crabbe", "Goyle", null)
.map(object -> Objects.toString(object,null))
.sorted(Comparator.nullsFirst(Comparator.comparingInt(String::length)))
.collect(Collectors.joining(" "));
System.out.format("Result is \"%s\".%n", output);
Output is:
Result is "null Goyle Malfoy Crabbe".
You notice that there is no space after Crabbe.
If you had an array already, you can pass it to either Arrays.stream() or Stream.of() to create the first stream.

How do I search for a list of strings inside another string?

Here is some code that works, but looks inelegant. What is a better way to search for any occurrence of these strings inside another string?
String AndyDaltonInjury = "broken right thumb";
if (AndyDaltonInjury.toLowerCase().contains("broken") &&
(AndyDaltonInjury.toLowerCase().contains("knee") ||
AndyDaltonInjury.toLowerCase().contains("leg") ||
AndyDaltonInjury.toLowerCase().contains("ankle") ||
AndyDaltonInjury.toLowerCase().contains("thumb") ||
AndyDaltonInjury.toLowerCase().contains("wrist")))
{
System.out.println("Marvin sends in the backup quarterback.");
}
Use the Set collection and its method Set::contains insde streaming the split array with the space (" ") delimiter:
Set<String> set = new HashSet<>(Arrays.asList("knee", "leg", "ankle", "thumb", "wrist"));
String lower = "broken right thumb".toLowerCase();
String split[] = lower.split(" ");
if (lower.contains("broken") && Arrays.stream(split).anyMatch(set::contains)) {
System.out.println("Marvin sends in the backup quarterback.");
}
Moreover, I highly recommend you to use lower-cased variable names.
As an alternative to an already posted Set-based solution (which I find better by the way, in the sense of readability), this can be done using a regular expression:
final Pattern brokeStuffPattern = Pattern.compile(
".*\\bbroken?\\b.*\\b(?:knee|leg|ankle|thumb|wrist)s?\\b.*"
+ "|.*\\b(?:knee|leg|ankle|thumb|wrist)s?\\b.*\\bbroken?\\b.*",
Pattern.CASE_INSENSITIVE
);
if (brokeStuffPattern.matcher(AndyDaltonInjury).matches()) {
...
}
This would account for plurals and the verb's perfect tense as well, e.g. if would match "broken legs".
You could create the missing functions (contains all/any) as methods, or express them using Lambda notations:
BiPredicate<String, List<String>> containsAll = (text, words) ->
words.stream().allMatch(word -> text.toLowerCase().contains(word));
BiPredicate<String, List<String>> containsAny = (text, words) ->
words.stream().anyMatch(word -> text.toLowerCase().contains(word));
if (containsAll.test(AndyDaltonInjury, Arrays.asList("broken")) &&
containsAny.test(AndyDaltonInjury, Arrays.asList("knee", "leg", "ankle", "thumb", "wrist"))) {
System.out.println("Marvin sends in the backup quarterback.");
}
You can try this:
String test = "broken right thumb";
Predicate << ? super String > machCriteria = s - > Stream.of("knee", "leg", "ankle", "thumb", "wrist").anyMatch(e - > e.equals(s.toLowerCase()));
String result = Pattern.compile(" ").splitAsStream(test).anyMatch(machCriteria) ? "Marvin sends in the backup quarterback." : "";
System.out.println(result);
Hash based algorithms are likely going to give you better performance if you need to check a lot of text against occurrences within a huge set.
HashSet would be a good first attempt as the search (test if key contained within the set) will be between O(1) and O(n).
However, I would strongly advise looking into the benefit of employing a [Bloom Filter][1]. It will serve well as a prefilter as it gives a predictable performance of O(k). Because the filter will have a small % of false positive, you will need to run a second stage as well.
Look into the Guava BloomFilter for a good implementation.
Another benefit of the Bloom Filter is that it does not contain the original dataset, just a reduced hash, meaning that its size is minimal. This means that it is more suitable for distributed systems as it copies over very efficiently. In an environment like Apache Spark, you would even set this up as a Broadcast variable, as once produced it is typically constant in time.

Elegant way to make sure string is no longer than a given length

I want to know if there is a quicker and/or more elegant way to achieve this with Java 8.
I would like to have a string no longer than a max length (say 4 character max)
input "" -> ""
input null -> null
input abc -> abc
input abcde -> abcd
some function (string s){
if(s==null)
return null;
if(s.length()>4)
return s.substring(1,4);
return s;
}
If you don't mind pulling in an external dependency, Apache Commons StringUtils is full of handy helper methods that do exactly this kind of thing, including a null/length safe SubString.
https://commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/StringUtils.html#substring(java.lang.String, int)
I personally find with pure Java projects it is almost always beneficial to make use of Apache Commons for the myriad of helpers.
The above said, your code is easily readable, so I wouldn't go out of your way to change it.
If the string is not null, you can always take a substring between 0 and the minimum between 4 and the length of the string. Couple this with Java 8's Optional, and you can do this in single statement:
private static String noMoreThanFour(String str) {
return Optional.ofNullable(str)
.map(s -> s.substring(0, Math.min(4, s.length())))
.orElse(null);
}
1 less return statement keeping if-else structure
if(s==null || s.length()<=4){
return s;
}else return s.substring(0,4);
another stem would be to use conditional expression, but it wont be more readable
return (s==null || s.length()<=4)? s:s.substring(0,4)

How to get lines before and after matching from java 8 stream like grep?

I have a text files that have a lot of string lines in there. If I want to find lines before and after a matching in grep, I will do like this:
grep -A 10 -B 10 "ABC" myfile.txt
How can I implements the equivalent in Java 8 using stream?
If you're willing to use a third party library and don't need parallelism, then jOOλ offers SQL-style window functions as follows
Seq.seq(Files.readAllLines(Paths.get(new File("/path/to/Example.java").toURI())))
.window(-1, 1)
.filter(w -> w.value().contains("ABC"))
.forEach(w -> {
System.out.println("-1:" + w.lag().orElse(""));
System.out.println(" 0:" + w.value());
System.out.println("+1:" + w.lead().orElse(""));
// ABC: Just checking
});
Yielding
-1: .window(-1, 1)
0: .filter(w -> w.value().contains("ABC"))
+1: .forEach(w -> {
-1: System.out.println("+1:" + w.lead().orElse(""));
0: // ABC: Just checking
+1: });
The lead() function accesses the next value in traversal order from the window, the lag() function accesses the previous row.
Disclaimer: I work for the company behind jOOλ
Such scenario is not well-supported by Stream API as existing methods do not provide an access to the element neighbors in the stream. The closest solution which I can think up without creating custom iterators/spliterators and third-party library calls is to read the input file into List and then use indices Stream:
List<String> input = Files.readAllLines(Paths.get(fileName));
Predicate<String> pred = str -> str.contains("ABC");
int contextLength = 10;
IntStream.range(0, input.size()) // line numbers
// filter them leaving only numbers of lines satisfying the predicate
.filter(idx -> pred.test(input.get(idx)))
// add nearby numbers
.flatMap(idx -> IntStream.rangeClosed(idx-contextLength, idx+contextLength))
// remove numbers which are out of the input range
.filter(idx -> idx >= 0 && idx < input.size())
// sort numbers and remove duplicates
.distinct().sorted()
// map to the lines themselves
.mapToObj(input::get)
// output
.forEachOrdered(System.out::println);
The grep output also includes special delimiter like "--" to designate the omitted lines. If you want to go further and mimic such behavior as well, I can suggest you to try my free StreamEx library as it has intervalMap method which is helpful in this case:
// Same as IntStream.range(...).filter(...) steps above
IntStreamEx.ofIndices(input, pred)
// same as above
.flatMap(idx -> IntStream.rangeClosed(idx-contextLength, idx+contextLength))
// remove numbers which are out of the input range
.atLeast(0).less(input.size())
// sort numbers and remove duplicates
.distinct().sorted()
.boxed()
// merge adjacent numbers into single interval and map them to subList
.intervalMap((i, j) -> (j - i) == 1, (i, j) -> input.subList(i, j + 1))
// flatten all subLists prepending them with "--"
.flatMap(list -> StreamEx.of(list).prepend("--"))
// skipping first "--"
.skip(1)
.forEachOrdered(System.out::println);
As Tagir Valeev noted, this kind of problem isn't well supported by the streams API. If you incrementally want to read lines from the input and print out matching lines with context, you'd have to introduce a stateful pipeline stage (or a custom collector or spliterator) which adds quite a bit of complexity.
If you're willing to read all the lines into memory, it turns out that BitSet is a useful representation for manipulating groups of matches. This bears some similarity to Tagir's solution, but instead of using integer ranges to represent lines to be printed, it uses 1-bits in a BitSet. Some advantages of BitSet are that it has a number of built-in bulk operations, and it has a compact internal representation. It can also produce a stream of indexes of the 1-bits, which is quite useful for this problem.
First, let's start out by creating a BitSet that has a 1-bit for each line that matches the predicate:
void contextMatch(Predicate<String> pred, int before, int after, List<String> input) {
int len = input.size();
BitSet matches = IntStream.range(0, len)
.filter(i -> pred.test(input.get(i)))
.collect(BitSet::new, BitSet::set, BitSet::or);
Now that we have the bit set of matching lines, we stream out the indexes of each 1-bit. We then set the bits in the bitset that represent the before and after context. This gives us a single BitSet whose 1-bits represent all of the lines to be printed, including context lines.
BitSet context = matches.stream()
.collect(BitSet::new,
(bs,i) -> bs.set(Math.max(0, i - before), Math.min(i + after + 1, len)),
BitSet::or);
If we just want to print out all the lines, including context, we can do this:
context.stream()
.forEachOrdered(i -> System.out.println(input.get(i)));
The actual grep -A a -B b command prints a separator between each group of context lines. To figure out when to print a separator, we look at each 1-bit in the context bit set. If there's a 0-bit preceding it, or if it's at the very beginning, we set a bit in the result. This gives us a 1-bit at the beginning of each group of context lines:
BitSet separators = context.stream()
.filter(i -> i == 0 || !context.get(i-1))
.collect(BitSet::new, BitSet::set, BitSet::or);
We don't want to print the separator before each group of context lines; we want to print it between each group. That means we have to clear the first 1-bit (if any):
// clear the first bit
int first = separators.nextSetBit(0);
if (first >= 0) {
separators.clear(first);
}
Now, we can print out the result lines. But before printing each line, we check to see if we should print a separator first:
context.stream()
.forEachOrdered(i -> {
if (separators.get(i)) {
System.out.println("--");
}
System.out.println(input.get(i));
});
}

Java 8 Stream of sentences

I’d like to use Java 8 streams to take a stream of strings (for example read from a plain text file) and produce a stream of sentences. I assume sentences can cross line boundaries.
So for example, I want to go from:
"This is the", "first sentence. This is the", "second sentence."
to:
"This is the first sentence.", "This is the second sentence."
I can see that it’s possible to get a stream of parts of sentences as follows:
Pattern p = Pattern.compile("\\.");
Stream<String> lines
= Stream.of("This is the", "first sentence. This is the", "second sentence.");
Stream<String> result = lines.flatMap(s -> p.splitAsStream(s));
But then I’m not sure how to produce a stream to join the fragments into sentences. I want to do this in a lazy way so that only what is needed from the original stream is read. Any ideas?
Breaking text into sentences is not that easy as just looking for dots. E.g., you don’t want to split in between “Mr.Smith”…
Thankfully, there is already a JRE class which takes care of that, the BreakIterator. What it doesn’t have, is Stream support, so in order to use it with streams, some support code around it is required:
public class SentenceStream extends Spliterators.AbstractSpliterator<String>
implements Consumer<CharSequence> {
public static Stream<String> sentences(Stream<? extends CharSequence> s) {
return StreamSupport.stream(new SentenceStream(s.spliterator()), false);
}
Spliterator<? extends CharSequence> source;
CharBuffer buffer;
BreakIterator iterator;
public SentenceStream(Spliterator<? extends CharSequence> source) {
super(Long.MAX_VALUE, ORDERED|NONNULL);
this.source = source;
iterator=BreakIterator.getSentenceInstance(Locale.ENGLISH);
buffer=CharBuffer.allocate(100);
buffer.flip();
}
#Override
public boolean tryAdvance(Consumer<? super String> action) {
for(;;) {
int next=iterator.next();
if(next!=BreakIterator.DONE && next!=buffer.limit()) {
action.accept(buffer.subSequence(0, next-buffer.position()).toString());
buffer.position(next);
return true;
}
if(!source.tryAdvance(this)) {
if(buffer.hasRemaining()) {
action.accept(buffer.toString());
buffer.position(0).limit(0);
return true;
}
return false;
}
iterator.setText(buffer.toString());
}
}
#Override
public void accept(CharSequence t) {
buffer.compact();
if(buffer.remaining()<t.length()) {
CharBuffer bigger=CharBuffer.allocate(
Math.max(buffer.capacity()*2, buffer.position()+t.length()));
buffer.flip();
bigger.put(buffer);
buffer=bigger;
}
buffer.append(t).flip();
}
}
With that support class, you can simply say, e.g.:
Stream<String> lines = Stream.of(
"This is the ", "first sentence. This is the ", "second sentence.");
sentences(lines).forEachOrdered(System.out::println);
This is a sequential, stateful problem, which Stream's designer is not too fond of.
In a more general sense, you are implementing a lexer, which converts a sequence of tokens to a sequence of another type of tokens. While you might use Stream to solve it with tricks and hacks, there is really no reason to. Just because Stream is there doesn't mean we have to use it for everything.
That being said, an answer to your question is to use flatMap() with a stateful function that holds intermediary data and emits the whole sentence when a dot is encountered. There is also the issue of EOF - you'll need a sentinel value for EOF in the source stream so that the function can react to it.
My StreamEx library has a collapse method which is designed to solve such tasks. First let's change your regexp to look-behind one, to leave the ending dots, so we can later use them:
StreamEx.of(input).flatMap(Pattern.compile("(?<=\\.)")::splitAsStream)
Here the input is array, list, JDK stream or just comma-separated strings.
Next we collapse two strings if the first one does not end with dot. The merging function should join both parts into single string adding a space between them:
.collapse((a, b) -> !a.endsWith("."), (a, b) -> a + ' ' + b)
Finally we should trim the leading and trailing spaces if any:
.map(String::trim);
The whole code is here:
List<String> lines = Arrays.asList("This is the", "first sentence. This is the",
"second sentence. Third sentence. Fourth", "sentence. Fifth sentence.", "The last");
Stream<String> stream = StreamEx.of(lines)
.flatMap(Pattern.compile("(?<=\\.)")::splitAsStream)
.collapse((a, b) -> !a.endsWith("."), (a, b) -> a + ' ' + b)
.map(String::trim);
stream.forEach(System.out::println);
The output is the following:
This is the first sentence.
This is the second sentence.
Third sentence.
Fourth sentence.
Fifth sentence.
The last
Update: since StreamEx 0.3.4 version you can safely do the same with parallel stream.

Categories