Parse hashtags between symbols - java

I need to parse hashtags from String (test comment #georgios#gsabanti sefse #afa).
String text = "test comment #georgios#gsabanti sefse #afa";
String[] words = text.split(" ");
List<String> tags = new ArrayList<String>();
for ( final String word : words) {
if (word.substring(0, 1).equals("#")) {
tags.add(word);
}
}
In the end i need an Array with "#georgios" , "#gsabanti" , "#afa" elements.
But now #georgios#gsabanti showing like one hashtag.
How to fix it?

+1 for the Regular Expressions:
Matcher matcher = Pattern.compile("(#[^#\\s]*)")
.matcher("test comment #georgios#gsabanti sefse #afa");
List<String> tags = new ArrayList<>();
while (matcher.find()) {
tags.add(matcher.group());
}
System.out.println(tags);

Here is a simple way of doing that
String text = "test comment #georgios#gsabanti sefse #afa";
String patternst = "#[a-zA-Z0-9]*";
Pattern pattern = Pattern.compile(patternst);
Matcher matcher = pattern.matcher(text);
List<String> tags = new ArrayList<String>();
while (matcher.find()) {
tags.add(matcher.group(0));
}
I hope it will work for you :)

Use Arraylist instead of array:
String text = "test comment #georgios#gsabanti sefse #afa";
ArrayList<String> hashTags = new ArrayList()<>;
char[] c = text.toCharArray();
for(int i=0;i<c.length;i++) {
if(c[i]=='#') {
String hash = "";
for(int j=i+1;j<c.length;j++) {
if(c[j]==' ' || c[j]=='#') {
hashTags.add(hash);
hash="";
break;
}
hash+=c[j];
}
}
}

String text = "test comment #georgios#gsabanti sefse #afa";
String[] words = text.split("(?=#)|\\s+")
List<String> tags = new ArrayList<String>();
for ( final String word : words) {
if (!word.isEmpty() && word.startsWith("#")) {
tags.add(word);
}
}

You can split your string at " " or "#" and keep the delimiters and filter those out which start with "#" like below:
public static void main(String[] args){
String text = "test comment #georgios#gsabanti sefse #afa";
String[] tags = Stream.of(text.split("(?=#)|(?= )")).filter(e->e.startsWith("#")).toArray(String[]::new);
System.out.println(Arrays.toString(tags));
}

Related

How to take input in expected form in java?

I want to take a string input in
%d+%d
format in java.How do i do it?
I know that I can do this with string.split() method. But I feel that it is going to be way more complex if I had to deal with more strings in input. Like
%d+%d-%d
I am looking for solutions that are close to a scanf solution for c.
I tried this for %d+%d
Scanner scanner = new Scanner(System.in);
String str = scanner.next();
String first,second;
String[] arr = str.split("\\+");
first = arr[0];
second = arr[1];
scanner.close();
And this for %d+%d-%d+%d..........=%d-%d+%d.....+%d...
private final String[] splitLoL(String txt) {
LinkedList<String> strList1 = new LinkedList<String>();
LinkedList<String> strList2 = new LinkedList<String>();
LinkedList<String> strList3 = new LinkedList<String>();;
strList1.addAll(Arrays.asList(txt.split("\\+")));
for(String str : strList1) {
String[] proxy = str.split("-");
strList2.addAll(Arrays.asList(proxy));
}
for(String str : strList2) {
String[] proxy = str.split("=");
strList3.addAll(Arrays.asList(proxy));
}
String[] strArr = new String[strList3.size()];
for(int i = 0; i < strArr.length; i++) {
strArr[i] = new String(strList3.get(i));
}
return strArr;
}
Try this:
String str = scanner.nextLine();
List<String> str2 = new ArrayList();
Matcher m = Pattern.compile("\\d+").matcher(str);
while(m.find()) {
str2.add(m.group());
}
Or you can do the following using JDK 9+:
import java.util.Scanner;
public class ScannerTrial {
public static void main(String[] args) {
Scanner scanner = new Scanner(" 4 z zz ggg 22 e");
scanner.findAll("\\d+").forEach((e) -> System.out.println(e.group()));
}
}
This would print
4 22

how to delete up extra line breakers in string

I have got a text like this in my String s (which I have already read from txt.file)
trump;Donald Trump;trump#yahoo.eu
obama;Barack Obama;obama#google.com
bush;George Bush;bush#inbox.com
clinton,Bill Clinton;clinton#mail.com
Then I'm trying to cut off everything besides an e-mail address and print out on console
String f1[] = null;
f1=s.split("(.*?);");
for (int i=0;i<f1.length;i++) {
System.out.print(f1[i]);
}
and I have output like this:
trump#yahoo.eu
obama#google.com
bush#inbox.com
clinton#mail.com
How can I avoid such output, I mean how can I get output text without line breakers?
Try using below approach. I have read your file with Scanner as well as BufferedReader and in both cases, I don't get any line break. file.txt is the file that contains text and the logic of splitting remains the same as you did
public class CC {
public static void main(String[] args) throws IOException {
Scanner scan = new Scanner(new File("file.txt"));
while (scan.hasNext()) {
String f1[] = null;
f1 = scan.nextLine().split("(.*?);");
for (int i = 0; i < f1.length; i++) {
System.out.print(f1[i]);
}
}
scan.close();
BufferedReader br = new BufferedReader(new FileReader(new File("file.txt")));
String str = null;
while ((str = br.readLine()) != null) {
String f1[] = null;
f1 = str.split("(.*?);");
for (int i = 0; i < f1.length; i++) {
System.out.print(f1[i]);
}
}
br.close();
}
}
You may just replace all line breakers as shown in the below code:
String f1[] = null;
f1=s.split("(.*?);");
for (int i=0;i<f1.length;i++) {
System.out.print(f1[i].replaceAll("\r", "").replaceAll("\n", ""));
}
This will replace all of them with no space.
Instead of split, you might match an email like format by matching not a semicolon or a whitespace character one or more times using a negated character class [^\\s;]+ followed by an # and again matching not a semicolon or a whitespace character.
final String regex = "[^\\s;]+#[^\\s;]+";
final String string = "trump;Donald Trump;trump#yahoo.eu \n"
+ " obama;Barack Obama;obama#google.com \n"
+ " bush;George Bush;bush#inbox.com \n"
+ " clinton,Bill Clinton;clinton#mail.com";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
final List<String> matches = new ArrayList<String>();
while (matcher.find()) {
matches.add(matcher.group());
}
System.out.println(String.join("", matches));
[^\\s;]+#[^\\s;]+
Regex demo
Java demo
package com.test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String s = "trump;Donald Trump;trump#yahoo.eu "
+ "obama;Barack Obama;obama#google.com "
+ "bush;George Bush;bush#inbox.com "
+ "clinton;Bill Clinton;clinton#mail.com";
String spaceStrings[] = s.split("[\\s,;]+");
String output="";
for(String word:spaceStrings){
if(validate(word)){
output+=word;
}
}
System.out.println(output);
}
public static final Pattern VALID_EMAIL_ADDRESS_REGEX = Pattern.compile(
"^[A-Z0-9._%+-]+#[A-Z0-9.-]+\\.[A-Z]{2,6}$",
Pattern.CASE_INSENSITIVE);
public static boolean validate(String emailStr) {
Matcher matcher = VALID_EMAIL_ADDRESS_REGEX.matcher(emailStr);
return matcher.find();
}
}
Just replace '\n' that may arrive at start and end.
write this way.
String f1[] = null;
f1=s.split("(.*?);");
for (int i=0;i<f1.length;i++) {
f1[i] = f1[i].replace("\n");
System.out.print(f1[i]);
}

How can i split String in java with custom pattern

I am trying to get the location data from this string using String.split("[,\\:]");
String location = "$,lat:27.980194,lng:46.090199,speed:0.48,fix:1,sats:6,";
String[] str = location.split("[,\\:]");
How can i get the data like this.
str[0] = 27.980194
str[1] = 46.090199
str[2] = 0.48
str[3] = 1
str[4] = 6
Thank you for any help!
If you just want to keep the numbers (including dot separator), you can use:
String[] str = location.split("[^\\d\\.]+");
You will need to ignore the first element in the array which is an empty string.
That will only work if the data names don't contain numbers or dots.
String location = "$,lat:27.980194,lng:46.090199,speed:0.48,fix:1,sats:6,";
Matcher m = Pattern.compile( "\\d+\\.*\\d*" ).matcher(location);
List<String> allMatches = new ArrayList<>();
while (m.find( )) {
allMatches.add(m.group());
}
System.out.println(allMatches);
Quick and Dirty:
String location = "$,lat:27.980194,lng:46.090199,speed:0.48,fix:1,sats:6,";
List<String> strList = (List) Arrays.asList( location.split("[,\\:]"));
String[] str = new String[5];
int count=0;
for(String s : strList){
try {
Double d =Double.parseDouble(s);
str[count] = d.toString();
System.out.println("In String Array:"+str[count]);
count++;
} catch (NumberFormatException e) {
System.out.println("s:"+s);
}
}

String reverse using Java'sstringbuilder

I develop using Java to make a little project.
I want String reverse.
If I entered "I am a girl", Printed reversing...
Already I tried to use StringBuilder.
Also I write it using StringBuffer grammar...
But I failed...
It is not printed my wish...
WISH
My with Print -> "I ma a lrig"
"I am a girl" -> "I ma a lrig" REVERSE!!
How can I do?..
Please help me thank you~!!!
public String reverse() {
String[] words = str.split("\\s");
StringTokenizer stringTokenizer = new StringTokenizer(str, " ");
for (String string : words) {
System.out.print(string);
}
String a = Arrays.toString(words);
StringBuilder builder = new StringBuilder(a);
System.out.println(words[0]);
for (String st : words){
System.out.print(st);
}
return "";
}
Java 8 code to do this :
public static void main(String[] args) {
String str = "I am a girl";
StringBuilder sb = new StringBuilder();
// split() returns an array of Strings, for each string, append it to a StringBuilder by adding a space.
Arrays.asList(str.split("\\s+")).stream().forEach(s -> {
sb.append(new StringBuilder(s).reverse() + " ");
});
String reversed = sb.toString().trim(); // remove trailing space
System.out.println(reversed);
}
O/P :
I ma a lrig
if you do not want to go with lambda then you can try this solution too
String str = "I am a girl";
String finalString = "";
String s[] = str.split(" ");
for (String st : s) {
finalString += new StringBuilder(st).reverse().append(" ").toString();
}
System.out.println(finalString.trim());
}

Java - Add numbers to matching words

I'm trying to add a count number for matching words, like this:
Match word: "Text"
Input: Text Text Text TextText ExampleText
Output: Text1 Text2 Text3 Text4Text5 ExampleText6
I have tried this:
String text = "Text Text Text TextText ExampleText";
String match = "Text";
int i = 0;
while(text.indexOf(match)!=-1) {
text = text.replaceFirst(match, match + i++);
}
Doesn't work because it would loop forever, the match stays in the string and IndexOf will never stop.
What would you suggest me to do?
Is there a better way doing this?
Here is one with a StringBuilder but no need to split:
public static String replaceWithNumbers( String text, String match ) {
int matchLength = match.length();
StringBuilder sb = new StringBuilder( text );
int index = 0;
int i = 1;
while ( ( index = sb.indexOf( match, index )) != -1 ) {
String iStr = String.valueOf(i++);
sb.insert( index + matchLength, iStr );
// Continue searching from the end of the inserted text
index += matchLength + iStr.length();
}
return sb.toString();
}
first take one stringbuffer i.e. result,Then spilt the source with the match(destination).
It results in an array of blanks and remaining words except "Text".
then check condition for isempty and depending on that replace the array position.
String text = "Text Text Text TextText ExampleText";
String match = "Text";
StringBuffer result = new StringBuffer();
String[] split = text.split(match);
for(int i=0;i<split.length;){
if(split[i].isEmpty())
result.append(match+ ++i);
else
result.append(split[i]+match+ ++i);
}
System.out.println("Result is =>"+result);
O/P
Result is => Text1 Text2 Text3 Text4Text5 ExampleText6
Try this solution is tested
String text = "Text Text Text TextText Example";
String match = "Text";
String lastWord=text.substring(text.length() -match.length());
boolean lastChar=(lastWord.equals(match));
String[] splitter=text.split(match);
StringBuilder sb = new StringBuilder();
for(int i=0;i<splitter.length;i++)
{
if(i!=splitter.length-1)
splitter[i]=splitter[i]+match+Integer.toString(i);
else
splitter[i]=(lastChar)?splitter[i]+match+Integer.toString(i):splitter[i];
sb.append(splitter[i]);
if (i != splitter.length - 1) {
sb.append("");
}
}
String joined = sb.toString();
System.out.print(joined+"\n");
One possible solution could be
String text = "Text Text Text TextText ExampleText";
String match = "Text";
StringBuilder sb = new StringBuilder(text);
int occurence = 1;
int offset = 0;
while ((offset = sb.indexOf(match, offset)) != -1) {
// fixed this after comment from #RealSkeptic
String insertOccurence = Integer.toString(occurence);
sb.insert(offset + match.length(), insertOccurence);
offset += match.length() + insertOccurence.length();
occurence++;
}
System.out.println("result: " + sb.toString());
This will work for you :
public static void main(String[] args) {
String s = "Text Text Text TextText ExampleText";
int count=0;
while(s.contains("Text")){
s=s.replaceFirst("Text", "*"+ ++count); // replace each occurrence of "Text" with some place holder which is not in your main String.
}
s=s.replace("*","Text");
System.out.println(s);
}
O/P:
Text1 Text2 Text3 Text4Text5 ExampleText6
I refactored #DeveloperH 's code to this:
public class Snippet {
public static void main(String[] args) {
String matchWord = "Text";
String input = "Text Text Text TextText ExampleText";
String output = addNumbersToMatchingWords(matchWord, input);
System.out.print(output);
}
private static String addNumbersToMatchingWords(String matchWord, String input) {
String[] inputsParts = input.split(matchWord);
StringBuilder outputBuilder = new StringBuilder();
int i = 0;
for (String inputPart : inputsParts) {
outputBuilder.append(inputPart);
outputBuilder.append(matchWord);
outputBuilder.append(i);
if (i != inputsParts.length - 1)
outputBuilder.append(" ");
i++;
}
return outputBuilder.toString();
}
}
We can solve this by using stringbuilder, it provides simplest construct to insert character in a string. Following is the code
String text = "Text Text Text TextText ExampleText";
String match = "Text";
StringBuilder sb = new StringBuilder(text);
int beginIndex = 0, i =0;
int matchLength = match.length();
while((beginIndex = sb.indexOf(match, beginIndex))!=-1) {
i++;
sb.insert(beginIndex+matchLength, i);
beginIndex++;
}
System.out.println(sb.toString());

Categories