Get word part from a variable String - java

I've been implementing an application to retrieve a word inside a incoming String parameter, this String parameter can vary since it is an URL, but the pattern for almost all the incoming url's is the same. For instance I could have:
GET /com.myapplication.v4.ws.monitoring.ModuleSystemMonitor HTTP/1.1
or
GET /com.myapplication.filesystem.ws.ModuleFileSystem/getIdFolders/jsonp?idFolder=idFis1&callback=__gwt_jsonp__.P0.onSuccess&failureCallback=__gwt_jsonp__.P0.onFailure HTTP/1.1
So in any case, I want to extract the word that starts with Module, for example, for the first incoming parameter I want to get: ModuleSystemMonitor. And for the second one I want to get the word: ModuleFileSystem.
This is the requirement, I'm not allowed to do anything else but this: just a method that receives a line and try to extract the words I mentioned: ModuleSystemMonitor and ModuleFileSystem.
I've been thinkng of using StringTokenizer class or String#split method, but I'm not sure if they are the best option. I tried and it is easy to get the word begins with Module using indexOf, but how to cut the word if from some cases it comes with a white space like the first sample or it comes with a "/" (slash) in the second. I know I can make an "if" statement and cut it when it is white space or it is slash but I wonder to know if there is another way that could be more dynamic.
Thanks in advance for your time and help. Best regards.

I'm not sure this is the best solution but you could try this:
String[] tmp = yourString.Split("\\.|/| ");
for (int i=0; i< tmp.length(); i++) {
if (tmp[i].matches("^Module.*")) {
return tmp[i];
}
}
return null;

You can just use String.indexOf and String.substring like this:
int startIndex = url.indexOf("Module");
for (int index = startIndex + "Module".length; i < url.length; i++
{
if (!Character.isLetter(url.charAt(index))
{
return url.substring(startIndex, index));
}
}
Based on the assumption that the first non-letter character is the end marker of the word.

String stringToSearch = "GET /com.myapplication.v4.ws.monitoring.ModuleSystemMonitor HTTP/1.1";
Pattern pattern = Pattern.compile("(Module[a-zA-Z]*)");
Matcher matcher = pattern.matcher(stringToSearch);
if (matcher.find()){
System.out.println(matcher.group(1));
}

Related

Look for a certain String inside another and count how many times it appears

I am trying to search for a String inside a file content which I got into a String.
I've tried to use Pattern and Matcher, which worked for this case:
Pattern p = Pattern.compile("(</machine>)");
Matcher m = p.matcher(text);
while(m.find()) //if the text "(</machine>)" was found, enter
{
Counter++;
}
return Counter;
Then, I tried to use the same code to find how many tags I have:
Pattern tagsP = Pattern.compile("(</");
Matcher tagsM = tagsP.matcher(text);
while(tagsM.find()) //if the text "(</" was found, enter
{
CounterTags++;
}
return CounterTags;
which in this case, the return value was always 0.
Try using the below code , btw not using Pattern:-
String actualString = "hello hi how(</machine>) are you doing. Again hi (</machine>) friend (</machine>) hope you are (</machine>)doing good.";
//actualString which you get from file content
String toMatch = Pattern.quote("(</machine>)");// for coverting to regex literal
int count = actualString .split(toMatch, -1).length - 1; // split the actualString to array based on toMatch , so final match count should be -1 than array length.
System.out.println(count);
Output :- 4
You can use Apache commons-lang util library, there is a function countMatches exactly for you:
int count = StringUtils.countMatches(text, "substring");
Also this function is null-safe.
I recommend you to explore Apache commons libraries, they provide a lot of useful common util methods.

Java String- How to get a part of package name in android?

Its basically about getting string value between two characters. SO has many questions related to this. Like:
How to get a part of a string in java?
How to get a string between two characters?
Extract string between two strings in java
and more.
But I felt it quiet confusing while dealing with multiple dots in the string and getting the value between certain two dots.
I have got the package name as :
au.com.newline.myact
I need to get the value between "com." and the next "dot(.)". In this case "newline". I tried
Pattern pattern = Pattern.compile("com.(.*).");
Matcher matcher = pattern.matcher(beforeTask);
while (matcher.find()) {
int ct = matcher.group();
I tried using substrings and IndexOf also. But couldn't get the intended answer. Because the package name in android varies by different number of dots and characters, I cannot use fixed index. Please suggest any idea.
As you probably know (based on .* part in your regex) dot . is special character in regular expressions representing any character (except line separators). So to actually make dot represent only dot you need to escape it. To do so you can place \ before it, or place it inside character class [.].
Also to get only part from parenthesis (.*) you need to select it with proper group index which in your case is 1.
So try with
String beforeTask = "au.com.newline.myact";
Pattern pattern = Pattern.compile("com[.](.*)[.]");
Matcher matcher = pattern.matcher(beforeTask);
while (matcher.find()) {
String ct = matcher.group(1);//remember that regex finds Strings, not int
System.out.println(ct);
}
Output: newline
If you want to get only one element before next . then you need to change greedy behaviour of * quantifier in .* to reluctant by adding ? after it like
Pattern pattern = Pattern.compile("com[.](.*?)[.]");
// ^
Another approach is instead of .* accepting only non-dot characters. They can be represented by negated character class: [^.]*
Pattern pattern = Pattern.compile("com[.]([^.]*)[.]");
If you don't want to use regex you can simply use indexOf method to locate positions of com. and next . after it. Then you can simply substring what you want.
String beforeTask = "au.com.newline.myact.modelact";
int start = beforeTask.indexOf("com.") + 4; // +4 since we also want to skip 'com.' part
int end = beforeTask.indexOf(".", start); //find next `.` after start index
String resutl = beforeTask.substring(start, end);
System.out.println(resutl);
You can use reflections to get the name of any class. For example:
If I have a class Runner in com.some.package and I can run
Runner.class.toString() // string is "com.some.package.Runner"
to get the full name of the class which happens to have a package name inside.
TO get something after 'com' you can use Runner.class.toString().split(".") and then iterate over the returned array with boolean flag
All you have to do is split the strings by "." and then iterate through them until you find one that equals "com". The next string in the array will be what you want.
So your code would look something like:
String[] parts = packageName.split("\\.");
int i = 0;
for(String part : parts) {
if(part.equals("com")
break;
}
++i;
}
String result = parts[i+1];
private String getStringAfterComDot(String packageName) {
String strArr[] = packageName.split("\\.");
for(int i=0; i<strArr.length; i++){
if(strArr[i].equals("com"))
return strArr[i+1];
}
return "";
}
I have done heaps of projects before dealing with websites scraping and I
just have to create my own function/utils to get the job done. Regex might
be an overkill sometimes if you just want to extract a substring from
a given string like the one you have. Below is the function I normally
use to do this kind of task.
private String GetValueFromText(String sText, String sBefore, String sAfter)
{
String sRetValue = "";
int nPos = sText.indexOf(sBefore);
if ( nPos > -1 )
{
int nLast = sText.indexOf(sAfter,nPos+sBefore.length()+1);
if ( nLast > -1)
{
sRetValue = sText.substring(nPos+sBefore.length(),nLast);
}
}
return sRetValue;
}
To use it just do the following:
String sValue = GetValueFromText("au.com.newline.myact", ".com.", ".");

Get number of exact substrings

I want to get the number of substrings out of a string.
The inputs are excel formulas like IF(....IF(...))+IF(...)+SUM(..) as a string. I want to count all IF( substrings. It's important that SUMIF(...) and COUNTIF(...) will not be counted.
I thought to check that there is no capital letter before the "IF", but this is giving (certainly) index out of bound. Can someone give me a suggestion?
My code:
for(int i = input.indexOf("IF(",input.length());
i != -1;
i= input.indexOf("IF(,i- 1)){
if(!isCapitalLetter(tmpFormulaString, i-1)){
ifStatementCounter++;
}
}
Although you can do the parsing by yourself as you were doing (that's possibly better for you to learn debugging so you know what your problem is)
However it can be easily done by regular expression:
String s = "FOO()FOOL()SOMEFOO()FOO";
Pattern p = Pattern.compile("\\bFOO\\b");
Matcher m = p.matcher(s);
int count = 0;
while (m.find()) {
count++;
}
// count= 2
The main trick here is \b in the regex. \b means word boundary. In short, if there is a alphanumeric character at the position of \b, it will not match.
http://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html
I think you can solve your problem by finding String IF(.
Try to do same thing in another way .
For example:
inputStrin = IF(hello)IF(hello)....IF(helloIF(hello))....
inputString.getIndexOf("IF(");
That solves your problem?
Click Here Or You can use regular expression also.

search for string in line

I have problem dealing with a string in a line,
For example i have the following lines in a .txt file:
ussdserver link from host /127.0.0.1:38978(account smpp34) is up|smpp34|2012-10-28 17:02:19
ussdserver link from host localhost/127.0.0.1:8088(account callme) is up|callme|2012-10-28 17:02:20
I need my code to get the word after "account" (in the first line it's smpp34) and the word "up" (after the "is" word).
I thought about using String.charAt() method but it doesn't work here because the words that I need can be in different places, as shown in the example above.
Try using following methods form String class.
String inputStr = "ussdserver link from host /127.0.0.1:38978(account smpp34) is up|smpp34|2012-10-28 17:02:19";
int index1 = inputStr.indexOf("account");
int index2 = inputStr.indexOf(')',index1);
String accountName = inputStr.substring(index1+8,index2); // you get smpp34
index1 = inputStr.indexOf("|");
index2 = inputStr.lastIndexOf(' ', index1);
String state = inputStr.substring(index2+1, index1) // you get up
Yeah its best to use regex in this sort of cases..But a simpler method can be used specifically for above case. Simply Try splitting from ) and then read till length of string-5 to length will give you the first word and try similar for second word..
IF and Only if the String pattern above never changes..Else will recommend using regex..
Try RegEx like this.
Pattern pattern = Pattern.compile(".*account\\s([^\\s]*)\\)\\sis\\s([^|]*)|.*");
Matcher matcher = pattern.matcher("ussdserver link from host /127.0.0.1:38978(account smpp34) is up|smpp34|2012-10-28 17:02:19");
while (matcher.find()) {
System.out.println(matcher.group(1));//will give you 'smpp34'
System.out.println(matcher.group(2));//will give you 'up'
return;
}

Java Find word in a String

I need to find a word in a HTML source code. Also I need to count occurrence. I am trying to use regular expression. But it says 0 match found.
I am using regular expression as I thought its the best way. In case of any better way, please let me know.
I need to find the occurrence of the word "hsw.ads" in HTML source code.
I have taken following steps.
int count = 0;
{
Pattern p = Pattern.compile(".*(hsw.ads).*");
Matcher m = p.matcher(SourceCode);
while(m.find())count++;
}
But the count is 0;
Please let me know your solutions.
Thank you.
Help Seeker
You are not matching any "expression", so probably a simple string search would be better. commons-lang has StringUtils.countMatches(source, "yourword").
If you don't want to include commons-lang, you can write that manually. Simply use source.indexOf("yourword", x) multiple times, each time supplying a greater value of x (which is the offset), until it gets -1
You should try this.
private int getWordCount(String word,String source){
int count = 0;
{
Pattern p = Pattern.compile(word);
Matcher m = p.matcher(source);
while(m.find()) count++;
}
return count;
}
Pass the word (Not pattern) you want to search in a string.
To find a string in Java you can use String methods indexOf which tells you the index of the first character of the string you searched for. To find all of them and count them you can do this (there might be a faster way but this should work). I would recommend using StringUtils CountMatches method.
String temp = string; //Copy to save the string
int count = 0;
String a = "hsw.ads";
int i = 0;
while(temp.indexOf(a, i) != -1) {
count++;
i = temp.indexof(a, i) + a.length() + 1;
}
StringUtils.countMatches(SourceCode, "hsw.ads") ought to work, however sticking with the approach you have above (which is valid), I'd recommend a few things:
1. As John Haager mentioned, remove the opening/closing .* will help, becuase you're looking for that exact substring
2. You want to escape the '.' because you're searching for a literal '.' and not a wildcard
3. I would make this Pattern a constant and re-use it rather than re-creating it each time.
That said, I'd still suggest using the approaches above, but I thought I'd just point out your current approach isn't conceptually flawed; just a few implementation details missing.
Your code and regular expression is valid. You don't need to include the .* at the beginning and the end of your regex. For example:
String t = "hsw.ads hsw.ads hsw.ads";
int count = 0;
Matcher m = Pattern.compile("hsw\\.ads").matcher(t);
while (m.find()){ count++; }
In this case, count is 3. And another thing, if you're going to use a regex, if you REALLY want to specifically look for a '.' period between hsw and ads, you need to escape it.

Categories