Parsing a string to get particular values

Parsing a string to get particular values - java

I'm trying to get the values from a string builder , where I've converted the string builder to string to get values but not helping in fetching particular values.
My String builder output is,
SN: 00486 Mode: 1
Temp. 15.4 C
Fat............... 0.0%
SNF............... 0.4%
well further replaced with characters like,
String i = stringBuilder.toString();
i = i.replaceAll("[^\\d.]", "");
here my output is ,
004861.15.4...............0.0...............0.4
I expect my output to get only those values(digits) of Temp ,Fat and SNF ,how can i overcome those dots with my decimal points if I use any loops? please help me out anyone.
Expected Output,
15 (two digits) from Temp , 0.0 from Fat , and 0.4 from SNF.

try this though it`s an ugly piece of code
StringBuilder sb=new StringBuilder("SN: 00486 Mode: 1\n" +
"Temp. 15.4 C\n" +
"Fat............... 0.0%\n" +
"SNF............... 0.4%");
String str=sb.toString();
str=str.replace("%","").replace("\n"," ");
String[] splt=str.split(" ");
String fullStr="";
for (String st: splt ) {
try {
Double num=Double.parseDouble(st);
fullStr+=num.toString()+"\n";
}catch (Exception ex){
}
}
System.out.println(fullStr);
here all the number is separated. and you can separate your desired too.
if you need clarification please ask.

Well I've replaced the code by slight change as
i = i.replaceAll("[^\d]", "");
where i got the results as
00486117000240241040719
which i did altered it later by converting to int [] and got the positions of my values,thanks for your reply everyone.

it is better to read out the searched values with a pattern
Pattern snpat = Pattern.compile( "(?<info>[^\\d\\:\\.\\x20\r\n]+)(:?[\\.]+)(:?\\x20)?(?<value>[\\d\\.]+)[^\r\n][\r\n]?", Pattern.MULTILINE );
Matcher m = snpat.matcher( str );
String rslt = "";
while( m.find() ) {
String value = m.group( "value" );
if( m.group( "info" ).equals( "Temp" ) && value.indexOf( '.' ) >= 0 ) // ignore decimal places
value = value.substring( 0, value.indexOf( '.' ) );
rslt = String.join( rslt.isEmpty() ? "" : ", ", rslt, value + " from " + m.group( "info" ) );
}
System.out.println( rslt + '.' );
gives:
15 from Temp, 0.0 from Fat, 0.4 from SNF.
if You like a blank before the comma change the delimiter to " , "

Related

Java match two strings without last character

I've a URL with path being /mypath/check/10.10/-123.11 . I want to return true if (optionally) there are 3 digits after decimal instead of 2 e.g /mypath/check/10.101/-123.112 should return true when matched. Before decimal for both two occurences should be exact match.
To cite some examples :
Success
/mypath/check/10.10/-123.11 = /mypath/check/10.101/-123.112
/mypath/check/10.10/-123.11 = /mypath/check/10.101/-123.11
/mypath/check/10.10/-123.11 = /mypath/check/10.10/-123.112
/mypath/check/10.10/123.11 = /mypath/check/10.101/123.112
.. and so forth
Failure :
/mypath/check/10.10/-123.11 != /mypath/check/10.121/-123.152
/mypath/check/10.11/-123.11 != /mypath/check/10.12/-123.11
The numbers before decimal can include - with digits with 1 to 3 numbers.

Try /mypath/check/10\.10/-?123\.11[ ]*=[ ]*/mypath/check/(\d\d)\.\1\d?/
demo

Try this:
url1.equals(url2) || url1.equals(url2.replaceAll("\\d$", ""))

Idea
Regex subpatterns that shall match optionally are suffixed with the ? modifier. In your case this applies to the 3rd character after a decimal point.
An equality tests modulo that optional digit may be implemented in matching each occurrence of the context pattern and replacing the optional part within the match with the empty string. After this normalization the strings can be tested for equality.
Code
// Initializing test data.
// Will compare Strings in batch1, batch2 at the same array position.
//
String[] batch1 = {
"/mypath/check/10.10/-123.11"
, "/mypath/check/10.10/-123.11"
, "/mypath/check/10.10/-123.11"
, "/mypath/check/10.10/123.11"
, "/mypath/check/10.10/-123.11"
, "/mypath/check/10.11/-123.11"
};
String[] batch2 = {
"/mypath/check/10.101/-123.112"
, "/mypath/check/10.101/-123.11"
, "/mypath/check/10.10/-123.112"
, "/mypath/check/10.101/123.112"
, "/mypath/check/10.121/-123.152"
, "/mypath/check/10.12/-123.11"
};
// Regex pattern used for normalization:
// - Basic pattern: decimal point followed by 2 or 3 digits
// - Optional part: 3rd digit of the basic pattern
// - Additional context: Pattern must match at the end of the string or be followed by a non-digit character.
//
Pattern re_p = Pattern.compile("([.][0-9]{2})[0-9]?(?:$|(?![0-9]))");
// Replacer routine for processing the regex match. Returns capture group #1
Function<MatchResult, String> fnReplacer= (MatchResult m)-> { return m.group(1); };
// Processing each test case
// Expected result
// match
// match
// match
// match
// mismatch
// mismatch
//
for ( int i = 0; i < batch1.length; i++ ) {
String norm1 = re_p.matcher(batch1[i]).replaceAll(fnReplacer);
String norm2 = re_p.matcher(batch2[i]).replaceAll(fnReplacer);
if (norm1.equals(norm2)) {
System.out.println("Url pair #" + Integer.toString(i) + ": match ( '" + norm1 + "' == '" + norm2 + "' )");
} else {
System.out.println("Url pair #" + Integer.toString(i) + ": mismatch ( '" + norm1 + "' != '" + norm2 + "' )");
}
}
Demo available here (ideone.com).

I'm assuming the first URL always has exactly 2 digits after every decimal point. If so, match the 2nd URL to the regex formed by appending an optional digit to the end of each decimal fraction in the first URL.
static boolean matchURL(String url1, String url2)
{
return url2.matches(url1.replaceAll("([.][0-9]{2})", "$1[0-9]?"));
}
Test:
String url1 = "/mypath/check/10.10/-123.11";
List<String> tests = Arrays.asList(
"/mypath/check/10.10/-123.11",
"/mypath/check/10.10/-123.111",
"/mypath/check/10.101/-123.11",
"/mypath/check/10.101/-123.111",
"/mypath/check/10.11/-123.11"
);
for(String url2 : tests)
System.out.format("%s : %s = %b%n", url1, url2, matchURL(url1, url2));
Output:
/mypath/check/10.10/-123.11 : /mypath/check/10.10/-123.11 = true
/mypath/check/10.10/-123.11 : /mypath/check/10.10/-123.111 = true
/mypath/check/10.10/-123.11 : /mypath/check/10.101/-123.11 = true
/mypath/check/10.10/-123.11 : /mypath/check/10.101/-123.111 = true
/mypath/check/10.10/-123.11 : /mypath/check/10.11/-123.11 = false

How do I stop regex after finding "Message: "?

I'm splitting the body of a JSON message with the regex ":|\n" and storing the values into an array. I would like to get assistance with stopping my regex expression from splitting the message once it finds "Message: ".
In the JSON body, each section is separated by a new line, so the body looks similar to this:
{"body": "Name: Alfred Alonso\nCompany: null\nEmail: 123#abc.com\nPhone Number: 123-456-9999\nProject Type: Existing\nContact by: Email\nTime Frame: within 1 month\nMessage: Hello,\nThis is my message.\nThank You,\nJohn Doe"}
The code below works perfectly when the user doesn't create a new line within the message, so the entire message gets stored as one array value.
Thank you to anyone that can help me fix this!
String[] messArr = body.split(":|\n");
for (int i = 0; i < messArr.length; i++)
messArr[i] = messArr[i].trim();
if ("xxx".equals(eventSourceARN)) {
name = messArr[1];
String[] temp;
String delimiter = " ";
temp = name.split(delimiter);
name = temp[0];
String lastName = temp[1];
company = messArr[3];
email = messArr[5];
phoneNumber = messArr[7];
projectType = messArr[9];
contactBy = messArr[11];
timeFrame = messArr[13];
message = messArr[15];
I would like
messArr[14] = "Message"
messArr[15] = "Hello, This is my message. Thank you, John Doe"
This is what I get
[..., Message, Hello,, This is my message., Thank You, John Doe].
messArr[14] = "Message"
messArr[15] = "Hello,"
messArr[16] = "This is my message."
messArr[17] = "Thank You,"
messArr[18] = "John Doe"

Instead of using split, you can use a find loop, e.g.
Pattern p = Pattern.compile("([^:\\v]+): |((?<=Message: )(?s:.*)|(?<!$).*)\\R?");
List<String> result = new ArrayList<>();
for (Matcher m = p.matcher(input); m.find(); )
result.add(m.start(1) != -1 ? m.group(1) : m.group(2));
Test
String input = "Name: Alfred Alonso\n" +
"Company: null\n" +
"Email: 123#abc.com\n" +
"Phone Number: 123-456-9999\n" +
"Project Type: Existing\n" +
"Contact by: Email\n" +
"Time Frame: within 1 month\n" +
"Message: Hello,\n" +
"This is my message.\n" +
"Thank You,\n" +
"John Doe";
Pattern p = Pattern.compile("([^:\\v]+): |((?<=Message: )(?s:.*)|(?!$).*)\\R?");
List<String> result = new ArrayList<>();
for (Matcher m = p.matcher(input); m.find(); )
result.add(m.start(1) != -1 ? m.group(1) : m.group(2));
for (int i = 0; i < result.size(); i++)
System.out.println("result[" + i + "]: " + result.get(i));
Output
result[0]: Name
result[1]: Alfred Alonso
result[2]: Company
result[3]: null
result[4]: Email
result[5]: 123#abc.com
result[6]: Phone Number
result[7]: 123-456-9999
result[8]: Project Type
result[9]: Existing
result[10]: Contact by
result[11]: Email
result[12]: Time Frame
result[13]: within 1 month
result[14]: Message
result[15]: Hello,
This is my message.
Thank You,
John Doe
Explanation
Match one of:
( Start capture #1
[^:\v]+ Match one or more characters that are not a : or a linebreak
) End capture #1
: Match, but don't capture, a : and a space (which SO is hiding here)
| or:
( Start capture #2
Match one of:
(?<=Message: )(?s:.*) Rest of input, i.e. all text including linebreaks, if the text is immediately preceded by "Message: "
| or:
(?!$) Don't match if we're already at end-of-input
.* Match 0 or more characters up to end-of-line, excluding the EOL
) End capture #2
\\R? Match, but don't capture, an optional linebreak. This doesn't apply to Message text, and is optional in case there is no Message text and no linebreak after last value

If you want to, you could do exactly what you are doing and then put things together later. As you are trimming, notice where it says Message, then know that the Message is in the next slot and beyond. Then put it back together.
int messagePosition = -1;
for (int i = 0; i < messArr.length; i++){
messArr[i] = messArr[i].trim();
if (i>0 && messArr[i-1].equals("Message")){
messagePosition =i;
}
}
if (messagePosition > -1){
for (int i=messagePosition+1; i <messArr.length; i++){
messArr[messagePosition]=messArr[messagePosition]+" "+messArr[i];
}
}
One downside is that because arrays are fixed size, you need to act as if there is nothing beyond the messagePosition. So any calculations with length will be misleading. If for some reason you are worried you will look in the slots beyond, you could add messArr[i]=""; to the second for loop after the concatenation step.

Why ws4j online demo values and source code demo values differ, especially the lesk value?

I am trying to find the similarity between two words (for example "home" and "house") using lesk.
I executed the demo code for finding lesk value given here and I also found the value using online ws4j demo here
Both of them give different values i.e.
Values by executing demo code given in ws4j :
WuPalmer = 0.4
JiangConrath = 0.08467941109843881
LeacockChodorow = 1.1349799328389845
Lin = 0.16528546101187536
Resnik = 1.1692001183611416
Path = 0.1111111111111111
Lesk = 0.0
HirstStOnge = 0.0
Values by online demo:
wup( home#n#8 , house#n#10 ) = 1.0000
jcn( home#n#8 , house#n#10 ) = 12876699.5
lch( home#n#8 , house#n#10 ) = 3.6889
lin( home#n#8 , house#n#10 ) = 1.0000
res( home#v#1 , house#v#2 ) = 9.0735
path( home#n#8 , house#n#10 ) = 1.0000
lesk( home#n#8 , house#n#10 ) = 1571
hso( home#n#8 , house#n#10 ) = 16
Why is so huge difference between these two when they both use same ws4j??
Is there any problem with the demo code ??

String word1="house";
String word2="home";
RelatednessCalculator wup = new WuPalmer(db);
List<POS[]> posPairs = wup.getPOSPairs();
double maxScore = -1D;
for(POS[] posPair: posPairs) {
List<Concept> synsets1 = (List<Concept>)db.getAllConcepts(word1, posPair[0].toString());
List<Concept> synsets2 = (List<Concept>)db.getAllConcepts(word2, posPair[1].toString());
for(Concept ss1: synsets1)
{
for (Concept ss2: synsets2) {
Relatedness relatedness = wup.calcRelatednessOfSynset(ss1, ss2);
double score = relatedness.getScore();
if (score > maxScore) {
maxScore = score;
}
p1=ss1.getPos().toString();
p2=ss2.getPos().toString();
}
}} if (maxScore == -1D) {
maxScore = 0.0;}
System.out.println("sim('" + word1 +" "+ p1 +"', '" + word2 +" "+ p2+ "') = " + maxScore);

For one thing, ws4j does show inconsistency between its online demo and the last stable release (v1.0.1). You could find related issue at here.
However, for your case, it is because the "mfs" flag (which stands for the Most Frequent Sense) is set to true at default in the ws4j library. When this flag is true, the similarity calculation will only perform on the most frequent senses of each word; when it is false, similarity calculation will be computed on all sense combination. Basically it is equal to #Pranav 's answer.
It is expectable that the computation burden will be greatly increased when mfs is set to false. So I guess that's the reason the author set it to true as default.
If you want to set the mfs value to false in your code, simply use:
WS4JConfiguration.getInstance().setMFS(false);

Home and House, both are in the same synset. So for wup and jcn, the value seems right. Which version of JDK do you use ? Try this link -
http://maraca.d.umn.edu/cgi-bin/similarity/similarity.cgi?word1=home&senses1=all&word2=house&senses2=all&measure=wup&rootnode=yes
It'll also give you the same result.
Use home#n#1 and house#n#1 in online version, it will give the result like your compiler.

How can I make the numbers that I print in this code right adjusted?

How can I make the numbers that I print in this code right adjusted?
Since the numbers in the code are variables do I need to do something different than normally adjusting it?
a = 4 ;
b = 4 ;
c = 1 ;
x = 2 ;
Root = (-1*b + Math.sqrt( Math.pow(b,2) - 4*a*c)/ 2*a );
CoefficientOfXSquared = (-(b*x+c)/(Math.pow(x,2)) );
CoefficientOfX = (-(a*x+c)-c/x );
Constant = (-(a*Math.pow(x,2)+b*x) );
System.out.println("\n\n\t Given that: \n\t CoefficientOfXSquared = " +a );
System.out.println("\n\t CoefficientOfX = " +b );
System.out.println("\n\t Constant = " +c );
System.out.println("\n\t Root = " +x );
System.out.println("\n\n\t x = " + Root );
System.out.println("\n\t a = " + CoefficientOfXSquared );
System.out.println("\n\t b = " + CoefficientOfX );
System.out.println("\n\t c = " + Constant );
System.out.println("\n\n\n" );
I would appreciate it if someone could explain how to make it right adjusted.

Try using string format method to print them. Like this:
double x=1234.56;
double y=78.678;
System.out.printf("%n\t Root = %12.4f", x);
System.out.printf("%n\t Constant = %12.4f%n", y);
which gives:
Root = 1234.5600
Constant = 78.6780
I am assuming that these are doubles. Look up printf for more info.
Cliff

I think the docs are about the best for this, shows how to format numerics outputs using printf() here is also a great tutorial on formatting numbers in java

You can do this to right justify the text.
String.format("%50s", "Root = " + root);
Try String format function, format arguments provide many options to customize the printing.

how to extract this using regex

I need to extract this
Example:
www.google.com
maps.google.com
maps.maps.google.com
I need to extraact google.com from this.
How can I do this in Java?

Split on . and pick the last two bits.
String s = "maps.google.com";
String[] arr = s.split("\\.");
//should check the size of arr here
System.out.println(arr[arr.length-2] + '.' + arr[arr.length-1]);

Assuming you want to get the top level domain out of the hostname, you could try this:
Pattern pat = Pattern.compile( ".*\\.([^.]+\\.[^.]+)" ) ;
Matcher mat = pat.matcher( "maps.google.com" ) ;
if( mat.find() ) {
System.out.println( mat.group( 1 ) ) ;
}
if it's the other way round, and you want everything excluding the last 2 parts of the domain (in your example; www, maps, and maps.maps), then just change the first line to:
Pattern pat = Pattern.compile( "(.*)\\.[^.]+\\.[^.]+" ) ;

Extracting a known substring from a string doesn't make much sense ;) Why would you do a
String result = address.replaceAll("^.*google.com$", "$1");
when this is equal:
String result = "google.com";
If you need a test, try:
String isGoogle = address.endsWith(".google.com");
If you need the other part from a google address, this may help:
String googleSubDomain = address.replaceAll(".google.com", "");
(hint - the first line of code is a solution for your problem!)

String str="www.google.com";
try{
System.out.println(str.substring(str.lastIndexOf(".", str.lastIndexOf(".") - 1) + 1));
}catch(ArrayIndexOutOfBoundsException ex){
//handle it
}
Demo

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Parsing a string to get particular values - java

Well I've replaced the code by slight change as i = i.replaceAll("[^\d]", ""); where i got the results as 00486117000240241040719 which i did altered it later by converting to int [] and got the positions of my values,thanks for your reply everyone.

Related

Java match two strings without last character

How do I stop regex after finding "Message: "?

Why ws4j online demo values and source code demo values differ, especially the lesk value?

How can I make the numbers that I print in this code right adjusted?

how to extract this using regex

Categories

Resources