Read string format and fetch required irregular data

Read string format and fetch required irregular data - java

I have a string format like this which is output of
readAllBytes(new String(Files.readAllBytes(Paths.get(data))
from a file
a+2 b+3 c+33 d+88 ......
My scenario is I want to get the data after c+" ". The position of c is not constant but c occurs only once. It may occur anywhere. My required value will always be after c+ only. The required size of value 33.....is also not constant. Can someone help me with the optimal code please? I think collections need to be used here.

You can use this regex which will let you capture the data you want,
c\+(\d+)
Explanation:
c+ matches a literal c character immediately followed by a + char
(\d+) captures the next digit(s) which you are interested in capturing.
Demo, https://regex101.com/r/jfYUPG/1
Here is a java code for demonstrating same,
public static void main(String args[]) {
String s = "a+2 b+3 c+33 d+88 ";
Pattern p = Pattern.compile("c\\+(\\d+)");
Matcher m = p.matcher(s);
if (m.find()) {
System.out.println("Data: " + m.group(1));
} else {
System.out.println("Input data doesn't match the regex");
}
}
This gives following output,
Data: 33

This code is extracting the value right after c+ up to the next space, or to the end of the string if there is no space:
String str = "a+2 b+3 c+33 d+88 ";
String find = "c+";
int index = str.indexOf(" ", str.indexOf(find) + 2);
if (index == -1)
index = str.length();
String result = str.substring(str.indexOf(find) + 2, index);
System.out.println(result);
prints
33
or in a method:
public static String getValue(String str, String find) {
int index = str.indexOf(find) + 2;
int indexSpace = str.indexOf(" ", index);
if (indexSpace == -1)
indexSpace = str.length();
return str.substring(index, indexSpace);
}
public static void main(String[] args) {
String str = "a+2 b+3 c+33 d+88 ";
String find = "c+";
System.out.println(getValue(str, find));
}

Related

java: to retrieve a part of a string that begins and ends with a specific string

This is the program and i have a string strLineText from which i need to extract the words that contain target in them.
Ex. In the string "random string with IWANTTHISABC-123 and more" i need to extract IWANTTHISABC-123. Similarly if the string is "random string with IWANTTHISBBC-001" i need to extract `IWANTTHISBBC-001. The prefix is fixed
I have tried it with substring() (Method1) but the logic doesn't work for Strings that end with this target word i.e., nothing is outputted
I tried the split() (Method2) and it works for all four combinations.
Can you help me with achieving using the substring() (Method1) for all four combinations
public static void main(String[] args) throws IOException {
String target = "IWANTTHIS";
//Four possible inputs
String strLineText = "random string with IWANTTHISABC-123 and more"; //works
String strLineText = "IWANTTHISCBC-45601 and more"; //works
String strLineText = "IWANTTHISEBC-1"; //doesn't work
String strLineText = "random string with IWANTTHISKBC-55545"; //doesn't work
//Method1
System.out.println("O/P 1:" + strLineText.substring(strLineText.indexOf(target),
strLineText.indexOf(target) + strLineText.substring(strLineText.indexOf(target)).indexOf(" ") + 1).trim());
//Method2
for (String s : strLineText.split(" "))
if (s.contains(target))
System.out.println("O/P 2:" + s.trim());
}

I think it is pretty straightforward, you just need to compute the end index starting from begin index. Here is the snippet that works for all cases.
int begin = strLineText.indexOf(target);
int end = strLineText.indexOf(" ", begin);
if(end == -1) end = strLineText.length();
System.out.println(strLineText.substring(begin, end));

Assumes that your definition of "word" is a sequence of alphas, excluding numbers, symbols, etc. For other definitions of "word," the regular expression can be adjusted accordingly. If you want to include parts of the word previous to the target string, you can add a loop that counts backwards from startIndex, examining characters to see if they are alpha.
public class Foo
{
public static void main(String[] args)
{
String target = "IWANTTHIS";
// String candidate = "random string with IWANTTHISABC-123 and more";
String candidate = "IWANTTHISCBC-45601 and more";
// String candidate = "IWANTTHISEBC-1";
// String candidate = "random string with IWANTTHISKBC-55545";
int startIndex = -1;
int endIndex = -1;
if(candidate.contains(target))
{
System.out.println("Target located.");
startIndex = candidate.indexOf(target);
System.out.println("target starts at " + startIndex);
// keep adding characters until first non-alpha char
endIndex = startIndex + target.length();
boolean wordEnded = false;
while(!wordEnded && (endIndex >= candidate.length()))
{
String foo = Character.toString(candidate.charAt(endIndex + 1));
if(foo.matches("[a-zA-Z]"))
{
endIndex++;
}
else
{
wordEnded = true;
}
}
String full = candidate.substring(startIndex, endIndex + 1);
System.out.println("Full string = " + full);
}
else
{
System.out.println("No target located. Exiting.");
}
}
}

strLineText.substring(strLineText.indexOf(target)).indexOf(" ") will be -1 if strLineText contains no spaces after your target string. You could check if strLineText.substring(strLineText.indexOf(target)) contains spaces, and if not, take the substring until the end of strLineText:
//Method1
int beginIndex = strLineText.indexOf(target);
String substring = strLineText.substring(beginIndex);
int endIndex = substring.contains(" ") ? beginIndex + substring.indexOf(" ") : strLineText.length();
System.out.println("O/P 1:" + strLineText.substring(beginIndex, endIndex));

How to display the characters upto a specific index of a String using String function?

I have my string defined as
text1:text2:text3:text4:text5
I want to get output as
text1:text2:text3
using String methods.
I have tried using lastIndexOf, then substring and then again lastIndexOf.
I want to avoid these three steps with calling lastIndexOf two times.
Is there a better way to achieve this?

You can do this by running a loop to iterate over the characters of the string from index = 0 to index = lastIndexOf('3'). Here's the code:
String s = "text1:text2:text3:text4:text5";
for(int i = 0; i < = s.lastIndexOf('3'); i++)
System.out.print(s.charAt(i));
This gives you the required output.
OUTPUT:
text1:text2:text3

A regular expression could be used to identify the correct part of the string:
private static Pattern PATTERN = Pattern.compile("([^:]*:){2}[^:]*(?=:|$)");
public static String find(String input) {
Matcher m = PATTERN.matcher(input);
return m.find() ? m.group() : null;
}
Alternatively do not use substring between every call of lastIndexOf, but use the version of lastIndexOf that restricts the index range:
public static String find(String input, int colonCount) {
int lastIndex = input.length();
while (colonCount > 0) {
lastIndex = input.lastIndexOf(':', lastIndex-1);
colonCount--;
}
return lastIndex >= 0 ? input.substring(0, lastIndex) : null;
}
Note that here colonCount is the number of : that are left out of the string.

You could try:
String test = "text1:text2:text3:text4:text5";
String splitted = text.split(":")
String result = "";
for (int i = 0; i <3; i++) {
result += splitted[i] + ":"
}
result = result.substring(0, result.length() -1)

You can use the Java split()-method:
String string = "text1:text2:text3:text4:text5";
String[] text = string.split(":");
String text1 = text[0];
String text2 = text[1];
String text3 = text[2];
And then generate the output directly or with a for-loop:
// directly
System.out.println(text1 + ":" + text2 + ":" + text3);
// for-loop. Just enter, how many elements you want to display.
for(int i = 0; i < 3; i++){
System.out.println(text[i] + " ");
}
Output:
text1 text2 text3
The advantage of using this method is, that your input and output can be a bit more complex, because you have power over the order in which the words can be printed.
Example:
Consider Master Yoda.
He has a strange way of talking and often mixes up the sentence structure. When he introduces himself, he says the (incorrect!) senctence: "Master Yoda my name is".
Now, you want to create an universal translator, that - of course - fixes those mistakes while translating from one species to another.
You take in the input-string and "divide" it into its parts:
String string = "Master:Yoda:my:name:is"
String[] text = string.split(":");
String jediTitle = text[0];
String lastName = text[1];
String posessivePronoun = text[2];
String noun = text[3];
String linkingVerb = text[4];
The array "text" now contains the sentence in the order that you put it in. Now your translator can analyze the structure and correct it:
String correctSentenceStructure = posessivePronoun + " " + noun + " " + linkingVerb + " " + jediTitle + " " + lastName;
System.out.println(correctSentenceStructure);
Output:
"My name is Master Yoda"
A working translator might be another step towards piece in the galaxy.

Maby try this one-line s.substring(0, s.lastIndexOf('3')+1);
Complete example:
package testing.project;
public class Main {
public static void main(String[] args) {
String s = "text1:text2:text3:text4:text5";
System.out.println(s.substring(0, s.lastIndexOf('3')+1));
}
}
Output:
text1:text2:text3

Converting C++ std::string's find_*_of() methods to Java

When converting code from C++ to Java, what is an easy way to convert the std::string methods like find_last_of(), find_last_not_of, etc?
These C++ methods find an index of any of a set of characters.
Java's String class provides indexOf() and lastIndexOf(), but these find an index of a character or a string, not any of a set of characters.
For example, the code below finds the last character that is not ASCII whitespace.
size_t pos = myString.find_last_not_of( " \t\n\r" );

One option is to use Guava's CharMatcher class.
Here are tested conversions for each of the single-argument find_*_of() methods.
public int findFirstOf( String sequence, String str ) {
return CharMatcher.anyOf( str ).indexIn( sequence );
}
public int findFirstNotOf( String sequence, String str ) {
return CharMatcher.anyOf( str ).negate().indexIn( sequence );
}
public int findLastOf( String sequence, String str ) {
return CharMatcher.anyOf( str ).lastIndexIn( sequence );
}
public int findLastNotOf( String sequence, String str ) {
return CharMatcher.anyOf( str ).negate().lastIndexIn( sequence );
}
Other answers welcomed. [I couldn't find anything for find_last_not_of() in Java when searching on stackoverflow and elsewhere. And I missed CharMatcher the first time I searched through Guava for corresponding functionality. I'd like to document this easy conversion for future use.]

If you like regex, you can give the below equivalents a shot. This might not be the most efficient method, but certainly worth considering, if you don't want to use any 3rd party library (Given that, there are no equivalent methods in String class in Java).
P.S: - If you are comfortable with 3rd party library, then I wouldn't suggest using regex for this task, as this might soon become difficult to extend as per requirement.
So, this is just another option :
public int findFirstOf( String sequence, String str ) {
String regex = "^[^" + Pattern.quote(str) + "]*";
int index = sequence.length() - sequence.replaceAll(regex, "").length();
return index == sequence.length() ? -1 : index;
}
public int findFirstNotOf( String sequence, String str ) {
String regex = "^[" + Pattern.quote(str) + "]*";
int index = sequence.length() - sequence.replaceAll(regex, "").length();
return index == sequence.length() ? -1 : index;
}
public int findLastOf( String sequence, String str ) {
String regex = "[^" + Pattern.quote(str) + "]*$";
return sequence.replaceAll(regex, "").length() - 1;
}
public int findLastNotOf( String sequence, String str ) {
String regex = "[" + Pattern.quote(str) + "]*$";
return sequence.replaceAll(regex, "").length() - 1;
}
I haven't tested above methods. You can do the test, and compare the results with the corresponding method you have got, and see if this works. Please get back, if this doesn't work.
As far as 3rd party library is concerned, you also have Apache Commons, StringUtils class, with following methods:
StringUtils#indexOfAny()
StringUtils#indexOfAnyBut()
StringUtils#lastIndexOfAny()

static int findFirstNotOf(String searchIn, String searchFor, int searchFrom) {
boolean found;
char c;
int i;
for (i = searchFrom; i < searchIn.length(); i++) {
found = true;
c = searchIn.charAt(i);
System.out.printf("s='%s', idx=%d\n",c,searchFor.indexOf(c));
if (searchFor.indexOf(c) == -1) {
found = false;
}
if (!found) {
return i;
}
}
return i;
}
static int findLastNotOf(String searchIn, String searchFor, int searchFrom) {
boolean found;
char c;
int i;
for ( i = searchFrom; i>=0; i--) {
found = true;
c = searchIn.charAt(i);
System.out.printf("s='%s', idx=%d\n",c,searchFor.indexOf(c));
if (searchFor.indexOf(c) == -1)
found = false;
if (!found) return i;
}
return i;
}
public static void main(String[] args){
String str = "look for non-alphabetic characters...";
int found = findFirstNotOf(str,"abcdefghijklmnopqrstuvwxyz ",0);
if (found!=str.length()) {
System.out.print("The first non-alphabetic character is " + str.charAt(found));
System.out.print(" at position " + found + '\n');
}
found = findLastNotOf(str,"abcdefghijklmnopqrstuvwxyz ",str.length()-1);
if (found>=0) {
System.out.print("The last non-alphabetic character is " + str.charAt(found));
System.out.print(" at position " + found + '\n');
}
str = "Please, erase trailing white-spaces \n";
String whitespaces = " \t\f\n\r";
found = findLastNotOf(str,whitespaces,str.length()-1);
if (found!=str.length()-1)
str = str.substring(0,found+1);
else
str = ""; // str is all whitespace
System.out.printf('['+ str +"]\n");
}

Java get Substring value from String

I have string
String path = /mnt/sdcard/Album/album3_137213136.jpg
I want to only strings album3.
How can I get that substring.
I am using substring through index.
Is there any other way because album number is getting changed because it will fail in like album9, album10.

You can use a regular expression, but it seems like using index is the simplest in this case:
int start = path.lastIndexOf('/') + 1;
int end = path.lastIndexOf('_');
String album = path.substring(start, end);
You might want to throw in some error checking in case the formatting assumptions are violated.

Try this
public static void main(String args[]) {
String path = "/mnt/sdcard/Album/album3_137213136.jpg";
String[] subString=path.split("/");
for(String i:subString){
if(i.contains("album")){
System.out.println(i.split("_")[0]);
}
}
}

Obligatory regex solution using String.replaceAll:
String album = path.replaceAll(".*(album\\d+)_.*", "$1");
Use of it:
String path = "/mnt/sdcard/Album/album3_137213136.jpg";
String album = path.replaceAll(".*(album\\d+)_.*", "$1");
System.out.println(album); // prints "album3"
path = "/mnt/sdcard/Album/album21_137213136.jpg";
album = path.replaceAll(".*(album\\d+)_.*", "$1");
System.out.println(album); // prints "album21"

Using Paths:
final String s = Paths.get("/mnt/sdcard/Album/album3_137213136.jpg")
.getFileName().toString();
s.subString(0, s.indexOf('_'));
If you don't have Java 7, you have to resort to File:
final String s = new File("/mnt/sdcard/Album/album3_137213136.jpg").getName();
s.subString(0, s.indexOf('_'));

Use regex to match the substring
path.matches(".*album[0-9]+.*")

Try this ..
String path = /mnt/sdcard/Album/album3_137213136.jpg
path = path.subString(path.lastIndexOf("/")+1,path.indexOf("_"));
System.out.println(path);

How to count substring in String in java
At line no 8 we have to used for loop
another optional case replace for loop using just while loop like as while(true){. . .}
public class SubString {
public static void main(String[] args) {
int count = 0 ;
String string = "hidaya: swap the Ga of Gates with the hidaya: of Bill to make Bites."
+ " The hidaya: of Bill will then be swapped hidaya: with the Ga of Gates to make Gall."
+ " The new hidaya: printed out would be Gall Bites";
for (int i = 0; i < string.length(); i++)
{
int found = string.indexOf("hidaya:", i);//System.out.println(found);
if (found == -1) break;
int start = found + 5;// start of actual name
int end = string.indexOf(":", start);// System.out.println(end);
String subString = string.substring(start, end); //System.out.println(subString);
if(subString != null)
count++;
i = end + 1; //advance i to start the next iteration
}
System.out.println("In given String hidaya Occurred "+count+" time ");
}
}

Generate fixed length Strings filled with whitespaces

I need to produce fixed length string to generate a character position based file. The missing characters must be filled with space character.
As an example, the field CITY has a fixed length of 15 characters. For the inputs "Chicago" and "Rio de Janeiro" the outputs are
" Chicago"
" Rio de Janeiro".

Since Java 1.5 we can use the method java.lang.String.format(String, Object...) and use printf like format.
The format string "%1$15s" do the job. Where 1$ indicates the argument index, s indicates that the argument is a String and 15 represents the minimal width of the String.
Putting it all together: "%1$15s".
For a general method we have:
public static String fixedLengthString(String string, int length) {
return String.format("%1$"+length+ "s", string);
}
Maybe someone can suggest another format string to fill the empty spaces with an specific character?

Utilize String.format's padding with spaces and replace them with the desired char.
String toPad = "Apple";
String padded = String.format("%8s", toPad).replace(' ', '0');
System.out.println(padded);
Prints 000Apple.
Update more performant version (since it does not rely on String.format), that has no problem with spaces (thx to Rafael Borja for the hint).
int width = 10;
char fill = '0';
String toPad = "New York";
String padded = new String(new char[width - toPad.length()]).replace('\0', fill) + toPad;
System.out.println(padded);
Prints 00New York.
But a check needs to be added to prevent the attempt of creating a char array with negative length.

This code will have exactly the given amount of characters; filled with spaces or truncated on the right side:
private String leftpad(String text, int length) {
return String.format("%" + length + "." + length + "s", text);
}
private String rightpad(String text, int length) {
return String.format("%-" + length + "." + length + "s", text);
}

For right pad you need String.format("%0$-15s", str)
i.e. - sign will "right" pad and no - sign will "left" pad
See my example:
import java.util.Scanner;
public class Solution {
public static void main(String[] args) {
Scanner sc=new Scanner(System.in);
System.out.println("================================");
for(int i=0;i<3;i++)
{
String s1=sc.nextLine();
Scanner line = new Scanner( s1);
line=line.useDelimiter(" ");
String language = line.next();
int mark = line.nextInt();;
System.out.printf("%s%03d\n",String.format("%0$-15s", language),mark);
}
System.out.println("================================");
}
}
The input must be a string and a number
example input : Google 1

String.format("%15s",s) // pads left
String.format("%-15s",s) // pads right
Great summary here

import org.apache.commons.lang3.StringUtils;
String stringToPad = "10";
int maxPadLength = 10;
String paddingCharacter = " ";
StringUtils.leftPad(stringToPad, maxPadLength, paddingCharacter)
Way better than Guava imo. Never seen a single enterprise Java project that uses Guava but Apache String Utils is incredibly common.

You can also write a simple method like below
public static String padString(String str, int leng) {
for (int i = str.length(); i <= leng; i++)
str += " ";
return str;
}

The Guava Library has Strings.padStart that does exactly what you want, along with many other useful utilities.

Here's a neat trick:
// E.g pad("sss","00000000"); should deliver "00000sss".
public static String pad(String string, String pad) {
/*
* Add the pad to the left of string then take as many characters from the right
* that is the same length as the pad.
* This would normally mean starting my substring at
* pad.length() + string.length() - pad.length() but obviously the pad.length()'s
* cancel.
*
* 00000000sss
* ^ ----- Cut before this character - pos = 8 + 3 - 8 = 3
*/
return (pad + string).substring(string.length());
}
public static void main(String[] args) throws InterruptedException {
try {
System.out.println("Pad 'Hello' with ' ' produces: '"+pad("Hello"," ")+"'");
// Prints: Pad 'Hello' with ' ' produces: ' Hello'
} catch (Exception e) {
e.printStackTrace();
}
}

Here is the code with tests cases ;) :
#Test
public void testNullStringShouldReturnStringWithSpaces() throws Exception {
String fixedString = writeAtFixedLength(null, 5);
assertEquals(fixedString, " ");
}
#Test
public void testEmptyStringReturnStringWithSpaces() throws Exception {
String fixedString = writeAtFixedLength("", 5);
assertEquals(fixedString, " ");
}
#Test
public void testShortString_ReturnSameStringPlusSpaces() throws Exception {
String fixedString = writeAtFixedLength("aa", 5);
assertEquals(fixedString, "aa ");
}
#Test
public void testLongStringShouldBeCut() throws Exception {
String fixedString = writeAtFixedLength("aaaaaaaaaa", 5);
assertEquals(fixedString, "aaaaa");
}
private String writeAtFixedLength(String pString, int lenght) {
if (pString != null && !pString.isEmpty()){
return getStringAtFixedLength(pString, lenght);
}else{
return completeWithWhiteSpaces("", lenght);
}
}
private String getStringAtFixedLength(String pString, int lenght) {
if(lenght < pString.length()){
return pString.substring(0, lenght);
}else{
return completeWithWhiteSpaces(pString, lenght - pString.length());
}
}
private String completeWithWhiteSpaces(String pString, int lenght) {
for (int i=0; i<lenght; i++)
pString += " ";
return pString;
}
I like TDD ;)

Apache common lang3 dependency's StringUtils exists to solve Left/Right Padding
Apache.common.lang3 provides the StringUtils class where you can use the following method to left padding with your preferred character.
StringUtils.leftPad(final String str, final int size, final char padChar);
Here, This is a static method and the parameters
str - string needs to be pad (can be null)
size - the size to pad to
padChar the character to pad with
We have additional methods in that StringUtils class as well.
rightPad
repeat
different join methods
I just add the Gradle dependency here for your reference.
implementation 'org.apache.commons:commons-lang3:3.12.0'
https://mvnrepository.com/artifact/org.apache.commons/commons-lang3/3.12.0
Please see all the utils methods of this class.
https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html
GUAVA Library Dependency
This is from jricher answer. The Guava Library has Strings.padStart that does exactly what you want, along with many other useful utilities.

This code works great.
String ItemNameSpacing = new String(new char[10 - masterPojos.get(i).getName().length()]).replace('\0', ' ');
printData += masterPojos.get(i).getName()+ "" + ItemNameSpacing + ": " + masterPojos.get(i).getItemQty() +" "+ masterPojos.get(i).getItemMeasure() + "\n";
Happy Coding!!

public static String padString(String word, int length) {
String newWord = word;
for(int count = word.length(); count < length; count++) {
newWord = " " + newWord;
}
return newWord;
}

This simple function works for me:
public static String leftPad(String string, int length, String pad) {
return pad.repeat(length - string.length()) + string;
}
Invocation:
String s = leftPad(myString, 10, "0");

public class Solution {
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
for (int i = 0; i < 3; i++) {
int s;
String s1 = sc.next();
int x = sc.nextInt();
System.out.printf("%-15s%03d\n", s1, x);
// %-15s -->pads right,%15s-->pads left
}
}
}
Use printf() to simply format output without using any library.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Read string format and fetch required irregular data - java

Related

java: to retrieve a part of a string that begins and ends with a specific string

How to display the characters upto a specific index of a String using String function?

Converting C++ std::string's find_*_of() methods to Java

Java get Substring value from String

Generate fixed length Strings filled with whitespaces

Categories

Resources