How to cut out specific pieces of a string - java

So I have a string that has several start markers and end markers. How can I make a code that only keeps segments that are between the start and end markers?
A good example would be DNA transcription.
So the starting marker would be TAC, and an end marker would be ACT.
I have a string: AGATACACGACTAGCGAGCTACGATACTACC.
I know how to use the substring method, but not well enough so that it cuts the string down to:
TACACGACTTACGATACT.
How can I do this?
EDIT: I have solved this problem by writing this method:
private String spliceString(String n){
int counter1 = 0;
int startloc = 0;
int endloc = 0;
String m = "";
while (n.indexOf("TAC",counter1) != -1){
startloc = n.indexOf("TAC",counter1);
if (n.indexOf("ACT", counter1) != -1){
endloc = n.indexOf("ACT", counter1);
}
else if (n.indexOf("ATT", counter1) != -1){
endloc = n.indexOf("ATT", counter1);
}
else if (n.indexOf("ATC", counter1) != -1){
endloc = n.indexOf("ATC", counter1);
}
else {
return "AAAA"; //Returns a error string. This will be caught in another method that is not relevant.
}
m = m + n.substring(startloc,endloc + 3);
counter1 = endloc + 1;
}
System.out.println(m); //Just prints out so to check if the code worked
return m;
}

For this, regular expression is your friend.
One way would be to search for what you want to keep, and collect that in a StringBuilder.
String input = "AGATACACGACTAGCGAGCTACGATACTACC";
StringBuilder buf = new StringBuilder();
Matcher m = Pattern.compile("TAC.*?ACT").matcher(input);
while (m.find())
buf.append(m.group());
String output = buf.toString();
System.out.println(output); // prints: TACACGACTTACGATACT
See IDEONE for running code.
Read the javadoc of Pattern for more information on regex.
Alternatively, delete what you don't want to keep, i.e.
Text before first TAC
Text between ACT and TAC
Text after last ACT
The code is much simpler, but the regex is a bit more complex:
String input = "AGATACACGACTAGCGAGCTACGATACTACC";
String output = input.replaceAll("(?<=^|ACT).*?(?=TAC|$)", "");
System.out.println(output); // prints: TACACGACTTACGATACT
See regex101.com for nice color-coded example.

Java - String substring() Method
Description:
This method has two variants and returns a new string that is a substring of this string. The substring begins with the character at the specified index and extends to the end of this string or up to endIndex - 1 If second argument is given.
Syntax:
Here is the syntax of this method:
public String substring(int beginIndex)
or
public String substring(int beginIndex, int endIndex)
Parameters:
Here is the detail of parameters:
beginIndex -- the begin index, inclusive.
endIndex -- the end index, exclusive.
Return Value:
The specified substring.
Example:
import java.io.*;
public class Test{
public static void main(String args[]){
String Str = new String("Welcome to Tutorialspoint.com");
System.out.print("Return Value :" );
System.out.println(Str.substring(10) );
System.out.print("Return Value :" );
System.out.println(Str.substring(10, 15) );
}
}
This produces the following result:
Return Value : Tutorialspoint.com
Return Value : Tuto

Related

Is there a way to find out how many numbers are at the end of a string without knowing the exact index?

I have a method that extracts a certain substring from a string. This substring consists of the numbers in the string. Then this is parsed to an integer.
Method:
protected int startIndex() throws Exception {
String str = getWorkBook().getDefinedName("XYZ");
String sStr = str.substring(10,13);
return Integer.parseInt(sStr) - 1;
}
Example:
String :
'0 DB'!$B$460
subString :
460
Well, I manually entered the index range for the substring. But I would like to automate it.
My approach:
String str = getWorkBook().getDefinedName("XYZ");
int length = str.length();
String sStr = str.substring(length - 3, length);
This works well for this example.
Now there is the problem that the numbers at the end of the string can also be 4 or 5 digits. If that is the case, I naturally get a NullPointerException.
Is there a way or another approach to find out how many numbers are at the end of the string?
You can use the regex, (?<=\D)\d+$ which means one or more digits (i.e. \d+) from the end of the string, preceded by non-digits (i.e. \D).
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String[] args) {
// Test
System.out.println(getNumber("'0 DB'!$B$460"));
}
static String getNumber(String str) {
Matcher matcher = Pattern.compile("(?<=\\D)\\d+$").matcher(str);
if (matcher.find()) {
return matcher.group();
}
// If no match is found, return the string itself
return str;
}
}
In your case I would recommend to use regex with replaceAll like this:
String sStr = str.replaceAll(".*?([0-9]+)$", "$1");
This will extract the all the digits in the end or your String or any length.
Also I think you are missing the case when there are no digit in your String, for that I would recommend to check your string before you convert it to an Integer.
String sStr = str.replaceAll(".*?([0-9]+)$", "$1");
if (!sStr.isEmpty()) {
return Integer.parseInt(sStr) - 1;
}
return 0; // or any default value
If you just want to get the last number, you can go through the entire string on revert and get the start index:
protected static int startIndex() {
String str = getWorkBook().getDefinedName("XYZ");
if(Character.isDigit(str.charAt(str.length() - 1))) {
for(int i = str.length() - 1; i >= 0; i--){
if(!Character.isDigit(str.charAt(i)))
return i+1;
}
}
return -1;
}
and then print it:
public static void main(String[] args) {
int start = startIndex();
if(start != -1)
System.out.println(getWorkBook().getDefinedName("XYZ").substring(start));
else
System.out.println("No Number found");
}
You will have to add the
Simple and fast solution without RegEx:
public class Main
{
public static int getLastNumber(String str) {
int index = str.length() - 1;
while (index > 0 && Character.isDigit(str.charAt(index)))
index--;
return Integer.parseInt(str.substring(index + 1));
}
public static void main(String[] args) {
final String text = "'0 DB'!$B$460";
System.out.println(getLastNumber(text));
}
}
The output will be:
460
If I were going to do this I just search from the end. This is quite efficient. It returns -1 if no positive number is found. Other return options and the use of an OptionalInt could also be used.
String s = "'0 DB'!$B$460";
int i;
for (i = s.length(); i > 0 && Character.isDigit(s.charAt(i-1)); i--);
int vv = (i < s.length()) ? Integer.valueOf(s.substring(i)) : -1;
System.out.println(vv);
Prints
460
If you know that there will always be a number at the end you can forget the ternary (?:) above and just do the following:
int vv = Integer.valueOf(s.substring(i));

Read string format and fetch required irregular data

I have a string format like this which is output of
readAllBytes(new String(Files.readAllBytes(Paths.get(data))
from a file
a+2 b+3 c+33 d+88 ......
My scenario is I want to get the data after c+" ". The position of c is not constant but c occurs only once. It may occur anywhere. My required value will always be after c+ only. The required size of value 33.....is also not constant. Can someone help me with the optimal code please? I think collections need to be used here.
You can use this regex which will let you capture the data you want,
c\+(\d+)
Explanation:
c+ matches a literal c character immediately followed by a + char
(\d+) captures the next digit(s) which you are interested in capturing.
Demo, https://regex101.com/r/jfYUPG/1
Here is a java code for demonstrating same,
public static void main(String args[]) {
String s = "a+2 b+3 c+33 d+88 ";
Pattern p = Pattern.compile("c\\+(\\d+)");
Matcher m = p.matcher(s);
if (m.find()) {
System.out.println("Data: " + m.group(1));
} else {
System.out.println("Input data doesn't match the regex");
}
}
This gives following output,
Data: 33
This code is extracting the value right after c+ up to the next space, or to the end of the string if there is no space:
String str = "a+2 b+3 c+33 d+88 ";
String find = "c+";
int index = str.indexOf(" ", str.indexOf(find) + 2);
if (index == -1)
index = str.length();
String result = str.substring(str.indexOf(find) + 2, index);
System.out.println(result);
prints
33
or in a method:
public static String getValue(String str, String find) {
int index = str.indexOf(find) + 2;
int indexSpace = str.indexOf(" ", index);
if (indexSpace == -1)
indexSpace = str.length();
return str.substring(index, indexSpace);
}
public static void main(String[] args) {
String str = "a+2 b+3 c+33 d+88 ";
String find = "c+";
System.out.println(getValue(str, find));
}

How to get String between last two underscore

I have a string "abcde-abc-db-tada_x12.12_999ZZZ_121121.333"
The result I want should be 999ZZZ
I have tried using:
private static String getValue(String myString) {
Pattern p = Pattern.compile("_(\\d+)_1");
Matcher m = p.matcher(myString);
if (m.matches()) {
System.out.println(m.group(1)); // Should print 999ZZZ
}
else {
System.out.println("not found");
}
}
If you want to continue with a regex based approach, then use the following pattern:
.*_([^_]+)_.*
This will greedily consume up to and including the second to last underscrore. Then it will consume and capture 9999ZZZ.
Code sample:
String name = "abcde-abc-db-tada_x12.12_999ZZZ_121121.333";
Pattern p = Pattern.compile(".*_([^_]+)_.*");
Matcher m = p.matcher(name);
if (m.matches()) {
System.out.println(m.group(1)); // Should print 999ZZZ
} else {
System.out.println("not found");
}
Demo
Using String.split?
String given = "abcde-abc-db-tada_x12.12_999ZZZ_121121.333";
String [] splitted = given.split("_");
String result = splitted[splitted.length-2];
System.out.println(result);
Apart from split you can use substring as well:
String s = "abcde-abc-db-tada_x12.12_999ZZZ_121121.333";
String ss = (s.substring(0,s.lastIndexOf("_"))).substring((s.substring(0,s.lastIndexOf("_"))).lastIndexOf("_")+1);
System.out.println(ss);
OR,
String s = "abcde-abc-db-tada_x12.12_999ZZZ_121121.333";
String arr[] = s.split("_");
System.out.println(arr[arr.length-2]);
The get text between the last two underscore characters, you first need to find the index of the last two underscore characters, which is very easy using lastIndexOf:
String s = "abcde-abc-db-tada_x12.12_999ZZZ_121121.333";
String r = null;
int idx1 = s.lastIndexOf('_');
if (idx1 != -1) {
int idx2 = s.lastIndexOf('_', idx1 - 1);
if (idx2 != -1)
r = s.substring(idx2 + 1, idx1);
}
System.out.println(r); // prints: 999ZZZ
This is faster than any solution using regex, including use of split.
As I misunderstood the logic from the code in question a bit with the first read and in the meantime there appeared some great answers with the use of regular expressions, this is my try with the use of some methods contained in String class (it introduces some variables just to make it more clear to read, it could be written in the shorter way of course) :
String s = "abcde-abc-db-ta__dax12.12_999ZZZ_121121.333";
int indexOfLastUnderscore = s.lastIndexOf("_");
int indexOfOneBeforeLastUnderscore = s.lastIndexOf("_", indexOfLastUnderscore - 1);
if(indexOfLastUnderscore != -1 && indexOfOneBeforeLastUnderscore != -1) {
String sub = s.substring(indexOfOneBeforeLastUnderscore + 1, indexOfLastUnderscore);
System.out.println(sub);
}

Switching last two char in a string

Learning java as my first language and I found a solution for the problem at codingbat but I don't understand why my solution doesn't work and would love your help.
Given a string of any length, return a new string where the last 2
chars, if present, are swapped, so "coding" yields "codign".
lastTwo("coding") → "codign" lastTwo("cat") → "cta"
lastTwo("ab") → "ba"
This is my not working code:
public String lastTwo(String str) {
int strLength = str.length();
String last = str.substring(strLength-1,strLength);
String bLast = str.substring(strLength-2,strLength-1);
if(strLength<2)
return str;
return str.substring(0, strLength-2)+last+bLast;
}
This are the errors and I cant figure out why:
lastTwo("a")
→"Exception:java.lang.StringIndexOutOfBoundsException: String
index out of range: -1 (line number:5)" lastTwo("")
→"Exception:java.lang.StringIndexOutOfBoundsException: String
index out of range: -1 (line number:4)"
It seems there is a problem when input is less than 2 chars but I can't figure out why. To me, the if logic looks okay.
You need to move if condition up in the method as:
public static void main(String[] args) {
System.out.println(lastTwo("coding"));
System.out.println(lastTwo("cat"));
System.out.println(lastTwo("ab"));
System.out.println(lastTwo("a"));
}
public static String lastTwo(String str) {
int strLength = str.length();
if(strLength<2)
return str;
String last = str.substring(strLength-1,strLength);
String bLast = str.substring(strLength-2,strLength-1);
return str.substring(0, strLength-2)+last+bLast;
}
This will print:
codign
cta
ba
a
In the method if length of str is less than 2 (e.g. 1) in that case it will be returned else it will compute last and blast and then perform the operation.
When the input is is 1 char, strLength-2 is -1. The methodsubstring throws the error because such index doesn't exist. (The same applies to 0 char and strLength-1)
You have to put this verification on top
if(strLength<2)
return str;
When you have this code, if the string is "", it is trying to get the substring between positions -1,0 and -2,-1. You can't get the substring in a position lower than 0.
int strLength = str.length();
String last = str.substring(strLength-1,strLength);
String bLast = str.substring(strLength-2,strLength-1);
One of the overloads for substring can take the starting index , it figures out the last index. So the following should give you the last two chars:
str.substring(java.lang.Math.max(0,str.length()-2))
public String lastTwo(String str) {
if(str != null ) {
int strLength = str.length();
if (strLength < 2)
return str;
String last = str.substring(strLength-1,strLength);
String bLast = str.substring(strLength-2,strLength-1);
return str.substring(0, strLength-2)+last+bLast;
}
return null;
}
Problem in your code is String bLast = str.substring(strLength-2,strLength-1);
when strLength = 1 and you subtract by 2 and your index will be -1, hence IndexOutOfboundException occure.
Use above code your problem solved.
Simpler solution is to take the start-of-string and
append the last-char and then
append the before-last-char:
public static String lastTwo(String str) {
if (str.length()<2){
return str;
} else{
return str.substring(0, str.length() - 2) +
str.charAt(str.length() - 1) +
str.charAt(str.length() - 2);
}
}

Java get Substring value from String

I have string
String path = /mnt/sdcard/Album/album3_137213136.jpg
I want to only strings album3.
How can I get that substring.
I am using substring through index.
Is there any other way because album number is getting changed because it will fail in like album9, album10.
You can use a regular expression, but it seems like using index is the simplest in this case:
int start = path.lastIndexOf('/') + 1;
int end = path.lastIndexOf('_');
String album = path.substring(start, end);
You might want to throw in some error checking in case the formatting assumptions are violated.
Try this
public static void main(String args[]) {
String path = "/mnt/sdcard/Album/album3_137213136.jpg";
String[] subString=path.split("/");
for(String i:subString){
if(i.contains("album")){
System.out.println(i.split("_")[0]);
}
}
}
Obligatory regex solution using String.replaceAll:
String album = path.replaceAll(".*(album\\d+)_.*", "$1");
Use of it:
String path = "/mnt/sdcard/Album/album3_137213136.jpg";
String album = path.replaceAll(".*(album\\d+)_.*", "$1");
System.out.println(album); // prints "album3"
path = "/mnt/sdcard/Album/album21_137213136.jpg";
album = path.replaceAll(".*(album\\d+)_.*", "$1");
System.out.println(album); // prints "album21"
Using Paths:
final String s = Paths.get("/mnt/sdcard/Album/album3_137213136.jpg")
.getFileName().toString();
s.subString(0, s.indexOf('_'));
If you don't have Java 7, you have to resort to File:
final String s = new File("/mnt/sdcard/Album/album3_137213136.jpg").getName();
s.subString(0, s.indexOf('_'));
Use regex to match the substring
path.matches(".*album[0-9]+.*")
Try this ..
String path = /mnt/sdcard/Album/album3_137213136.jpg
path = path.subString(path.lastIndexOf("/")+1,path.indexOf("_"));
System.out.println(path);
How to count substring in String in java
At line no 8 we have to used for loop
another optional case replace for loop using just while loop like as while(true){. . .}
public class SubString {
public static void main(String[] args) {
int count = 0 ;
String string = "hidaya: swap the Ga of Gates with the hidaya: of Bill to make Bites."
+ " The hidaya: of Bill will then be swapped hidaya: with the Ga of Gates to make Gall."
+ " The new hidaya: printed out would be Gall Bites";
for (int i = 0; i < string.length(); i++)
{
int found = string.indexOf("hidaya:", i);//System.out.println(found);
if (found == -1) break;
int start = found + 5;// start of actual name
int end = string.indexOf(":", start);// System.out.println(end);
String subString = string.substring(start, end); //System.out.println(subString);
if(subString != null)
count++;
i = end + 1; //advance i to start the next iteration
}
System.out.println("In given String hidaya Occurred "+count+" time ");
}
}

Categories