Simple Java question using while loop, substring, and indexOf

Simple Java question using while loop, substring, and indexOf - java

I'm working on an exercise for learning Java where I am supposed to write a method to print to the screen all items that come after the word "category:". This is my attempt at it:
public static void main(String[] args) {
String str = "We have a large inventory of things in our warehouse falling in "
+ "the category:apperal and the slightly "
+ "more in demand category:makeup along with the category:furniture and _.";
printCategories(str);
}
public static void printCategories(String passedString) {
int startOfSubstring = passedString.indexOf(":") + 1;
int endOfSubstring = passedString.indexOf(" ", startOfSubstring);
String categories = passedString.substring(startOfSubstring,endOfSubstring);
while(startOfSubstring > 0) {
System.out.println(categories);
startOfSubstring = passedString.indexOf((":") + 1, passedString.indexOf(categories));
System.out.println(startOfSubstring);
System.out.println(categories);
}
}
So the program should print:
apperal
makeup
furniture
My attempt is that the program should print the substring where it finds the starting index as ":" and the ending index as " ". Then it does the same thing again, only except from starting the very beginning of the variable str, this time it starts from the beginning of the last category found.
Once there are no more ":" to be found, the indexOf (part of startOfSubstring) will return -1 and the loop will terminate. However, after printing the first category it keeps returning -1 and terminating before finding the next category.
The two lines:
System.out.println(startOfSubstring);
System.out.println(categories);
Confirm that it is returning -1 after printing the first category, and the last line confirms that the categories variable is still defined as "apperal". If I comment out the line:
startOfSubstring = passedString.indexOf((":") + 1, passedString.indexOf(categories));
It returns the startOfSubstring as 77. So it is something to do with that line and attempting to change the start of search position in the indexOf method that is causing it to return -1 prematurely, but I cannot figure out why this is happening. I've spent the last few hours trying to figure it out...
Please help :(

There are a couple of issues with the program:
You're searching passedString for (":") + 1 which is the string ":1", probably not what you want.
You should evaluate endOfSubstring and categories inside the loop.
This is probably close to what you want:
public static void printCategories(String passedString) {
int startOfSubstring = passedString.indexOf(":") + 1;
while(startOfSubstring > 0) {
int endOfSubstring = passedString.indexOf(" ", startOfSubstring);
// If "category:whatever" can appear at the end of the string
// without a space, adjust endOfSubstring here.
String categories = passedString.substring(startOfSubstring, endOfSubstring);
// Do something with categories here, maybe print it?
// Find next ":" starting with end of category string.
startOfSubstring = passedString.indexOf(":", endOfSubstring) + 1;
}
}

I have corrected (in a comment) where you set the new value of startOfSubstring
while(startOfSubstring > 0) { // better if you do startOfSubstring != -1 IMO
System.out.println(categories);
// this should be startOfSubstring = passedString.indexOf(":", startOfSubstring +1);
startOfSubstring = passedString.indexOf((":") + 1, passedString.indexOf(categories));
System.out.println(startOfSubstring);
System.out.println(categories);
}

Related

StringIndexOutOfBoundsException when trying to get string from long string

I tried to get string from long string which is Firebase URL
"https://firebasestorage.googleapis.com/v0/b/No-manworld-3577.appspot.com/o/Contacts%2F1510361061636_Julien_Vcf?alt=media&token=c0bff20d-d115-4fef-b58c-4c7ffaef4296"
Now if you notice there is under score before and after name Julien in above string. I am trying to get that name but i am getting
java.lang.StringIndexOutOfBoundsException: String index out of range: -1
Here is my piece of code
String s="https://firebasestorage.googleapis.com/v0/b/No-manworld-3577.appspot.com/o/Contacts%2F1510361061636_Julien_Vcf?alt=media&token=c0bff20d-d115-4fef-b58c-4c7ffaef4296";
String newName=s.substring(s.indexOf("_")+1, s.indexOf("_"));
System.out.println(newName);

As said in my comment, when using substring, the first number has to be smaller than the second one.
In your case, you are calling substring with x + 1 and x. x + 1 > x thus substring fails, with x being s.indexOf("_").
I understand that you are trying to get the second indexOf of _.
Here is code that would in your case yield Julien:
String s = "...";
int start = s.indexOf("_") + 1;
int end = s.indexOf("_", start);
// name will hold the content of s between the first two `_`s, assuming they exist.
String name = s.substring(start, end);

If requirements are not clear on which 2 _ to select then here is Java 8 Stream way of doing it ..
public class Check {
public static void main(String[] args) {
String s = "https://firebasestorage.googleapis.com/v0/b/No-manworld-3577.appspot.com/o/Contacts%2F1510361061636_Julien_Vcf?alt=media&token=c0bff20d-d115-4fef-b58c-4c7ffaef4296";
long count = s.chars().filter(ch -> ch == '_').count();
if (count == 2) {
System.out.println(s.substring(s.indexOf('_') + 1, s.lastIndexOf('_')));
} else {
System.out.println("More than 2 underscores");
}
}
}
Why your code didn't work?
Let assume s.indexOf("_") gets some positive number say 10 then below translates to ...
String newName=s.substring(s.indexOf("_")+1, s.indexOf("_"));
String newName=s.substring(11, 10);
This will give StringIndexOutOfBoundsException as endIndex < beginIndex for subString method.

Understanding this algorithm to find permutations of a string (recursion)

I was going through my textbook, and I couldn't really wrap my head around how this generates the permutations of a string recursively
class PermutationIterator
{
private String wrd;
private int current;
private PermutationIterator Iter;
private String tl;
// Constructor
public PermutationIterator(String s)
{
wrd = s;
current = 0;
if (wrd.length() > 0)
Iter = new PermutationIterator(wrd.substring(1));
}
public String nextPermutation()
{
if(wrd.length() == 0)
{
current++;
return "";
}
char c = wrd.charAt(current);
String nextPermut = Iter.nextPermutation();
if(!Iter.hasMorePermutations())
{
System.out.println("Current value is " + current + " word length is " + wrd.length());
current++;
if (current >= wrd.length()) {
Iter = null;
}
else
{
if (current + 1 >= wrd.length())
tl = wrd.substring(0,current);
else
//System.out.println("Reached");
tl = wrd.substring(0,current) + wrd.substring(current + 1, wrd.length());
Iter = new PermutationIterator(tl);
}
}
return c + nextPermut;
}
public boolean hasMorePermutations()
{
System.out.println("Inside this method we have current= " + current + " with wrdlength " + wrd.length() +"with the word " + wrd);
return current < wrd.length();
}
}
This gets called by
public static void main(String [] args)
{
PermutationIterator iter = new PermutationIterator("eat");
while(iter.hasMorePermutations())
{
System.out.println(iter.nextPermutation());
}
}
For eat this will output
eat
eta
aet
ate
tea
tae
My attempt
Before even attempting to understand everything, for the past three days I have been really struggling to figure out how exactly !Iter.hasMorePermutations() is reached. The only way this can be false is if return current < wrd.length(); is not true. i.e wrd.length() <= current.
Now here is where it really starts to lose me. I tried printing out the values of word.length and current inside the !Iter.hasMorePermutations() branch just to see what was going on.
Current value is 0 word length is 1
Current value is 0 word length is 2
eat
Wait.. How is this possible? Isn't our condition for reaching this branch, to have current value bigger than our word length? How did we ever reach this branch?
I have also attached a picture of my trying to trace the program,
Thanks for reading this!

There are four iterators active at a time, for each of the word lengths. They all have their own values of current, and the call to hasMorePermutations is checking the values of current and length on the next iterator, not itself. So you may want to instead output:
System.out.println("Current value is " + Iter.current + " word length is " + Iter.wrd.length());
To start with, all their current values are 0, so we have for 'eat':
(word = 'eat', current = 0, length = 3)
(word = 'at', current = 0, length = 2)
(word = 't', current = 0, length = 1)
(word = '', current = 0, length = 0)
Each iterator calls nextPermutation on the next, until we get to the last iterator which has its current value incremented because wrd.length() == 0. So we get:
(word = 'eat', current = 0, length = 3)
(word = 'at', current = 0, length = 2)
(word = 't', current = 0, length = 1)
(word = '', current = 1, length = 0)
This is detected in the third iterator's Iter.hasMorePermutations(), which will then increment its own current value and reset the last iterator:
(word = 'eat', current = 0, length = 3)
(word = 'at', current = 0, length = 2)
(word = 't', current = 1, length = 1)
(word = '', current = 0, length = 0)
This is similarly detected by the second iterator, which resets the last two iterators:
(word = 'eat', current = 0, length = 3)
(word = 'at', current = 1, length = 2)
(word = 'a', current = 0, length = 1)
(word = '', current = 0, length = 0)
The first iterator's call to Iter.hasMorePermutations() will then return false, so it doesn't increment its current value, giving the next String 'eta'.

I guess that what you are missing is that the test isn't this.hasMorePermutations(), but rather Iter.hasMorePermutations(), possibly confused by the fact that Iter looks like a class name rather than a field name...
So, just before your point 3 returns, that has word length 0 and current 1, that is, no more iterations, and when it returns to the parent it is the object being tested, so you enter the if-statement.

What you are printing are the values of wrd and current of this object, not of the object Iter on which !Iter.hasMorePermutations() is evaluated.
To understand what is going on you should be printing Iter.wrd.length() and Iter.current.

in your diagram at number 3 ,
it falls into this if branch
if(wrd.length() == 0)
{
current++;
return "";
}
once this branch is executed , now wrd.length() is 0 and current is 1,
so after nextPermutation hits this if branch, next immediate call to hasMorePermutations will return false

What's so complicated? The way I read the code and translate into English, the procedure is:
for(current=0 to string_len)
pick the letter at the position of current
extract it out from the string (make a string with that letter missing). (this will have a length reduced by one; am I stating the obvious here?)
recursively generate the permutations of the string so constructed at the prev step (call it tl), taking care to add the extracted letter as the first one for each of the permutation generated on the shortened tl string
Since when current reaches string_len there's no more letters to extract, it means you are done.
It's a case of "turtles all the way down" - a 3-letter iterator will use a 2-letter iterator, which will use a 1-letter iterator which will use an empty-string iterator.
At each level, the iterator will extract the current letter, create a level-1 iterator,'squeeze it dry' and discard it when it doesn't have anything to offer. Once current is at max, this iterator will report to the level+1 one that "i'm dry" and get sacked.
--
"Rubber duck debugging" to understand how !Iter.hasMorePermutations() is reached:
Empty string
set current = 0
extract first letter - I cannot current == len, thus:
return empty string first time we are called (and increment current)
when called, tell the caller 'false==hasMorePermutation()'
See
if(wrd.length() == 0)
{
current++;
return "";
}
Word with 1 letter - make it "t"
set current=0
extract the letter at current - first_letter='t'; tl=""
generate the first permutation of tl (empty string) using an empty string permutation iterator (see above)
String nextPermut = Iter.nextPermutation();
only one possible, as it is an empty string. When I, the one-letter-iterator, gets to ask my sub-iterator (the empty string one) "do you have more permutations" the answer will come in the negative
if(!Iter.hasMorePermutations())
// this is true immediately for my empty-str subiterator
So, what me, the one-letter-iterator will do?
Will increment my current then...
because next time I'll be asked, I won't be able to generate one more, I'll sack my (zero-length) iterator
if (current >= wrd.length()) {
Iter = null;
}
and then return the extracted letter (my only one) prepended to the permutation generated by my (already sacked) iterator : thus returning a "t" prepended to an empty string.
What I, the one-letter iterator will answer next time I'm asked hasMorePermutations? Well, false, because I incremented my current to the length of my string (1)
Do you really want me to continue, or are you already seeing the 'iterators using iterators for one-less-letter words, each one of those iterators with its own current and word'?

Java program malfunction

First half of my question: When I try to run my program it loads and loads forever; it never shows the results. Could someone check out my code and spot an error somewhere. This program is meant to find a start DNA codon ATG and keep looking until finding a stop codon TAA or TAG or TGA, and then print out the gene from start to stop. I'm using BlueJ.
Second half of my question: I'm supposed to write a program in which the following steps are needed to be taken:
To find the first gene, find the start codon ATG.
Next look immediately past ATG for the first occurrence of each of the three stop codons TAG, TGA, and TAA.
If the length of the substring between ATG and any of these three stop codons is a multiple of three, then a candidate for a gene is the start codon through the end of the stop codon.
If there is more than one valid candidate, the smallest such string is the gene. The gene includes the start and stop codon.
If no start codon was found, then you are done.
If a start codon was found, but no gene was found, then start searching for another gene via the next occurrence of a start codon starting immediately after the start codon that didn't yield a gene.
If a gene was found, then start searching for the next gene immediately after this found gene.
Note that according to this algorithm, for the string "ATGCTGACCTGATAG", ATGCTGACCTGATAG could be a gene, but ATGCTGACCTGA would not be, even though it is shorter, because another instance of 'TGA' is found first that is not a multiple of three away from the start codon.
In my assignment I'm asked to produce these methods as well:
Specifically, to implement the algorithm, you should do the following.
Write the method findStopIndex that has two parameters dna and index, where dna is a String of DNA and index is a position in the string. This method finds the first occurrence of each stop codon to the right of index. From those stop codons that are a multiple of three from index, it returns the smallest index position. It should return -1 if no stop codon was found and there is no such position. This method was discussed in one of the videos.
Write the void method printAll that has one parameter dna, a String of DNA. This method should print all the genes it finds in DNA. This method should repeatedly look for a gene, and if it finds one, print it and then look for another gene. This method should call findStopIndex. This method was also discussed in one of the videos.
Write the void method testFinder that will use the two small DNA example strings shown below. For each string, it should print the string, and then print the genes found in the string. Here is sample output that includes the two DNA strings:
Sample output is:
ATGAAATGAAAA
Gene found is:
ATGAAATGA
DNA string is:
ccatgccctaataaatgtctgtaatgtaga
Genes found are:
atgccctaa
atgtctgtaatgtag
DNA string is:
CATGTAATAGATGAATGACTGATAGATATGCTTGTATGCTATGAAAATGTGAAATGACCCA
Genes found are:
ATGTAA
ATGAATGACTGATAG
ATGCTATGA
ATGTGA
I've thought it through and found this bit of code to be close to working order. I just need for my output to produce the results asked for in the instructions. Hopefully this isn't too messy, I'm just at a loss as to how to look for a stop codon after the start codon and then how I can grab the gene sequence. I'm also hoping to understand how to get the closest sequence of genes by finding which of the three tags (tag, tga, taa) is closer to atg. I know this is alot but hopefully it all makes sense.
import edu.duke.*;
import java.io.*;
public class FindMultiGenes {
public String findGenes(String dnaOri) {
String gene = new String();
String dna = dnaOri.toLowerCase();
int start = -1;
while(true){
start = dna.indexOf("atg", start);
if (start == -1) {
break;
}
int stop = findStopCodon(dna, start);
if(stop > start){
String currGene = dnaOri.substring(start, stop+3);
System.out.println("From: " + start + " to " + stop + "Gene: "
+currGene);}
}
return gene;
}
private int findStopCodon(String dna, int start){
for(int i = start + 3; i<dna.length()-3; i += 3){
String currFrameString = dna.substring(i, i+3);
if(currFrameString.equals("TAG")){
return i;
} else if( currFrameString.equals("TGA")){
return i;
} else if( currFrameString.equals("TAA")){
return i;
}
}
return -1;
}
public void testing(){
FindMultiGenes FMG = new FindMultiGenes();
String dna =
"CATGTAATAGATGAATGACTGATAGATATGCTTGTATGCTATGAAAATGTGAAATGACCCA";
FMG.findGenes(dna);
System.out.println("DNA string is: " + dna);
}
}

Change your line start = dna.indexOf("atg", start); to
start = dna.indexOf("atg", start + 1);
What is currently happening is you find the "atg" at index k and in the next run search the string for the next "atg" from k onwards. That finds the next match at the exact same location since the start location is inclusive. Therefore you are going to find the same index k over and over again and will never halt.
By increasing the index by 1 you jump over the currently found index k and start searching for next match from k+1 onwards.

This program is meant to find a start DNA codon ATG and keep looking until finding a stop codon TAA or TAG or TGA, and then print out the gene from start to stop.
Since the first search always starts from 0 you can just set the start index there, then search the stop codon from the result. Here I do it with 1 of the stop codons:
public static void main(String[] args) {
String dna = "CATGTAATAGATGAATGACTGATAGATATGCTTGTATGCTATGAAAATGTGAAATGACCCA";
String sequence = dna.toLowerCase();
int index = 0;
int newIndex = 0;
while (true) {
index = sequence.indexOf("atg", index);
if (index == -1)
return;
newIndex = sequence.indexOf("tag", index + 3);
if (newIndex == -1) // Check needed only if a stop codon is not guaranteed for each start codon.
return;
System.out.println("From " + (index + 3) + " to " + newIndex + " Gene: " + sequence.substring(index + 3, newIndex));
index = newIndex + 3;
}
}
Output:
From 4 to 7 Gene: taa
From 13 to 22 Gene: aatgactga
Also, you can use a regex to do a lot of the work for you:
public static void main(String[] args) {
String dna = "CATGTAATAGATGAATGACTGATAGATATGCTTGTATGCTATGAAAATGTGAAATGACCCA";
Pattern p = Pattern.compile("ATG([ATGC]+?)TAG");
Matcher m = p.matcher(dna);
while (m.find())
System.out.println("From " + m.start(1) + " to " + m.end(1) + " Gene: " + m.group(1));
}
Output:
From 4 to 7 Gene: TAA
From 13 to 22 Gene: AATGACTGA

How would I check if a part of a string equals another string of unknown length?

for(int j = 1;j<fileArray.size();j++) {
if(str.contains(fileArray.get(end+j))) {
}
}
(assume end is some number such as 30).
The goal of this part is when having a window length of 30 and a fileArray size > 30, check if theres anything after index 30 that matches whatever is inside the window.
ex: "i like to eat piesss aaaabbbbpiesssbbbb"
starting from the beginning of the string add the first 17 characters to a arraylist called window. then i check the rest of the string starting from right after window to see if there's anything that matches. space doesnt match so you add it to the output. keep checking then you see "piesss" matches. Then i replace the second "piesss" with wherever the first "piesss" occurs.
So right now im using fileArray.get(end+j) to check if there's anything that matches within my string(str) except this doesn't really work. Is there a way I could fix this code segment?

The replacement part of your question is still unclear. As is any reasoning to use an ArrayList. I've written some code that does a 5 character window search for a match after splitting the string you provided. Note how with the 30 and 17 values you gave nothing is ever matched (see commented out code). However with tweaked values some matches can be found.
public static void main(String[] args) {
// 1 2 3
//012345678901234567890123456789012345678 <- shows the index
String test = "i like to eat piesss aaaabbbbpiesssbbbb";
// int first = 17;
// int end = 30;
int first = 20;
int end = 37;
String firstHalf = test.substring(0, first);
String secondHalf = test.substring(first, end);
int matchSize = 5;
for (int i = 0; i + matchSize < secondHalf.length() ; i++)
{
String window = secondHalf.substring(i, i + matchSize);
if ( firstHalf.contains(window) )
{
System.out.println(window);
}
}
System.out.println("Done searching.");
}
Displays:
piess
iesss
Done searching.
If this isn't what you meant PLEASE edit your question to make your needs clear.

Sorting string array in Java

There is an example in my textbook for how to sort string arrays, but I am having a hard time understanding the logic of the code. We have the following array:
String[] words = {"so", "in", "very", "every", "do"};
The method itself is as follows:
public static void sortArray(Comparable[] compTab) {
for (int next=1; next < compTab.length; next++) {
Comparable value = compTab[next];
int this;
for (this = next; this > 0 && value.compareTo(compTab[this-1]) < 0; this--) {
compTab[this] = compTab[this-1];
}
compTab[this] = value;
writeArray(next + " run through: ", compTab);
}
}
This last writeArray call results in the following text being printed for first run through: "1. run through: in so very every do"
OK. Like I said, I have some problems with the logic in this code. If we go through the loop for the first time, this is what I see happening:
We have: Comparable value = compTab[1]. This means that value = "in".
We start the inner loop with this = next (which == 1). Thus, Java will only go through the inner loop once. It turns out that for this first run value.compareTo(compTab[this-1]) is indeed less than 0. Thus we have: compTab[1] = compTab[0]. This means that the word that used to be in position [1] is now replaced with the word that used to be in position [0]. Thus, we now have the word "so" in position [1] of the array.
The next step in the method is: compTab[this] = value. This is where I get confused. This tells me that since this = 1, we here get compTab[1] = value. However, earlier in the method we defined value = "in". This tells me that position [1] in the array yet again assumes the word "in".
The way I see this, the final print out should then be:
"1. run through: so in very every do".
In other words, the way I follow the logic of the code, the final print out of the array is just the same as it was before the method was implemented! Clearly there is some part of my logic here which is not correct. For instance - I don't see how the word that used to be in position [1] is now in position [0]. If anyone can help explain this to me, I would be extremely grateful!

The issue is within the following statement:
The next step in the method is: compTab[this] = value. This is where I
get confused. This tells me that since this = 1, we here get
compTab[1] = value. However, earlier in the method we defined value =
"in". This tells me that position [1] in the array yet again assumes
the word "in".
Since you ran through the loop once (see your statement 2), also the this-- was executed once and therefore this==0.

public class A {
static String Array[]={" Hello " , " This " , "is ", "Sorting ", "Example"};
String temp;
public static void main(String[] args)
{
for(int j=0; j<Array.length;j++)
{
for (int i=j+1 ; i<Array.length; i++)
{
if(Array[i].trim().compareToIgnoreCase(Array[j].trim())<0)
{
String temp= Array[j];
Array[j]= Array[i];
Array[i]=temp;
}
}
System.out.print(Array[j]);
}
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Simple Java question using while loop, substring, and indexOf - java

Related

StringIndexOutOfBoundsException when trying to get string from long string

Understanding this algorithm to find permutations of a string (recursion)

Java program malfunction

How would I check if a part of a string equals another string of unknown length?

Sorting string array in Java

Categories

Resources