I have a constructor that takes in a string as a parameter. I want to throw a runtime exception everytime the string that is passed into the constructor contains anything that is not either "A", "C", "G", or "T". Currently this is what my code looks like:
public DNAStrandNovice(String strand) {
passedStrand = strand;
if (passedStrand.contains("a") || passedStrand.contains("c")
|| passedStrand.contains("g") || passedStrand.contains("t")) {
throw new RuntimeException("Illegal DNA strand");
} else if (passedStrand.contains("1") || passedStrand.contains("2")
|| passedStrand.contains("3") || passedStrand.contains("4")
|| passedStrand.contains("5") || passedStrand.contains("6")
|| passedStrand.contains("7") || passedStrand.contains("8")
|| passedStrand.contains("9") || passedStrand.contains("0")) {
throw new RuntimeException("Illegal DNA Strand");
} else if (passedStrand.contains(",") || passedStrand.contains(".")
|| passedStrand.contains("?") || passedStrand.contains("/")
|| passedStrand.contains("<") || passedStrand.contains(">")) {
throw new RuntimeException("Illegal DNA Strand");
}
}
I feel like this could be implemented in a much more concise way, but I don't know how. Right now I'm just checking for every character that is not the capital letters "A", "C", "G", or "T" and throwing a run time exception but I feel like it's too tedious and bad programming style. Anyone have any ideas?
Check negatively, instead of positively.
for (int i = 0; i < str.length(); i++) {
if (str.charAt(i) != 'A' && str.charAt(i) != 'C'
&& str.charAt(i) != 'G' && str.charAt(i) != 'T') {
throw new IllegalArgumentException("Bad character " + str.charAt(i));
}
}
...or, even shorter,
for (int i = 0; i < str.length(); i++) {
if (!"ACGT".contains(str.charAt(i))) {
throw new IllegalArgumentException("Bad character " + str.charAt(i));
}
}
You can achieve this using regex (regular expressions):
public DNAStrandNovice(String strand) {
if (!strand.matches("[ACGT]+")) { //or [ACGT] <-- see note below
throw new RuntimeException("Illegal DNA strand");
}
passedStrand = strand;
}
The regular expression [ACGT]+ means the string must have one or more characters, and each of them must be one of A, C, G or T. The ! in front of strand.matches reverses the boolean value returned by matches, essentially meaning if the string does not match the regex, then throw RuntimeException.
Note: If you need the string to have exactly one character, use the regex [ACGT]. If you need to allow spaces, you can use [ACGT ]+ (then trim and check for empty) or [ACGT][ACGT ]+ (which ensures the first character is not a space).
You can even do much more complex and powerful regex checks such as patterns that should contain exactly four characters repeated with spaces in between (example ATCG TACG) or even where only certain characters appear in certain places, like only A and C can appear as first two characters, and only G and T can appear following it (example ACTG is correct while AGTC is wrong). I will leave all that as an exercise.
Recommend against using an exception. Define an Enum and pass that.
public enum DnaCode { A, C, G, T }
...
public DNAStrandNovice(List<DnaCode> strand) {
...
}
Or make it a DnaCode[] if you prefer. You can control the input and avoid dealing with interrupted control flow. Exceptions are rather expensive to throw and are not really intended for use as a method of flow control.
You can make the code slightly more efficient by manaully looping through the characters and checking for the letters either with ifs or a Set.
But honestly, unless performance is a problem, it's good how it. Very obvious and easy to maintain.
I was going to jump in with a possibility...
public boolean validateLetter(String letter){
HashMap<String, String> dna = new HashMap<String, String>();
dna.put("A", "A");
dna.put("C", "C");
dna.put("G", "G");
dna.put("T", "T");
if(dna.get(letter) == null){
System.out.println("fail");
return false;
} else {
return true;
}
}
I would also not put that code in the constructor, rather put it in its own method and call from the constructor.
public DNAStrandNovice(String strand){
if(strand.matches("^[A-Za-z]*[0-9]+[A-Za-z]*$") || strand.matches("^[a-zA-Z]*[^a-zA-Z0-9][a-zA-Z]*$") || strand.matches("^[A-Za-z]*[acgt]+[A-Za-z]*$")){
throw new RuntimeException("Illegal DNA strand");
}
}
Related
//I trying to solve a problem I got from Codewars
// The question is as follows
/*Deoxyribonucleic acid (DNA) is a chemical found in the nucleus of cells and carries the "instructions" for the development and functioning of living organisms.
If you want to know more http://en.wikipedia.org/wiki/DNA
In DNA strings, symbols "A" and "T" are complements of each other, as "C" and "G". You have function with one side of the DNA (string, except for Haskell); you need to get the other complementary side. DNA strand is never empty or there is no DNA at all (again, except for Haskell).
*/
public class DnaStrand {
public static String makeComplement(String dna) {
StringBuilder builder = new StringBuilder();
for(int i=0;i<dna.length();i++){
char c = dna.charAt(i);
if(dna.charAt(i) == 'T'){
builder.append('A');
}
if(dna.charAt(i) == 'A'){
builder.append('T');
}
if(dna.charAt(i) == 'C'){
builder.append('G');
}
if(dna.charAt(i) == 'G'){
builder.append('T');
}
}
return builder.toString();
}
}
//This method seems to work correct
//But when I submit it, It shows that it is incorrect for various inputs from //code wars
Your code is...
if(dna.charAt(i) == 'G'){
builder.append('T');
}
The complement of 'G' is 'C' (not 'T'). So it should be...
if(dna.charAt(i) == 'G'){
builder.append('C');
}
I'm making a basic game of Tic Tac Toe, accepting player input in the form of a string (i.e. a2). The first char is made into an int called row depending on the letter, the same being said for the second char into col (for array grid[row][col]). I have a block of code that throws a custom exception in the event that the first char isn't a, b, or c, and if the second char isn't 1, 2, or 3:
if(input == null) {
throw new NullInputException();
}
else if(input.length() != 2) {
throw new InvalidInputException();
}
else if(!(input.substring(0,1).equalsIgnoreCase("a") &&
input.substring(0,1).equalsIgnoreCase("b") &&
input.substring(0,1).equalsIgnoreCase("c") ||
input.substring(1).equals("1") &&
input.substring(1).equals("2") &&
input.substring(1).equals("3"))) {
throw new InvalidInputException();
}
The problem is, this code throws an error even when the input is valid, and I don't know why. I've tried using .charAt() as opposed to .substring(), as well as messed around with my conditional statements. My question is: How do I fix this so that it accepts valid input?
Other questions that just don't help:
fill two dimensional array with parts of a string;
fill a 2d array with chars of 2 string
Sometimes it is better to write a series of simpler tests which are easier to read and verify
row = input.substring(0,1).toUpperCase();
col = input.substring(1);
boolean validRow = (row.equals("A") ||
row.equals("B") ||
row.equals("C"));
boolean validCol =
(col.equals("1") ||
col.equals("2") ||
col.equals("3"));
if(!(validRow && validCol)) {
You AND two conditions:
input.substring(0,1).equalsIgnoreCase("a") &&
input.substring(0,1).equalsIgnoreCase("b")
Both cannot be true in the same time. That is why the result is always false and an exception is thrown.
What you really want is:
String first = input.substring(0,1);
String second = input.substring(1);
if (!((first.equalsIgnoreCase("a") ||
first.equalsIgnoreCase("b") ||
first.equalsIgnoreCase("c")) &&
(second.equals("1") ||
second.equals("2") ||
second.equals("3"))) {
throw new InvalidInputException();
}
Small edit for Neil...
In this answer I recommended using
s.replaceFirst("\\.0*$|(\\.\\d*?)0+$", "$1");
but two people complained that the result contained the string "null", e.g., 23.null. This could be explained by $1 (i.e., group(1)) being null, which could be transformed via String.valueOf to the string "null". However, I always get the empty string. My testcase covers it and
assertEquals("23", removeTrailingZeros("23.00"));
passes. Is the exact behavior undefined?
The documentation of Matcher class from the reference implementation doesn't specify the behavior of appendReplacement method when a capturing group which doesn't capture anything (null) is specified in the replacement string. While the behavior of group method is clear, nothing is mentioned in appendReplacement method.
Below are 3 exhibits of difference in implementation for the case above:
The reference implementation does not append anything (or we can say append an empty string) for the case above.
GNU Classpath and Android's implementation appends null for the case above.
Some code has been omitted for the sake of brevity, and is indicated by ....
1) Sun/Oracle JDK, OpenJDK (Reference implementation)
For the reference implementation (Sun/Oracle JDK and OpenJDK), the code for appendReplacement doesn't seem to have changed from Java 6, and it will not append anything when a capturing group doesn't capture anything:
} else if (nextChar == '$') {
// Skip past $
cursor++;
// The first number is always a group
int refNum = (int)replacement.charAt(cursor) - '0';
if ((refNum < 0)||(refNum > 9))
throw new IllegalArgumentException(
"Illegal group reference");
cursor++;
// Capture the largest legal group string
...
// Append group
if (start(refNum) != -1 && end(refNum) != -1)
result.append(text, start(refNum), end(refNum));
} else {
Reference
jdk6/98e143b44620
jdk8/687fd7c7986d
2) GNU Classpath
GNU Classpath, which is a complete reimplementation of Java Class Library has a different implementation for appendReplacement in the case above. In Classpath, the classes in java.util.regex package in Classpath is just a wrapper for classes in gnu.java.util.regex.
Matcher.appendReplacement calls RE.getReplacement to process replacement for the matched portion:
public Matcher appendReplacement (StringBuffer sb, String replacement)
throws IllegalStateException
{
assertMatchOp();
sb.append(input.subSequence(appendPosition,
match.getStartIndex()).toString());
sb.append(RE.getReplacement(replacement, match,
RE.REG_REPLACE_USE_BACKSLASHESCAPE));
appendPosition = match.getEndIndex();
return this;
}
RE.getReplacement calls REMatch.substituteInto to get the content of the capturing group and appends its result directly:
case '$':
int i1 = i + 1;
while (i1 < replace.length () &&
Character.isDigit (replace.charAt (i1)))
i1++;
sb.append (m.substituteInto (replace.substring (i, i1)));
i = i1 - 1;
break;
REMatch.substituteInto appends the result of REMatch.toString(int) directly without checking whether the capturing group has captured anything:
if ((input.charAt (pos) == '$')
&& (Character.isDigit (input.charAt (pos + 1))))
{
// Omitted code parses the group number into val
...
if (val < start.length)
{
output.append (toString (val));
}
}
And REMatch.toString(int) returns null when the capturing group doesn't capture (irrelevant code has been omitted).
public String toString (int sub)
{
if ((sub >= start.length) || sub < 0)
throw new IndexOutOfBoundsException ("No group " + sub);
if (start[sub] == -1)
return null;
...
}
So in GNU Classpath's case, null will be appended to the string when a capturing group which fails to capture anything is specified in the replacement string.
3) Android Open Source Project - Java Core Libraries
In Android, Matcher.appendReplacement calls private method appendEvaluated, which in turn directly appends the result of group(int) to the replacement string.
public Matcher appendReplacement(StringBuffer buffer, String replacement) {
buffer.append(input.substring(appendPos, start()));
appendEvaluated(buffer, replacement);
appendPos = end();
return this;
}
private void appendEvaluated(StringBuffer buffer, String s) {
boolean escape = false;
boolean dollar = false;
for (int i = 0; i < s.length(); i++) {
char c = s.charAt(i);
if (c == '\\' && !escape) {
escape = true;
} else if (c == '$' && !escape) {
dollar = true;
} else if (c >= '0' && c <= '9' && dollar) {
buffer.append(group(c - '0'));
dollar = false;
} else {
buffer.append(c);
dollar = false;
escape = false;
}
}
// This seemingly stupid piece of code reproduces a JDK bug.
if (escape) {
throw new ArrayIndexOutOfBoundsException(s.length());
}
}
Since Matcher.group(int) returns null for capturing group which fails to capture, Matcher.appendReplacement appends null when the capturing group is referred to in the replacement string.
It is most likely that the 2 people complaining to you are running their code on Android.
Having had a careful look at the Javadoc, I conclude that:
$1 is equivalent to calling group(1), which is specified to return null when the group didn't get captured.
The handling of nulls in the replacement expression is unspecified.
The wording of the relevant parts of the Javadoc is on the whole surprisingly vague (emphasis mine):
Dollar signs may be treated as references to captured subsequences as described above...
You have two alternatives | or-ed together, but only the second is between ( ) hence if the first alternative is matched, group 1 is null.
In general place the parentheses around all alternatives
In your case you want to replace
"xxx.00000" by "xxx" or else
"xxx.yyy00" by "xxx.yyy"
Better do that in two steps, as that is more readable:
"xxx.y*00" by "xxx.y*" then
"xxx." by "xxx"
This does a bit extra, changing an initial "1." to "1".
So:
.replaceFirst("(\\.\\d*?)0+$", "$1").replaceFirst("\\.$", "");
I"m trying to take a string that represents a full algebraic excpression, such as x = 15 * 6 / 3 which is a string, and tokenize it into its individual components. So the first would be x, then =, then 15, then *, 6, / and finally 3.
The problem I am having is actually parsing through the string and looking at the individual characters. I can't think of a way to do this without a massive amount of if statements. Surely there has to be a better way tan specifically defining each individual case and testing for it.
For each type of token, you'll want to figure out how to identify:
when you're starting to read a particular token
if you're continuing to read the same token, or if you've started a different one
Let's take your example: x=15*6/3. Let's assume that you cannot rely on the fact that there are spaces in between each token. In that case, it's trivial: your new token starts when you reach a space.
You can break down the character types into letters, digits, and symbols. Let's call the token types Variable, Operator, and Number.
A letter indicates a Variable token has started. It continues until you read a non-letter.
A symbol indicates the start of an Operator token. I only see single symbols, but you can have groups of symbols correspond to different Operator tokens.
A digit indicates the start of a Number token. (Let's assume integers for now.) The Number token continues until you read a non-digit.
Basically, that's how a simple symbolic parser works. Now, if you add in negative numbers (where the '-' symbol can have multiple meanings), or parentheses, or function names (like sin(x)) then things get more complicated, but it amounts to the same set of rules, now just with more choices.
create regular expression for each possible element: integer, variable, operator, parentheses.
combine them using the | regular expression operator into one big regular expression with capture groups to identify which one matched.
in a loop match the head of the remaining string and break off the matched part as a token. the type of the token depends on which sub-expression matched as described in 2.
or
use a lexer library, such as the one in antlr or javacc
This is from my early expression evaluator that takes an infix expression like yours and turns it into postfix to evaluate. There are methods that help the parser but I think they're pretty self documenting. Mine uses symbol tables to check tokens against. It also allows for user defined symbols and nested assignments and other things you may not need/want. But it shows how I handled your issue without using niceties like regex which would simplify this task tremendously. In addition everything shown is of my own implementation - stack and queue as well - everything. So if anything looks abnormal (unlike Java imps) that's because it is.
This section of code is important not to answer your immediate question but to show the necessary work to determine the type of token you're dealing with. In my case I had three different types of operators and two different types of operands. Based on either the known rules or rules I chose to enforce (when appropriate) it was easy to know when something was a number (starts with a number), variable/user symbol/math function (starts with a letter), or math operator (is: /,*,-,+) . Note that it only takes seeing the first char to know the correct extraction rules. From your example, if all your cases are as simple, you'd only have to handle two types, operator or operand. Nonetheless the same logic will apply.
protected Queue<Token> inToPostParse(String exp) {
// local vars
inputExp = exp;
offset = 0;
strLength = exp.length();
String tempHolder = "";
char c;
// the program runs in a loop so make sure you're dealing
// with an empty queue
q1.reset();
for (int i = offset; tempHolder != null && i < strLength; ++i) {
c = exp.charAt(i);
// Spaces are useless so skip them
if (c == ' ') { continue; }
// If c is a letter
if ((c >= 'A' && c <= 'Z')
|| (c >= 'a' && c <= 'z')) {
// Here we know it must be a user symbol possibly undefined
// at this point or an function like SIN, ABS, etc
// We extract, based on obvious rules, the op
tempHolder = extractPhrase(i); // Used to be append sequence
if (ut.isTrigOp(tempHolder) || ut.isAdditionalOp(tempHolder)) {
s1.push(new Operator(tempHolder, "Function"));
} else {
// If not some math function it is a user defined symbol
q1.insert(new Token(tempHolder, "User"));
}
i += tempHolder.length() - 1;
tempHolder = "";
// if c begins with a number
} else if (c >= '0' && c <= '9') {
try {
// Here we know that it must be a number
// so we extract until we reach a non number
tempHolder = extractNumber(i);
q1.insert(new Token(tempHolder, "Number"));
i += tempHolder.length() - 1;
tempHolder = "";
}
catch (NumberFormatException nfe) {
return null;
}
// if c is in the math symbol table
} else if (ut.isMathOp(String.valueOf(c))) {
String C = String.valueOf(c);
try {
// This is where the magic happens
// Here we determine the "intersection" of the
// current C and the top of the stack
// Based on the intersection we take action
// i.e., in math do you want to * or + first?
// Depending on the state you may have to move
// some tokens to the queue before pushing onto the stack
takeParseAction(C, ut.findIntersection
(C, s1.showTop().getSymbol()));
}
catch (NullPointerException npe) {
s1(C);
}
// it must be an invalid expression
} else {
return null;
}
}
u2();
s1.reset();
return q1;
}
Basically I have a stack (s1) and a queue (q1). All variables or numbers go into the queue. Any operators trig, math, parens, etc.. go on the stack. If the current token is to be put on the stack you have to check the state (top) to determine what parsing action to take (i.e., what to do based on math precedence). Sorry if this seems like useless information. I imagine if you're parsing a math expression it's because at some point you plan to evaluate it. IMHO, postfix is the easiest so I, regardless of input format, change it to post and evaluate with one method. If your O is different - do what you like.
Edit: Implementations
The extract phrase and number methods, which you may be most interested in, are as follows:
protected String extractPhrase(int it) {
String phrase = new String();
char c;
for ( ; it < inputExp.length(); ++it) {
c = inputExp.charAt(it);
if ((c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z')
|| (c >= '0' && c <= '9')) {
phrase += String.valueOf(c);
} else {
break;
}
}
return phrase;
}
protected String extractNumber(int it) throws NumberFormatException {
String number = new String();
int decimals = 0;
char c;
for ( ; it < strLength; ++it) {
c = inputExp.charAt(it);
if (c >= '0' && c <= '9') {
number += String.valueOf(c);
} else if (c == '.') {
++decimals;
if (decimals < 2) {
number += ".";
} else {
throw new NumberFormatException();
}
} else {
break;
}
}
return number;
}
Remember - By the time they enter these methods I've already been able to deduce what type it is. This allows you to avoid the seemingly endless while-if-else chain.
Are components always separated by space character like in your question? if so, use algebricExpression.split(" ") to get a String[] of components.
If no such restrictions can be assumed, a possible solution can be to iterate over the input, and switch the Character.getType() of the current index, somthing like that:
ArrayList<String> getExpressionComponents(String exp) {
ArrayList<String> components = new ArrayList<String>();
String current = "";
int currentSequenceType = Character.UNASSIGNED;
for (int i = 0 ; i < exp.length() ; i++) {
if (currentSequenceType != Character.getType(exp.charAt(i))) {
if (current.length() > 0) components.add(current);
current = "";
currentSequenceType = Character.getType(exp.charAt(i));
}
switch (Character.getType(exp.charAt(i))) {
case Character.DECIMAL_DIGIT_NUMBER:
case Character.MATH_SYMBOL:
case Character.START_PUNCTUATION:
case Character.END_PUNCTUATION:
case Character.LOWERCASE_LETTER:
case Character.UPPERCASE_LETTER:
// add other required types
current = current.concat(new String(new char[] {exp.charAt(i)}));
currentSequenceType = Character.getType(exp.charAt(i));
break;
default:
current = "";
currentSequenceType = Character.UNASSIGNED;
break;
}
}
return components;
}
You can easily change the cases to meet with other requirements, such as split non-digit chars to separate components etc.
I keep getting an error with removing a character from within a string. I have tried everything that i could find on this site and nothing has worked. This is NOT a help post. Rather maybe an answer that explains why this shows up and how to fix it in case someone else encounters this issue. Without further a due, here is my code:
public JTextField Clean()
{
String Cleaner = TopField.getText();
Cleaner=Cleaner.toLowerCase();
int Length = Cleaner.length();
StringBuilder Combiner = new StringBuilder(Cleaner);
for (int x=0;x+1<Length;x++)
{
char c = Cleaner.charAt(x);
char c1 = Cleaner.charAt(x+1);
if(c==' ' && c1==' ')
{
Combiner.deleteCharAt(x);
Cleaner=Combiner.toString();
}
if(c!='a' && c=='b' && c!='c' && c!='d' && c!='f' && c!='g' && c!='h' && c!='i' && c!='j' && c!='k' && c!='l' && c!='m' && c!='n' && c!='o' && c!='p' && c!='q' && c!='r' && c!='s' && c!='t' && c!='u' && c!='v' && c!='w' && c!='x' && c!='y' && c!='z' && c!=' ')
{Combiner.deleteCharAt(x);
Cleaner=Combiner.toString();}
}
TopField.setText(Cleaner);
return TopField;
}
I receive an error that states that My value is out of bounds by the length of the string that i input. Please note that this is a method inside a class that i created that removes any character that is not an alphabet or space.
Thanks in advance
As you remove characters, Cleaner becomes shorter, so you're likely to reach a point where x is too large.
I would suggest a different approach using regular expressions:
string cleaned = TopField.getText().toLowerCase().replaceAll("[^a-z ]", "");
There are a number of things that pop out at me.
Your basing your loop on a fixed value (Length), but where the actual length of the String can decrease...
You are potentially removing 2 characters per loop (there are two deleteCharAt calls)
The loop doesn't take into account the shrinking size of the String. For example. x == 1, you remove the character at x, you increment x by 1 (x == 2), effectively skipping a character (the character at position 2 is now at position 1
Your if statement is unnecessarily long. In fact, depending on your needs, you could use Character.isDigit or Character.isLetter and Character.isWhiteSpace
String Cleaner = TopField.getText();
Cleaner = Cleaner.toLowerCase();
StringBuilder Combiner = new StringBuilder(Cleaner);
int x =0;
while (x < Combiner.length()) {
char c = Combiner.charAt(x);
if (c >= 'a' && c <= 'z' || c == ' ') {
Combiner.deleteCharAt(x);
} else {
x++;
}
}
From the looks of your code, you appear to wanting to filter a JTextField so it will only allow numeric values. It would be much better to use something like a JSpinner, JFormattedTextField or DocumentFilter and ensure the correctness of the data as it's entered...IMHO
I used a isDigit() function and found the output as incorrect. Look at the code I tested and found problem with the output. Any one explain.
public static void main(String[] args) {
// TODO Auto-generated method stub
String temp="you got 211111 out of 211111?";
StringBuilder cleaner=new StringBuilder(temp);
for(int i=0;i<cleaner.length();i++)
{
char c=cleaner.charAt(i);
if(Character.isDigit(c))
{
cleaner.deleteCharAt(i);
}
}
System.out.println(cleaner);
I am getting output as : you got 111 out of 111?
it is not removing some digits.
Also found that no function called replaceAll() is there in Java.