Using " this" in java - java

I have been given a code and need to "fill in the blanks", for one part of the problem I need to write a method that will check that a string has only letters from the alphabet ( no commas or periods or numbers) and then if there are upper case letters, convert them to lower case.
I think I understand how to write this, however this is part of the code that is given
public Message (String m){
message = m;
lengthOfMessage = m.length();
this.makeValid();
The method I have to write is the makeValid one, but I'm not sure how to use the this.makeValid and how to write the code if the method doesn't take a string as the argument?
Note: I understand now that I can use message and lengthOfmessage but i am still trying to wrap my head around it.
would this code make sense and make proper use of the this
public void makeValid(){
for (int i = 0; i < lengthOfMessage ; i++ ) {
char mchar = message.charAt(i);
if (64 <= mchar && mchar<= 90) {
mchar = (char)((mchar + 32));
builder.append(mChar);
}
}
}

Related

I fail to understand the rules of using StringBuffer, append, conversions

I was wondering if anyone could help me figuring out why my code doesn't do what I expect it to do. The idea was to count the same following letters in a StringBuffer and transform it into something like this AABBC => 2A2B1C. Now my program doesn't do that and it probably has to do with my poor usage of these newly-learned concepts. Do I have to convert marker into a char for it to print it out? Or is the structure of my code inherently wrong? I'm also not sure what I can do with StringBuffers and what not.
package package1;
public class Strings {
public static void main(String[]args){
int marker = 1;
StringBuffer s2 = new StringBuffer();
StringBuffer s = new StringBuffer("AAAA");
for(int i = 0; i<=s.length(); i++){
while(s.charAt(i) == s.charAt(i+1)){
marker++;
}
i += marker;
s2.append(marker);
s2.append(s.charAt(i));
marker = 0;
}
System.out.println(s2); // It simply prints out nothing
}
}
You have an off-by-one-error bug in your code.
You are counting how many times you've seen a char starting from 1 but string indexes start from zero and you are mixing up the two later when you are assigning marker to i using getCharAt(i) is off by one (in this case it tries to get the char at index 4 which is passed the end of the string).
An Easy way to fix it is to have your count (marker) start at 0 too and only increase by one what you are appending to the string:
package package1;
public class Strings {
public static void main(String[]args){
int marker = 0; // changing this to 0
StringBuffer s2 = new StringBuffer();
StringBuffer s = new StringBuffer("AAAA");
for(int i = 0; i<=s.length(); i++){
while(s.charAt(i) == s.charAt(i+1)){
marker++;
}
i += marker;
s2.append(marker + 1); // print out the count plus one because we are counting from zero
s2.append(s.charAt(i));
marker = 0;
}
System.out.println(s2); // It simply prints out nothing
}
}
Your while-loop never finishes.
If s.charAt(i) == s.charAt(i+1) is true, the marker gets increased. But because i stays the same the condition of your while-loop stays the same, so it runs for ever.
There are some more bugs in your code (like i <= s.length() and try to access s.charAt(i+1) will lead to IndexOutOfBoundsException) but you will find them.

Java efficiently replace unless matches complex regular expression

I have over a gigabyte of text that I need to go through and surround punctuation with spaces (tokenizing). I have a long regular expression (1818 characters, though that's mostly lists) that defines when punctuation should not be separated. Being long and complicated makes it hard to use groups with it, though I wouldn't leave that out as an option since I could make most groups non-capturing (?:).
Question: How can I efficiently replace certain characters that don't match a particular regular expression?
I've looked into using lookaheads or similar, and I haven't quite figured it out, but it seems to be terribly inefficient anyway. It would likely be better than using placeholders though.
I can't seem to find a good "replace with a bunch of different regular expressions for both finding and replacing in one pass" function.
Should I do this line by line instead of operating on the whole text?
String completeRegex = "[^\\w](("+protectedPrefixes+")|(("+protectedNumericOnly+")\\s*\\p{N}))|"+protectedRegex;
Matcher protectedM = Pattern.compile(completeRegex).matcher(s);
ArrayList<String> protectedStrs = new ArrayList<String>();
//Take note of the protected matches.
while (protectedM.find()) {
protectedStrs.add(protectedM.group());
}
//Replace protected matches.
String replaceStr = "<PROTECTED>";
s = protectedM.replaceAll(replaceStr);
//Now that it's safe, separate punctuation.
s = s.replaceAll("([^\\p{L}\\p{N}\\p{Mn}_\\-<>'])"," $1 ");
// These are for apostrophes. Can these be combined with either the protecting regular expression or the one above?
s = s.replaceAll("([\\p{N}\\p{L}])'(\\p{L})", "$1 '$2");
s = s.replaceAll("([^\\p{L}])'([^\\p{L}])", "$1 ' $2");
Note the two additional replacements for apostrophes. Using placeholders protects against those replacements as well, but I'm not really concerned with apostrophes or single quotes in my protecting regex anyway, so it's not a real concern.
I'm rewriting what I considered very inefficient Perl code with my own in Java, keeping track of speed, and things were going fine until I started replacing the placeholders with the original strings. With that addition it's too slow to be reasonable (I've never seen it get even close to finishing).
//Replace placeholders with original text.
String resultStr = "";
String currentStr = "";
int currentPos = 0;
int[] protectedArray = replaceStr.codePoints().toArray();
int protectedLen = protectedArray.length;
int[] strArray = s.codePoints().toArray();
int protectedCount = 0;
for (int i=0; i<strArray.length; i++) {
int pt = strArray[i];
// System.out.println("pt: "+pt+" symbol: "+String.valueOf(Character.toChars(pt)));
if (protectedArray[currentPos]==pt) {
if (currentPos == protectedLen - 1) {
resultStr += protectedStrs.get(protectedCount);
protectedCount++;
currentPos = 0;
} else {
currentPos++;
}
} else {
if (currentPos > 0) {
resultStr += replaceStr.substring(0, currentPos);
currentPos = 0;
currentStr = "";
}
resultStr += ParseUtils.getSymbol(pt);
}
}
s = resultStr;
This code may not be the most efficient way to return the protected matches. What is a better way? Or better yet, how can I replace punctuation without having to use placeholders?
I don't know exactly how big your in-between strings are, but I suspect that you can do somewhat better than using Matcher.replaceAll, speed-wise.
You're doing 3 passes across the string, each time creating a new Matcher instance, and then creating a new String; and because you're using + to concatenate the strings, you're creating a new string which is the concatenation of the in-between string and the protected group, and then another string when you concatenate this to the current result. You don't really need all of these extra instances.
Firstly, you should accumulate the resultStr in a StringBuilder, rather than via direct string concatenation. Then you can proceed something like:
StringBuilder resultStr = new StringBuilder();
int currIndex = 0;
while (protectedM.find()) {
protectedStrs.add(protectedM.group());
appendInBetween(resultStr, str, current, protectedM.str());
resultStr.append(protectedM.group());
currIndex = protectedM.end();
}
resultStr.append(str, currIndex, str.length());
where appendInBetween is a method implementing the equivalent to the replacements, just in a single pass:
void appendInBetween(StringBuilder resultStr, String s, int start, int end) {
// Pass the whole input string and the bounds, rather than taking a substring.
// Allocate roughly enough space up-front.
resultStr.ensureCapacity(resultStr.length() + end - start);
for (int i = start; i < end; ++i) {
char c = s.charAt(i);
// Check if c matches "([^\\p{L}\\p{N}\\p{Mn}_\\-<>'])".
if (!(Character.isLetter(c)
|| Character.isDigit(c)
|| Character.getType(c) == Character.NON_SPACING_MARK
|| "_\\-<>'".indexOf(c) != -1)) {
resultStr.append(' ');
resultStr.append(c);
resultStr.append(' ');
} else if (c == '\'' && i > 0 && i + 1 < s.length()) {
// We have a quote that's not at the beginning or end.
// Call these 3 characters bcd, where c is the quote.
char b = s.charAt(i - 1);
char d = s.charAt(i + 1);
if ((Character.isDigit(b) || Character.isLetter(b)) && Character.isLetter(d)) {
// If the 3 chars match "([\\p{N}\\p{L}])'(\\p{L})"
resultStr.append(' ');
resultStr.append(c);
} else if (!Character.isLetter(b) && !Character.isLetter(d)) {
// If the 3 chars match "([^\\p{L}])'([^\\p{L}])"
resultStr.append(' ');
resultStr.append(c);
resultStr.append(' ');
} else {
resultStr.append(c);
}
} else {
// Everything else, just append.
resultStr.append(c);
}
}
}
Ideone demo
Obviously, there is a maintenance cost associated with this code - it is undeniably more verbose. But the advantage of doing it explicitly like this (aside from the fact it is just a single pass) is that you can debug the code like any other - rather than it just being the black box that regexes are.
I'd be interested to know if this works any faster for you!
At first I thought that appendReplacement wasn't what I was looking for, but indeed it was. Since it's replacing the placeholders at the end that slowed things down, all I really needed was a way to dynamically replace matches:
StringBuffer replacedBuff = new StringBuffer();
Matcher replaceM = Pattern.compile(replaceStr).matcher(s);
int index = 0;
while (replaceM.find()) {
replaceM.appendReplacement(replacedBuff, "");
replacedBuff.append(protectedStrs.get(index));
index++;
}
replaceM.appendTail(replacedBuff);
s = replacedBuff.toString();
Reference: Second answer at this question.
Another option to consider:
During the first pass through the String, to find the protected Strings, take the start and end indices of each match, replace the punctuation for everything outside of the match, add the matched String, and then keep going. This takes away the need to write a String with placeholders, and requires only one pass through the entire String. It does, however, require many separate small replacement operations. (By the way, be sure to compile the patterns before the loop, as opposed to using String.replaceAll()). A similar alternative is to add the unprotected substrings together, and then replace them all at the same time. However, the protected strings would then have to be added to the replaced string at the end, so I doubt this would save time.
int currIndex = 0;
while (protectedM.find()) {
protectedStrs.add(protectedM.group());
String substr = s.substring(currIndex,protectedM.start());
substr = p1.matcher(substr).replaceAll(" $1 ");
substr = p2.matcher(substr).replaceAll("$1 '$2");
substr = p3.matcher(substr).replaceAll("$1 ' $2");
resultStr += substr+protectedM.group();
currIndex = protectedM.end();
}
Speed comparison for 100,000 lines of text:
Original Perl script: 272.960579875 seconds
My first attempt: Too long to finish.
With appendReplacement(): 14.245160866 seconds
Replacing while finding protected: 68.691842962 seconds
Thank you, Java, for not letting me down.

Java, cant figure out how to strip symbols from a String for a Palindrome

Im in highschool and this is an assignment i have, you guys are out of my league but im willing to learn and understand. I looked all over the place but all i could find was complicated syntax i dont know yet. This is what i have, it takes a String and reverses it. I managed to get it to ignore Capitals, but i cannot figure out how to make it ignore symbols. The numbers i have there are from the ANSI Characters, there is a list on textpad im using. Dont be afraid to be harsh, im not good at this and i only want to improve so have at it.
import java.util.Scanner;
public class PalindromeV2
{
public static void main(String[] args)
{
//declare
Scanner sc = new Scanner(System.in);
String fwd, rev;
String result;
//input
System.out.println("What word would you like to Palindrome test?");
fwd = sc.next();
rev = reverseString(fwd);
result = stripPunctuation(fwd);
if(stripPunctuation(rev).equals(stripPunctuation(fwd)))
{
System.out.println("That is a palindrome");
}
else
System.out.println("That is not a palindrome");
}
public static String reverseString(String fwd)
{
String rev = "";
for(int i = fwd.length()-1; i >= 0; i--)
{
rev += fwd.charAt(i);
}
return rev.toUpperCase();
}
public static String stripPunctuation(String fwd)
{
String result = "";
fwd = fwd.toUpperCase();
for(int i = fwd.length()-1; i >= 0; i--)
{
if((fwd.charAt(i)>=65 && fwd.charAt(i)<=90)||(fwd.charAt(i) >= 48 && fwd.charAt(i) <= 58));
result = result + fwd.charAt(i);
}
return result;
}
}
You can use this as a checking condition
if (Character.isLetter(fwd.charAt(i)) {
// do something
}
This will check to make sure the character is a letter, so you don't have to worry about case, numbers, or other symbols.
If you want to strip your string out of some set of characters than do something like that
clearString=targetStringForStripping.replaceAll([type_characters_for_stripping],"");
this will remove all characters you will provide inside square brackets.
There is even more. If you want to let say leave only letters (because in palindromes nothing matters except letters - spaces are not important to) than you simply can use predefine character set - letters.
To conclude all if you do
clearString=targetStringForStripping.replaceAll("[\w]","");
or
clearString=targetStringForStripping.replaceAll("[^a-zA-Z]","");
you will get clear string with white characters in first example, and only letters in second one. Perfect situation for isPalindrom resolution.
if((fwd.charAt(i)>=65 && fwd.charAt(i)<=90)||(fwd.charAt(i) >= 48 && fwd.charAt(i) <= 58));
you have semicolon at last. so i think if condition is no use here
Since this is a highschool assignment, I'll just give some pointers, you'll figure it out on your own.
Think about what you want to include / exclude, then write the code.
Keep in mind, that you can compare char variables using < or > operators as long as you do not want to handle complex character encodings.
A String is really just a sequence of chars which one by one you can compare or reorder, include or exclude.
A method should only do one thing, not a lot of things. Have a look at your reverseString method. This is doing an toUpperCase to your string at the same time. If your programs get more complex, this way of doing things is not to easy to follow.
Finally, if you e.g. just want to include capital letters in your palindrome check, then try some code like this:
char[] toCheck = fwd.toCharArray();
for (char c : toCheck) {
if (c >= 'A' && c <= 'Z') {
result = result + c;
}
}
Depending on your requirements this might do what you want. If you want something different, have a look at the hints I gave above.
Java golf?
public static String stripPunctuation(String stripThis) {
return stripThis.replaceAll("\\W", "");
}

Java characters count in an array

Another problem I try to solve (NOTE this is not a homework but what popped into my head), I'm trying to improve my problem-solving skills in Java. I want to display this:
Students ID #
Carol McKane 920 11
James Eriol 154 10
Elainee Black 462 12
What I want to do is on the 3rd column, display the number of characters without counting the spaces. Give me some tips to do this. Or point me to Java's robust APIs, cause I'm not yet that familiar with Java's string APIs. Thanks.
It sounds like you just want something like:
public static int countNonSpaces(String text) {
int count = 0;
for (int i = 0; i < text.length(); i++) {
if (text.charAt(i) != ' ') {
count++;
}
}
return count;
}
You may want to modify this to use Character.isWhitespace instead of only checking for ' '. Also note that this will count pairs outside the Basic Multilingual Plane as two characters. Whether that will be a problem for you or not depends on your use case...
Think of solving a problem and presenting the answer as two very different steps. I won't help you with the presentation in a table, but to count the number of characters in a String (without spaces) you can use this:
String name = "Carol McKane";
int numberOfCharacters = name.replaceAll("\\s", "").length();
The regular expression \\s matches all whitespace characters in the name string, and replaces them with "", or nothing.
Probably the shortest and easiest way:
String[][] students = { { "Carol McKane", "James Eriol", "Elainee Black" }, { "920", "154", "462" } };
for (int i = 0 ; i < students[0].length; i++) {
System.out.println(students[0][i] + "\t" + students[1][i] + "\t" + students[0][i].replace( " ", "" ).length() );
}
replace(), replaces each substring (" ") of your string and removes it from the result returned, from this temporal string, without spaces, you can get the length by calling length() on it...
The String name will remain unchanged.
http://docs.oracle.com/javase/7/docs/api/java/lang/String.html
cheers
To learn more about it you should watch the API documentation for String and Character
Here some examples how to do:
// variation 1
int count1 = 0;
for (char character : text.toCharArray()) {
if (Character.isLetter(character)) {
count1++;
}
}
This uses a special short from of "for" instruction. Here's the long form for better understanding:
// variation 2
int count2 = 0;
for (int i = 0; i < text.length(); i++) {
char character = text.charAt(i);
if (Character.isLetter(character)) {
count2++;
}
}
BTW, removing whitespaces via replace method is not a good coding style to me and not quite helpful for understanding how string class works.

CharacterSet to String and Int to Char casting

New to java. Trying to familiarize myself with syntax and overall language structure.
I am trying to mimic this php function in java which just converts all instances of a number to a particular character
for($x=10;$x<=20;$x++){
$string = str_replace($x, chr($x+55), $string);
}
so in php if a string was 1090412 it would be converted to something like A904C.
I am trying to do this in java by using string.replace but I cant for the life of me figure out how to cast the variables properly. I know that I can convert an integer to a character by casting it as (char), but not sure where to go from there. this gives me a compile error expecting a character set.
string=1090412;
for (x = 10; x <= 35; x++) {
string.replace( x, (char) (x + 55));
}
Well, although not the best way to do this, here is a method that works
import java.lang.*;
public class T {
public static void main(String[] args) {
String s = "1090412";
for (int i = 10; i <= 20; i++) {
s = s.replace(Integer.toString(i), "" + (char)(i + 55));
}
System.out.println(s);
}
}
The java.lang.String object has a replace method on it. It is overloaded to take either two char parameters or two CharSequence objects. The "" + (char)(i + 55) is just a quick way to create a CharSequence.

Categories