Removing duplicates from a string

Removing duplicates from a string - java

I am trying to remove duplicates from a String in Java. Here i what I have tried
public void unique(String s)
{
// put your code here
char[]newArray = s.toCharArray();
Set<Character> uniquUsers = new HashSet<Character>();
for (int i = 0; i < newArray.length; i++) {
if (!uniquUsers.add(newArray[i]))
newArray[i] =' ';
}
System.out.println(new String(newArray));
}
Problem with this is when I try to remove the duplicate I replace it with a space. I tried replacing the duplicate with '' but it cannot be done or I cant set the duplicate place to null. What is the best way to do this?

If you use regex, you only need one line!
public void unique(String s) {
System.out.println(s.replaceAll("(.)(?=.*\\1)", ""));
}
This removes (by replacing with blank) all characters that found again later in the input (by using a look ahead with a back reference to the captured character).

If I understand your question correctly, perhaps you could try something like:
public static String unique(final String string){
final StringBuilder builder = new StringBuilder();
for(final char c : string.toCharArray())
if(builder.indexOf(Character.toString(c)) == -1)
builder.append(c);
return builder.toString();
}

You can use BitSet
public String removeDuplicateChar(String str){
if(str==null || str.equals(""))throw new NullPointerException();
BitSet b = new BitSet(256);
for(int i=0;i<str.length();i++){
b.set(str.charAt(i));
}
StringBuilder s = new StringBuilder();
for(int i=0;i<256;i++){
if(b.isSet(i)){
s.append((char)i);
}
}
return s.toString();
}
You can roll down your own BitSet like below:
class BitSet {
int[] numbers;
BitSet(int k){
numbers = new int[(k >> 5) + 1];
}
boolean isSet(int k){
int remender = k & 0x1F;
int devide = k >> 5;
return ((numbers[devide] & (1 << remender)) == 1);
}
void set(int k){
int remender = k & 0x1F;
int devide = k >> 5;
numbers[devide] = numbers[devide] | (1 << remender);
}
}

This will work for what you are attempting.
public static void unique(String s) {
// r code here
char[] newArray = s.toCharArray();
Set<Character> uniqueUsers = new HashSet<>();
for (int i = 0; i < newArray.length; i++) {
uniqueUsers.add(newArray[i]);
}
newArray = new char[uniqueUsers.size()];
Iterator iterator = uniqueUsers.iterator();
int i = 0;
while (iterator.hasNext()) {
newArray[i] = (char)iterator.next();
i++;
}
System.out.println(new String(newArray));
}

without changing almost anything in your code, change the line
System.out.println(new String(newArray));
for
System.out.println( new String(newArray).replaceAll(" ", ""));
the addition of replaceAll will remove blanks

import java.util.*;
class StrDup{
public static void main(String[] args){
String s = "abcdabacdabbbabbbaaaaaaaaaaaaaaaaaaabbbbbbbbbbdddddddddcccccc";
String dup = removeDupl(s);
}
public static String removeDupl(String s){
StringBuilder sb = new StringBuilder(s);
String ch = "";
for(int i = 0; i < sb.length(); i++){
ch = sb.substring(i,i+1);
int j = i+1;
int k = 0;
while(sb.indexOf(ch,j)!=-1){
k = sb.indexOf(ch,j);
sb.deleleCharAt(k);
j = k;
}
}
return sb.toString();
}
}
In the code above, I'm doing the following tasks.
I'm first converting the string to a StringBuilder. Strings in Java are immutable, which means they are like CDs. You can't do anything with them once they are created. The only thing they are vulnerable to is their departure, i.e. the end of their life cycle by the garbage collector, but that's a whole different thing. Foe example:
String s = "Tanish";
s + "is a good boy";
This will do nothing. String s is still Tanish. To make the second line of code happen, you will have to assign the operation to some variable, like this:
s = s + "is a good boy";
And, make no mistake! I said strings are immutable, and here I am reassigning s with some new string. But, it's a NEW string. The original string Tanish is still there, somewhere in the pool of strings. Think of it like this: the string that you are creating is immutable. Tanish is immutable, but s is a reference variable. It can refer to anything in the course of its life. So, Tanish and Tanish is a good boy are 2 separate strings, but s now refers to the latter, instead of the former.
StringBuilder is another way of creating strings in Java, and they are mutable. You can change them. So, if Tanish is a StringBuilder, it is vulnerable to every kind of operation (append, insert, delete, etc.).
Now we have the StringBuilder sb, which is same as the String s.
I've used a StringBuilder built-in method, i.e. indexOf(). This methods finds the index of the character I'm looking for. Once I have the index, I delete the character at that index.
Remember, StringBuilder is mutable. And that's the reason I can delete the characters.
indexOf is overloaded to accept 2 arguments (sb.indexOf(substr ,index)). This returns you the position of the first occurrence of string within the sb, starting from index.
In the example string, sb.indexOf(a,1) will give me 4. All I'm trying to say to Java is, "Return me the index of 'a', but start looking for 'a' from index 1'. So, this way I've the very first a at 0, which I don't want to get rid of.
Now all I'm doing inside the for loop is extracting the character at ith position. j represents the position from where to start looking for the extracted character. This is important, so that we don't loose the one character we need. K represents the result of indexOf('a',j), i.e. the first occurrence of a, after index j.
That's pretty much it. Now, as long as we have a character ch lying in the string (indexOf(....) returns -1, if it can't find the specified character (...or the string as i specified before) as a duplicate, we will obtain it's position (k), delete it using deleteCharAt(k) and update j to k. i.e., the next duplicate a (if it exists) will appear after k, where it was last found.
DEMONSTRATION:
In the example I took, let's say we want to get rid of duplicate cs.
So, we will start looking for the first c after the very first c, i.e. index 3.
sb.indexOf("c",3) will give us 7, where a c is lying. so, k = 7. delete it, and then set j to k. Now, j = 7. Basically after deleting the character, the succeeding string shifts to left by 1. So, now at 7th pos we have d, which was at 8 before. Now, k = indexOf("c",7) and repeat the entire cycle. Also, remember that indexOf("c",j) will start looking right from j. which means if c, is found at j, it will return j. That's why when we extracted the first character, we started looking from position 1 after the character's position.

public class Duplicates {
public static void main(String[] args) {
String str="aabbccddeeff";
String[] str1 = str.split("");
ArrayList<String> List = new ArrayList<String>
Arrays.asList(str1);
List<String> newStr = List.stream().distinct().collect(Collectors.toList());
System.out.print(newStr);
}
}

Related

Remove a Character From String in Java

I'm trying to concatenate a string with itself and remove all capital letters from the resultant string.
Here is my code:
public String removeCapitals(String A) {
StringBuilder B = new StringBuilder(A+A);
int n = B.length();
for(int i=0; i<n; i++){
if(B.charAt(i)>='A' && B.charAt(i)<='Z'){
B.deleteCharAt(i);
}
}
return B.toString();
}
I'm getting Exception saying:
Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: 6
at java.lang.AbstractStringBuilder.charAt(AbstractStringBuilder.java:237)
at java.lang.StringBuilder.charAt(StringBuilder.java:76)
at Solution.removeCapitals(Solution.java:10)
at Main.main(Main.java:190)
Can someone help me to understand the issue.

If at least one removal succeeds, at some point your code will attempt to access an invalid index that exceeds the length of a StringBuilder.
It happens because the variable n remain unchanged. You should change the condition to be bound to the current size of StringBuilder and decrement the index at each removal, or iterate backwards (as shown in another answer).
Also condition B.charAt(i)>='A' && B.charAt(i)<='Z' can be replaced with:
Character.isUpperCase(b.charAt(i))
Which is more descriptive.
That's how it might look like:
public static String removeCapitals(String a) {
StringBuilder b = new StringBuilder(a + a);
for (int i = 0; i < b.length(); i++) {
if (Character.isUpperCase(b.charAt(i))) {
b.deleteCharAt(i); // which can be combined with the next line `b.deleteCharAt(i--);` - previous value of `i` would be used in the call `deleteCharAt()` and variable `i` will hold a value decremented by 1
i--;
}
}
return b.toString();
}
Method deleteCharAt() runs in a linear time, because it shifts all subsequent characters in the underlying array bytes. Each upper-case letter will trigger these shifts and in the worst case scenario, it would result in the quadratic overall time complexity O(n ^ 2).
You make your method more performant and much more concise without using loops and StringBuilder. This code will run in a linear time O(n).
public static String removeCapitals(String a) {
return a.replaceAll("\\p{Upper}", "").repeat(2);
}

When you delete a character you change the length of the StringBuilder. But n still has the original length. So you will eventually exceed the size of the StringBuilder. So start from the end and move backwards. That way, any deletions will come after (based on relative indices) the next position so the index will be within the modified StringBuilder size. In addition, deleting from the end is more efficient since there is less copying to do in the StringBuilder.
public String removeCapitals(String A) {
StringBuilder B = new StringBuilder(A+A);
int n = B.length();
for(int i=n-1; i>=0; i--){
if(B.charAt(i)>='A' && B.charAt(i)<='Z'){
B.deleteCharAt(i);
}
}
return B.toString();
}

If just remove Capital characters from a string. Alternative solution just create another method replaceAll() + regex
private static String removeCapitals(String A){
if (!A.isEmpty() && !A.equals("")) {
String B = A + A;
String newStr = B.replaceAll("([A-Z])", "");
return newStr;
} else {
return null;
}
}

Shorter solution to your task.
String a = "ABcdEF";
String b = "";
for (int i = 0; i < a.length(); i++) {
if(a.toLowerCase().charAt(i) == a.charAt(i))
b+=a.charAt(i);
}
System.out.println(b);
By changing to .toUpperCase() you'll get rid of the lower case ones.

Why am I getting whitespaces in StringBuilder in Java?

I wrote a method that returns a rotated StringBuilder with a given key. However, although it is working fine, it's adding white spaces within the StringBuilder.
public static StringBuilder rotateCipher(String plain, int key) {
int keyTemp = key;
char[] rotatedChar = new char[plain.length()];
StringBuilder builder = new StringBuilder();
for (int i = 0; i < plain.length(); i++) {
rotatedChar[i] = plain.charAt(key);
key++;
if (key == plain.length()) {
builder.append(String.valueOf(rotatedChar));
builder.append(plain.substring(0, keyTemp + 1));
break;
}
}
return builder;
}
Output: nopqrstuvwxyz abcdefghijklmn

The reason for the whitespace is that the array rotatedChar does not have all it's elements filled. By default, a char[] contains only (char)0 elements.
When you call this method with the parameters "abcdefghijklmnopqrstuvwxyz", 13 then only the first 13 elements of rotatedChar get filled, then you hit the if-condition and break out of the loop. That means you have the remaining 13 elements left as 0s, which is a nonprintable character, so it appears as whitespace.
It's a bit hard to suggest which part to change here because as Gabe pointed out in the comments, the solution only requires 2 calls to substring.
If you really want to use loops, maybe this can be an approach:
for (int i = 0; i < plain.length(); i++) {
rotatedChar[i] = plain.charAt(key);
key++;
if (key == plain.length()) {
//this "restarts" taking chars from the beginning of the string
key = 0;
}
}
return String.valueOf(rotatedChar);

letter change, what am I doing wrong?

So im trying the following challenge:
Using the Java language, have the function LetterChanges(str) take the str parameter being passed andmodify it using the following algorithm. Replace every letter in the string with the letter following it in thealphabet (ie. c becomes d, z becomes a). Then capitalize every vowel in this new string (a, e, i, o, u) and finally return this modified string.
This is my code
class LetterChange {
public static String LetterChanges(String str) {
String alphabet = "AbcdEfghIjklmnOpqrstUvwxyz";
char currentChar,letter;
int i = 0;
while (i < str.length())
{
currentChar = str.charAt(i);
for(int x = 0; x < alphabet.length(); x++)
{
letter = alphabet.charAt(x);
if (currentChar == letter){
str = str.replace(currentChar,alphabet.charAt(x+1));
i++;
}
}
}
when I run it it is returning the last char in string +1 letter in alphabet. for example if i was to run "bcd" it returns "EEE". I dont understand why its replacing all chars with the result of the loop for the last char.

When you go through the loop the first time you get
"bcd"--> "ccd"
Now, str.replace will turn this into "ddd" on next turn, then "EEE".
I.e., replace replaces every occurrence on each turn.
It is true that debugging it in the IDE will help you in the future!
Also, what if you had a lowercase vowel in your string?
public class Alphabet {
public static String LetterChanges(String str) {
String alphabet = "AbcdEfghIjklmnOpqrstUvwxyz";
char[] string = str.toLowerCase().toCharArray();
for (int i=0; i < string.length; i++) {
char d = alphabet.charAt(((alphabet.toLowerCase().indexOf(string[i]))+1) % 26);
string[i]=d;
}
return new String(string);
}
public static void main(String[] args) {
System.out.println(Alphabet.LetterChanges("aabb"));
}
}
alphabet.charAt(
((alphabet.toLowerCase().indexOf(string[i]))
+1) % 26)
1) use toLowerCase on the input and your string map to eliminate case problems
2) find character at index+1 in string map 'alphabet', treating it as a circular buffer using a modulus that takes z to a.
index 25 (z) + 1 == 26 --> 0 (A) because 26 is 0 mod 26 while index 0(A) + 1 = 1 --> 1 mod 26. It is only necessary to wrap the z to A while not changing the other 25 indices and is more efficient than branching with an "if" statement.

Does this solution help?
public static String letterChanges(String str) {
String alphabet = "AbcdEfghIjklmnOpqrstUvwxyz";
StringBuilder stringBuilder = new StringBuilder();
for (char letter : str.toCharArray()) {
if (alphabet.contains(Character.toString(letter))) {
int index = alphabet.indexOf(letter) + 1;
if (index >= 26) {
index = 0;
}
stringBuilder.append(alphabet.charAt(index));
}
}
return stringBuilder.toString();
}
The previous solution was hard to follow, so it's difficult to explain why it wasn't working without debugging through it to see where it goes wrong. It was easier to use a for-each loop to go through the str parameter and find matches using Java's provided methods like .indexOf and .charAt.
Also, Java uses lower camel case method naming, letterChanges instead of LetterChanges :)
Let me know if you have any questions.

You are getting that result because on every replacing you are re-setting the input string. I recommend you:
Better try with two different variables: Let the input variable be unmodified, and work on the output one.
Since strings are unmodifiable -as you already know- better declare them as arrays of char.
For the shake of optimization, base your algorithm on one single loop, which will iterate over the characters of the input string. For each character, decide if it is alphabetic or not, and in case it is, what character should it be replaced with.

The Creation of The Encryption Key for a Keyword Cipher in My Code (Java)

I was just wondering if I could get some help with a question involving keyword ciphers.
The following code is supposed to create the encryption key for a keyword cipher (the key that states how your input (let's say "this is a secret message") became ("qabp bp k poynoq hoppkdo")). The key is, therefore, a String or char array. My teacher had us use a char array. For the case above, the key would {KEYWORDABCFGHIJLMNPQSTUVXZ}, and the basic alphabet would correspond to this {ABCDEFGHIJKLMNOPQRSTUVWXYZ}, therefore A would become K, I would become A, and so on and so forth.
But, anways, back to the problem at hand, when I try to create the key, the first loop that you see below works perfectly fine adding in the keyword for the first values of the array; however, after this adding in the rest of the array (the rest of the alphabet not including or skipping the letters of the keyword) doesn't seem to work in the second loop.
I don't exactly know what I am doing wrong, but I would guess it would have to do with one of the if statements, or the java keyword (continue;) that I use in the loop. Because it appears that when I print the keyword array, it comes out to {KEYWORDABCDEFGHIJKLMONPQRS} leaving out the last seven letters of the alphabet instead of the letters that already occur in the word, KEYWORD.
If you could help fix the code, or get me on the right track, that would be much appreciated. If you have an question on the question, fell free to ask in the comment section below/
Thank you so much for all your help!
public class Crytograph
{
private String in;
private String out;
//private int awayFrom;
private char [] keyword;
private String word;
public Crytograph(String input, String wordLY) // , int fromAway )
{
in = input.toLowerCase();
out = "";
awayFrom = fromAway;
word = wordLY;
keyword = new char[26];
int counter = 97;
int counter1 = 0;
for (int x = 0; x < word.length(); x++)
{
keyword[x] = word.charAt(x);
}
for (int i = word.length(); i < 26; i++)
{
if ((char)(counter) == keyword[counter1])
{
continue;
}
else
{
keyword[i] = (char)(counter);
//System.out.println(keyword[i]);
}
counter++;
counter1++;
if (counter1 == word.length())
{
counter1 = 0;
}
}
}

You're comparing each new character to a single character at index counter1. But you really need to compare it to all the characters in word. Even if you happen to find a match, continue; will still increment i, meaning your array will be missing one or more characters at the end. Try this instead:
char c = 'a' - 1;
for (int i = word.length(); i < 26; i++) {
// loop until we find a char not contained in `word`
while (word.contains(String.valueOf(++c)));
keyword[i] = c;
}

Having trouble with for loops

So, I need to write a program using loops that takes a string and counts what and how many letters appear in that string. (string "better butter" would print "b appears 2 times, e appears 3 times, ' '(space) appears 1 time, and so on). While I understand the idea and concept behind this assignment, actually pulling it off has been rough.
My nested for loop is where the problems are coming from, I assume. What I've written only loops once (i think) and just shows the first character and says there's only one of that character.
Edit: Preferably without using Map or arrays. I'm fine with using them if it's the only way, but they've not been covered in my class so I'm trying to avoid them. Every other similar question to this (that I've found) uses Map or array.
import java.util.Scanner;
class myString{
String s;
myString() {
s = "";
}
void setMyString(String s) {
this.s = s;
}
String getMyString() {
return s;
}
String countChar(String s){
s = s.toUpperCase();
int cnt = 0;
char c = s.charAt(cnt);
for (int i = 0; i <= s.length(); i++)
for (int j = 0; j <= s.length(); j++) //problem child here
c = s.charAt(cnt);
cnt++;
if (cnt == 1)
System.out.println(c+" appears "+cnt+" time in "+s);
else
System.out.println(c+" appears "+cnt+" times in "+s);
return "for"; //this is here to prevent complaint from the below end bracket.
}
}
public class RepeatedCharacters {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
String s;
System.out.println("Enter a sentence: ");
s = in.nextLine();
myString myS = new myString();
// System.out.println(myS.getMyString());
// System.out.println(myS.countChar());
myS.countChar(s);
}
}

First you will need to scan the entire string and store the
counts of each characters. Later you can just print the counts.
Algorithm 1:
Use a HashMap to store the character as key and its count as value. (If you are new to Java, you might want to read up on
HashMaps.)
Every time you read a character in your for loop, check if it present in the HashMap. If yes, then increment the count by 1. Else
add a new characters to the map with count 1.
Printing:
Just iterate on your HashMap and print out the character and
their respective counts.
Issue with your code: You are trying to print the count as soon as you
read a character. But the character might appear again later in the
string. So you need to keep track of the characters you have already
read.
Algorithm 2:
String countChar(String s){
has_processed = []
for i = 0 to n
cnt = 0
if s.charAt(i) has been processed
continue;
for j = i+1 to n
if (s.charAt(i) == s.charAt(j))
cnt++
add s.charAt(i) to has_processed array
print the count of s.charAt(i)
}

Use a frequency array to get an answer in linear time.
/* package whatever; // don't place package name! */
import java.util.*;
import java.lang.*;
import java.io.*;
/* Name of the class has to be "Main" only if the class is public. */
class Ideone
{
public static void main (String[] args) throws java.lang.Exception
{
String s = "better butter";
int freq[] = new int[26];
int i;
for (i = 0; i < s.length(); i++) {
if (s.charAt(i) >= 'a' && s.charAt(i) <= 'z')
freq[s.charAt(i)-'a']++;
}
for (i = 0; i < freq.length; i++) {
if (freq[i] == 0) continue;
System.out.println((char)(i+'a') + " appears " + freq[i] + " times" );
}
}
}
Ideone Link
Note that this can be expanded to include uppercase letters, but for demonstrative purposes, only lowercase letters are handled in the above code.
EDIT: While the OP did ask if it was possible to do this without an array, I would recommend against such. That solution would have terrible time complexity and repeat character counts (unless an array is used to keep track of seen characters, which is counter to the aim). Thus, the above solution is the best way to do it in a reasonable amount of time (linear) with limited space consumption.

I would do the following. Create a HashMap which keeps track of which unique characters are in the string and the count for each character.
You only need to iterate over the string once, and put each character into the HashMap. if the characer is in the map, icrement the integer count in the map, else add 1 to the map for that character. Print out the map with toString() to get the result. The whole thing can be done in about 4 lines of code.

The only thing being done in your nested for loop with the following
c = s.charAt(cnt)
is setting the c char to the value of the first letter (i.e. index 0 of the string) over and over and over until you've looped through the string n^2 times. In other words, you're not incrementing your cnt counter within the for loops at all.

Suggestion: try to use meaningful names for your variables; it will help you a lot in your career. Also class names should always start with a capital letter.
Although it is not the quickest solution in terms of performance, the most simple solution should be:
import java.util.HashMap;
import java.util.Map;
...
Map<String, Integer> freq = new HashMap<String, Integer>();
...
int count = freq.containsKey(word) ? freq.get(word) : 0;
freq.put(word, count + 1);
Source: Most efficient way to increment a Map value in Java
Please next time use the search function before posting a new question.

Here is my version of countChar(String s)
boolean countChar(String s) {
if(s==null) return false;
s = s.toUpperCase();
//view[x] will means that the characted in position x has been just read
boolean[] view = new boolean[s.length()];
/*
The main idea is:
foreach character c = s.charAt(x) in the string s, I have a boolean value view[x] which say if I have already examinated c.
If c has not been examinated yet, I search for other characters equals to c in the rest of the string.
When I found other characters equals to c, I mark it as view and I increment count with count++.
*/
for (int i = 0; i < s.length(); i++) {
if (!view[i]) {
char tmp = s.charAt(i);
int count = 0;
for (int j = i; j < s.length(); j++) {
if (!view[j] && s.charAt(j) == tmp) {
count++;
view[j] = true;
}
}
System.out.println("There were " + count + " " + tmp);
}
}
return true;
}
It should work, excuse me for my English because I'm italian

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Removing duplicates from a string - java

If you use regex, you only need one line! public void unique(String s) { System.out.println(s.replaceAll("(.)(?=.*\\1)", "")); } This removes (by replacing with blank) all characters that found again later in the input (by using a look ahead with a back reference to the captured character).

without changing almost anything in your code, change the line System.out.println(new String(newArray)); for System.out.println( new String(newArray).replaceAll(" ", "")); the addition of replaceAll will remove blanks

public class Duplicates { public static void main(String[] args) { String str="aabbccddeeff"; String[] str1 = str.split(""); ArrayList<String> List = new ArrayList<String> Arrays.asList(str1); List<String> newStr = List.stream().distinct().collect(Collectors.toList()); System.out.print(newStr); } }

Related

Remove a Character From String in Java

Why am I getting whitespaces in StringBuilder in Java?

letter change, what am I doing wrong?

The Creation of The Encryption Key for a Keyword Cipher in My Code (Java)

Having trouble with for loops

Categories

Resources