Compressing a file - java

I want to accomplish a program which can take a textfile and make the size smaller. So far it replaces all the double character occurrences, and now I want to replace "ou" with "1".
I've tried with an if-statement, but it doesn't seem to work quite well.
My method is below:
public String compressIt (String input)
{
int length = input.length(); // length of input
int ix = 0; // actual index in input
char c; // actual read character
int cCounter; // occurrence counter of actual character
String ou = "ou";
StringBuilder output = // the output
new StringBuilder(length);
// loop over every character in input
while(ix < length)
{
// read character at actual index then increments the index
c = input.charAt(ix++);
// we count one occurrence of this character here
cCounter = 1;
// while not reached end of line and next character
// is the same as previously read
while(ix < length && input.charAt(ix) == c)
{
// inc index means skip this character
ix++;
// and inc character occurence counter
cCounter++;
}
if (input.charAt(ix) == 'o' && input.charAt(++ix) == 'u' && ix < length - 1)
{
output.append("1");
}
// if more than one character occurence is counted
if(cCounter > 1)
{
// print the character count
output.append(cCounter);
}
// print the actual character
output.append(c);
}
// return the full compressed output
return output.toString();
}
It's this lines of code I'm referring to.
if (input.charAt(ix) == 'o' && input.charAt(ix + 1) == 'u')
{
output.append("1");
}
What I want to do: Replace characters. I got a text-file which contains "Alice In Wonderland". When my looping through all characters sees an 'o' and an 'u' (like "You"), I want to replace the characters so it looks like: "Y1".
Regards

So most likely you're trying loop from ix = 0 to the length of the string.
First of all my guess is that youre looping up and including string.length(). Which doesnt work, charAt is 0 indexed aka
"abc" has a charAt 0, 1 and 2 but not 3 which gives the error you describe.
Second of all the line you showed uses input.charAt (ix++) which does the following: get the char at position ix (old value) and after that, update the value ix to ix + 1, if you want ix to be updated before the surrounding charAt you'd have to write input.charAt(++ix)
Third of all there is a String.replace function, input.replace("abc", "def") will work great for simple replacements, for more complicated replacements, consider using regex.

This has nothing to do with the charAt method. You need to change your if condition to run it till length-1. It is failing in the last case as it is going out array.
for(int i=0; i<inputString.length() - 1; i++)
{
char temp = inputString.charAt(i);
char blah = inputString.charAt(i+1);
System.out.println("temp: "+ temp);
System.out.println("blah: "+ blah);
}
This works for me!

Related

sublength() codecademy occurence of characters

Write a function subLength() that takes 2 parameters, a string and a single character. The function should search the string for the two occurrences of the character and return the length between them including the 2 characters. If there are less than 2 or more than 2 occurrences of the character the function should return 0.
// Write function below
const subLength = (str, char) => {
let charCount = 0;
let len = -1;
for (let i=0; i<str.length; i++) {
if (str[i] == char) {
charCount++;
if (charCount > 2) {
return 0;
}
if (len == -1) {
len = i;
} else {
len = i - len + 1
}
}
}
if (charCount < 2) {
return 0;
}
return len;
};
Can someone explain the len=-1 and how to find length between character part in this question please?
It is used as initial value. The initial value of 'len' needs to be outside the possible range of 'len' otherwise you cannot set the first position of the first occurrence of the char.
E.g. if 'len' is initialized with 0 it will be assumed that the first occurrence of the char is at position 0.
In your function the second line, you declare that len variable to -1. The intent is to use a number, that impossible to return (you defined 0 as no, or too many occurences, and every other number could be a valid length between those two occurences).
If you've found the first occurence of the specific character, you should remember it, where to start the counting. This is when it checks for the initial value of -1.
When the second character is found, this len variable is something other than -1, so the else branch is executed.
if(len == -1) {
len = i; // found the first occurence, save its position
} else {
len = i - len +1; // not the first occurence, calculate the length
}
variable
len has been initialized with -1 because this value is not going to occur as index or length while iterating over the String
str. Later,
len is initialized with the index at which that char occurs first time in String
str by checking if
len is -1 otherwise it would be second occurrence of same
char(else condition).

Problem replacing char in char array with a digit

Given a string, I have to replace all vowels with their respective position in the array. However, my code returns some strange symbols instead of numbers. Where's the problem?
String s = "this is my string";
char p = 1;
char[] formatted = s.toCharArray();
for(int i = 0; i < formatted.length; i++) {
if(formatted[i] == 'a' ||formatted[i] == 'e' ||formatted[i] == 'i'
||formatted[i] == 'o' ||formatted[i] == 'u') {
formatted[i] = p;
}
p++;
}
s = String.valueOf(formatted);
System.out.println(s);
P.S: Numbers are bigger than 10
this is my s t r i n g
012345678910 11 12 13 14
The position of i in string is 14 but 14 is not a character; it's a numeric string. It means that you need to deal with strings instead of characters. Split s using "" as the delimiter, process the resulting array and finally join the array back into a string using "" as the delimiter.
class Main {
public static void main(String[] args) {
String s = "this is my string";
String[] formatted = s.split("");
for (int i = 0; i < formatted.length; i++) {
if (formatted[i].matches("(?i)[aeiou]")) {
formatted[i] = String.valueOf(i);
}
}
s = String.join("", formatted);
System.out.println(s);
}
}
Output:
th2s 5s my str14ng
The regex, (?i)[aeiou] specifies case-insensitive match for one of the vowels where (?i) specifies the case-insensitiveness. Test it here.
The character '1' has a different value from the number 1.
You can change
char p = 1;
to
char p = '1';
and I think that will give you what you want, as long you're not trying to insert more than 9 numbers in your string. Otherwise you'll need to cope with inserting extra digits, which you cannot do into a char-array, because it has a fixed length.
the root of the problem is already in the comments,
in java the types make a difference in memory size and its representation
int x = 1;
and
char y = '1'
are not holding the same value, this is because many numerical representations are related with ascii codes and the value you have to assing to y to get the number 1 printed is HEX 0x31 or DEC 49.
take a look at the ascci table

How to reverse all the words in a string, without changing symbol position?

i have to write a program which will reverse all words in a string, but all symbols should stay on previous position for example: "a1bcd efg!h" => "d1cba hgf!e". I wrote a simple program which can reverse all words/symbols, but I have no idea how to make it like in example
public void reverseWordInMyString(String str) {
String[] words = str.split(" ");
String reversedString = "";
for (int i = 0; i < words.length; i++) {
String word = words[i];
String reverseWord = "";
for (int j = word.length()-1; j >= 0; j--) {
reverseWord = reverseWord + word.charAt(j);
}
reversedString = reversedString + reverseWord + " ";
}
System.out.println(reversedString);
}
It's a good start. The question is just that tricky.
Your current approach uses a single 'accumulator' which starts at the end of the string and moves back to the start: The j in for (int j =...).
You'll need two accumulators to complete this homework assignment: One going from the front to the back, which steadily increments (so, that'll be for (int i = 0; i < word.length(); i++)), and one which starts at the end and decrements, but not steadily.
The idea is: As you go forward, you inspect the character you find at position i. Then, you use an if, as the question asks you to do different things depending on a condition:
if the character at i is a special character, just add it.
else, add the last non-special character in the string we haven't yet added.
the if case is trivial. The else case is not. That's where your second accumulator comes in: This one will track where you're at in the string, from the end. This is a loop-in-loop. What you'll need to:
repeat the following algorithm:
If the character at 'j' (which goes from end to start) is a special character, decrement j, and restart this algorithm
Otherwise, that's the 'last non-special character we havent yet added', so add that, decrement j, and escape this algorithm.
The above can be done with, for example, a while or do/while loop. It'll be inside your for loop.
Good luck!
NB: This isn't the only way to do it. For example, you could also eliminate all special characters from the input, do a basic reverse on every word inside, which is a lot simpler than what you have now, as string has a .reverse() method these days, and then, after all that, go through your original input character by character, and for each special character you find, insert that character at that position in your output string. That works too. Whichever strategy you prefer!
according to www.geeksforgeeks.org
The Problem:
Given a string, that contains special character together with alphabets (‘a’ to ‘z’ and ‘A’ to ‘Z’), reverse the string in a way that special characters are not affected.
Solution:
Create a temporary character array say temp[].
Copy alphabetic characters from given array to temp[].
Reverse temp[] using standard string reversal algorithm.
Now traverse input string and temp in a single loop. Wherever there is alphabetic character is input string, replace it with current character of temp[].
Algorithm:
1) Let input string be 'str[]' and length of string be 'n'
2) l = 0, r = n-1
3) While l is smaller than r, do following
a) If str[l] is not an alphabetic character, do l++
b) Else If str[r] is not an alphabetic character, do r--
c) Else swap str[l] and str[r]
Java Code:
public static void main(String[] args){
String s = "Thi!s is a sa5mpl?e sentence.";
String[] words = s.split("\\s+");
System.out.println("Result:" + reverse(s));
//Output: sih!T si a el5pma?s ecnetnes.
}
public static String reverse(String input)
{
String[] words = input.split("\\s+");
String last_str = "";
for(int j=0;j<words.length;j++){
char[] str = words[j].toCharArray();
int r = str.length - 1, l = 0;
// Traverse string from both ends until
// 'l' and 'r'
while (l < r)
{
// Ignore special characters
if (!Character.isAlphabetic(str[l]))
l++;
else if(!Character.isAlphabetic(str[r]))
r--;
// Both str[l] and str[r] are not spacial
else
{
str[l] ^= str[r];//swap using triple XOR
str[r] ^= str[l];
str[l] ^= str[r];
l++;
r--;
}
}
last_str = last_str + new String(str) + " ";
}
// Initialize left and right pointers
return last_str;
}
}
I would approach this as follows:
Keep track of where the current word starts (or, equivalently, where the last non-word character was). This will be updated as we go along.
Scan the input string. Every time a non-word character (special character or space) is found, output the reverse of the current word and then the non-word character. You can output the reverse of the current word by indexing backwards from just before the non-word character to the start of the current word. (Note that the current word might be empty; for example, two special characters in a row).
For the output area, I recommend a StringBuilder rather than a String; this will be more efficient.
In code, it might look something like this (where I've changed the method to return the result rather than print it to the console):
public String reverseWordInMyString(String str) {
StringBuilder output = new StringBuilder(str.length()); // initialize full capacity
int curWordStart = 0;
// scan the input string
for (int i = 0; i < str.length(); i++) {
char curLetter = str.charAt(i);
if (!Character.isLetter(char)) {
// the current word has ended--output it in reverse
for (int j = i-1; j >= curWordStart; j--) {
output.append(str.charAt(j));
}
// output the current letter
output.append(curLetter);
// the next current word starts just after the current letter
curWordStart = i + 1;
}
}
// The last current word (if any) ends with the end of string,
// not a special character, so add it (reversed) as well to output
for (int j = str.length() - 1; j >= curWordStart; j--) {
output.append(str.charAt(j));
}
return output.toString();
}

Create substring using loop

I am given a string from which i have to find out sub-strings that satisfy the following conditions
all characters in the sub-string are same. eg: aa,bbb,cccc.
all the character except the middle character have to be the same.
eg: aba, bbabb, etc.
I've made an algo something like this
I beak the string using two loops 1st loop holds the first char and the second loop traverses through the string.
Then i send the sub-string to the vet() to see if the substring contains less than or equals two character.
If the sub-string contains two character then i check if its a palindrome
public static int reverse(String s)
{
String wrd="";
for(int i = s.length()-1 ;i>=0;i--)
wrd = wrd + s.charAt(i);
if(s.equals(wrd))
return 1;
else
return 0;
}
public static boolean vet(String s)
{
HashSet<Character> hs = new HashSet<>();
for(char c : s.toCharArray())
{
hs.add(c);
}
if(hs.size() <= 2)
return true;
else
return false;
}
static long substrCount(int n, String s) {
List<String> al = new ArrayList<>();
for(int i=0;i<s.length();i++)
{
for(int j=i;j<s.length();j++)
{
if(vet(s.substring(i,j+1)))
{
if(reverse(s.substring(i,j+1)) == 1)
al.add(s.substring(i,j+1));
}
}
}
return al.size();
}
This code works fine for small strings, however if the string is big say ten thousand character, this code will throw Time limit exception.
I suspect the loop that breaks the string and create the sub-string in the substrCount() is causing the time complexity as it has nested loops.
Please review this code and provide a better way to break the string or if the complexity is increasing due to some other section then let me know.
link : https://www.hackerrank.com/challenges/special-palindrome-again/problem?h_l=interview&playlist_slugs%5B%5D=interview-preparation-kit&playlist_slugs%5B%5D=strings
You can collect counts from left side and right side of the string in 2 separate arrays.
Now, we collect counts in the fashion of if previous char equals current char, increase count by 1, else set it to 1.
Example:
a a b a a c a a
1 2 1 1 2 1 1 2 // left to right
2 1 1 2 1 1 2 1 // right to left
For strings that have all characters equal, we just collect all of them while iterating itself.
For strings with all equal except the middle character, you can use above the above table and you can collect string as below:
Pseudocode:
if(str.charAt(i-1) == str.charAt(i+1)){ // you will add checks for boundaries
int min_count = Math.min(left[i-1],right[i+1]);
for(int j=min_count;j>=1;--j){
set.add(str.substring(i-j,i+j+1));
}
}
Update:
Below is my accepted solution.
static long substrCount(int n, String s) {
long cnt = 0;
int[] left = new int[n];
int[] right = new int[n];
int len = s.length();
for(int i=0;i<len;++i){
left[i] = 1;
if(i > 0 && s.charAt(i) == s.charAt(i-1)) left[i] += left[i-1];
}
for(int i=len-1;i>=0;--i){
right[i] = 1;
if(i < len-1 && s.charAt(i) == s.charAt(i+1)) right[i] += right[i+1];
}
for(int i=len-1;i>=0;--i){
if(i == 0 || i == len-1) cnt += right[i];
else{
if(s.charAt(i-1) == s.charAt(i+1) && s.charAt(i-1) != s.charAt(i)){
cnt += Math.min(left[i-1],right[i+1]) + 1;
}else if(s.charAt(i) == s.charAt(i+1)) cnt += right[i];
else cnt++;
}
}
return cnt;
}
Algorithm:
The algorithm is the same as explained above with a few additional stuff.
If the character is at the boundary, say 0 or at len-1, we just look at right[i] to count the strings, because we don't have a left here.
If a character is inside this boundary, we do checks as follows:
If previous character equals next character, we check if previous character does not equal current character. We do this because, we want to avoid future addition of strings at the current iteration itself(say for strings like aaaaa where we are at the middle a).
Second condition says s.charAt(i) == s.charAt(i+1), meaning, we again have strings like aaa and we are at the first a. So we just add right[i] to indicate addition of strings like a,aa,aaa).
Third does cnt++ meaning addition of individual character.
You can make a few optimizations like completely avoiding right array etc, but I leave that to you as an exercise.
Time complexity: O(n), Space complexity: O(n)
Your current solution runtime is O(n^4). You can reduce it to O(n^2logn) by removing the number of character count in substrings and optimising the palindrome check portion.
To do so, you have to pre-calculate an array say "counter" where every position of the "counter" array indicates number of different characters from starting index to that position.
After constructing the array, you can check if a substring has more than two characters in O(1) by subtracting the end position and starting position value of counter array. If the value is 1 then there have only one character in the substring. If the value is 2, then you can binary search in the counter array between the substrings starting and end positions to find the position of single character. After finding out the position of the single character its straight forward to check if the substring is palindrome or not.
UPDATE!
Let me explain with an example:
Suppose the string is "aaabaaa".
So, the counter array would be = [1, 1, 1, 2, 2, 2, 2];
Now, lets assume for a specific time, the outer for loops value i = 1 and the inner for loops value j = 5; so the substring is "aabaa".
Now to find the number of character in the substring by following code:
noOfDifferentCharacter = counter[j] - counter[i-1] + 1
If the noOfDifferentCharacter is 1 then no need to check for palindrome. If the noOfDifferentCharacter is 2 like in our case we need to check if the substring is palindrome. To check if the substring is palindrome have to perform a binary search in the counter array from index i to j to check for the position where the value is greater than its previous index. In our case the position is 3, then you just need to check if the position is the middle position of the substring. Note that the counter array is sorted.
Hope this helps. Let me know if you don't understand any step. Happy coding!

How do I reorder characters in a String without the use of Hashmaps?

My code below is giving the following error and I can't figure out why. I am trying to reorder the entered word ("Polish" for example) in the order of:
(First letter, last letter, second letter, second last letter, third letter... so on) so the output should be "Phosli".
Updated code
public static String encodeTheWord(String word1)
{
int b = 0;
int e = word1.length()-1;
String word2 = "";
for (int i=0; i<e; i++)
{
word2 = word2 + word1.charAt(b) + word1.charAt(e);
b+=1;
e-=1;
}
System.out.println(word2);
return (word2);
}
For a word with an even amount of characters (Polish), the order of the characters becomes 051423, so the maximum value of b is 2 and the minimum value is e is from 5 to 3. Thus, your loop should decrement e and increment b twice (so you run the loop for word1.length() / 2 times). Also,
int e = word1.length();
Would need to be:
int e = word1.length() - 1;
For words of an uneven length (word1.length() % 2 > 0) you need an extra check or you will repeat the middle character.
your for loop is wrong, you can get a char at index 0, until the word1.length()-1...
must be
for (int i=0; i<word1.length()-1; i++)
the same applies for this...
word1.charAt(e);
because you defined e as word1.length()

Categories