LZW decoding miss the first code entry - java

I followed the Rosetta Java code implementation.
I tried do this LZW coding with my own Dictionary and not with the ASCII Dictionary which was used.
When I try with my own Dictioanry there is a problem about decoding... The result is wrong, because each of decoded word don't view the first 'a' letter.
The result have to be 'abraca abrac abra' and not 'braca brac bra'
I see the problem in decode() method at String act = "" + (char)(int)compressed.remove(0); This will remove all first 'a' letter.
But I don't have any ideas how can I modify this line...
For example if I use the String act = "";instead of above line... the coding will be very wrong, or use another command... I don't know how can I solve this little problem... Or maybe I am looking for on the bad way for the solution.
public class LZW {
public static List<Integer> encode(String uncompressed) {
Map<String,Integer> dictionary = DictionaryInitStringInt();
int dictSize = dictionary.size();
String act = "";
List<Integer> result = new ArrayList<Integer>();
for (char c : uncompressed.toCharArray()) {
String next = act + c;
if (dictionary.containsKey(next))
act = next;
else {
result.add(dictionary.get(act));
// Add next to the dictionary.
dictionary.put(next, dictSize++);
act = "" + c;
}
}
// Output the code for act.
if (!act.equals(""))
result.add(dictionary.get(act));
return result;
}
public static String decode(List<Integer> compressed) {
Map<Integer,String> dictionary = DictionaryInitIntString();
int dictSize = dictionary.size();
String act = "" + (char)(int)compressed.remove(0);
//String act = "";
String result = act;
for (int k : compressed) {
String entry;
if (dictionary.containsKey(k))
entry = dictionary.get(k);
else if (k == dictSize)
entry = act + act.charAt(0);
else
throw new IllegalArgumentException("Nincs ilyen kulcs: " + k);
result += entry;
dictionary.put(dictSize++, act + entry.charAt(0));
act = entry;
}
return result;
}
public static Map<String,Integer> DictionaryInitStringInt()
{
char[] characters = {'a','b','c','d','e','f','g','h','i','j', 'k','l','m','n',
'o','p','q','r','s','t','u','v','w','x','y','z',' ','!',
'?','.',','};
int charactersLength = characters.length;
Map<String,Integer> dictionary = new HashMap<String,Integer>();
for (int i = 0; i < charactersLength; i++)
dictionary.put("" + characters[i], i);
return dictionary;
}
public static Map<Integer,String> DictionaryInitIntString()
{
char[] characters = {'a','b','c','d','e','f','g','h','i','j', 'k','l','m','n',
'o','p','q','r','s','t','u','v','w','x','y','z',' ','!',
'?','.',','};
int charactersLength = characters.length;
Map<Integer,String> dictionary = new HashMap<Integer,String>();
for (int i = 0; i < charactersLength; i++)
dictionary.put(i,"" + characters[i]);
return dictionary;
}
public static void main(String[] args) {
List<Integer> compressed = encode("abraca abrac abra");
System.out.println(compressed);
String decodeed = decode(compressed);
// decodeed will be 'braca brac bra'
System.out.println(decodeed);
}
}

The rosetta example use
"" + (char) (int) compressed.remove(0);
because the first 256 entries of the dictionnary map exactly the 'char' values.
With a custom dictionnary this line should be:
String act = dictionary.get(compressed.remove(0));

Related

Splitting a string with CaesarCipherBreaker

How would I add code to this example for creating a CaesarCipherBreaker method that splits the encrypted message by two keys. So far I have this much written down:
import edu.duke.*;
public class TestCaesarCipherTwo {
public int[] countOccurrencesOfLetters(String message) {
//snippet from lecture
String alph = "abcdefghijklmnopqrstuvwxyz";
int[] counts = new int[26];
for (int k=0; k < message.length(); k++) {
char ch = Character.toLowerCase(message.charAt(k));
int dex = alph.indexOf(ch);
if (dex != -1) {
counts[dex] += 1;
}
}
return counts;
}
public int maxIndex(int[] values) {
int maxDex = 0;
for (int k=0; k < values.length; k++) {
if (values[k] > values[maxDex]) {
maxDex = k;
}
}
return maxDex;
}
public String halfOfString(String message, int start) {
StringBuilder halfString = new StringBuilder();
for (int index=start;index < message.length();index += 2) {
halfString.append(message.charAt(index));
}
return halfString.toString();
}
public void simpleTests() {
FileResource fileResource = new FileResource();
String fileAsString = fileResource.asString();
CaesarCipherTwoKeys cctk = new CaesarCipherTwoKeys(17, 3);
String encrypted = cctk.encrypt(fileAsString);
System.out.println("Encrypted string:\n"+encrypted);
String decrypted = cctk.decrypt(encrypted);
System.out.println("Decrypted string:\n"+decrypted);
String blindDecrypted = breakCaesarCipher(encrypted);
System.out.println("Decrypted string using breakCaesarCipher():\n"+blindDecrypted);
}
public String breakCaesarCipher(String input) {
int[] freqs = countOccurrencesOfLetters(input);
int freqDex = maxIndex(freqs);
int dkey = freqDex - 4;
if (freqDex < 4) {
dkey = 26 - (4-freqDex);
}
CaesarCipherTwoKeys cctk = new CaesarCipherTwoKeys(dkey);
return cctk.decrypt(input);
}
}
WARNING: I also have a constructor error on this line CaesarCipherTwoKeys cctk = new CaesarCipherTwoKeys(dkey); stating CaesarCipherTwoKeys in class CaesarCipherTwoKeys cannot be applied to given types; required int,int; found int....
The breakCaesarCipher method I have now only figures out one key, not two. How should I go about writing a method that splits an encrypted string, figuring out two keys used for decryption.
If I understand your Code correctly, you could just call your halfOfString (two times) to get the two parts of the ciphertext and then use your usual approach to breaking a Ceaser-Cipher on both parts separately.
Your error seems to result from the fact that the two-key-encryption expects (unsurprisingly) two keys. You should give them both to the constructor.
public String breakCaesarCipher(String input) {
String in_0 = halfOfString(input, 0);
String in_1 = halfOfString(input, 1);
// Find first key
// Determine character frequencies in ciphertext
int[] freqs_0 = countOccurrencesOfLetters(in_0);
// Get the most common character
int freqDex_0 = maxIndex(freqs_0);
// Calculate key such that 'E' would be mapped to the most common ciphertext character
// since 'E' is expected to be the most common plaintext character
int dkey_0 = freqDex_0 - 4;
// Make sure our key is non-negative
if (dkey_0 < 0) {
dkey_0 = dkey_0+26;
}
// Find second key
int[] freqs_1 = countOccurrencesOfLetters(in_1);
int freqDex_1 = maxIndex(freqs_1);
int dkey_1 = freqDex_1 - 4;
if (freqDex_1 < 4) {
dkey_1 = dkey_1+26;
}
CaesarCipherTwoKeys cctk = new CaesarCipherTwoKeys(dkey_0, dkey_1);
return cctk.decrypt(input);
}

Comparing two strings by character in java

I have 2 strings :
first= "BSNLP"
second = "PBN" (or anything that user enters).
Requirement is , O/P should return me the string with only those characters in first but not in second.
Eg. in this case O/P is SL
Eg2.
first = "ASDR"
second = "MRT"
, o/p = "ASD"
For this, the coding I have developed:
String one = "pnlm";
String two ="bsnl";
String fin = "";
for(int i =0; i<one.length();i++)
{
for(int j=0;j<two.length();j++)
{
//System.out.print(" "+two.charAt(j));
if(one.charAt(i) == two.charAt(j))
{
fin+=one.charAt(i);
}
}
}
ch=removeDuplicates(fin);
System.out.print(" Ret ::"+fin);
System.out.println("\n Val ::"+ch);
CH gives me the string with equal characters, but using this logic i cant get the unequal characters.
Can anyone please help?
You can use the Set interface to add all the second array of character so you can check it there later.
sample:
String one = "ASDR";
String two ="MRT";
StringBuilder s = new StringBuilder();
Set<Character> set = new HashSet<>();
for(char c : two.toCharArray())
set.add(c); //add all second string character to set
for(char c : one.toCharArray())
{
if(!set.contains(c)) //check if the character is not one of the character of second string
s.append(c); //append the current character to the pool
}
System.out.println(s);
result:
ASD
I have simple exchange your logic, see:
String one = "pnlm";
String two = "bsnl";
String fin = "";
int cnt;
for (int i = 0; i < one.length(); i++) {
cnt = 0; // zero for no character equal
for (int j = 0; j < two.length(); j++) {
// System.out.print(" "+two.charAt(j));
if (one.charAt(i) == two.charAt(j)) {
cnt = 1; // ont for character equal
}
}
if (cnt == 0) {
fin += one.charAt(i);
}
}
System.out.print(" Ret ::" + fin);
o/p: Ret ::pm.
public static void main(String[] args)
{
String one = "ASDR";
String two ="MRT";
String fin = unique(one, two);
System.out.println(fin);
}
private static String unique(final String one,
final String two)
{
final List<Character> base;
final Set<Character> toRemove;
final StringBuilder remaining;
base = new ArrayList<>(one.length());
toRemove = new HashSet<>();
for(final char c : one.toCharArray())
{
base.add(c);
}
for(final char c : two.toCharArray())
{
toRemove.add(c);
}
base.removeAll(toRemove);
remaining = new StringBuilder(base.size());
for(final char c : base)
{
remaining.append(c);
}
return (remaining.toString());
}
Iterate over the first string
For each character, check if the second string contains it
If it doesn't, add the caracter to a StringBuilder
Return stringBuilder.toString()

Invert Concordance Using Java

today I'm working with a client that creates a concordance from a text file using Java. All I need to do is invert the concordance to essentially recreate the text from start to finish. Now, the issue I seem to be having is where to start and how to do each step. As of now I have tried to create an array of words and iterate through my symbol table and assign each key to the array. Then I end up getting just a list of words from the concordance. For some reason this problem makes me feel very stupid because it seems like it should be a simple solution. I can't seem to think of any valid ideas to get me started with recreating the story. I have included the source here:
public class InvertedConcordance {
public static ST<String, SET<Integer>> createConcordance (String[] words) {
ST<String, SET<Integer>> st = new ST<String, SET<Integer>>();
for (int i = 0; i < words.length; i++) {
String s = words[i];
if (!st.contains(s)) {
st.put(s, new SET<Integer>());
}
SET<Integer> set = st.get(s);
set.add(i);
}
return st;
}
public static String[] invertConcordance (ST<String, SET<Integer>> st) {
//This is what I have so far
//Here is what I have that doesnt work
for(String key : st.keys())
{
inv[i++] = key;
}
for(int z = 0; z< inv.length; z++)
{
System.out.println(inv[z]);
}
String[]inv = new String[st.size()];
return inv;
}
private static void saveWords (String fileName, String[] words) {
int MAX_LENGTH = 70;
Out out = new Out (fileName);
int length = 0;
for (String word : words) {
length += word.length ();
if (length > MAX_LENGTH) {
out.println ();
length = word.length ();
}
out.print (word);
out.print (" ");
length++;
}
out.close ();
}
public static void main(String[] args) {
String fileName = "data/tale.txt";
In in = new In (fileName);
String[] words = in.readAll().split("\\s+");
ST<String, SET<Integer>> st = createConcordance (words);
StdOut.println("Finished building concordance");
// write to a file and read back in (to check that serialization works)
//serialize ("data/concordance-tale.txt", st);
//st = deserialize ("data/concordance-tale.txt");
words = invertConcordance (st);
saveWords ("data/reconstructed-tale.txt", words);
}
}
First of all - why are you using some weird classes like:
SET
ST
instead of built-in java classes:
Set
Map
Which are nedded here?
As for your problem, your code should not compile at all since you are declaring the variable inv AFTER using it:
public static String[] invertConcordance (ST<String, SET<Integer>> st) {
//This is what I have so far
//Here is what I have that doesnt work
for(String key : st.keys())
{
inv[i++] = key;
}
for(int z = 0; z< inv.length; z++)
{
System.out.println(inv[z]);
}
String[]inv = new String[st.size()];
return inv;
}
If I understand your idea correctly, the concordances simply creates the list of words and sets containing indices on which there were found. If this is a correct interpretation then an inverse operation would be:
public static String[] invertConcordance (ST<String, SET<Integer>> st) {
//First - figure out the length of the document, which is simply the maximum index in the concordancer
int document_length = 0;
for(String key : st.keys()){
for(Integer i : st.get(key)){
if(i>document_length){
document_length=i;
}
}
}
//Create the document
String[] document = new String[document_length+1];
//Reconstruct
for(String key : st.keys()){
for(Integer i : st.get(key)){
document[i] = key;
}
}
return document;
}
I assumed, that indices are numbered from 0 to the document's length-1, if there are actually stored from the 1 to document'length you should modify lines:
String[] document = new String[document_length+1];
to
String[] document = new String[document_length];
and
document[i] = key;
to
document[i-1] = key;

Add to list at certain index

I'm having a problem with some list manipulation. I take the user's input and search through it: if i find an "=" sign i assume that the string in front of it is the name of a variable , so on the line right above that variable i want to add a new string to the user's input (in this case it is called "tempVAR", doesn't really matter though). I've been trying to do this with StringBuilder but without any success , so i currently am trying to do it with ArrayLists but I am getting stuck at adding new elements to the list. Because of the way list.add(index,string) works , the elements to the right of what i am adding will always add +1 to their index. Is there a way to always know exactly what index i am looking for even after a random number of string has been added? Here is my code so far, if you run it you will see what i mean, instead of "tempVAR" or "tempVar1" being added above the name of the variable they will be added one or to positions in the wrong way.
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map.Entry;
public class ToTestStuff {
static List<String> referenceList = new ArrayList<String>();
public static final String SEMICOLUMN = ";";
public static final String BLANK = " ";
public static final String EMPTY = "";
public static final String LEFT_CURLY = "{";
public static final char CARRIAGE_RETURN = '\r';
public static final String CR_STRING = "CARRIAGE_RETURN_AND_NEW_LINE";
public static final char NEW_LINE = '\n';
public static void main(String[] args) {
List<String> test = new ArrayList<String>();
String x = "AGE_X";
String y = "AGE_Y";
String z = "AGE_YEARS";
String t = "P_PERIOD";
String w = "T_VALID";
referenceList.add(x);
referenceList.add(y);
referenceList.add(z);
referenceList.add(t);
referenceList.add(w);
String text2 = " if ( AGE_YEARS > 35 ) {\r\n"
+ " varX = P_PERIOD ;\r\n"
+ " }\r\n"
+ " if ( AGE_YEARS < 35 ) {\r\n"
+ " varY = T_VALID ;\r\n"
+ " varZ = AGE_Y ;\r\n"
+ " varA = AGE_X ;\r\n"
+ " }";
detectEquals(text2);
}
public static String detectEquals(String text) {
String a = null;
// text = text.trim();
// text = TestSplitting.addDelimiters(text);
String[] newString = text.split(" ");
List<String> test = Arrays.asList(newString);
StringBuilder strBuilder = new StringBuilder();
HashMap<String, List<Integer>> signs = new HashMap<String, List<Integer>>();
HashMap<String, List<Integer>> references = new HashMap<String, List<Integer>>();
HashMap<Integer, Integer> indexesOfStringAndList = new HashMap<Integer, Integer>();
List<String> testList = new ArrayList<String>();
List<Integer> lastList = new ArrayList<Integer>();
List<Integer> indexList = new ArrayList<Integer>();
List<Integer> refList = new ArrayList<Integer>();
List<String> keysList = new ArrayList<String>();
List<List> minList = new ArrayList<List>();
String previous = null;
int index = 0;
Object obj = new Object();
List<Integer> referenceValueList = new ArrayList<Integer>();
List<Integer> indexPosition = new ArrayList<Integer>();
String b = null;
int indexOfa = 0;
// System.out.println("a----> " + test);
List<String> anotherList = new ArrayList(test);
for (int i = 0; i < anotherList.size(); i++) {
a = anotherList.get(i).trim();
index = strBuilder.length();// - a.length();
// index = i;
strBuilder.append(a); // "=", 3 - if, 14 - while, 36 , "=", 15
testList.add(a);
if (a.equals("if") || a.equals("=")) {
lastList.add(i);
indexOfa = i;
indexesOfStringAndList.put(index, indexOfa);
refList.add(index);
indexPosition.add(index);
if (signs.containsKey(a)) {
signs.get(a).add(index);
} else {
signs.put(a, refList);
}
refList = new ArrayList<Integer>();
}
if (referenceList.contains(a)) {
indexList.add(index);
if (references.containsKey(a)) {
references.get(a).add(index);
} else {
references.put(a, indexList);
}
indexList = new ArrayList<Integer>();
}
}
for (String k : references.keySet()) {
keysList.add(k);
referenceValueList = references.get(k);
obj = Collections.min(referenceValueList);
int is = (Integer) obj;
ArrayList xx = new ArrayList();
xx.add(new Integer(is));
xx.add(k);
minList.add(xx);
}
for (List q : minList) {
Integer v = (Integer) q.get(0);
String ref = (String) q.get(1);
int x = closest(v, indexPosition);
int lSize = anotherList.size();
int sizeVar = lSize - test.size();
int indexOfPx = 0;
int px = 0;
if (x != 0) {
px = indexesOfStringAndList.get(x) - 1;
} else {
px = indexesOfStringAndList.get(x);
}
if (px == 0) {
System.out.println("previous when x=0 " +anotherList.get(px+sizeVar));
anotherList.add(px, "tempVar1=\r\n");
} else {
previous = anotherList.get(px + sizeVar);
System.out.println("previous is---> " + previous + " at position " + anotherList.indexOf(previous));
anotherList.add(anotherList.indexOf(previous) - 1, "\r\ntempVAR=");
}
}
strBuilder.setLength(0);
for (int j = 0; j < anotherList.size(); j++) {
b = anotherList.get(j);
strBuilder.append(b);
}
String stream = strBuilder.toString();
// stream = stream.replaceAll(CR_STRING, CARRIAGE_RETURN + EMPTY + NEW_LINE);
System.out.println("after ----> " + stream);
return stream;
}
public static int closest(int of, List<Integer> in) {
int min = Integer.MAX_VALUE;
int closest = of;
for (int v : in) {
final int diff = Math.abs(v - of);
if (diff < min) {
min = diff;
closest = v;
}
}
return closest;
}
}
I've mapped the positions of the "=" and "if" to their positions in the StringBuilder, but these are remnants from when i was trying to use a stringBuilder to do what i said above.
I have been struggling with this for a few days now and still haven't managed to do what i need, i am not sure where i am going wrong. At the moment i am hellbent on making this work as it is (with either lists or string builder) after which , if there is a better way i will look into that and adapt this accordingly.
The addDelimiters() method is a method i created to avoid writing the string as you see it in "String text2" but i took that out for this because it would only clutter my already chaotic code even more :), i don't think it has any relevance to why what i am trying to do is not working.
TLDR: at the line above front of every varX or varY or other "var" i would like to be able to add a string to the list but i think my logic in getting the variable names or in adding to the list is wrong.
I think we both know that your code is messed up and that you need many more abstractions to make it better. But you could make it work by maintaining an offset variable, lets say "int offset". Each time you insert a string after the initial pass you increment it, and when you access the list you use it, "list.get(index+offset);". Read up on Abstract syntax trees. , which are a great way to parse and manipulate languages.

Java String parsing - {k1=v1,k2=v2,...}

I have the following string which will probably contain ~100 entries:
String foo = "{k1=v1,k2=v2,...}"
and am looking to write the following function:
String getValue(String key){
// return the value associated with this key
}
I would like to do this without using any parsing library. Any ideas for something speedy?
If you know your string will always look like this, try something like:
HashMap map = new HashMap();
public void parse(String foo) {
String foo2 = foo.substring(1, foo.length() - 1); // hack off braces
StringTokenizer st = new StringTokenizer(foo2, ",");
while (st.hasMoreTokens()) {
String thisToken = st.nextToken();
StringTokenizer st2 = new StringTokenizer(thisToken, "=");
map.put(st2.nextToken(), st2.nextToken());
}
}
String getValue(String key) {
return map.get(key).toString();
}
Warning: I didn't actually try this; there might be minor syntax errors but the logic should be sound. Note that I also did exactly zero error checking, so you might want to make what I did more robust.
The speediest, but ugliest answer I can think of is parsing it character by character using a state machine. It's very fast, but very specific and quite complex. The way I see it, you could have several states:
Parsing Key
Parsing Value
Ready
Example:
int length = foo.length();
int state = READY;
for (int i=0; i<length; ++i) {
switch (state) {
case READY:
//Skip commas and brackets
//Transition to the KEY state if you find a letter
break;
case KEY:
//Read until you hit a = then transition to the value state
//append each letter to a StringBuilder and track the name
//Store the name when you transition to the value state
break;
case VALUE:
//Read until you hit a , then transition to the ready state
//Remember to save the built-key and built-value somewhere
break;
}
}
In addition, you can implement this a lot faster using StringTokenizers (which are fast) or Regexs (which are slower). But overall, individual character parsing is most likely the fastest way.
If the string has many entries you might be better off parsing manually without a StringTokenizer to save some memory (in case you have to parse thousands of these strings, it's worth the extra code):
public static Map parse(String s) {
HashMap map = new HashMap();
s = s.substring(1, s.length() - 1).trim(); //get rid of the brackets
int kpos = 0; //the starting position of the key
int eqpos = s.indexOf('='); //the position of the key/value separator
boolean more = eqpos > 0;
while (more) {
int cmpos = s.indexOf(',', eqpos + 1); //position of the entry separator
String key = s.substring(kpos, eqpos).trim();
if (cmpos > 0) {
map.put(key, s.substring(eqpos + 1, cmpos).trim());
eqpos = s.indexOf('=', cmpos + 1);
more = eqpos > 0;
if (more) {
kpos = cmpos + 1;
}
} else {
map.put(key, s.substring(eqpos + 1).trim());
more = false;
}
}
return map;
}
I tested this code with these strings and it works fine:
{k1=v1}
{k1=v1, k2 = v2, k3= v3,k4 =v4}
{k1= v1,}
Written without testing:
String result = null;
int i = foo.indexOf(key+"=");
if (i != -1 && (foo.charAt(i-1) == '{' || foo.charAt(i-1) == ',')) {
int j = foo.indexOf(',', i);
if (j == -1) j = foo.length() - 1;
result = foo.substring(i+key.length()+1, j);
}
return result;
Yes, it's ugly :-)
Well, assuming no '=' nor ',' in values, the simplest (and shabby) method is:
int start = foo.indexOf(key+'=') + key.length() + 1;
int end = foo.indexOf(',',i) - 1;
if (end==-1) end = foo.indexOf('}',i) - 1;
return (start<end)?foo.substring(start,end):null;
Yeah, not recommended :)
Adding code to check for existance of key in foo is left as exercise to the reader :-)
String foo = "{k1=v1,k2=v2,...}";
String getValue(String key){
int offset = foo.indexOf(key+'=') + key.length() + 1;
return foo.substring(foo.indexOf('=', offset)+1,foo.indexOf(',', offset));
}
Please find my solution:
public class KeyValueParser {
private final String line;
private final String divToken;
private final String eqToken;
private Map<String, String> map = new HashMap<String, String>();
// user_uid=224620; pass=e10adc3949ba59abbe56e057f20f883e;
public KeyValueParser(String line, String divToken, String eqToken) {
this.line = line;
this.divToken = divToken;
this.eqToken = eqToken;
proccess();
}
public void proccess() {
if (Strings.isNullOrEmpty(line) || Strings.isNullOrEmpty(divToken) || Strings.isNullOrEmpty(eqToken)) {
return;
}
for (String div : line.split(divToken)) {
if (Strings.isNullOrEmpty(div)) {
continue;
}
String[] split = div.split(eqToken);
if (split.length != 2) {
continue;
}
String key = split[0];
String value = split[1];
if (Strings.isNullOrEmpty(key)) {
continue;
}
map.put(key.trim(), value.trim());
}
}
public String getValue(String key) {
return map.get(key);
}
}
Usage
KeyValueParser line = new KeyValueParser("user_uid=224620; pass=e10adc3949ba59abbe56e057f20f883e;", ";", "=");
String userUID = line.getValue("user_uid")

Categories