Java: Making comparisons within a text file - java

I am currently writing a program where there is a text file with several million digits in it, and I have to go through it looking for a random string of 6 numbers (entered by the user). There are several constraints to this, which is making it difficult.
Must used BufferedReader
Each character can only be read once (I got it working with a bunch of nested if statements, but the way I did it violated this rule)
Cannot use any methods from the string class (so I can't put the read characters together and compare to the original string with .equals()). I have already broken up the original string into the 6 individual characters.
Not allowed to store read characters into an array of any kind, only into character variables (of which there should be 6)
Once a match has been found, it is to report the location to the user (I just need to keep a count variable that I increment with every character read) and continue on until the end of the file is reached. There can be multiple matches in the file.
Any help with this would be great, I'm at a loss for what to do.

You have a haystack to search, say 98712365478932145697, and a needle to find, say 893.
How about:
use BufferedReader.read() to read from the haystack a character at a time
if the character is the first character in your needle, store it in the first character variable
if the next character is the second character in your needle, store it in the second character variable, else, if it's the first character in your needle, start over and store it in the first character variable
if the next character is the third character in your needle, store it in the third character variable, else, if it's the first character in your needle, start over
etc
if you fill the last character variable, you have found the needle in the haystack, you can stop here or start over and look for another occurrence
I won't write the code as it's fairly trivial and this sounds like homework, but that should give you a nudge.

Related

Processing text which contains double-leterred 'character'

How would you treat/store letter 'CH' in java code for let's say frequency analysis? I haven't found any alphabet libraries that will work with the double lettered 'CH'. Storing in char is no longer an option. All the text processing algorithms just scans one by one. But now I will need to somehow scan ahead to match the pair. There is no 'CH' char in unicode either, are there any other coding tables where 'CH' can be found?
Another way will be to replace 'CH' with '1' in input data files and treat the '1' as another regular character. By which I will loose the option of ASCII codes aritmetics('a' - 't' is nonsense as the 'ch' is missing in ASCII)

Removing comments from code character by character [Java]

I need to remove comments from code, but in this case I'll have to do it without using
System.out.println(sourceCode.replaceAll("//.*|/\\*((.|\\n)(?!=*/))+\\*/", ""));
The program needs to check the code character by character to look for "/" and then proceed to check if the next character is "/" or "*".
I'm looking for a good way to read through the code and check characters letter by letter
This is a classic problem given to new learners in Java. I would suggest to go for a simple approach as it is intended to help you practice your coding skills
Read the java source code as a file in your program char by char.
Search for comments beginning. In this case, there are 2, /* and //.
Open a string buffer and start writing the read contents into it.
If its /*, then don't write it in buffer. Keep on moving to next character till you find */.
Repeat till end of file is reached.
If single line comments need to be removed, then same algorithm can be followed till you get a new line character.
If you need help in reading from file char by char, refer to Java documentation.
When end of file is reached, then write the string buffer back to the file.

How to make a scanner in Java that checks if the first letter is a character between A-V and if the second character is a number between 1-20?

How to make a scanner that checks if the first letter is a character between A-V and if the second character is a number between 1-20? Some examples are: '.B4', 'H10.', '**V1', 'L19*', 'M12', or 'N14'.
I'm kind a new to Java. Still learning it in school. I've followed the lessons for about half a year now.
Now I've got an assignment for school. It is about creating a text-based minesweeper. I succeeded in printing the board and placing the mines. But now I'm stuck on getting the right input.
If you use '*' in the scanner like * B4 or B4* it should mark a square.
If you use '.' in the scanner like .B4 or B4. it should unmark a square.
And if you enter B4 it should open.
But I can't get this done in a neat way. I've tried to make sub-strings of it to check if every character is the right one but after I did that my code was kind of chaotic and it didn't work as supposed to.
I've tried it like: "Example 3 : Validating vowels in: Validating input using java.util.Scanner" only I used a variable of the length of my board. So if the board was 10 by 10 it wouldn't go further than J10. But that didn't work either for me.
So I was hoping that you could help me solving this problem.
As this is an assignment, I'll just give you a guideline rather than actual code.
First, you need to get the input into some format. Consider reading the input in from the scanner and storing it into a string.
We can then make use of Java's String functions, a list of which can be found here. Try to find a function that could be useful, perhaps one that lets us get the character at a certain index.
We can then do checks on the string. First we check the first character (the character at index 0), we want to know if that is a letter from A-V. To do this we can do a check on the ASCII numbers. Assuming you just want capital letters, if we convert A to an int, then it will have the value 65. V has the value 86. All the numbers in between correspond to the ASCII values of the letters in between.
Thus we can do a check, convert the first character to an integer, let's call it x. If x >= 65 && x <= 86, then it's a letter we can care about.
Next, you need to do the number checking. For this, take a look at the function Integer.parseInt(String s). It takes a String and then converts it to an integer. You'll have to do some checks to see if it's >= 10 or <10.

Algorithm and Data Structure for Checking letters in a word with another set of letters

I have a dictionary of 200,000 words and a set of letters. I need an algorithm to check if all the letters of a word are in that set of letters. It's very slow to check the words one by one. Because there is a huge number of words to process, I need a data structure to do this. Any ideas? Thanks!
For example: I have a set of letters {b,g,e,f,t,u,i,t,g,n,c,m,m,w,c,s}, I wanna check if word "big" and "buff". All letters of "big" are a subset of the original set then "big" is what i want while "buff" is not what i want because there is only one "f" in the original set.
This is what i wanna do.
This is for something like Scrabble or Boggle, right? Well, what you do is pre-generate your dictionary by sorting the letters in each word. So, word becomes dorw. Then you shove all these into a Trie data structure. So, in your Trie, the sequence dorw would point to the value word.
[Note that because we sorted the words, they lose their uniqueness, so one sorted word can point to multiple different words. ie your Trie needs to store a list or array at its data nodes]
You can save this structure out if you need to load it quickly later without all the sorting steps.
What you then do is take your input letters and you sort them too. You then start walking through your Trie recursively. If the current letter matches an existing path in the Trie, you follow it. Because you can have unused letter, you also allow the current letter to be dropped.
And it's that simple. Any time you encounter a node in your Trie that has a value, that's a word that you can make out of the letters you used to get there. You just add these words to a list as you find them, and when the recursion is done you have found every possible word.
If you have repeated letters in your input, you may need extra logic to prevent multiple instances of the same word being given (unless you want that). That logic can be invoked during the step that 'leaves out' a letter (you just skip past all the repeated letters) to the next letter.
[edit] You seem to want to do the opposite. My solution above finds all possible words that can be made from a set of letters. But you want to test a specific word to see if it's allowed, given the set of letters you have.
This is simple.
Store your available letters as a histogram. That is, for each letter, you store the number that you have. Then, you walk through each letter in your test word, building a new histogram as you go. As soon as one of your histogram buckets exceeds the value in your available-letters, the word cannot be made. If you get all the way to the end, you can successfully make the word.
You can use an array to mark the letter set. Each element in the array stands for a letter. To convert the letter to the element position, just need to subtract the ASCII code of 'a' or 'A'. Then the first element stands for 'a', then the second is 'b', and so on. Then the 27th is 'A'. The element value stands for the occurrences. For example, the array {2, 0, 1, 0, ...} stands for like {a, c, a}. The pseudo code could be:
for each word
copy the array to a new one
for each letter in the word
get the element position of the letter: position = letter - 'a'
decrease the element value in the new array by one: new_array[position]--
if the value is negative, return not found: if array[position] < 0 {return not found;}
sort the set, then sort each word and do a "merge"-like operation

Need help writing a descrambling method for substitution cipher

I need some help on a Java assignment. We are given a scrambled text file, which was scrambled using a substitution cipher, where every letter in the text is simply swapped out for another letter. My program is almost finished, but I'm having trouble figuring out how to write the final "descramble" method, which takes the scrambled text and replaces each letter with its correct substitute in order to reveal the correct text.
These are the instructions provided in the assignment:
The descrambling is done by using the letter in the scrambled text as the index in the char array. For example, if the scrambled text has a letter B, you replace it with the character it index 2 in the array. All whitespace and punctuation from the original file should also be in the descrambled file, only the letters should have been changed. Additionally, if a letter was capitalized in the original file, it should be capitalized in the descrambled file (similarily, lowercase letters should still be lowercase).
I'm not asking to have the answer given to me, since this is for school. I just can't seem to properly understand these instructions, what exactly is it that I need to do to successfully decode the text? Mostly, I don't understand how I can use a letter as an index for a char array, aren't indexes always integers?
You didn't say what language you're working in, so I'll use C/Java. You'll want to compute an integer index. Assume for the moment that scrambled_char is an upper case letter then it's:
// index into descrambling array:
int index = scrambled_char - 'A' + 1;
This has value 1 for character A, 2 for B, etc. as the problem says. It sounds like you're being given the descrambling array. For example:
char descramble[] = "_ZYX ... ";
This would cause A to be translated to Z, B to Y, C to X, ...
The descrambled character will be
char descrambled_char = descramble[index];
Now you just need to work out how to handle lower case letters, white space, and punctuation.

Categories