I'm making a small number reversal game. The user gets the 10 unique digits (0-9) in a random order, and has to put them in order by reversing them. However, when the 0 is in front such as:
0123456789 becomes 123456789
0000192837465 becomes 192837465
0(anything after) becomes (anything after)
The 0 disappears automatically. Any way to stop this? Since this number is constantly changing, I wish the 0 to simply stay in the number.
PS: In Java
Don’t represent the digits as a numeric type, use a string. After all your program is manipulating a string of characters, the individual characters happen to be digits but it’s not important that you evaluate it as a number.
When you write a program you need to consider your requirements and think of what data structure best handles them, when you use the first thing that comes to mind you can get yourself into trouble.
Related
My teacher and I were discussing whether or not a recursive permutation function could be written without the use of substrings and/or arrays in Java.
Is there a way to do this?
The answer is yes, this can be done. I'm assuming that "without the use of substrings and/or arrays" refers to the info being passed to the recursion. You have to have some sort of container for the elements that are to be permuted.
In that case it can be done by pulling some hideous tricks with numerically encoding the indices of the elements as digits of a numeric argument. For instance, if there are 3 elements and I use 1 as a sentinel value in the left-most digit (so you can have 0 as the leading index sometimes), 1 means I haven't started, 10 means the first element has been selected, 102 means the first and third, and 1021 means I'm ready to print the permutation since I now have a 4 digit argument and there are 3 elements in the set. I can then deconstruct which elements to print using % 10 and / 10 arithmetic to pick them off.
I implemented this in Ruby rather than Java, and I'm not going to share the actual code because it's too horrible to contemplate. However, it works recursively with only the input array of elements and an integer as arguments, no partial solution substrings or arrays.
Introduction
We store tuples (string,int) in a binary file. The string represents a word (no spaces nor numbers). In order to find a word, we apply binary search algorithm, since we know that all the tuples are sorted with respect to the word.
In order to store this, we use writeUTF for the string and writeInt for the integer. Other than that, let's assume for now there are no ways to distinguish between the start and the end of the tuple unless we know them in advance.
Problem
When we apply binary search, we get a position (i.e. (a+b)/2) in the file, which we can read using methods in Random Access File, i.e. we can read the byte at that place. However, since we can be in the middle of the word, we cannot know where this words starts or finishes.
Solution
Here're two possible solutions we came up with, however, we're trying to decide which one will be more space efficient/faster.
Method 1: Instead of storing the integer as a number, we thought to store it as a string (using eg. writeChars or writeUTF), because in that case, we can insert a null character in the end of the tuple. That is, we can be sure that none of the methods used to serialize the data will use the null character, since the information we store (numbers and digits) have higher ASCII value representations.
Method 2: We keep the same structure, but instead we separate each tuple with 6-8 (or less) bytes of random noise (same across the file). In this case, we assume that words have a low entropy, so it's very unlikely they will have any signs of randomness. Even if the integer may get 4 bytes that are exactly the same as those in the random noise, the additional two bytes that follow will not (with high probability).
Which of these methods would you recommend? Is there a better way to store this kind of information. Note, we cannot serialize the entire file and later de-serialize it into memory, since it's very big (and we are not allowed to).
I assume you're trying to optimize for speed & space (in that order).
I'd use a different layout, built from 2 files:
Interger + Index file
Each "record" is exactly 8 bytes long, the lower 4 are the integer value for the record, and the upper 4 bytes are an integer representing the offset for the record in the other file (the characters file).
Characters file
Contiguous file of characters (UTF-8 encoding or anything you choose). "Records" are not separated, not terminated in any way, simple 1 by 1 characters. For example, the records Good, Hello, Morning will look like GoodHelloMorning.
To iterate the dataset, you iterate the integer/index file with direct access (recordNum * 8 is the byte offset of the record), read the integer and the characters offset, plus the character offset of the next record (which is the 4 byte integer at recordNum * 8 + 12), then read the string from the characters file between the offsets you read from the index file. Done!
it's less than 200MB. Max 20 chars for a word.
So why bother? Unless you work on some severely restricted system, load everything into a Map<String, Integer> and get a few orders of magnitude speed up.
But let's say, I'm overlooking something and let's continue.
Method 1: Instead of storing the integer as a number, we thought to store it as a string (using eg. writeChars or writeUTF), because in that case, we can insert a null character
You don't have to as you said that your word contains no numbers. So you can always parse things like 0124some456word789 uniquely.
The efficiency depends on the distribution. You may win a factor of 4 (single digit numbers) or lose a factor of 2.5 (10-digit numbers). You could save something by using a higher base. But there's the storage for the string and it may dominate.
Method 2: We keep the same structure, but instead we separate each tuple with 6-8 (or less) bytes of random noise (same across the file).
This is too wasteful. Using four zeros between the data byte would do:
Find a sequence of at least four zeros.
Find the last zero.
That's the last separator byte.
Method 3: Using some hacks, you could ensure that the number contains no zero byte (either assuming that it doesn't use the whole range or representing it with five bytes). Then a single zero byte would do.
Method 4: As disk is organized in blocks, you should probably split your data into 4 KiB blocks. Then you can add some time header allowing you quick access to the data (start indexes for the 8th, 16th, etc. piece of data). The range between e.g., the 8th and 16th block should be scanned sequentially as it's both simpler and faster than binary search.
I have been learning suffix arrays creation, & i understand that We first sort all suffixes according to first character, then according to first 2 characters, then first 4 characters and so on while the number of characters to be considered is smaller than 2n.
But my doubt is why don't we choose the first 3 characters, then 9... and so on. Why only 2 characters are taken into account since the strings are a part of same strings and not different random strings?
I haven't analyzed the suffix array construction algorithm thoroughly, but still would like to share my thoughts.
In my humble opinion, your question is similar to the following ones:
Why do computers use binary encoding of information instead of ternary?
Why does binary search bisect the range instead of trisecting it?
Why are there two sexes rather than three?
The reason is that the number 2 is special - it is the smallest plural number. The difference between 1 and 2 is qualitative, whereas the difference between 2 and 3 (as well as any other positive integer) is quantitative and therefore not as drastic.
As a result, binary formulation of many algorithms and data structures turns out to be the simplest one, though some of them may be generalized, with various degrees of added complexity, for an arbitrary base.
Answer is given from the post you linked. And as #Leon answered, the algorithm work because it use a dichotomous approach to solve the sorting problem. if you correctly read the answer, the main purpose is to divide word be small 2 character fragments. So that 4 characters can be easily sort base on the arrangement of the 2 pair of characters, 6 characters with 4-2 or 2-4 or 2-2-2 and so one. Thus have a word of 3 letters in the table is non-sense since word of 3 characters may be seen has 2 characters + the position in the alphabet of the last character.
I think you are considering only the speed of 2^x versus 3^x where you obviously would prefer the latter.
But you have to consider the effort you need for each step.
Since 3^x needs about 1.58 less steps than 2^x you would need to be able to compute a single step for the 3^x growth in less than 1.58 times what you need for a single step in the 2^x growth to perform better.
Generally the problems will get much more complex when you have to handle three elements in each step instead of two.
Also if you could expand it to 3^x you could also do it for a bigger n^x and then with big n your algorithm is suddenly not exponential but effectively linear.
I need to store two extremely large numbers (Strings, since they won't fit in int) in two linked lists, add them and then display the result (again, a String).
I can store the numbers directly into the list.
312312 can be stored as 2->1->3->2->1->3 (actual number will be extremely long)
111119 can be stored as 9->1->1->1->1->1
Then I can add them
11->2->4->3->2->4
Normally I could do 11*10^0 + 2*10^1 +...+ 4*10^5 and get 423431 but all those operations (multiplication, addition and exponentiation) would again be integer operations and since the actual numbers are going to be extremely big, int or long won't support the operations. The final result has to be a string.
So I need a way to convert 11->2->4->3->2->4 into 423431 without using int. Also, I cannot use BigInteger. Can anyone help me?
Well, first thing you need to do is implement carry.
For each digit (that is >= 10), you need to increase the next digit by that digit /10 and set that digit to that digit %10.
So 11->2->... becomes 1->3->....
Then to actually produce the string.
For the most performant option, I suggest StringBuilder.
Just append each digit in the linked-list, then just reverse().toString() (since you started with the smallest number).
Think about how you would do it by hand on paper. If the sum of a pair digits is greater than 9 you write down a carry digit of 1, which you add into the sum of the next pair of digits.
In a computer program you can use a local variable for that: add digits from first and last numbers and the carry from earlier, if sum is greater than.. set carry to 1, else set carry to 0, move on to the next pair...
I’m working on a problem where its demanded to get the last 100 digits of 2^1001. The solution must be in java and without using BigInteger, only with int or Long. For the moment I think to create a class that handle 100 digits. So my question is, if there is another way the handle the overflow using int or long to get a number with 100 digits.
Thanks for all.
EDITED: I was off by a couple powers of 10 in my modulo operator (oops)
The last 100 digits of 2^1001 is the number 2^1001 (mod 10^100).
Note that 2^1001 (mod 10^100) = 2*(2^1000(mod 10^100)) (mod 10^100).
Look into properties of modulo: http://www.math.okstate.edu/~wrightd/crypt/lecnotes/node17.html
This is 99% math problem, 1% programming problem. :)
With this approach, though, you wouldn't be able to use just an integer, as 10^100 won't fit in an int or long.
However, you could use this to find, say, the last 10 digits of 2^1001, then use a separate routine to find the next 10 digits, etc, where each routine uses this feature...
The fastest way to do it (coding wise) is to create an array of 100 integers and then have a function which starts from the ones place and doubles every digit. Make sure to take the modulus of 10 and carry over 1s if needed. For the 100th digit, simply eliminate the carry over. Do it 1001 times and you're done.