Good time of the day
I have to implement some functions on my site.
When a user enters the website, he has to enter a number of errors that should be occurred in some string. If he chooses 0, the original string will be displayed without error. If the user enters 1 in the error field, the string should be displayed with 1 error. For instance, the word "programming" should be displayed like "rrograming" or "prugramming", or adding/deleting one character to/from the string. Consequently, if the error is 2, the mistakes also should be 2.
In addition, the result should be the same every time. Someone told me to use seeds in Random class, but now I don't have idea.
I am programming in Java and some JS
Please, if you faced the same problem or experience, give me some ideas or resources to learn.
Random with a seed results in the same sequence of values every time, so use the same random every time.
The trick is using a random seed based on the word content. Otherwise you would end up doing the same mutilation pattern on all words.
The likeliness of changing a letter can be in the order of the length of the word.
And inserting twice as much as removing, both rare.
String mutilateWord(String word, int times) {
//Random random = new Random(42);
// Or better (as different words get differently mixed up:
Random random = new Random(word.hashCode());
for (int t = 0; t < times; ++t) {
int chars = word.length;
// If a letter modification has the same likeliness:
int choices = chars + 2 + 1; // change some letter + (twice) add + remove
word = switch (random.nextInt(choices)) {
case 0 -> removeRandomLetter(word, random);
case 1, 2 -> insertRandomLetter(word, random);
default -> changeRandomLetter(word, random);
};
}
return word;
}
Since java 12 one can use a switch expression as above.
The above will also mutilate the mutilations, so you might prefer to first have a times loop changing random letters (or not), and then a second times loop adding/removing letters.
Your friend's idea is correct, setting the random number seed to the same value each time will yield same results. However, you can cache the result everytime you compute it in a map for example. Setting the keys as the error number and the value as the string result. On each function call you check if the error number is cached, if that's the case you return it without computing it else you compute the result then cache it and finally return it.
i'm try to make a random or unique code to payment method, for example the user must pay for Rp. 10.000 and then the user checkout and the system give the total payment with unique code in last three number.
For example user should pay for Rp. 10.000 and then the system showing Rp. 10.123
for other example, user pay for 1.000.000 and then system showing the total is 1.000.562 just unique or random in last three number
how i can make like this in java for android ?
UPDATE
i've been try this code
int someNumber = 10.000;
int lastThree = someNumber % 1000;
but when the last 3 number is '0' it return 0, when i'm change the last 3 number like 10.234 it return into 234, now how i can get the last three number when the value is like 10.000
That's because your 'lastThree' is an integer and there is no way to have an integer like 000. if you need last three number, you can use something like this:
int someNumber = 10.000;
String temp = Integer.toString(someNumber);
String lastThree = temp.substring(temp.length() - 3);
Now if you pass 10000, lastThree will be '000'.
There is a storage unit, with has a capacity for N items. Initially this unit is empty.
The space is arranged in a linear manner, i.e. one beside the other in a line.
Each storage space has a number, increasing till N.
When someone drops their package, it is assigned the first available space. The packages could also be picked up, in this case the space becomes vacant.
Example: If the total capacity was 4. and 1 and 2 are full the third person to come in will be assigned the space 3. If 1, 2 and 3 were full and the 2nd space becomes vacant, the next person to come will be assigned the space 2.
The packages they drop have 2 unique properties, assigned for immediate identification. First they are color coded based on their content and second they are assigned a unique identification number(UIN).
What we want is to query the system:
When the input is color, show all the UIN associated with this color.
When the input is color, show all the numbers where these packages are placed(storage space number).
Show where an item with a given UIN is placed, i.e. storage space number.
I would like to know how which Data Structures to use for this case, so that the system works as efficiently as possible?
And I am not given which of these operations os most frequent, which means I will have to optimise for all the cases.
Please take a note, even though the query process is not directly asking for storage space number, but when an item is removed from the store it is removed by querying from the storage space number.
You have mentioned three queries that you want to make. Let's handle them one by one.
I cannot think of a single Data Structure that can help you with all three queries at the same time. So I'm going to give an answer that has three Data Structures and you will have to maintain all the three DS's state to keep the application running properly. Consider that as the cost of getting a respectably fast performance from your application for the desired functionality.
When the input is color, show all the UIN associated with this color.
Use a HashMap that maps Color to a Set of UIN. Whenever an item:
is added - See if the color is present in the HashMap. If yes, add this UIN to the set else create a new entry with a new set and add the UIN then.
is removed - Find the set for this color and remove this UIN from the set. If the set is now empty, you may remove this entry altogether.
When the input is color, show all the numbers where these packages are placed.
Maintain a HashMap that maps UIN to the number where an incoming package is placed. From the HashMap that we created in the previous case, you can get the list of all UINs associated with the given Color. Then using this HashMap you can get the number for each UIN which is present in the set for that Color.
So now, when a package is to be added, you will have to add the entry to previous HashMap in the specific Color bucket and to this HashMap as well. On removing, you will have to .Remove() the entry from here.
Finally,
Show where an item with a given UIN is placed.
If you have done the previous, you already have the HashMap mapping UINs to numbers. This problem is only a sub-problem of the previous one.
The third DS, as I mentioned at the top, will be a Min-Heap of ints. The heap will be initialized with the first N integers at the start. Then, as the packages will come, the heap will be polled. The number returned will represent the storage space where this package is to be put. If the storage unit is full, the heap will be empty. Whenever a package will be removed, its number will be added back to the heap. Since it is a min-heap, the minimum number will bubble up to the top, satisfying your case that when 4 and 2 are empty, the next space to be filled will be 4.
Let's do a Big O analysis of this solution for completion.
Time for initialization: of this setup will be O(N) because we will have to initialize a heap of N. The other two HashMaps will be empty to begin with and therefore will incur no time cost.
Time for adding a package: will include time to get a number and then make appropriate entries in the HashMaps. To get a number from heap will take O(Log N) time at max. Addition of entries in HashMaps will be O(1). Hence a worst case overall time of O(Log N).
Time for removing a package: will also be O(Log N) at worst because the time to remove from the HashMaps will be O(1) only while, the time to add the freed number back to min-heap will be upper bounded by O(Log N).
This smells of homework or really bad management.
Either way, I have decided to do a version of this where you care most about query speed but don't care about memory or a little extra overhead to inserts and deletes. That's not to say that I think that I'm going to be burning memory like crazy or taking forever to insert and delete, just that I'm focusing most on queries.
Tl;DR - to solve your problem, I use a PriorityQueue, an Array, a HashMap, and an ArrayListMultimap (from guava, a common external library), each one to solve a different problem.
The following section is working code that walks through a few simple inserts, queries, and deletes. This next bit isn't actually Java, since I chopped out most of the imports, class declaration, etc. Also, it references another class called 'Packg'. That's just a simple data structure which you should be able to figure out just from the calls made to it.
Explanation is below the code
import com.google.common.collect.ArrayListMultimap;
private PriorityQueue<Integer> openSlots;
private Packg[] currentPackages;
Map<Long, Packg> currentPackageMap;
private ArrayListMultimap<String, Packg> currentColorMap;
private Object $outsideCall;
public CrazyDataStructure(int howManyPackagesPossible) {
$outsideCall = new Object();
this.currentPackages = new Packg[howManyPackagesPossible];
openSlots = new PriorityQueue<>();
IntStream.range(0, howManyPackagesPossible).forEach(i -> openSlots.add(i));//populate the open slots priority queue
currentPackageMap = new HashMap<>();
currentColorMap = ArrayListMultimap.create();
}
/*
* args[0] = integer, maximum # of packages
*/
public static void main(String[] args)
{
int howManyPackagesPossible = Integer.parseInt(args[0]);
CrazyDataStructure cds = new CrazyDataStructure(howManyPackagesPossible);
cds.addPackage(new Packg(12345, "blue"));
cds.addPackage(new Packg(12346, "yellow"));
cds.addPackage(new Packg(12347, "orange"));
cds.addPackage(new Packg(12348, "blue"));
System.out.println(cds.getSlotsForColor("blue"));//should be a list of {0,3}
System.out.println(cds.getSlotForUIN(12346));//should be 1 (0-indexed, remember)
System.out.println(cds.getSlotsForColor("orange"));//should be a list of {2}
System.out.println(cds.removePackage(2));//should be the orange one
cds.addPackage(new Packg(12349, "green"));
System.out.println(cds.getSlotForUIN(12349));//should be 2, since that's open
}
public int addPackage(Packg packg)
{
synchronized($outsideCall)
{
int result = openSlots.poll();
packg.setSlot(result);
currentPackages[result] = packg;
currentPackageMap.put(packg.getUIN(), packg);
currentColorMap.put(packg.getColor(), packg);
return result;
}
}
public Packg removePackage(int slot)
{
synchronized($outsideCall)
{
if(currentPackages[slot] == null)
return null;
else
{
Packg packg = currentPackages[slot];
currentColorMap.remove(packg.getColor(), packg);
currentPackageMap.remove(packg.getUIN());
currentPackages[slot] = null;
openSlots.add(slot);//return slot to priority queue
return packg;
}
}
}
public List<Packg> getUINsForColor(String color)
{
synchronized($outsideCall)
{
return currentColorMap.get(color);
}
}
public List<Integer> getSlotsForColor(String color)
{
synchronized($outsideCall)
{
return currentColorMap.get(color).stream().map(packg -> packg.getSlot()).collect(Collectors.toList());
}
}
public int getSlotForUIN(long uin)
{
synchronized($outsideCall)
{
if(currentPackageMap.containsKey(uin))
return currentPackageMap.get(uin).getSlot();
else
return -1;
}
}
I use 4 different data structures in my class.
PriorityQueue I use the priority queue to keep track of all the open slots. It's log(n) for inserts and constant for removals, so that shouldn't be too bad. Memory-wise, it's not particularly efficient, but it's also linear, so that won't be too bad.
Array I use a regular Array to track by slot #. This is linear for memory, and constant for insert and delete. If you needed more flexibility in the number of slots you could have, you might have to switch this out for an ArrayList or something, but then you'd have to find a better way to keep track of 'empty' slots.
HashMap ah, the HashMap, the golden child of BigO complexity. In return for some memory overhead and an annoying capital letter 'M', it's an awesome data structure. Insertions are reasonable, and queries are constant. I use it to map between the UIDs and the slot for a Packg.
ArrayListMultimap the only data structure I use that's not plain Java. This one comes from Guava (Google, basically), and it's just a nice little shortcut to writing your own Map of Lists. Also, it plays nicely with nulls, and that's a bonus to me. This one is probably the least efficient of all the data structures, but it's also the one that handles the hardest task, so... can't blame it. this one allows us to grab the list of Packg's by color, in constant time relative to the number of slots and in linear time relative to the number of Packg objects it returns.
When you have this many data structures, it makes inserts and deletes a little cumbersome, but those methods should still be pretty straight-forward. If some parts of the code don't make sense, I'll be happy to explain more (by adding comments in the code), but I think it should be mostly fine as-is.
Query 3: Use a hash map, key is UIN, value is object (storage space number,color) (and any more information of the package). Cost is O(1) to query, insert or delete. Space is O(k), with k is the current number of UINs.
Query 1 and 2 : Use hash map + multiple link lists
Hash map, key is color, value is pointer(or reference in Java) to link list of corresponding UINs for that color.
Each link list contains UINs.
For query 1: ask hash map, then return corresponding link list. Cost is O(k1) where k1 is the number of UINs for query color. Space is O(m+k1), where m is the number of unique color.
For query 2: do query 1, then apply query 3. Cost is O(k1) where k1 is the number of UINs for query color. Space is O(m+k1), where m is the number of unique color.
To Insert: given color, number and UIN, insert in hash map of query 3 an object (num,color); hash(color) to go to corresponding link list and insert UIN.
To Delete: given UIN, ask query 3 for color, then ask query 1 to delete UIN in link list. Then delete UIN in hash map of query 3.
Bonus: To manage to storage space, the situation is the same as memory management in OS: read more
This is very simple to do with SegmentTree.
Just store a position in each place and query min it will match with vacant place, when you capture a place just assign 0 to this place.
Package information possible store in separate array.
Initiall it have following values:
1 2 3 4
After capturing it will looks following:
0 2 3 4
After capturing one more it will looks following:
0 0 3 4
After capturing one more it will looks following:
0 0 0 4
After cleanup 2 it will looks follwong:
0 2 0 4
After capturing one more it will looks following:
0 0 0 4
ans so on.
If you have segment tree to fetch min on range it possible to done in O(LogN) for each operation.
Here my implementation in C#, this is easy to translate to C++ of Java.
public class SegmentTree
{
private int Mid;
private int[] t;
public SegmentTree(int capacity)
{
this.Mid = 1;
while (Mid <= capacity) Mid *= 2;
this.t = new int[Mid + Mid];
for (int i = Mid; i < this.t.Length; i++) this.t[i] = int.MaxValue;
for (int i = 1; i <= capacity; i++) this.t[Mid + i] = i;
for (int i = Mid - 1; i > 0; i--) t[i] = Math.Min(t[i + i], t[i + i + 1]);
}
public int Capture()
{
int answer = this.t[1];
if (answer == int.MaxValue)
{
throw new Exception("Empty space not found.");
}
this.Update(answer, int.MaxValue);
return answer;
}
public void Erase(int index)
{
this.Update(index, index);
}
private void Update(int i, int value)
{
t[i + Mid] = value;
for (i = (i + Mid) >> 1; i >= 1; i = (i >> 1))
t[i] = Math.Min(t[i + i], t[i + i + 1]);
}
}
Here example of usages:
int n = 4;
var st = new SegmentTree(n);
Console.WriteLine(st.Capture());
Console.WriteLine(st.Capture());
Console.WriteLine(st.Capture());
st.Erase(2);
Console.WriteLine(st.Capture());
Console.WriteLine(st.Capture());
For getting the storage space number I used a min heap approach, PriorityQueue. This works in O(log n) time, removal and insertion both.
I used 2 BiMaps, self-created data structures, for storing the mapping between UIN, color and storage space number. These BiMaps used internally a HashMap and an array of size N.
In first BiMap(BiMap1), a HashMap<color, Set<StorageSpace>> stores the mapping of color to the list of storage spaces's. And a String array String[] colorSpace which stores the color at the storage space index.
In the Second BiMap(BiMap2), a HashMap<UIN, storageSpace> stores the mapping between UIN and storageSpace. And a string arrayString[] uinSpace` stores the UIN at the storage space index.
Querying is straight forward with this approach:
When the input is color, show all the UIN associated with this color.
Get the List of storage spaces from BiMap1, for these spaces use the array in BiMap2 to get the corresponding UIN's.
When the input is color, show all the numbers where these packages are placed(storage space number). Use BiMap1's HashMap to get the list.
Show where an item with a given UIN is placed, i.e. storage space number. Use BiMap2 to get the values from the HashMap.
Now when we are given a storage space to remove, both the BiMaps have to be updated. In BiMap1 get the entry from the array, get the corersponding Set, and remove the space number from this set. From BiMap2 get the UIN from the array, remove it and also remove it from the HashMap.
For both the BiMaps the removal and the insert operations are O(1). And the Min heap works in O(Log n), hence the total time complexity is O(Log N)
As the title says , for example I have the value 02 I want the first zero to stay there so I can control my pendingIntent.
my program basically has a list , every item has it's own options , the only way I can control alertmanager is by knowing the exact unique id so I can cancel the right notification , but I couldn't come up with a solution other than this:
int FirstListPosition = getArguments().getInt(EXTRA_FirstListPosition);
int InnerListPosition = getArguments().getInt(EXTRA_PASSEDPOSITION);
String Merge = Integer.toString(FirstListPosition) + Integer.toString(InnerListPosition);
int FinalValue = Integer.parseInt(Merge);
so if I click the first item in my main list , the FirstListPosition will be 0 and if I click the third item in my list inside of the main list the InnerListPosition will be 2
so the string merge will be = 02 , this way the unique id will never be the same and I can actually get the position of the two if the user wanted to cancel a specific pendingintent " notification "
hopefully you guys understood what I meant
Have a look at NumberFormat. This class is intended to be used to format number according to a specific pattern.
If you want to encode the position in two lists into a single int value (and the number of choices in the second list is, say, less than 1000 you might use:
int encodedListPositions;
int list1Position;
int list2Position;
int encodedListPosition = list1Position*1000 + list2Position;
...
list1Position = encodedListPosition / 1000;
list2Position = endocedListPosition - 1000*list1Position;
The scenario is that i have a bulky database of around 500,000 records having address and city field in which there is no such standard way of inserting the value so multiple users, for example some have inserted their city value as bangalore and another have inserted its city value as begaluru or benglore(misspelled)
Also in case of address field same user with multiple record have inserted its address values but the values are not exaclty same for example Mountville park Thomas gate and Montlee park thonas gte.
I need to fetch all those record those are having same and almost similar values(somehow missplelled) of address and city.
Is there any way to get those records with almost similar but unmatched values?
Thankyou.
It will be an expensive query, but since this will hopefully be a one-time operation, you might consider looking in a Levenshtein distance formula.
In order to avoid needing to calculate the distance for a cartesian product of your table, you could first narrow the set of cities and addresses to be compared with a quicker sanity check... such as they begin with the same letter, and have a similar length.
You could then start off by only returning records with a very small Levenshtein distance, and then gradually increasing the distance until you start to get too many false positives.
Here's an implementation directly in MySql:
CREATE FUNCTION levenshtein( s1 VARCHAR(255), s2 VARCHAR(255) )
RETURNS INT
DETERMINISTIC
BEGIN
DECLARE s1_len, s2_len, i, j, c, c_temp, cost INT;
DECLARE s1_char CHAR;
-- max strlen=255
DECLARE cv0, cv1 VARBINARY(256);
SET s1_len = CHAR_LENGTH(s1), s2_len = CHAR_LENGTH(s2), cv1 = 0x00, j = 1, i = 1, c = 0;
IF s1 = s2 THEN
RETURN 0;
ELSEIF s1_len = 0 THEN
RETURN s2_len;
ELSEIF s2_len = 0 THEN
RETURN s1_len;
ELSE
WHILE j <= s2_len DO
SET cv1 = CONCAT(cv1, UNHEX(HEX(j))), j = j + 1;
END WHILE;
WHILE i <= s1_len DO
SET s1_char = SUBSTRING(s1, i, 1), c = i, cv0 = UNHEX(HEX(i)), j = 1;
WHILE j <= s2_len DO
SET c = c + 1;
IF s1_char = SUBSTRING(s2, j, 1) THEN
SET cost = 0; ELSE SET cost = 1;
END IF;
SET c_temp = CONV(HEX(SUBSTRING(cv1, j, 1)), 16, 10) + cost;
IF c > c_temp THEN SET c = c_temp; END IF;
SET c_temp = CONV(HEX(SUBSTRING(cv1, j+1, 1)), 16, 10) + 1;
IF c > c_temp THEN
SET c = c_temp;
END IF;
SET cv0 = CONCAT(cv0, UNHEX(HEX(c))), j = j + 1;
END WHILE;
SET cv1 = cv0, i = i + 1;
END WHILE;
END IF;
RETURN c;
END;
This function could then be used in a a helper function as follows:
CREATE FUNCTION levenshtein_ratio( s1 VARCHAR(255), s2 VARCHAR(255) )
RETURNS INT
DETERMINISTIC
BEGIN
DECLARE s1_len, s2_len, max_len INT;
SET s1_len = LENGTH(s1), s2_len = LENGTH(s2);
IF s1_len > s2_len THEN
SET max_len = s1_len;
ELSE
SET max_len = s2_len;
END IF;
RETURN ROUND((1 - LEVENSHTEIN(s1, s2) / max_len) * 100);
END;
You could also optimize the levenshtein function by passing in your current max distance... if the function passes that distance, exit without calculating the exact distance.
Ouch. This is a tricky one. Regardless of the method you use, you're going to end up with a very expensive query. My recommendation is that you write an application that duplicates the data into a new table after running it through a spell checker. You can also implement the query in Java, reading each record, spell checking the field and comparing it.
Fortunately, some spell checkers software already exists... You can have a look at Jazy or JOrtho for this purpose.
SOUNDEX() may be of limited use to you, however I know from experience in normalising hotel names across the world (similar problem with mistranslation issues thrown in) that a reliable solution is going to be very difficult to create.
The best option would have involved making a standard city and/or address list. I have no idea if anything equivilant to the Postcode Address File ( http://www.royalmail.com/marketing-services/address-management-unit/address-data-products/postcode-address-file-paf ) which is available in the UK is present for your locality. This however will be of no real use for normalising your existing data.
Ultimately any option available is going to require significant human input to ensure any normalisations are not falsely matched.
In the first instance I would look to rely on any area codes you have available to you (Google tells me that in India this is a PIN code?). There are most likely database(s) available which can map these codes to areas ( http://www.geopostcodes.com/india_zip_codes ) which will remove the problem of normalising broader areas (assuming their PIN code is correct)
Regarding street-level normalisation, you may have to look at SOUNDEX() or some kind of arbitary system if you wish to normalise spelling errors / differences in how people write individual street names or places.
It depends on the language that you are using, for example you can remove vowels before comparing strings
You can create a table and use it to aid in the search
CREATE TABLE `correct_spelling` (
correctString varchar(100) not null,
variant varchar(100) not null,
primary key (correctString)
)
You would populate the table with the known variants (manually). While this sounds crazy in the short run, it may be your best solution in the long run. and it may be reusable later in life.