Google Cloud Storage sort directory by name - java

This feels wrong to me - given a prefix in GCS and knowing my "folders" are consistently named with a long value (e.g. a date in unix time) I want to get the first listing if i was to sort them in descending order. Right now, I only see how to iterate through them all and sort the list:
ListOptions.Builder b = new ListOptions.Builder();
b.setRecursive(false);
b.setPrefix(path);
ListResult result = null;
result = gcsService.list(appIdentity.getDefaultGcsBucketName(), ListOptions.DEFAULT);
List<Long> names = new ArrayList<>();
while (result.hasNext()){
ListItem l = result.next();
String name = l.getName();
logger.info("get top folder" + name);
names.add(Long.valueOf(name));
}
Collections.sort(names);
long topDay = names.get(0);
Maybe a list option i don't see?

If the numbers have the same length, you are looking for the last element on the last page of the results. There is no parameter which reverses result sorting, unfortunately.
If the numbers are not the same length, that's rough. The best way to solve that would probably be to iterate through the options and keep track of the best one you've seen yet, although sorting through them all afterwards also works.

Related

Retrieve String from values in HashMap at a specific occurrence of special character

So I'm trying retrieve specific substrings in values in a Hashmap constructed like this..
HashMap<ID, "Home > Recipe > Main Dish > Chicken > Chicken Breasts">
Which is passed from a different method that returns a HashMap
In above example, I need to retrieve Chicken.
Thus far, I have..
public static ArrayList<String> generalize() {
HashMap<String, String> items = new HashMap<>();
ArrayList<String> cats = new ArrayList<>();
items = RecSys.readInItemProfile("PATH", 0, 1);
for(String w : items.values()) {
cats.add(w);
}
for(String w : cats) {
int e = w.indexOf('>', 1 + w.indexOf('>', 1 + w.indexOf('>')));
String k = w.substring(e+1);
System.out.print(k);
e = 0;
}
System.out.println("k" + cats);
return cats;
}
Where I try to nullify String e for each iteration (I know it's redundant but it was just to test).
In my dataset, the first k-v pair is
3880=Home  >  Recipes  >  Main Dish  >  Pasta,
My output is
Pasta
Which is ok. If there are more than 3x ">", it'll return all following categories. Optimally it wouldn't do that, but it's ok if it does. However, further down the line, it (seemingly) randomly returns
Home > Recipe
Along with the rest of the data...
This happens at the 6th loop, I believe.
Any help is greatly appreciated..
Edit:
To clarify, I have a .csv file containing 3 columns, whereas 2 are used in this function (ID and Category). These are passed to this function by a read method in another class.
What I need to do is extract a generalized description of each category, which in all cases is the third instance of category specification (that is, always between the third and fourth ">" in every k-v pair).
My idea was to simply put all values in an arraylist, and for every value extract a string from between the third and fourth ">".
I recommend using the following map:
Map<Integer, List> map = new HashMap<>();
String[] vals = new String[] { "HomeRecipe", "Main Dish", "Chicken",
"Chicken Breasts" };
map.put(1, Arrays.asList(vals));
Then, if you need to find a given value in your original string using an ID, you can simply call ArrayList#get() at a certain position. If you don't care at all about order, then a map of integers to sets might make more sense here.
If you can. change your data structure to a HashMap<Integer, List<String>> or HashMap<Integer, String[]>. It's better to store the categories (by cats you mean categories right?) in a collection instead of a string.
Then you can easily get the third item.
If this is not possible. You need to do some debugging. Start by printing every input and output pair and find out which input caused the unexpected output. Your indexOf method seems to work at first glance.
Alternatively, try this regex method:
String k = cats.replaceAll("(?:[^>]+\\s*>\\s*){3}([^>]+).*", "$1");
System.out.println(k);
The regex basically looks for a xxx > yyy > zzz > aaa ... pattern and replaces that pattern with aaa (whatever that is in the original string).

Get the same element values on multiple arrays

Ive been searching SO about this question and most only have the problem with two arrays comparing by have a nested loop. My problem is quite the same but on a bigger scale. Suppose I have a 100 or thousand user on my app, and each user has the list of item it wants.
Something like this
User1 = {apple,orange,guava,melon,durian}
User2 = {apple, melon,banana,lemon,mango}
User3 = {orange,carrots,guava,melon,tomato}
User4 = {mango,carrots,tomato,apple,durian}
.
.
Nuser = ...
I wanted to see how many apples or oranges was listed from all the users array. So I am basically comparing but on a bigger scale. The data isn't static as well, A user can input an unknown fruit from the developers knowledge but on the users knowledge they can put it there so there can be multiple users that can put this unknown fruit and yet the system can still figure out how many is this unknown item was listed. Keep in mind this is a dynamic one. User can reach for example a 100 users depending how popular an app would be. I can't afford to do nested loop here.
PS this is not the exact problem but it is the simplest scenario I can think of to explain my problem.
PS: just to clarify, I dont intend to use 3rd party lib as well like guava. I am having a problem on proguard with it.
Edit
Just read that Original poster cannot use Java 8, which is a pity, because this would realy make it very easy!
Java 7 solution
final Map<String, Integer> occurencesByFruit = new HashMap<>();
for (User user : users) {
String[] fruits = user.getFruits();
for (String fruit : fruits) {
final Integer currentCount = occurencesByFruit.get(fruit);
if (currentCount == null) {
occurencesByFruit.put(fruit, 1);
} else {
occurencesByFruit.put(fruit, currentCount + 1);
}
}
}
Java 8 solution
I'd stream the users, flatMap() to the actual fruit elements, and then use Collectors.groupingBy() with a downstream collector Collectors.counting().
This will give you a Map where the keys are the fruits, and the values are the occurrences of each fruit throughout all your users.
List<User> users = Arrays.asList(/* ... */);
final Map<String, Long> occurencesByFruit = users.stream()
.map(User::getFruits)
.flatMap(Arrays::stream)
.collect(Collectors.groupingBy(f -> f, Collectors.counting()));
Seems it is a good possibility to use HashMap<Item, Integer> fruits. You could iterate over all Users (you would need to store all Users in some kind of list, such as ArrayList<User> users) and check the list of items chosen by each User (I suppose User should have a field ArrayList<Item> items in its body to store items). You could achieve it with something like that:
for (User user : users) { // for each User from users list
for (Item item : user.items) { // check each item chosen by this user
if (fruits.containsKey(item) { // if the fruit is already present in the items HashMap increment the amount of items
int previousNumberOfItems = fruits.get(item);
fruits.put(item, ++previousNumberOfItems);
else { // otherwise put the first occurrency of this item
fruits.put(item, 1);
}
}
}
I would either create an ArrayList containing a HashMap with strings and ints or use two ArrayLists (one of type String and one of type Integer). Then you can iterate over every entry in each of the user arrays (this is only a simple nested loop). For every entry in the current user array you check if there is already the same entry in the ArrayList you created additionally. If so, you increment the respective int. If not, you add a string and an int. In the end, you have the number of occurrences of all the fruit strings in the added ArrayLists, which is, if I understood you correctly, just what you wanted.

java: check if an object's attribute exists in HashMap values

I have a HashMap with key of type Double and my custom object as value.
It looks like this:
private static Map<Double, Incident> incidentHash = new HashMap<>();
The Incident object has following attributes: String date, String address, String incidentType.
Now I have a String date that I get from the user as input and I want to check if there exists any incident in the HashMap with that user inputted date. There can be many Incidents in the HashMap with the given date but as long as there's at least one Incident with the given date, I can do *
something.
I can just iterate over all the values in the HashMap and check if a given date exists but I was wondering if there is any better and more efficient way possible without modifying the data structure.
Given your HashMap, NO, there is not another way of doing so without iterating that HashMap.
As for changing the structure, you could do as Map<String, List<Incident>> that way you would have a date as key and a List of incidents for that date, given your requirement: There can be many Incidents in the HashMap with the given date.
So this would be a O(1)
//considering that the key is added when you have at least one incident
if (yourHash.get("yourDateStringWhatEverTheFormatIs") != null)
You can use streams API (from Java8) as shown in the below code with inline comments:
String userInput="10-APR-2017";
Optional<Map.Entry<Double, Incident>> matchedEntry =
incidentHash.entrySet().stream().
//filter with the condition to match
filter(element -> element.getValue().getDate().equals(userInput)).findAny();
//if the entry is found, do your logic
matchedEntry.ifPresent(value -> {
//do something here
});
If you are looking for something prior to JDK1.8, you can refer the below code:
String userInput="10-APR-2017";
Set<Map.Entry<Double, Incident>> entries = incidentHash.entrySet();
Map.Entry<Double, Incident> matchedEntry = null;
for(Iterator<Map.Entry<Double, Incident>> iterator = entries.iterator();
iterator.hasNext();) {
Map.Entry<Double, Incident> temp = iterator.next();
if(temp.getValue().getDate().equals(userInput)) {
matchedEntry = temp;
break;
}
}
You can use a TreeMap with your custom Comparator. In your Comparator compare the values of dates.
You would have to iterate through the map until you find a data that matches. Since you only need to know if any occurrences exist you can simply exit the loop when you find a match instead of iterating the rest of the map.
You can only keep a second Hash/TreeMap that matches the attribute to the object, so you can also check this attibute qickly. But you have to curate one such map for each attribute you want to access quickly. This makes it a bit more complex and use more memory, but can be much much faster.
If this is not an option the stream API referenced in other answers is a nice and tidy way to iterate over all objects to search for an attribute.
private static Map<Double, Incident> incidentHash = new HashMap<>();
private static Map<String, List<Incident>> incidentsPerDayMap = new HashMap<>();
Given that you don't want to iterate the Map and currently it's the only way to get the required value, I would recommend recomment another Map that contains Date as key and List<Incident> as value. It can be a TreeMap, e.g.:
Map<Date, List<Incident>> incidents = new TreeMap<>();
You can put the entry in this Map whenever an entry is added into the original Map, e.g.:
Incident incident = ;// incident object
Date date; //Date
incidents.computeIfAbsent(date, t -> new ArrayList<>()).add(incident);
Once the user inputs the Date, you can get all the incidents belonging to this date just by incidents.get(). Although that will give you a list and you still need to iterate over it, it will contain a lot less elements and get method in TreeMap will guarantee you log n complexity as it is sorted. So, your search operation will be much more efficient.

Convert list to hashmap

Title of the question may give you the impression that it is duplicate question, but according to me it is not.
I am just a few months old in Java and a month old in MongoDB, SpringBoot and REST.
I have a Mongo Collection with 3 fields in a document, _id (default field), appName and appKey. I am using list to iterate through all the documents and find one document whose appName and appKey matches with the one that is passed. This collection right now has only 4 entries, and thus it is running smoothly. But I was reading a bit about collections and found that if there will be a higher number of documents in a collection then the result with list will be much slower than hashMap.
But as I have already said that I am quite new to Java, I am having a bit of trouble converting my code to hashMap, so I was hoping if someone can guide me through this.
I am also attaching my code for reference.
public List<Document> fetchData() {
// Collection that stores appName and appKey
MongoCollection<Document> collection = db.getCollection("info");
List<Document> nameAndKeyList = new ArrayList<Document>();
// Getting the list of appName and appKey from info DB
AggregateIterable<Document> output = collection
.aggregate(Arrays.asList(new BasicDBObject("$group", new BasicDBObject("_id",
new BasicDBObject("_id", "$id").append("appName", "$appName").append("appKey", "$appKey"))
)));
for (Document doc : output) {
nameAndKeyList.add((Document) doc.get("_id"));
}
return nameAndKeyList;
}// End of Method
And then I am calling it in another method of the same class:
List<Document> nameAndKeyList = new ArrayList<>();
//InfoController is the name of the class
InfoController obj1 = new InfoController();
nameAndKeyList = obj1.fetchData();
// Fetching and checking if the appName & appKey pair
// is present in the DB one by one.
// If appName & appKey mismatches, it increments the value
// of 'i' and check them with the other values in DB
for (int i = 0; i < nameAndKeyList.size(); i++) {
"followed by my code"
And if I am not wrong then there will be no need for the above loop also.
Thanks in advance.
You just need a simple find query to get the record you need directly from Mongo DB.
Document document = collection
.find(new Document("appName", someappname).append("appKey", someappkey)).first();
First of all a list is not much slower or faster than an HashMap. A Hasmap is commonly used to save key-pair values such as "ID", "Name" or something like that. In your case I see you are using ArrayList without a specified size for the list. better use a linked list when you do not know the size because an arraylist is holding a array behind and extending this by copying. If you want to generate a Hasmap out of the List or use a Hasmap you need to map an ID and the value to the records.
HashMap<String /*type of the identifier*/, String /*type of value*/> map = new HashMap<String,String>();
for (Document doc : output) {
map.put(doc.get("_id"), doc.get("_value"));
}
First, avoid premature optimization (lookup the expression if you don’t know what it is). Put a realistic number of thousands of items containing near-realistic data in your list. Try to retrieve an item that isn’t there. This will force your for loop to traverse the entire list. See how long it takes. Try a number of times to get an impression of whether you get impatient. If you don’t, you’re done.
If you find out that you need a speed-up, I agree that HashMap is one of the obvious solutions to try. One of the first things to consider with this is a key type for you HashMap. As I understand, what you need to search for is an item where appName and appKey are both right. The good solution is to write a simple class with these two fields and equals and hashCode methods (I’ll call it DocumentHashMapKey for now, think of a better name). For hashCode(), try Objects.hash(appName, appKey). If it doesn’t give satisfactory performance with the data you have, consider alternatives. Now you are ready to build your HashMap< DocumentHashMapKey, Document>.
If you’re lazy or just want a first impression of how a HashMap performs, you may also build your keys by concatenating appName + "$##" + appKey (where the string in the middle is something that is unlikely to be part of a name or key) and use HashMap<String, Document>.
Everything I said can be refined depending on your needs. This was just to get you started.
Thanks everyone for your help, without which I would not have got to a solution.
public HashMap<String, String> fetchData() {
// Collection that stores appName and apiKey
MongoCollection<Document> collection = db.getCollection("info");
HashMap<String, String> appKeys = new HashMap<String, String>();
// Getting the list of appName and appKey from info DB
AggregateIterable<Document> output = collection
.aggregate(Arrays.asList(new BasicDBObject("$group", new BasicDBObject("_id",
new BasicDBObject("_id", "$id").append("appName", "$appName").append("appKey", "$appKey"))
)));
String appName = null;
String appKey = null;
for (Document doc : output) {
Document temp = (Document) doc.get("_id");
appName = (String) temp.get("appName");
appKey = (String) temp.get("appKey");
appKeys.put(appName, appKey);
}
return appKeys;
Calling the above method into another method of the same class.
InfoController obj = new InfoController();
//Fetching the values of 'appName' & 'appKey' sent from 'info' DB
HashMap<String, String> appKeys = obj.fetchData();
storedAppkey = appKeys.get(appName);
//Handling the case of mismatch
if (storedAppkey == null || storedApikey.compareTo(appKey)!=0)
{//Then the response and further processing that I need to do.
Now what HashMap has done is that it has made my code more readable and the 'for' loop that I was using for iterating is gone, although it might not make much difference in the performance as of now.
Thanks once again to everyone for your help and support.

Sorting of 2 or more massive resultsets?

I need to be able to sort multiple intermediate result sets and enter them to a file in sorted order. Sort is based on a single column/key value. Each result set record will be list of values (like a record in a table)
The intermediate result sets are got by querying entirely different databases.
The intermediate result sets are already sorted based on some key(or column). They need to be combined and sorted again on the same key(or column) before writing it to a file.
Since these result sets can be massive(order of MBs) this cannot be done in memory.
My Solution broadly :
To use a hash and a random access file . Since the result sets are already sorted, when retrieving the result sets , I will store the sorted column values as keys in a hashmap.The value in the hashmap will be a address in the random access file where every record associated with that column value will be stored.
Any ideas ?
Have a pointer into every set, initially pointing to the first entry
Then choose the next result from the set, that offers the lowest entry
Write this entry to the file and increment the corresponding pointer
This approach has basically no overhead and time is O(n). (it's Merge-Sort, btw)
Edit
To clarify: It's the merge part of merge sort.
If you've got 2 pre-sorted result sets, you should be able to iterate them concurrently while writing the output file. You just need to compare the current row in each set:
Simple example (not ready for copy-and-paste use!):
ResultSet a,b;
//fetch a and b
a.first();
b.first();
while (!a.isAfterLast() || !b.isAfterLast()) {
Integer valueA = null;
Integer valueB = null;
if (a.isAfterLast()) {
writeToFile(b);
b.next();
}
else if (b.isAfterLast()) {
writeToFile(a);
a.next();
} else {
int valueA = a.getInt("SORT_PROPERTY");
int valueB = b.getInt("SORT_PROPERTY");
if (valueA < valueB) {
writeToFile(a);
a.next();
} else {
writeToFile(b);
b.next();
}
}
}
Sounds like you are looking for an implementation of the Balance Line algorithm.

Categories