Is there a better way of bulk selections in sql?

Is there a better way of bulk selections in sql? - java

Assuming i have a database table 'UserInformation' with thousands of entries.
id
user_id
type
value
1
1
A
value_1
2
1
B
value_2
3
2
A
value_3
4
2
B
value_4
5
2
C
value_5
...
...
...
...
For simplicity, my task is to select all entries with a specific user_id and type A or B and assign the values to an UserModel Object (in Java).
But i have to select those information for thousands of user ids (e.g. user = 10.000 entries).
I know two ways of achieving this:
1.
public List<UserModel> getUsers(List<User> users){
List<UserModel> userModels = new ArrayList<>();
for (User user : users){
String valueA = findValueByUserIdAndType(user.getId(), "A");
String valueB = findValueByUserIdAndType(user.getId(), "B");
userModels.add(new UserModel(user.getId(), valueA, valueB);
}
return userModels;
}
public List<UserModel> getUsers(List<User> users){
List<UserModel> userModels = new ArrayList<>();
List<UserInformation> entries = findAllByUserIdInAndTypeIn(userIds, List.of("A", "B");
Map<Long, List<UserInformation>> entriesMappedByUserId = groupByUserId(entries);
for (User user : users){
UserInformation userInformation = entriesMappedByUserId.get(user.getId());
userModels.add(new UserModel(user.getId(), userInformation);
}
return userModels;
}
I know the first method yields in 20.000 database queries. But the second one leads me to two questions:
Is there a better way than transforming the database result into a map and retrieving the values based on an identifier
Assuming the second database query uses an in clause in the sql query (where user_id in (1,2,3,...,10000), at which number do i break my database
And the resulting question: Is there a better way to select a huge number of entries?
Edit: Assume the User table and UserInformation table can't be joined because the information is stored on two different places.

Related

How to know the missing items from Spring Data JPA's findAllById method in an efficient way?

Consider this code snippet below:
List<String> usersList = Arrays.asList("john", "jack", "jill", "xxxx", "yyyy");
List<User> userEntities = userRepo.findAllById(usersList);
User class is a simple Entity object annotated with #Entity and has an #Id field which is of String datatype.
Assume that in db I have rows corresponding to "john", "jack" and "jill". Even though I passed 5 items in usersList(along with "xxxx" and "yyyy"), findAllById method would only return 3 items/entities corresponding to "john","jack",and "jill".
Now after the call to findAllById method, what's the best, easy and efficient(better than O(n^2) perhaps) way to find out the missing items which findAllById method did not return?(In this case, it would be "xxxx" and "yyyy").

Using Java Sets
You could use a set as the source of filtering:
Set<String> usersSet = new HashSet<>(Arrays.asList("john", "jack", "jill", "xxxx", "yyyy"));
And now you could create a predicate to filter those not present:
Set<String> foundIds = userRepo.findAllById(usersSet)
.stream()
.map(User::getId)
.collect(Collectors.toSet());
I assume the filter should be O(n) to go over the entire results.
Or you could change your repository to return a set of users ideally using a form of distinct clause:
Set<String> foundIds = userRepo.findDistinctById(usersSet)
.stream()
.map(User::getId)
.collect(Collectors.toSet());;
And then you can just apply a set operator:
usersSet.removeAll(foundIds);
And now usersSet contains the users not found in your result.
And a set has a O(1) complexity to find an item. So, I assume this should be O(sizeOf(userSet)) to remove them all.
Alternatively, you could iterate over the foundIds and gradually remove items from the userSet. Then you could short-circuit the loop algorithm in the event you realize that there are no more userSet items to remove (i.e. the set is empty).
Filtering Directly from Database
Now to avoid all this, you can probably define a native query and run it in your JPA repository to retrieve only users from your list which didn't exist in the database. The query would be somewhat as follows that I did in PostgreSQL:
WITH my_users AS(
SELECT 'john' AS id UNION SELECT 'jack' UNION SELECT 'jill'
)
SELECT id FROM my_users mu
WHERE NOT EXISTS(SELECT 1 FROM users u WHERE u.id = mu.id);
Spring Data: JDBC Example
Since the query is dynamic (i.e. the filtering set could be of different sizes every time), we need to build the query dynamically. And I don't believe JPA has a way to do this, but a native query might do the trick.
You could either pack a JdbcTemplate query directly into your repository or use JPA native queries manually.
#Repository
public class UserRepository {
private final JdbcTemplate jdbcTemplate;
public UserRepository(JdbcTemplate jdbcTemplate) {this.jdbcTemplate = jdbcTemplate;}
public Set<String> getUserIdNotFound(Set<String> userIds) {
StringBuilder sql = new StringBuilder();
for(String userId : userIds) {
if(sql.length() > 0) {
sql.append(" UNION ");
}
sql.append("SELECT ? AS id");
}
String query = String.format("WITH my_users AS (%sql)", sql) +
"SELECT id FROM my_users mu WHERE NOT EXISTS(SELECT 1 FROM users u WHERE u.id = mu.id)";
List<String> myUsers = jdbcTemplate.queryForList(query, userIds.toArray(), String.class);
return new HashSet<>(myUsers);
}
}
Then we just do:
Set<String> usersIds = Set.of("john", "jack", "jill", "xxxx", "yyyy");
Set<String> notFoundIds = userRepo.getUserIdNotFound(usersIds);
There is probably a way to do it with JPA native queries. Let me see if I can do one of those and put it in the answer later on.

You can write your own algorithm that finds missing users. For example:
List<String> missing = new ArrayList<>(usersList);
for (User user : userEntities){
String userId = user.getId();
missing.remove(userId);
}
In the result you will have a list of user-ids that are missing:
"xxxx" and "yyyy"

You can just add a method to your repo:
findByIdNotIn(Collection<String> ids) and Spring will make the query:
See here:
https://docs.spring.io/spring-data/jpa/docs/current/reference/html/#jpa.query-methods
Note (from the docs):
In and NotIn also take any subclass of Collection as aparameter as well as arrays or varargs.

How can I bulk delete by a List of model classes with JPA

Normally deleting many records from a table at once when you have a List is pretty straightforward. Right now my query looks like this:
Query q = em.createQuery("DELETE from SomeEntity e WHERE e.id in :someEntityIds")
.setParameter("someEntityIds", someEntityIds);
where someEntityIds is a List<int> of id's that I want to delete.
However, I need to modify the code to delete based on two different id's. I have a model class, SomeModel that has two variables in it
public class SomeModel {
public String someId;
public Long anotherId;
}
Note that this model does not directly correspond with my JPA model/entity. The table will have these two values, plus a bunch more.
I have a list, List<SomeModel>, that I need to use to delete multiple records at once. In other words, I have to delete every single record that matches both someId and anotherId in a list of SomeModels. Is this possible without iterating through the List and running a delete query on each item in the list?

Possibly concat could work but you'll have to prepare a list of composite ids beforehand:
// List<SomeModel> list = ... // assuming there's a list of some models
List<String> compositeIds = list.stream()
.map(m -> m.someId + "#" + Integer.toString(m.anotherId))
.collect(Collectors.toList());
Query q = em.createQuery("DELETE from SomeModel e WHERE concat(e.someId, '#', e.anotherId) in :compositeIds")
.setParameter("compositeIds ", compositeIds);

What would be a cleaner way to cluster ArrayList() based on one column data?

Say I have a below list
full_list
ID NAME
1 Apple
1 Banana
2 Carrot
1 Mango
2 Spinach
3 BMW
3 Opal
With this single list, I want to create a grouped list based on column ID like below
fruits_list veggies_list cars_list
ID NAME ID NAME ID NAME
1 Apple 2 Carrot 3 BMW
1 Banana 2 Spinach 3 Opal
1 Mango
I was trying to do it with Arraylist<ArrayList<CustomObject>>. But it is adding more complexity to the code. Am I doing it wrong? Is there any cleaner way to do this?
PS: Data incoming is not fixed. I cannot define lists explicitly with conditions (i.e. if(id==1), if(id==2) etc). This shall be done dynamically based on incoming data.

As you had said that it will be more complex if you are doing this using an List. This logic can be simplified by using a Map and List. The sample code to achieve this using Map and List is as below
Map<Integer, List<String>> myMap = new LinkedHashMap<Integer, List<String>>();
try {
Class.forName("com.mysql.cj.jdbc.Driver");
Connection con = DriverManager.getConnection("jdbc:mysql://localhost:3306/stackoverflow?useSSL=false",
"root", "password");
Statement stmt = con.createStatement();
ResultSet rs = stmt.executeQuery("select * from `58144029`");
while (rs.next()) {
List<String> myList = myMap.get(rs.getInt("id")) == null ? new LinkedList<String>()
: myMap.get(rs.getInt("id"));
myList.add(rs.getString("name"));
myMap.put(rs.getInt("id"), myList);
}
con.close();
} catch (Exception e) {
e.printStackTrace();
}
System.out.println(myMap);
// Getting the List as per id
System.out.println(myMap.get(1));
In the code above, I had created the table in the database same as your. And the data are now added into the map dynamically. Whatever the id you have in your database will become the key of the map and the value against that id will become the list of value. Key can be any integer which you can further map them to a particular list.
Query to create table and insert data is as below
CREATE TABLE `58144029` (
`id` int(11) NOT NULL,
`name` varchar(64) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
INSERT INTO `58144029` (`id`, `name`) VALUES
(1, 'Apple'),
(1, 'Banana'),
(2, 'Carrot'),
(1, 'Mango'),
(2, 'Spinach'),
(3, 'BMW'),
(3, 'Opal');

You need to use Map that will have the ID as its key, and the value as a List of your object. Suppose you have the following class:
class Item {
private int id;
private String name;
public Item(int id, String name) {
this.id = id;
this.name = name;
}
public int getId() { return this.id; }
}
Your result should be of type Map<Integer, List<Item>>. One way to achieve this is by using Collectors.groupingBy. For example:
list.stream().collect(Collectors.groupingBy(Item::getId));
This will create a map, that's grouped by the id property of the Item class.

Another Quick Solution
var arList = [{1:'Apple'}, {1:'Banana'}, {2:'Carrot'}, {1:'Mango'}, {2:'Spinach'}, {3:'BMW'}, {3:'Opal'}]
var sorted =[];
arList.map(function (value, i) {
Object.keys(value).forEach(function(key, index) {
if(typeof sorted[key]==='undefined'){
sorted[key] = [value[key]];
}else{
sorted[key].push(value[key]);
}
});
})
sorted = sorted.filter(entry => /\S/.test(entry));
console.log(sorted);

Thanks all for your answers. As #Mauron advised, I used Collectors.groupingby and it saved literally 20 lines of sorting code.
After some research I found how to list all keys in map here
So here is the solution I was looking for
//Create a grouped map using Collectors.groupingBy
Map m = itemslist.stream().collect(Collectors.groupingBy(ShipmentItemsModel::getLrid));
//Make an ArrayList of keys
ArrayList<Integer> keys = new ArrayList<>(m.keySet());
//Use enhanced for loop (foreach) to go through each key in order to create list of items each key has.
//(In Layman terms: It will create small lists based on keys)
for (int singlekey: keys) {
ArrayList<ShipmentItemsModel> values = new ArrayList<ShipmentItemsModel>();
values = (ArrayList<ShipmentItemsModel>) m.get(singlekey);
for (ShipmentItemsModel sortedmodel: values) {
System.out.println("Key/ID :"+singlekey+" and values :"+sortedmodel.getProductname());
}
}
So final output will be like this
Key/ID :1 and value :Apple
Key/ID :1 and value :Banana
Key/ID :1 and value :Mango
Key/ID :2 and value :Carrot
Key/ID :2 and value :Spinach
Key/ID :3 and value :BMW
Key/ID :3 and value :Opal
I appreciate all the answers. Thanks all once again.

The best way how to keep select result in array

I have a question about keeping query result in array. For example I execute a query
SELECT * FROM some_table
Then I want to save it to array and create records. The table contains these columns:
id
user_name
last_name
The result array can be:
[[1, "First user name", "First last name"],
[2, "Second user name", "Second last name"]
...
].
Can you recommend me which array or data type should I use?

You do that like this:
Create a bean class for User
public class User {
private int id;
private String firstName;
private String lastName;
// getter and setter
...
}
And then query all the data from table, and create a User object and set the data.
List<User> users = new ArrayList<>();
while(someValue) {
...
int id = ...
String firstName= ...
String lastName = ...
User user = new User();
user.setId(id);
user.setFirstName(firstName);
user.setLastName(lastName);
users .add(user);
}
// after do what you want with the list

I extend your question to "the best way to keep select result" (with or without array).
It depends on:
how many results
how many fields
what do you want to do after ?
do you want to modify, put in your database again ?
So, several propositions:
just arrays: String[] fields1; String[] fields2, ...
array of arrays: String[][];
better collections: Vector, List or Set: do you want them to be sorted ?, how do you pick them after ? Or Map, (if you want to keep index => data)
or Object you create yourself. For this, you even have tools to map object-database.
you should take a look at these features, and what you want to do .
Hope it helps.

Hibernate : getting multiple table columns results from multiple tables

I am using hibernate 4 and Spring 3.
I have 5 tables and each table is mapped with 1 entity class.Now if I have to select columns from 1 table than i will do below :
String hql = "from Employee E";
Query query = session.createQuery(hql);
List results = query.list();
This value in this result will be of type EmployeeEntity.
Or I can use Criteria as well.
Now my requirnment is that I have to get the result from all 5 tables.1-2 columns from each table.
Earlier it was one 1 table so i was getting one entity , now I am getting results from 5 tables so how to map it in entity.
List results1 = query.list(); // considering select and getting 6 columns in results with diffenent tables.
Now how to iterate this result1.
I hope you got my question.

You can use Result Set Transformer of Query:
Say you have 4 columns from different tables like tab1col,tab2col,tab3col,tab4col.
Create a 'POJO' as follows
class MyClass
{
private Integer tablcol;
private Integer tab2col;
private Integer tab3col;
private Integer tab4col;
// getter and setters
}
Following way you can transform you result set:
List<MyClass> myClassList=query.setResultTransformer(Transformers.aliasToBean(MyClass.class)).list();
Note: query should contain a result set(something like cursor in oracle).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Is there a better way of bulk selections in sql? - java

Related

How to know the missing items from Spring Data JPA's findAllById method in an efficient way?

How can I bulk delete by a List of model classes with JPA

What would be a cleaner way to cluster ArrayList() based on one column data?

The best way how to keep select result in array

Hibernate : getting multiple table columns results from multiple tables

Categories

Resources