I want to do batch insert to postgres using jooq:
List<MyTableRecord> records = new ArrayList<>();
for (Dto dto : dtos) {
Field<Long> sequenceId = SEQUENCE.nextval();
Long id = using(ctx).select(sequenceId).fetchOne(sequenceId);
records.add(mapToRecord(dto, id));
}
using(ctx).batchInsert(records).execute();
The problem is that I am fetching next sequence number for each row.
For simple insert I can use Field in statement like this:
create.insertInto(ID, VALUE)
.values(SEQUENCE.nextval(), val("William"))
.execute();
How can I do so with batch insert?
Pre-fetch all the sequence values
You could pre-fetch all the sequence values you need using this:
List<Long> ids = using(ctx)
.select(sequenceId)
.from(generateSeries(1, dtos.size()))
.fetch(sequenceId);
for (int i = 0; i < dtos.size(); i++)
records.add(mapToRecord(dtos.get(i), ids.get(i)));
using(ctx).batchInsert(records).execute();
This seems like a useful feature to have out of the box, in an RDBMS agnostic way via using(ctx).nextvals(SEQUENCE, dtos.size()). We'll consider this for a future jOOQ version: https://github.com/jOOQ/jOOQ/issues/10658
Don't use records
An alternative is to batch actual INSERT statements instead of Record.insert() calls via batchInsert(). That way, you can put the SEQUENCE.nextval() expression in the statement. See: https://www.jooq.org/doc/latest/manual/sql-execution/batch-execution/
Related
I want to insert multiple records at a time and get the id of each record which is auto-increment. I am doing in following way but getting number of updated rows instead of generated key which is id in this case.
public int[] addPersons(List<Person> persons)
{
SqlParameterSource[] records= new BeanPropertySqlParameterSource[persons.size()] ;
int i = 0;
for (Person person: persons)
{
records[i]= new BeanPropertySqlParameterSource(person);
i++;
}
SimpleJdbcInsert insertPerson=new SimpleJdbcInsert(dsource).withTableName("PersonTable").usingGeneratedKeyColumns("id");
int [] ids= insertPerson.executeBatch(records);
return ids;
}
Here Person is the bean.
So how can I get the auto generated key which is id, for the records added ?
Spring JDBC does not allow to retrieve the generated keys when you invoke executeBatch method. This is because internally it invokes executeBatch() method of java.sql.PreparedStatement, which only returns the count of rows affected. An alternate approach would be to execute the insert statement using executeAndReturnKey method multiple times.
I have seen that JOOQ can automatically return a POJO when we use .selectFrom(TABLE) or .fetchInto(POJO.class);
But is it possible to convert the result of a complex query into multiple POJO ?
Example :
This query will return an array of all columns into tables Support and Box. It is possible to convert them into a Support and Box Pojo ?
Result<Record> results = query.select()
.from(BOX)
.join(SUPPORT)
.on(SUPPORT.ID.equal(BOX.SUPPORT_ID))
.where(SUPPORT.ID.equal("XXXX"))
.orderBy(BOX.ID)
.fetch();
I have tested the method .intoGroups(SUPPORT.ID, Box.class) , it works fine. But I doesn't have the support object.
Instantiate to SelectSeekStep1
With aliases it's more convenient:
Box b = BOX.as("b");
Support s = SUPPORT.as("s");
SelectSeekStep1<Integer, Integer> sql = query.select(b.ID, s.ID /* other columns */)
.from(b)
.join(s)
.on(s.ID.eq(b.SUPPORT_ID))
.where(s.ID.eq("XXXX"))
.orderBy(b.ID)
;
Then just fetch what/as you need:
List<BoxRecord> boxes = sql.fetchInto(BOX);
SupportRecord support = sql.limit(1).fetchOneInto(SUPPORT);
For future readers, if you want to achieve the same behaviour with insert methods you should use:
insertInto(BOX)
.set(BOX.COLUMN1, UInteger.valueOf(1))
.set(BOX.COLUMN2, "test")
.returning()
.fetchOne()
.into(<POJO_class>.class);
I want to copy data from one HBase table to another using Java APIs, but not able to find one. Is there any Java API to do the same?
Thanks.
The following is not by far the most optimized way - but from the tone of the question it seems performance is not the critical factor here.
First, you need to set up your HBaseConfiguration and your input / output tables:
Configuration config = HBaseConfiguration.create();
HTable inputTable = new HTable(config, "input_table");
HTable outputTable = new HTable(config, "output_table");
What you want is a "Scan", which allows a range scan to be performed. You need to define the query parameters, by adding columns to a Scan object.
Scan scan = new Scan(Bytes.toBytes("smith-"));
scan.addColumn(Bytes.toBytes("personal"), Bytes.toBytes("givenName"));
scan.addColumn(Bytes.toBytes("contactinfo"), Bytes.toBytes("email"));
scan.setFilter(new PageFilter(25));
Now you are ready to invoke the scan object and retrieve results:
ResultScanner scanner = inputTable.getScanner(scan);
for (Result result : scanner) {
putToOutputTable(result);
}
Now to save to the second table, you will either do Put's within the for loop, or aggregate the results into a List/Array or similar for a bulk put.
protected void putToOutputTable(Result result) {
// Retrieve the Map of families to their most recent qualifiers and values.
NavigableMap<byte[],NavigableMap<byte[],byte[]>> map = result.getNoVersionMap();
for ( // iterate through the family/values map entries for this result ) {
// Convert the result to the row key and the column values here ..
// specifically set the rowKey, colFamily, colQualifier, and colValue(s)
Put p = new Put(Bytes.toBytes(rowKey));
// To set the value you'd like to update in the row 'myLittleRow',
// specify the column family, column qualifier, and value of the table
// cell you'd like to update. The column family must already exist
// in your table schema. The qualifier can be anything.
// All must be specified as byte arrays as hbase is all about byte
// arrays. Lets pretend the table 'myLittleHBaseTable' was created
// with a family 'myLittleFamily'.
p.add(Bytes.toBytes(colFamily), Bytes.toBytes(colQualifier),
Bytes.toBytes(colValue));
}
table.put(p);
}
If instead you want a more scalable version, take a look at how to use map/reduce to read from input hdfs files / write to output hbase tables here: Hbase Map/Reduce
i am building a shopping cart using jsp and hibernate.
i am filtering the content by brand size and price using checkboxes
the checked checkboxes are returned to the class where hql query exists.
so i want i single hql query that can handle this.
as like if one of the parameter like size is empty (means user doesnt uses it to filter the content ) than an empty string is passed to the hql query which returns any value...
so is there anything possible that all values can be retrived in where clause for empty string or some other alternative except coding different methods for different parameter...
I typically use the Criteria api for things like this... if the user does not specify a size, do not add it to the criteria query.
Criteria criteria = session.createCriteria(MyClass.class);
if(size != null && !size.isEmpty()){
criteria.add(Restrictions.eq("size", size);
}
To have multiple restrictions via an OR statement, you use Disjunction. For an AND, you use Conjunction.
Criteria criteria = session.createCriteria(MyClass.class);
Disjunction sizeDisjunction = Restrictions.disjunction();
String[] sizes = { "small", "medium", "large" };
for(int i = 0; i < sizes.length; i++){
sizeDisjunction.add(Restrictions.eq("size", sizes[i]);
}
criteria.add(sizeDisjunction );
First, good practices say that instead of passing and empty String to the query, you should pass null instead. That said, this hql should help you:
from Product p
where p.brand = coalesce(:brand, p.brand)
and p.size = coalesce(:size, p.size)
and p.price = coalesce (:price, p.price)
Friends!
I am using MongoDB in java project via spring-data. I use Repository interfaces to access data in collections. For some processing I need to iterate over all elements of collection. I can use fetchAll method of repository, but it always return ArrayList.
However, it is supposed that one of collections would be large - up to 1 million records several kilobytes each at least. I suppose I should not use fetchAll in such cases, but I could not find neither convenient methods returning some iterator (which may allow collection to be fetched partially), nor convenient methods with callbacks.
I've seen only support for retrieving such collections in pages. I wonder whether it is the only way for working with such collections?
Late response, but maybe will help someone in the future. Spring data doesn't provide any API to wrap Mongo DB Cursor capabilities. It uses it within find methods, but always returns completed list of objects. Options are to use Mongo API directly or to use Spring Data Paging API, something like that:
final int pageLimit = 300;
int pageNumber = 0;
Page<T> page = repository.findAll(new PageRequest(pageNumber, pageLimit));
while (page.hasNextPage()) {
processPageContent(page.getContent());
page = repository.findAll(new PageRequest(++pageNumber, pageLimit));
}
// process last page
processPageContent(page.getContent());
UPD (!) This method is not sufficient for large sets of data (see #Shawn Bush comments) Please use Mongo API directly for such cases.
Since this question got bumped recently, this answer needs some more love!
If you use Spring Data Repository interfaces, you can declare a custom method that returns a Stream, and it will be implemented by Spring Data using cursors:
import java.util.Stream;
public interface AlarmRepository extends CrudRepository<Alarm, String> {
Stream<Alarm> findAllBy();
}
So for the large amount of data you can stream them and process the line by line without memory limitation.
See https://docs.spring.io/spring-data/mongodb/docs/current/reference/html/#mongodb.repositories.queries
you can still use mongoTemplate to access the Collection and simply use DBCursor:
DBCollection collection = mongoTemplate.getCollection("boundary");
DBCursor cursor = collection.find();
while(cursor.hasNext()){
DBObject obj = cursor.next();
Object object = obj.get("polygons");
..
...
}
Use MongoTemplate::stream() as probably the most appropriate Java wrapper to DBCursor
Another way:
do{
page = repository.findAll(new PageRequest(pageNumber, pageLimit));
pageNumber++;
}while (!page.isLastPage());
Check new method to handle results per document basis.
http://docs.spring.io/spring-data/mongodb/docs/current/api/org/springframework/data/mongodb/core/MongoTemplate.html#executeQuery-org.springframework.data.mongodb.core.query.Query-java.lang.String-org.springframework.data.mongodb.core.DocumentCallbackHandler-
You may want to try the DBCursor way like this:
DBObject query = new BasicDBObject(); //setup the query criteria
query.put("method", method);
query.put("ctime", (new BasicDBObject("$gte", bTime)).append("$lt", eTime));
logger.debug("query: {}", query);
DBObject fields = new BasicDBObject(); //only get the needed fields.
fields.put("_id", 0);
fields.put("uId", 1);
fields.put("ctime", 1);
DBCursor dbCursor = mongoTemplate.getCollection("collectionName").find(query, fields);
while (dbCursor.hasNext()){
DBObject object = dbCursor.next();
logger.debug("object: {}", object);
//do something.
}
The best way to iterator over a large collection is to use the Mongo API directly. I used the below code and it worked like a charm for my use-case.
I had to iterate over more than 15M records and the document size was huge for some of those.
The following code is in Kotlin Spring Boot App (Spring Boot Version: 2.4.5)
fun getAbcCursor(batchSize: Int, from: Long?, to: Long?): MongoCursor<Document> {
val collection = xyzMongoTemplate.getCollection("abc")
val query = Document("field1", "value1")
if (from != null) {
val fromDate = Date(from)
val toDate = if (to != null) { Date(to) } else { Date() }
query.append(
"createTime",
Document(
"\$gte", fromDate
).append(
"\$lte", toDate
)
)
}
return collection.find(query).batchSize(batchSize).iterator()
}
Then, from a service layer method, you can just keep calling MongoCursor.next() on returned cursor till MongoCursor.hasNext() returns true.
An Important Observation: Please do not miss adding batchSize on 'FindIterable' (the return type of MongoCollection.find()). If you won't provide the batch size, the cursor will fetch initial 101 records and will hang after that (it tries to fetch all the remaining records at once).
For my scenario, I used the batch size as 2000, as it gave the best results during testing. This optimized batch size will be impacted by the average size of your records.
Here is the equivalent code in Java (removing createTime from query as it is specific to my data model).
MongoCursor<Document> getAbcCursor(Int batchSize) {
MongoCollection<Document> collection = xyzMongoTemplate.getCollection("your_collection_name");
Document query = new Document("field1", "value1");// query --> {"field1": "value1"}
return collection.find(query).batchSize(batchSize).iterator();
}
This answer is based on: https://stackoverflow.com/a/22711715/5622596
That answer needs a bit of an update as PageRequest has changed how it is being constructed.
With that said here is my modified response:
int pageNumber = 1;
//Change value to whatever size you want the page to have
int pageLimit = 100;
Page<SomeClass> page;
List<SomeClass> compondList= new LinkedList<>();
do{
PageRequest pageRequest = PageRequest.of(pageNumber, pageLimit);
page = repository.findAll(pageRequest);
List<SomeClass> listFromPage = page.getContent();
//Do something with this list example below
compondList.addAll(listFromPage);
pageNumber++;
}while (!page.isLast());
//Do something with the compondList: example below
return compondList;