I am having difficult time in finding a sample program that uses the execute of batch statement as argument for org.springframework.data.cassandra.core.CassandraTemplate;
Basically I am trying to do multiple insert as a batch.
CqlTemplate cqltemplate = new CqlTemplate(session);
cqltemplate.execute(Batch arg0);
How does it all come together? Also batch has issues in dealing with inserting multiple records to any unknown table (not linked to entity class). My project requires a method to do multiple insert for a given table and hashmap of key and values (row data) - which does not have an equivalent POJO Class. Any suggestions on how to achieve this?
I was able to work it out. Thank you for the guidance. Sorry for posting it late:
Insert insert1 = QueryBuilder.insert...
Batch batch = QueryBuilder.batch(insert1);
Insert insert2 = QueryBuilder.insert...
batch.add(insert2);
CassandraOperations cassandraOperations = new CassandraTemplate(session);
WriteOptions options = new WriteOptions();
options.setTtl(60);
options.setConsistencyLevel(ConsistencyLevel.ONE);
options.setRetryPolicy(RetryPolicy.DOWNGRADING_CONSISTENCY);
cassandraOperations.execute(batch.toString(), options);
According to the reference you need to create an object of class com.datastax.driver.core.querybuilder.Batch.
You can create with com.datastax.driver.core.querybuilder.QueryBuilder batch method. CQLTemplate should not be created in the code, it should be injected in the configuration:
CQLTemplate cqlTemplate=new CQLTemplate();
yourServiceBean.setCQLTemplate(cqlTemplate);
And in your service/dao it would be something like:
Batch batch=QueryBuilder.batch (...)
cqlTemplate.execute(batch);
Related
I am working on a monitoring tool developed in Spring Boot using Hibernate as ORM.
I need to compare each row (already persisted rows of sent messages) in my table and see if a MailId (unique) has received a feedback (status: OPENED, BOUNCED, DELIVERED...) Yes or Not.
I get the feedbacks by reading csv files from a network folder. The CSV parsing and reading of files goes very fast, but the update of my database is very slow. My algorithm is not very efficient because I loop trough a list that can have hundred thousands of objects and look in my table.
This is the method that make the update in my table by updating the "target" Object (row in table database)
#Override
public void updateTargetObjectFoo() throws CSVProcessingException, FileNotFoundException {
// Here I make a call to performProcessing method which reads files on a folder and parse them to JavaObjects and I map them in a feedBackList of type Foo
List<Foo> feedBackList = performProcessing(env.getProperty("foo_in"), EXPECTED_HEADER_FIELDS_STATUS, Foo.class, ".LETTERS.STATUS.");
for (Foo foo: feedBackList) {
//findByKey does a simple Select in mySql where MailId = foo.getMailId()
Foo persistedFoo = fooDao.findByKey(foo.getMailId());
if (persistedFoo != null) {
persistedFoo.setStatus(foo.getStatus());
persistedFoo.setDnsCode(foo.getDnsCode());
persistedFoo.setReturnDate(foo.getReturnDate());
persistedFoo.setReturnTime(foo.getReturnTime());
//The save account here does an MySql UPDATE on the table
fooDao.saveAccount(foo);
}
}
}
What if I achieve this selection/comparison and update action in Java side? Then re-update the whole list in database?
Will it be faster?
Thanks to all for your help.
Hibernate is not particularly well-suited for batch processing.
You may be better off using Spring's JdbcTemplate to do jdbc batch processing.
However, if you must do this via Hibernate, this may help: https://docs.jboss.org/hibernate/orm/5.2/userguide/html_single/chapters/batch/Batching.html
I've written a small code which functions like follows:
Read an input file
Read the first line in it and check if it
has value AAA at a certain position
if it satisfies the
condition call the method insertAAAmethod and load the data to the
oracle table
read the next line if it is record BBB call the insertBBBrmethod which has different insert query
The problem is I have 15 different records in the input file so I have 15 different methods like insertAAAmethod each with different insert query:
public static void insertAAARecord() throws Exception {
String sqlQuery = "insert into my_table(ColumnA,ColumnB,ColumnC,ColumnD,ColumnE,ColumnF,ColumnG)"
+ "values (?,?,?,?,?,?,?)";
try {
pstmt = conn.prepareStatement(sqlQuery);
pstmt.setString(1, "AAA");
pstmt.setDate(2,
StringtoDate("AAA", CurrentLine.substring(150, 158)));
pstmt.setDate(3,
StringtoDate("AAA", CurrentLine.substring(158, 166)));
pstmt.setString(4, CurrentLine.substring(24, 34));
pstmt.setString(5, CurrentLine.substring(37, 45));
pstmt.setString(6, CurrentLine.substring(147, 150));
pstmt.setDate(7, headerDate);
pstmt.executeUpdate();
} catch (SQLException e) {
e.printStackTrace(System.out);
} finally {
pstmt.close();
}
}
Is it possible to keep the sql query out of the java code? (Like in a properties file, for example)
Note: The insert query changes as per the record if it is record 'AAA' it should follow one insert query if the record is different insert query should change..
Let me know how to optimize my code.
Yes, you should create a properties file with all your queries and load them on startup using Properties object, then use the Properties to look them up. You can also you Spring to inject the queries into your configuration objects.
I have always built DB object that integrate your code to DB data and queries are kept there, it is much easier to debug and manage as everything you need is in one place.
Make your life simple, avoid ORMs and tune your SQL queries (or let DBAs do that that's what they do and they are good at it). However if you do not like SQL or don't care how efficient it is, then ORM like Hibernate may be what you need.
Maybe this will take more time than other approaches, but it would help you with your code optimization.
Every query/process have common attributes.
SQL Query
String pattern
Parameters
and each parameter has
Position
Datatype
etc
If you're using spring already, you'll be able to define these as beans in your config.xml.
if not, you can use xml configuration anyway (instead of property files)
Then, you will have to create some class to parse these beans and create the custom queries.
Hope it helps.
Try to use any ORM framework like Hibernate/JPA, Ibatis?
You want to use JPA. I would recommend walking through a simple JPA tutorial such as the following: http://glassfish.java.net/javaee5/persistence/persistence-example.html
That tutorial will demonstrate the basics of how to create or modify objects and persist those changes in a database. Good luck.
is it possible to get all process or task variables using TaskService:
processEngine.getTaskService.createTaskQuery().list();
I know there is an opportunity to get variables via
processEngine.getTaskService().getVariable()
or
processEngine.getRuntimeService().getVariable()
but every of operation above goes to database. If I have list of 100 tasks I'll make 100 queries to DB. I don't want to use this approach.
Is there any other way to get task or process related variables?
Unfortunately, there is no way to do that via the "official" query API! However, what you could do is writing a custom MyBatis query as described here:
https://app.camunda.com/confluence/display/foxUserGuide/Performance+Tuning+with+custom+Queries
(Note: Everything described in the article also works for bare Activiti, you do not need the fox engine for that!)
This way you could write a query which selects tasks along with the variables in one step. At my company we used this solution as we had the exact same performance problem.
A drawback of this solution is that custom queries need to be maintained. For instance, if you upgrade your Activiti version, you will need to ensure that your custom query still fits the database schema (e.g., via integration tests).
If it is not possible to use the API as elsvene says, you can query yourself the database. Activiti has several tables on the database.
You have act_ru_variable, were the currently running processes store the variables. For the already finished processess you have act_hi_procvariable. Probably you can find a detailed explanation on what is on each table in activiti userguide.
So you just need to make queries like
SELECT *
FROM act_ru_variable
WHERE *Something*
The following Test, sends a value object (Person) to a process which just adds a few tracking infos for demonstration.
I had the same problem, to get the value object after execution the service to do some validation in my test.
The following piece of code shows the execution and the gathering of the task varaible after the execution was finished.
#Test
public void justATest() {
Map<String, Object> inVariables = new HashMap<String, Object>();
Person person = new Person();
person.setName("Jens");
inVariables.put("person", person);
ProcessInstance processInstance = runtimeService.startProcessInstanceByKey("event01", inVariables);
String processDefinitionId = processInstance.getProcessDefinitionId();
String id = processInstance.getId();
System.out.println("id " + id + " " + processDefinitionId);
List<HistoricVariableInstance> outVariables =
historyService.createHistoricVariableInstanceQuery().processInstanceId(id).list();
for (HistoricVariableInstance historicVariableInstance : outVariables) {
String variableName = historicVariableInstance.getVariableName();
System.out.println(variableName);
Person person1 = (Person) historicVariableInstance.getValue();
System.out.println(person1.toString());
}
}
I am performing a call to a function which is part of a DB package. This package is deployed in two locations. One local and another remote (across the Atlantic).
I am retrieving the data via the Spring JDBC template.
There is one function which returns approximately 1000 rows (not all that much) and this is taking about 1.5 seconds when getting the data locally but it's taking in the region of 12 seconds when getting the data remotely.
In all sample code, names have been changed and code has been simplified a little.
Please see an example of the current Java code:
SimpleJdbcCall simpleJdbcCall = new SimpleJdbcCall(getDataSource())
.withSchemaName(MY_SCHEMA_NAME)
.withCatalogName("REFCURSOR_PKG")
.withFunctionName("GET_DATA")
.returningResultSet("RESULT_SET", new DataEntryMapper());
SqlParameterSource params = new MapSqlParameterSource()
.addValue("the_name", name)
.addValue("the_rev", rev);
Map resultSet = simpleJdbcCall.execute(params);
ArrayList list = (ArrayList) resultSet.get("RESULT_SET");
The RowMapper class looks something like this:
class RouteDataEntryMapper implements RowMapper {
public RouteDataEntry mapRow(ResultSet resultSet, int rowNum) throws SQLException {
return new DataEntry(resultSet.getString("name"),
Integer.parseInt(resultSet.getString("rev"));
}
}
SQL package spec snippet:
TYPE REF_CURSOR IS REF CURSOR;
SQL function:
FUNCTION GET_ROUTE_DATA(the_name VARCHAR2, the_rev VARCHAR2) RETURN REF_CURSOR AS
RESULT_SET REF_CURSOR;
BEGIN
OPEN RESULT_SET FOR
select *
from table_name tn
where tn.name = the_name
and tn.rev = the_rev;
RETURN RESULT_SET;
CLOSE RESULT_SET;
EXCEPTION WHEN OTHERS THEN
RAISE;
END GET_ROUTE_DATA;
I have tried using regular boiler plate JDBC also (create connection, prepare statement, execute statement, retrieve data from RESULT_SET, etc) and I found that the vast majority of time was spent looping over the RESULT_SET and extracting the data out of it and into some pojos. In the case of the Spring code above, most of the time was spent during the execute() method but this is probably because it creates the objects using the RowMapper at that time.
So, the common area between them is the performing of actions such as:
rs.getString("name")
and I'm guessing that this is where the problem lies but I could be wrong.
As I said, locally the delay is fine but remotely it's taking way too long. Is this because it's going to the DB on every rs.get... ? Is there a better way to do this?
Thanks in advance.
rs.getString("name")
ResultSet.get*(String columnName) can be replaced with ResultSet.get*(int columnNaumber) which is slightly faster but I doubt that the main problem here.
Is this because it's going to the DB on every rs.get... ?
While it really depends the driver I suspect it won't. For a cached result-set it might go to ther server when your scroll through the cursor but it would still fetch a bunch of rows in every roundtrip.
Two more suggestions I have are:
Use a network sniffing utility to see the data being transferred
Check your driver for any option to pre-fetch and such like.
add this line :-
.withoutProcedureColumnMetaDataAccess
in the following code lines
SimpleJdbcCall simpleJdbcCall = new SimpleJdbcCall(getDataSource())
.withSchemaName(MY_SCHEMA_NAME)
.withCatalogName("REFCURSOR_PKG")
.withFunctionName("GET_DATA")
.withoutProcedureColumnMetaDataAccess // to avoid fetching meta data info from database
I am new to JPA and am facing this issue for the past two days . Whenever i am trying to update my object in the database , the merge query is executing twice and the data is not updated in the Database . Can any one tell me where i have done mistake .
here is the Snippet :
Employee emp = em.find(Employee.class,empid);
if (emp != null) {
emp.setDescription("Success");
emp.setDob(new Timestamp(new Date().getTime()));
etxn = em.getTransaction();
etxn.begin();
em.merge(emp);
System.out.println(em.merge(emp));
etxn.commit();
}
Thats because you are calling merge method twice
Since you are using the same EntityManager, and JPA transactions, you do not even need to call merge.
Perhaps enable logging and include the log. Also include the code for you class.