All of my data points are lost when InfluxDB restarts - java

A cron job is being used to fire this script off once a day. When the script runs it seems to work as expected. The code builds a map, iterates over that map, creates points which are added to a batch, and finally writes those batched points to influxDB. I can connect to the influxDB and I can query my database and see that the points were added. I am using influxdb-java 2.2.
The issue that I am having is that when influxDB is restarted all of my data is being removed. The database still exists and the series still exists, however, all of the points/rows are gone (Each table is empty). My database is not the only database, there are several others, those databases are restored correctly. My guess is that the transaction is not being finalized. I am not aware of a way to make it do a flush and ensure that my points are persisted. I tried to adding:
influxDB.write(batchPoints);
influxDB.disableBatch(); // calls this.batchProcessor.flush() in InfluxDBImpl.java
This was an attempt to force a flush but this didn't work as expected. I am using influxDB 0.13.X
InfluxDB influxDB = InfluxDBFactory.connect(host, user, pass);
String dbName = "dataName";
influxDB.createDatabase(dbName);
BatchPoints batchPoints = BatchPoints
.database(dbName)
.tag("async", "true")
.retentionPolicy("default")
.consistency(ConsistencyLevel.ALL)
.build();
for (Tags type: Tags.values()) {
List<LinkedHashMap<String, Object>> myList = this.trendsMap.get(type.getDisplay());
if (myList != null) {
for (LinkedHashMap<String, Object> data : myList) {
Point point = null;
long time = (long) data.get("time");
if (data.get("date").equals(this.sdf.format(new Date()))) {
time = System.currentTimeMillis();
}
point = Point.measurement(type.getDisplay())
.time(time, TimeUnit.MILLISECONDS)
.field("count", data.get("count"))
.field("date", data.get("date"))
.field("day_of_week", data.get("day_of_week"))
.field("day_of_month", data.get("day_of_month"))
.build();
batchPoints.point(point);
}
}
}
influxDB.write(batchPoints);

Can you upgrade InfluxDB to 0.11.0? There have been many important changes since then and it would be best to test against that.

Related

How to test roll over feature in mongo db transactions

I am newbie to MongoDB i implemented transactional feature in one of my application, as per my requirements i need to persist data into different collections in the same database. Below is the code snippet for the same
In Tuple3 first element is database, second element is collection and third element is data i want to persist which is coming as json string which i am converting to bson document
ClientSession clientSession = mongoClient.startSession();
try {
clientSession.startTransaction(transactionOptions);
for (Tuple3<String, String, String> value: insertValues) {
MongoCollection<Document> collection = mongoClient
.getDatabase(insertValues.f0)
.getCollection(insertValues.f1);
Document data= Document.parse(insertValues.f2);
log.info(String.format("Inserting data into database %s and collection is %s", insertValues.f0, insertValues.f1));
collection.insertOne(clientSession, data);
clientSession.commitTransaction();
}
} catch (MongoCommandException | MongoWriteException exception) {
clientSession.abortTransaction();
log.error(String.format("Exception happened while inserting record into Mongo DB rolling back the transaction " +
"and cause of exception is: %s", exception));
} finally {
clientSession.close();
}
Below are transaction options i am using
TransactionOptions transactionOptions = TransactionOptions.builder().readConcern(ReadConcern.LOCAL).writeConcern(WriteConcern.W1).build();
Below is MongoClient method with MongoClientOptions i am taking Mongo DB Connection string as input to this method
public MongoClient getTransactionConnection(String connectionString) {
MongoClientOptions.Builder mongoClientOptions = new MongoClientOptions.Builder()
.readConcern(ReadConcern.LOCAL)
.writeConcern(WriteConcern.W1)
.readPreference(ReadPreference.primary())
.serverSelectionTimeout(120000)
.maxWaitTime(120000)
.connectionsPerHost(10)
.connectTimeout(120000);
MongoClientURI uri = new MongoClientURI(connectionString, mongoClientOptions);
return new MongoClient(uri);
}
Till here it is good and it is inserting data to three different collection under the specified database. But when i try to some negative scenario i am trying to throw exception in try block which ideally should rollback the data for that particular client session if any error happens.
I am trying to throw exception by using count variable which will increment and for if count value is equal to 1 i am throwing exception which should abort the transaction and rollback if any data is written to database but what i am seeing it is writing to one of the collection and throws exception after that stops the program but it is not rolling back the data written to collection actually. I am trying something like this below
ClientSession clientSession = mongoClient.startSession();
int count = 0;
try {
clientSession.startTransaction(transactionOptions);
for (Tuple3<String, String, String> value: insertValues) {
MongoCollection<Document> collection = mongoClient
.getDatabase(insertValues.f0)
.getCollection(insertValues.f1);
Document data= Document.parse(insertValues.f2);
log.info(String.format("Inserting data into database %s and collection is %s", insertValues.f0, insertValues.f1));
collection.insertOne(clientSession, data);
if(count == 1){
throw new MongoException("Aborting transaction.....");
}
count++;
clientSession.commitTransaction();
}
} catch (MongoCommandException | MongoWriteException exception) {
clientSession.abortTransaction();
log.error(String.format("Exception happened while inserting record into Mongo DB rolling back the transaction " +
"and cause of exception is: %s", exception));
} finally {
clientSession.close();
}
I am not sure where i am going wrong i am using Mongo DB version 4.0 deployed using Azure CosmosDB Api. Please help me in resolving this issue thanks in advance.
Cosmos DB does not have transaction support outside of a single partition (shard) of a single collection. This limitation exists regardless of API in use (in your case, MongoDB API). This is why you're not seeing the behavior you're expecting. Note: this is mentioned in the Cosmos DB MongoDB compatibility docs.
You'll need to come up with your own implementation for managing data consistency within your app.

Google Appengine Datastore Timeout Exception

We are fetching the list of namespaces from datastore which counts upto 30k.
The cron to fetch namespaces runs daily. But one day it works fine and other day it throws datastore timeout exception.
com.google.appengine.api.datastore.DatastoreTimeoutException: The
datastore operation timed out, or the data was temporarily
unavailable.
Related Code :
DatastoreService ds = DatastoreServiceFactory.getDatastoreService();
FetchOptions options = FetchOptions.Builder.withChunkSize(150);
Query q = new Query(Entities.NAMESPACE_METADATA_KIND);
for (Entity e : ds.prepare(q).asIterable(options)){
// A nonzero numeric id denotes the default namespace;
// see Namespace Queries, below
if (e.getKey().getId() != 0){
continue;
}else{
namespaces.add(e.getKey().getName());
}
}
What could be the issue?
According to official documentation:
DatastoreTimeoutException is thrown when a datastore operation times
out. This can happen when you attempt to put, get, or delete too many
entities or an entity with too many properties, or if the datastore is
overloaded or having trouble.
This means that datastore having troubles with your request. Try to handle that error like:
import com.google.appengine.api.datastore.DatastoreTimeoutException;
try {
// Code that could result in a timeout
} catch (DatastoreTimeoutException e) {
// Display a timeout-specific error page
}

unexpected multiple execution of mapper intended to run once

I tried to write a very simple job with only 1 mapper and no reducer to write some data to hbase. In the mapper I tried to simply open connection with hbase, write a few rows of data to a table and then close connection. In job driver I am using JobConf.setNumMapTasks(1); and JobConf.setNumReduceTasks(0); to specify that only 1 mapper and no reducer are to be executed. I am also setting the reducer class to IdentityReducer in jobConf. The strange behavior I am observing is that the job successfully writes the data to hbase table however after that I see in the logs it continuously tried to open connection with hbase and then closes the connection which goes on for 20-30 minutes and after the job is declared to have completed with 100% success. At the end when I check the _success file created by the dummy data I put in OutputCollector.collect(...) I see hundred of rows of dummy data when there should only be 1.
Following is the code for job driver
public int run(String[] arg0) throws Exception {
Configuration config = HBaseConfiguration.create(getConf());
ensureRequiredParametersExist(config);
ensureOptionalParametersExist(config);
JobConf jobConf = new JobConf(config, getClass());
jobConf.setJobName(config.get(ETLJobConstants.ETL_JOB_NAME));
//set map specific configuration
jobConf.setNumMapTasks(1);
jobConf.setMaxMapAttempts(1);
jobConf.setInputFormat(TextInputFormat.class);
jobConf.setMapperClass(SingletonMapper.class);
jobConf.setMapOutputKeyClass(LongWritable.class);
jobConf.setMapOutputValueClass(Text.class);
//set reducer specific configuration
jobConf.setReducerClass(IdentityReducer.class);
jobConf.setOutputKeyClass(LongWritable.class);
jobConf.setOutputValueClass(Text.class);
jobConf.setOutputFormat(TextOutputFormat.class);
jobConf.setNumReduceTasks(0);
//set job specific configuration details like input file name etc
FileInputFormat.setInputPaths(jobConf, jobConf.get(ETLJobConstants.ETL_JOB_FILE_INPUT_PATH));
System.out.println("setting output path to : " + jobConf.get(ETLJobConstants.ETL_JOB_FILE_OUTPUT_PATH));
FileOutputFormat.setOutputPath(jobConf,
new Path(jobConf.get(ETLJobConstants.ETL_JOB_FILE_OUTPUT_PATH)));
JobClient.runJob(jobConf);
return 0;
}
Driver class extends Configured and implements Tool (I used the sample from definitive guide)Following is the code in my mapper class.
Following is the code in my Mapper's map method where I simply open the connection with Hbase, do some preliminary check to make sure table exists and then write the rows and close the table.
public void map(LongWritable arg0, Text arg1,
OutputCollector<LongWritable, Text> arg2, Reporter arg3)
throws IOException {
HTable aTable = null;
HBaseAdmin admin = null;
try {
arg3.setStatus("started");
/*
* set-up hbase config
*/
admin = new HBaseAdmin(conf);
/*
* open connection to table
*/
String tableName = conf.get(ETLJobConstants.ETL_JOB_TABLE_NAME);
HTableDescriptor htd = new HTableDescriptor(toBytes(tableName));
String colFamilyName = conf.get(ETLJobConstants.ETL_JOB_TABLE_COLUMN_FAMILY_NAME);
byte[] tablename = htd.getName();
/* call function to ensure table with 'tablename' exists */
/*
* loop and put the file data into the table
*/
aTable = new HTable(conf, tableName);
DataRow row = /* logic to generate data */
while (row != null) {
byte[] rowKey = toBytes(row.getRowKey());
Put put = new Put(rowKey);
for (DataNode node : row.getRowData()) {
put.add(toBytes(colFamilyName), toBytes(node.getNodeName()),
toBytes(node.getNodeValue()));
}
aTable.put(put);
arg3.setStatus("xoxoxoxoxoxoxoxoxoxoxoxo added another data row to hbase");
row = fileParser.getNextRow();
}
aTable.flushCommits();
arg3.setStatus("xoxoxoxoxoxoxoxoxoxoxoxo Finished adding data to hbase");
} finally {
if (aTable != null) {
aTable.close();
}
if (admin != null) {
admin.close();
}
}
arg2.collect(new LongWritable(10), new Text("something"));
arg3.setStatus("xoxoxoxoxoxoxoxoxoxoxoxoadded some dummy data to the collector");
}
As you could see around the end that I am writing some dummy data to collection in the end (10, 'something') and I see hundreds of rows of this data in the _success file after the job has terminated.
I can't identify why the mapper code is restarted multiple times over and over instead of running just once. Any help would be greatly appreciated.
Using JobConf.setNumMapTasks(1) is just saying to hadoop that you wish to use 1 mapper, if possible, unlike the setNumReduceTasks, which actually defines the number that you specified.
That's why more mappers are run and you observe all these numbers.
For more details, please read this post.

Why does app engine bill me less when the below code is wrapped in a transaction?

I have verified this multiple times using appstats. When the below code is NOT wrapped in a transaction, JDO performs two datastore reads and one write, 3 RPC's, at a cost of 240. Not just the first time, every time, even though it is accessing the same record every time hence should be pulling it from cache. However, when I wrap the code in a transaction as above, the code makes 4 RPC's: begin transaction, get, put, and commit -- of these, only the Get is billed as a datastore read, so the overall cost is 70.
If it's pulling it from cache, why would it only bill for a read? It would seem that it would bill for a write, not a read. Could app engine be billing me the same amount for non-transactional cache reads as it does for datastore reads? why?
This is the code WITH transaction:
PersistenceManager pm = PMF.getManager();
Transaction tx = pm.currentTransaction();
String responsetext = "";
try {
tx.begin();
Key userkey = obtainUserKeyFromCookie();
User u = pm.getObjectById(User.class, userkey);
Key mapkey = obtainMapKeyFromQueryString();
// this is NOT a java.util.Map, just FYI
Map currentmap = pm.getObjectById(Map.class, mapkey);
Text mapData = currentmap.getMapData(); // mapData is JSON stored in the entity
Text newMapData = parseModifyAndReturn(mapData); // transform the map
currentmap.setMapData(newMapData); // mutate the Map object
tx.commit();
responsetext = "OK";
} catch (JDOCanRetryException jdoe) {
// log jdoe
responsetext = "RETRY";
} catch (Exception e) {
// log e
responsetext = "ERROR";
} finally {
if (tx.isActive()) {
tx.rollback();
}
pm.close();
}
resp.getWriter().println(responsetext);
This is the code WITHOUT the transaction:
PersistenceManager pm = PMF.getManager();
String responsetext = "";
try {
Key userkey = obtainUserKeyFromCookie();
User u = pm.getObjectById(User.class, userkey);
Key mapkey = obtainMapKeyFromQueryString();
// this is NOT a java.util.Map, just FYI
Map currentmap = pm.getObjectById(Map.class, mapkey);
Text mapData = currentmap.getMapData(); // mapData is JSON stored in the entity
Text newMapData = parseModifyAndReturn(mapData); // transform the map
currentmap.setMapData(newMapData); // mutate the Map object
responsetext = "OK";
} catch (Exception e) {
// log e
responsetext = "ERROR";
} finally {
pm.close();
}
resp.getWriter().println(responsetext);
With the transaction, the PersistenceManager can know that the caches are valid throughout the processing of that code. Without the transaction, it cannot (it doesn't know whether some other action has come in behind its back and changed things) and so must validate the cache's contents against the DB tables. Each time it checks, it needs to create a transaction to do so; that's a feature of the DB interface itself, where any action that's not in a transaction (with a few DB-specific exceptions) will have a transaction automatically added.
In your case, you should have a transaction anyway, because you want to have a consistent view of the database while you do your processing. Without that, the mapData could be modified by another operation while you're in the middle of working on it and those modifications would be silently lost. That Would Be Bad. (Well, probably.) Transactions are the cure.
(You should also look into using AOP for managing the transaction wrapping; that's enormously easier than writing all that transaction management code yourself each time. OTOH, it can add a lot of complexity to deployment until you get things right, so I could understand not following this piece of adviceā€¦)

ormlite with persistent h2 db - new tables not get persisted

When I am creating a new H2 database via ORMLite the database file get created but after I close my application, all the data that it stored in the database is lost:
JdbcConnectionSource connection =
new JdbcConnectionSource("jdbc:h2:file:" + path.getAbsolutePath() + ".h2.db");
TableUtils.createTable(connection, SomeClass.class);
Dao<SomeClass, Integer> dao = DaoManager.createDao(connection, SomeClass.class);
SomeClass sc = new SomeClass(id, ...);
dao.create(sc);
SomeClass retrieved = dao.queryForId(id);
System.out.println("" + retrieved);
This code will produce good results. It will print the object that I stored.
But when I start the application again this time without creating the table and storing new object I get an exception telling me that the required table is not exists:
JdbcConnectionSource connection =
new JdbcConnectionSource("jdbc:h2:file:" + path.getAbsolutePath() + ".h2.db");
Dao<SomeClass, Integer> dao = DaoManager.createDao(connection, SomeClass.class);
SomeClass retrieved = dao.queryForId(id); // will produce an exception..
System.out.println("" + retrieved);
The following worked fine for me if I ran it once and then a second time with the createTable turned off. The 2nd insert gave me a primary key violation of course but that was expected. It created the file with (as #Thomas mentioned) a ".h2.db.h2.db" prefix.
Some questions:
After you run your application the first time, can you see the path file being created?
Is it on permanent storage and not in some temporary location cleared by the OS?
Any chance some other part of your application is clearing it before the database code begins?
Hope this helps.
#Test
public void testStuff() throws Exception {
File path = new File("/tmp/x");
JdbcConnectionSource connection = new JdbcConnectionSource("jdbc:h2:file:"
+ path.getAbsolutePath() + ".h2.db");
// TableUtils.createTable(connection, SomeClass.class);
Dao<SomeClass, Integer> dao = DaoManager.createDao(connection,
SomeClass.class);
int id = 131233;
SomeClass sc = new SomeClass(id, "fopewjfew");
dao.create(sc);
SomeClass retrieved = dao.queryForId(id);
System.out.println("" + retrieved);
connection.close();
}
I can see Russia from my house:
> ls -l /tmp/
...
-rw-r--r-- 1 graywatson wheel 14336 Aug 31 08:47 x.h2.db.h2.db
Did you close the database? It is closed automatically but it's better to close it manually (so recovery is faster).
In many cases the database URL is the problem. Are you sure the same path is used in both cases? Otherwise you end up with two databases. By the way, ".h2.db" is added automatically, you don't need to add it manually.
To better analyze the problem, you could append ;TRACE_LEVEL_FILE=2 to the database URL, and then check in the *.trace.db file what SQL statements were executed against the database.

Categories