Delete batch operation in Azure Storage - java

I have been trying to implement a DAO method for delete operation for Azure Storage entities. Delete using TableOperation was ok.
TableOperation deleteEntity = TableOperation.delete(entity);
But when I tried it using Batch Operation, It was not supported.
Any suggestions to overcome this issue is highly appreciated.

But when I tried it using Batch Operation, It was not supported.
I assumed that you could group your items for deleting by partition key, then execute the TableBatchOperation.
Here I wrote a helper class via C# language for achieving this purpose, you could refer to it:
public class TableBatchHelper<T> where T : ITableEntity
{
const int batchMaxSize = 100;
public static IEnumerable<TableBatchOperation> GetBatchesForDelete(IEnumerable<T> items)
{
var list = new List<TableBatchOperation>();
var partitionGroups = items.GroupBy(arg => arg.PartitionKey).ToArray();
foreach (var group in partitionGroups)
{
T[] groupList = group.ToArray();
int offSet = batchMaxSize;
T[] entities = groupList.Take(offSet).ToArray();
while (entities.Any())
{
var tableBatchOperation = new TableBatchOperation();
foreach (var entity in entities)
{
tableBatchOperation.Add(TableOperation.Delete(entity));
}
list.Add(tableBatchOperation);
entities = groupList.Skip(offSet).Take(batchMaxSize).ToArray();
offSet += batchMaxSize;
}
}
return list;
}
public static async Task BatchDeleteAsync(CloudTable table, IEnumerable<T> items)
{
var batches = GetBatchesForDelete(items);
await Task.WhenAll(batches.Select(table.ExecuteBatchAsync));
}
}
Then, you could you execute the batch deleting as follows:
await TableBatchHelper<ClassName>.BatchDeleteAsync(cloudTable,items);
Or
var batches = TableBatchHelper<ClassName>.GetBatchesForDelete(entities);
Parallel.ForEach(batches, new ParallelOptions()
{
MaxDegreeOfParallelism = 5
}, (batchOperation) =>
{
try
{
table.ExecuteBatch(batchOperation);
Console.WriteLine("Writing {0} records", batchOperation.Count);
}
catch (Exception ex)
{
Console.WriteLine("ExecuteBatch throw a exception:" + ex.Message);
}
});

No, That was the code without using block operation. Following is the code that includes block operation. Sorry for not mentioning that
TableBatchOperation batchOperation = new TableBatchOperation();
List<TableBatchOperation> list = new ArrayList<>();
if (partitionQuery != null) {
for (AzureLocationData entity : cloudTable.execute(partitionQuery)) {
batchOperation.add(TableOperation.delete(entity));
list.add(batchOperation); //exception thrown line
}
try {
cloudTable.execute((TableOperation) batchOperation);
} catch (StorageException e) {
e.printStackTrace();
}
}

public void deleteLocationsForDevice(String id) {
logger.info("Going to delete location data for Device [{}]", id);
// Create a filter condition where the partition key is deviceId.
String partitionFilter = TableQuery.generateFilterCondition(
PARTITION_KEY,
TableQuery.QueryComparisons.EQUAL,
id);
// Specify a partition query, using partition key filter.
TableQuery<AzureLocationData> partitionQuery =
TableQuery.from(AzureLocationData.class)
.where(partitionFilter);
if (partitionQuery != null) {
for (AzureLocationData entity : cloudTable.execute(partitionQuery)) {
TableOperation deleteEntity = TableOperation.delete(entity);
try {
cloudTable.execute(deleteEntity);
logger.info("Successfully deleted location records with : " + entity.getPartitionKey());
} catch (StorageException e) {
e.printStackTrace();
}
}
} else {
logger.debug("No records to delete!");
}
// throw new UnsupportedOperationException("AzureIotLocationDataDao Delete Operation not supported");
}

Related

SortedSet not adding new constructed objects from certain sql queried results data

The answer to the following described issue may be as simple as that I am not using SortedSet correctly, but I wouldn't know if that is the case.
void SQLRankGuildsByPoints(final CallbackReturnIntegerStringSortedSet callback)
{
java.sql.Connection cn = null;
try {
cn = DataSource.getConnection();
if(cn != null)
{
PreparedStatement query = cn.prepareStatement("SELECT GuildName, TotalActivityPoints FROM Guilds");
ResultSet result = query.executeQuery();
SortedSet<Pair_IntString> GuildsRanking = new TreeSet(new ComparatorGuildsRanking());
while(result.next())
{
int resultInt = result.getInt("TotalActivityPoints");
String resultString = result.getString("GuildName");
GuildsRanking.add(new Pair_IntString(resultInt, resultString));
}
Bukkit.getScheduler().runTask(MainClassAccess, new Runnable() { //Callback to main thread
#Override
public void run() {
callback.onDone(GuildsRanking);
}
});
}
} catch (SQLException e) {
System.err.print(e);
} finally {
try {
cn.close();
} catch (SQLException e) {
System.err.print(e);
}
}
}
All 8 results from the Guilds table are present in "result" ResultSet.
GuildsRanking.add() isn't adding the new custom Pair_IntString object constructed with the query results, specifically for guilds "test" and "lilo" in Guilds table.
SQLRankGuildsByPoints method finishes it's execution, calling back the GuildsRanking SortedSet without 2 of the iterated results.
This behaviour is unintended and I can't find an explanation for it.
The comparator used for TreeSet:
public class ComparatorGuildsRanking implements Comparator<Pair_IntString> {
#Override
public int compare(Pair_IntString intStr1, Pair_IntString intStr2) {
return intStr2.integer.compareTo(intStr1.integer);
}
}
Custom Pair_IntString class:
public class Pair_IntString {
public Integer integer;
public String string;
Pair_IntString(Integer i, String s)
{
integer = i;
string = s;
}
}
No error messages with the skipped add iterations.

Executing batch query with redis redisson client Hangs forever

I have an inventory list of Millions of records that I want to insert/merge in batches in Redis using Redisson Batch command.
below is the code
public void upsertInventoryInBatches(final List<ItemInventory> itemInventory) throws ExecutionException, InterruptedException {
RBatch batch = redissonClient.createBatch(BatchOptions.defaults().responseTimeout(300, TimeUnit.SECONDS));
RMapAsync<String, ItemInventory> map = batch.getMap(IMSConstant.REDIS_INVENTORY_MAP);
try {
for (ItemInventory item : itemInventory) {
map.mergeAsync(item.getKey(), item, (existing, newValue) -> {
if (existing == null) {
return newValue;
} else {
if (existing.getQuantity() == newValue.getQuantity()
&& existing.getMinMRP() == newValue.getMinMRP()) {
return existing;
}
existing.setQuantity(item.getQuantity());
existing.setMinMRP(item.getMinMRP());
existing.setEarliestExpiryDate(item.getEarliestExpiryDate());
existing.setVersion(item.getVersion());
return existing;
}
});
}
var res = batch.execute(); // Hangs with no result and no error
} catch (Exception e) {
System.out.println(e.getMessage());
}
thebatch.execute statement just hangs with no error and no output.
Looking for guidance on what I am doing wrong.
batch.getMap(IMSConstant.REDIS_INVENTORY_MAP).putAsync(item.getKey(), item) works fine but I want to merge the values. if its not possible with redisson, Is it possible via any redis java client?

TarArchiveInputStream with Java Reactor

Currently I'm dealing with a TarArchiveInputStream as this :
private Mono<Employee> createEmployeeFromArchiveFile() {
return Mono.fromCallable(() -> {
return new Employee();
})
.flatMap(employee -> {
try {
TarArchiveInputStream tar =
new TarArchiveInputStream(new GzipCompressorInputStream(new FileInputStream(new File("/tmp/myarchive.tar.gz"))));
TarArchiveEntry entry;
tar.read();
while ((entry = tar.getNextTarEntry()) != null) {
if (entry.getName().equals("data1.txt")) {
// process data
String data1 = IOUtils.toString(tar, String.valueOf(StandardCharsets.UTF_8));
if (data1.contains("age")) {
employee.setAge(4);
} else {
return Mono.error(new Exception("Missing age"));
}
}
if (entry.getName().equals("data2.txt")) {
// a lot more processing => put that in another function for clarity purpose
String data2 = IOUtils.toString(tar, String.valueOf(StandardCharsets.UTF_8));
employee = muchProcessing(employee, data2);
}
}
tar.close();
} catch (Exception e) {
return Mono.error(new Exception("Error while streaming archive"));
}
return Mono.just(employee);
});
}
private Employee muchProcessing(Employee employee, String data2) {
if (data2.contains("name")) {
employee.setName(4);
} else {
// return an error ?
}
return employee;
}
Firstly, is this a correct way to process the archive file with Reactor ? It works fine, but it seems like synchronous business inside a flatMap. I haven't found a better way.
Secondly, I don't know how to handle the function muchProcessing(tar). If that function triggers errors, how would it return them in order to be dealt appropriately as a Mono.error ? Since I want this function to return me an employee.
Thanks!
You can handle the task inside the flatMap as a CompletableFuture and convert it to a Mono. Here's a link on how to do that:
How to create a Mono from a completableFuture
Then, you can abstract it out as:
.flatMap(this::processEmployee).doOnError(this::logError).onErrorResume(getFallbackEmployee())

Unexpected output in concurrent file writing in java web

I wrote a very simple Java web application ,just included some basic function like register , sign in , changing the password and some others.
I don't use database. I just create a file in the app to record the users' information and do the database stuff.
I used JMeter to stressing the web application, especially the register interface.
The JMeter shows that the result of the 1000 thread is right
but when I look into the information.txt , which stores the users' information, it's wrong because it stores 700+ record :
but it should include 1000 record, it must be somewhere wrong
I use the singleton class to do the write/read stuff, and i add a synchronized word to the class, the insert() function which is used by register to record the register information is shown as below: (a part of it)
public class Database {
private static Database database = null;
private static File file = null;
public synchronized static Database getInstance() {
if (database == null) {
database = new Database();
}
return database;
}
private Database() {
String path = this.getClass().getClassLoader().getResource("/")
.getPath() + "information.txt";
file = new File(path);
if (!file.exists()) {
try {
file.createNewFile();
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
public void insert(String account, String password, String username) {
RandomAccessFile infoFile = null;
try {
infoFile = new RandomAccessFile(file, "rw");
String record;
long offset = 0;
while ((record = infoFile.readLine()) != null ) {
offset += record.getBytes().length+2;
}
infoFile.seek(offset);
record = account+"|"+password+"|"+username+"\r\n";
infoFile.write(record.getBytes());
infoFile.close();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (infoFile != null) {
try {
infoFile.close();
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
}
}
the question is why would this happened , the synchronized is thread safe, why i lost so many data and some blank line was inserted into it, what could I do the correct it !
You are synchronizing the getInstance() method, but not the insert() method. This makes the retrieval of the instance of Database thread-safe, but not the write operation.

How to implement Jedis pipelining which takes list of commands as input?

I am using redis with java using Jedis as redis-client. I have a class
public class RedisDBManager
which has methods which call Jedis to execute the commands on redis.
Sample method inside RedisDBManager
public Set<NTuple> zrangeWithScores(String key, int min, int max)
{
JedisSentinelPool localPool = redisSentinelPool;
Jedis redis = null;
try
{
redis = getResource();
Set<Tuple> tupleSet = redis.zrangeWithScores(key, min, max);
Set<NTuple> tuples = new LinkedHashSet<>();
for (Tuple tuple : tupleSet) {
tuples.add(new NTuple(tuple.getElement(), tuple.getScore()));
}
return tuples;
}
catch (JedisConnectionException jex) {
logger.error("Creating new connection since it encountered Jedis Connection Exception: ",jex);
createNewConnectionPool(localPool);
try {
redis = getResource();
Set<Tuple> tupleSet = redis.zrangeWithScores(key, min, max);
Set<NTuple> tuples = new LinkedHashSet<>();
for (Tuple tuple : tupleSet) {
tuples.add(new NTuple(tuple.getElement(), tuple.getScore()));
}
return tuples;
}
catch (Exception e){
logger.error("Exception: ", e);
return null;
}
}
catch (Exception ex) {
logger.error("Exception: ", ex);
return null;
}
finally
{
if (redis != null)
{
returnResource(redis);
}
}
}
Here the method getResource returns a resource from JedisSentinelPool.
I want a pipelining method inside this class so that it takes a list of commands to execute and returns the responses as a list. I want that any construct of Jedis should not be used outside of RedisDBManager as in outside methods should call pipelining method which takes care of all responsibilities.
This question is similar to this question. It differs in a way that i want to use different redis commands as well and get their responses.
My current incomplete approach is modifying all methods in RedisDBManager to accept whether to pipeline it or not to a thread local Pipeline object and then have a pipelining method which syncs this pipeline object and returns the responses.
Something like :
public Set<NTuple> zrangeWithScores(String key, int min, int max, boolean pipelined) {
...
try
{
if (pipelined) {
pipeline = getExistingThreadLocalPipelineObject();
pipeline.zrangeWithScores(key, min, max);
} else {
redis = getResource();
...
return tuples;
}
catch (JedisConnectionException jex) {
...
if (pipelined) {
pipeline = getExistingThreadLocalPipelineObject();
pipeline.zrangeWithScores(key, min, max);
} else {
redis = getResource();
...
return tuples;
...
}
public List<MyResponse> syncPipeline() {
pipeline = getExistingThreadLocalPipelineObject();
pipeline.sync();
//process all responses and send
}
Is there any better or simpler approach? Thanks.

Categories