Elasticsearch initial configuration in Java code or in external script?

Elasticsearch initial configuration in Java code or in external script? - java

I am learning Elasticsearch and I have started a new project. Now I wonder where I should add the initial code for creating the mappings etc. Would you create an external script which holds the different cURL commands and then run that, or have for example a own package in the Java project where you have the configuration code and then run it when you need to? Which approach is the most appropriate and why?
Mapping that I want to try with XContentBuilder
{
"tweet" : {
"properties" : {
"message" : {
"type" : "string",
"store" : "yes",
"index" : "analyzed",
"null_value" : "na"
}
}
}
}

I like to have it in java:
public void putMappingFromString(String index, String type, String mapping) {
IndicesAdminClient iac = getClient().admin().indices();
PutMappingRequestBuilder pmrb = new PutMappingRequestBuilder(iac);
pmrb.setIndices(index);
pmrb.setType(type);
pmrb.setSource(mapping);
ListenableActionFuture<PutMappingResponse> laf = pmrb.execute();
PutMappingResponse pmr = laf.actionGet();
pmr.getAcknowledged();
}
You can also get the mapping for an index from the cluster state (indirectly):
public String getMapping(String index, String type) throws EsuException {
ClusterState cs = getClient().admin().cluster().prepareState().setFilterIndices(index).execute().actionGet().getState();
IndexMetaData imd = cs.getMetaData().index(index);
if (imd == null) {
throw new EsuIndexDoesNotExistException(index);
}
MappingMetaData mmd = imd.mapping(type);
if (mmd == null) {
throw new EsuTypeDoesNotExistException(index, type);
}
String mapping = "";
try {
mapping = mmd.source().string();
} catch (IOException e) {
mapping = "{ \"" + e.toString() + "\"}";
}
return mapping;
}
This allows for versioning your mappings along with your source code if you store your mappings as a resource on your class path

Related

Optimize the code which Copying AWS S3 objects into DB

I have written a code which fetched the S3 objects from AWS s3 using S3 sdk and stores the same in our DB, the only problem is the task is repeated for three different services, the only thing is changed is the instance of service class.
I have copy and pasted code in each service layer just to changes the instance for an instance.
The task is repeated for service classes VehicleImageService, MsilLayoutService and NonMsilLayoutService, every layer is having its own repository.
I am trying to identify a way to accomplish the same by placing that snippet in one place and on an runtime using Reflection API I wish to pass the correct instance and invoke the method, but I want to achieve the same using best industry practices and pattern. I.e. I want to refactor into generic methods for other services, so instance can be passed at runtime.
So kindly assist me on the same.
public void persistImageDetails() {
log.info("MsilVehicleLayoutServiceImpl::persistImageDetails::START");
String bucketKey = null; //common param
String modelCode = null;//common param
List<S3Object> objList = new ArrayList<>(); //common param
String bucketName = s3BucketDetails.getBucketName();//common param
String bucketPath = s3BucketDetails.getBucketPrefix();//common param
try {
//the layoutRepository object can be MSILRepository,NonMSILRepository and VehilceImageRepository
List<ModelCode> modelCodes = layoutRepository.findDistinctAllBy(); // this line need to take care of
List<String> modelCodePresent = modelCodes.stream().map(ModelCode::getModelCode)
.collect(Collectors.toList());
List<CommonPrefix> allKeysInDesiredBucket = listAllKeysInsideBucket(bucketName, bucketPath);//common param
synchDB(modelCodePresent, allKeysInDesiredBucket);
if (null != allKeysInDesiredBucket && !allKeysInDesiredBucket.isEmpty()) {
for (CommonPrefix commonPrefix : allKeysInDesiredBucket) {
bucketKey = commonPrefix.prefix();
modelCode = new File(bucketKey).getName();
if (modelCodePresent.contains(modelCode)) {
log.info("skipping iteration for {} model code", modelCode);
continue;
}
objList = s3Service.getBucketObjects(bucketName, bucketKey);
if (null != objList && !objList.isEmpty()) {
for (S3Object object : AppUtil.skipFirst(objList)) {
saveLayout(bucketName, modelCode, object);
}
}
}
}
log.info("MSIL Vehicle Layout entries has been successfully saved");
} catch (Exception e) {
log.error("Error occured", e);
e.printStackTrace();
}
log.info("MsilVehicleLayoutServiceImpl::persistImageDetails::END");
}
private void saveLayout(String bucketName, String modelCode, S3Object object) {
log.info("Inside saveLayout::Start preparing entity to persist");
String resourceUri = null;
MsilVehicleLayout vehicleLayout = new MsilVehicleLayout();// this can be MsilVehicleLayout. NonMsilVehicleLayout, VehicleImage
vehicleLayout.setFileName(FilenameUtils.removeExtension(FilenameUtils.getName(object.key())));
vehicleLayout.setModelCode(modelCode);
vehicleLayout.setS3BucketKey(object.key());
resourceUri = getS3ObjectURI(bucketName, object.key());
vehicleLayout.setS3ObjectUri(resourceUri);
vehicleLayout.setS3PresignedUri(null);
vehicleLayout.setS3PresignedExpDate(null);
layoutRepository.save(vehicleLayout); //the layoutRepository object can be MSILRepository,NonMSILRepository and VehilceImageRepository
log.info("Exiting saveLayout::End entity saved");
}

How can I update custom properties in alfresco workflow task using only Java?

First, I want to say thanks to everyone that took their time to help me figure this out because I was searching for more than a week for a solution to my problem. Here it is:
My goal is to start a custom workflow in Alfresco Community 5.2 and to set some custom properties in the first task trough a web script using only the Public Java API. My class is extending AbstractWebScript. Currently I have success with starting the workflow and setting properties like bpm:workflowDescription, but I'm not able to set my custom properties in the tasks.
Here is the code:
public class StartWorkflow extends AbstractWebScript {
/**
* The Alfresco Service Registry that gives access to all public content services in Alfresco.
*/
private ServiceRegistry serviceRegistry;
public void setServiceRegistry(ServiceRegistry serviceRegistry) {
this.serviceRegistry = serviceRegistry;
}
#Override
public void execute(WebScriptRequest req, WebScriptResponse res) throws IOException {
// Create JSON object for the response
JSONObject obj = new JSONObject();
try {
// Check if parameter defName is present in the request
String wfDefFromReq = req.getParameter("defName");
if (wfDefFromReq == null) {
obj.put("resultCode", "1 (Error)");
obj.put("errorMessage", "Parameter defName not found.");
return;
}
// Get the WFL Service
WorkflowService workflowService = serviceRegistry.getWorkflowService();
// Build WFL Definition name
String wfDefName = "activiti$" + wfDefFromReq;
// Get WorkflowDefinition object
WorkflowDefinition wfDef = workflowService.getDefinitionByName(wfDefName);
// Check if such WorkflowDefinition exists
if (wfDef == null) {
obj.put("resultCode", "1 (Error)");
obj.put("errorMessage", "No workflow definition found for defName = " + wfDefName);
return;
}
// Get parameters from the request
Content reqContent = req.getContent();
if (reqContent == null) {
throw new WebScriptException(Status.STATUS_BAD_REQUEST, "Missing request body.");
}
String content;
content = reqContent.getContent();
if (content.isEmpty()) {
throw new WebScriptException(Status.STATUS_BAD_REQUEST, "Content is empty");
}
JSONTokener jsonTokener = new JSONTokener(content);
JSONObject json = new JSONObject(jsonTokener);
// Set the workflow description
Map<QName, Serializable> params = new HashMap();
params.put(WorkflowModel.PROP_WORKFLOW_DESCRIPTION, "Workflow started from JAVA API");
// Start the workflow
WorkflowPath wfPath = workflowService.startWorkflow(wfDef.getId(), params);
// Get params from the POST request
Map<QName, Serializable> reqParams = new HashMap();
Iterator<String> i = json.keys();
while (i.hasNext()) {
String paramName = i.next();
QName qName = QName.createQName(paramName);
String value = json.getString(qName.getLocalName());
reqParams.put(qName, value);
}
// Try to update the task properties
// Get the next active task which contains the properties to update
WorkflowTask wfTask = workflowService.getTasksForWorkflowPath(wfPath.getId()).get(0);
// Update properties
WorkflowTask updatedTask = workflowService.updateTask(wfTask.getId(), reqParams, null, null);
obj.put("resultCode", "0 (Success)");
obj.put("workflowId", wfPath.getId());
} catch (JSONException e) {
throw new WebScriptException(Status.STATUS_BAD_REQUEST,
e.getLocalizedMessage());
} catch (IOException ioe) {
throw new WebScriptException(Status.STATUS_BAD_REQUEST,
"Error when parsing the request.",
ioe);
} finally {
// build a JSON string and send it back
String jsonString = obj.toString();
res.getWriter().write(jsonString);
}
}
}
Here is how I call the webscript:
curl -v -uadmin:admin -X POST -d #postParams.json localhost:8080/alfresco/s/workflow/startJava?defName=nameOfTheWFLDefinition -H "Content-Type:application/json"
In postParams.json file I have the required pairs for property/value which I need to update:
{
"cmprop:propOne" : "Value 1",
"cmprop:propTwo" : "Value 2",
"cmprop:propThree" : "Value 3"
}
The workflow is started, bpm:workflowDescription is set correctly, but the properties in the task are not visible to be set.
I made a JS script which I call when the workflow is started:
execution.setVariable('bpm_workflowDescription', 'Some String ' + execution.getVariable('cmprop:propOne'));
And actually the value for cmprop:propOne is used and the description is properly updated - which means that those properties are updated somewhere (on execution level maybe?) but I cannot figure out why they are not visible when I open the task.
I had success with starting the workflow and updating the properties using the JavaScript API with:
if (wfdef) {
// Get the params
wfparams = {};
if (jsonRequest) {
for ( var prop in jsonRequest) {
wfparams[prop] = jsonRequest[prop];
}
}
wfpackage = workflow.createPackage();
wfpath = wfdef.startWorkflow(wfpackage, wfparams);
The problem is that I only want to use the public Java API, please help.
Thanks!

Do you set your variables locally in your tasks? From what I see, it seems that you define your variables at the execution level, but not at the state level. If you take a look at the ootb adhoc.bpmn20.xml file (https://github.com/Activiti/Activiti-Designer/blob/master/org.activiti.designer.eclipse/src/main/resources/templates/adhoc.bpmn20.xml), you can notice an event listener that sets the variable locally:
<extensionElements>
<activiti:taskListener event="create" class="org.alfresco.repo.workflow.activiti.tasklistener.ScriptTaskListener">
<activiti:field name="script">
<activiti:string>
if (typeof bpm_workflowDueDate != 'undefined') task.setVariableLocal('bpm_dueDate', bpm_workflowDueDate);
if (typeof bpm_workflowPriority != 'undefined') task.priority = bpm_workflowPriority;
</activiti:string>
</activiti:field>
</activiti:taskListener>
</extensionElements>

Usually, I just try to import all tasks for my custom model prefix. So for you, it should look like that:
import java.util.Set;
import org.activiti.engine.delegate.DelegateExecution;
import org.activiti.engine.delegate.DelegateTask;
import org.apache.log4j.Logger;
public class ImportVariables extends AbstractTaskListener {
private Logger logger = Logger.getLogger(ImportVariables.class);
#Override
public void notify(DelegateTask task) {
logger.debug("Inside ImportVariables.notify()");
logger.debug("Task ID:" + task.getId());
logger.debug("Task name:" + task.getName());
logger.debug("Task proc ID:" + task.getProcessInstanceId());
logger.debug("Task def key:" + task.getTaskDefinitionKey());
DelegateExecution execution = task.getExecution();
Set<String> executionVariables = execution.getVariableNamesLocal();
for (String variableName : executionVariables) {
// If the variable starts by "cmprop_"
if (variableName.startsWith("cmprop_")) {
// Publish it at the task level
task.setVariableLocal(variableName, execution.getVariableLocal(variableName));
}
}
}
}

How can I import data to Mongodb from Json file using java

I am struggling with importing data into Mongodb from a Json file.
I can do the same in command line by using mongoimport command.
I explored and tried lot but not able to import from Json file using java.
sample.json
{ "test_id" : 1245362, "name" : "ganesh", "age" : "28", "Job" :
{"company name" : "company1", "designation" : "SSE" }
}
{ "test_id" : 254152, "name" : "Alex", "age" : "26", "Job" :
{"company name" : "company2", "designation" : "ML" }
}
Thank for your time.
~Ganesh~

Suppose you can read the JSON string respectively. For example, you read the first JSON text
{ "test_id" : 1245362, "name" : "ganesh", "age" : "28", "Job" :
{"company name" : "company1", "designation" : "SSE" }
}
and assign it to a variable (String json1), the next step is to parse it,
DBObject dbo = (DBObject) com.mongodb.util.JSON.parse(json1);
put all dbo into a list,
List<DBObject> list = new ArrayList<>();
list.add(dbo);
then save them into database:
new MongoClient().getDB("test").getCollection("collection").insert(list);
EDIT:
In the newest MongoDB Version you have to use Documents instead of DBObject, and the methods for adding the object look different now. Here's an updated example:
Imports are:
import com.mongodb.MongoClient;
import com.mongodb.client.MongoDatabase;
import org.bson.Document;
The code would like this (refering to the text above the EDIT):
Document doc = Document.parse(json1);
new MongoClient().getDataBase("db").getCollection("collection").insertOne(doc);
you can also do it the way with the list. but then you need
new MongoClient().getDataBase("db").getCollection("collection").insertMany(list);
But I think there is a problem with this solution. When you type:
db.collection.find()
in the mongo shell to get all objects in the collection, the result looks like the following:
{ "_id" : ObjectId("56a0d2ddbc7c512984be5d97"),
"test_id" : 1245362, "name" : "ganesh", "age" : "28", "Job" :
{ "company name" : "company1", "designation" : "SSE"
}
}
which is not exactly the same as before.

Had a similar "problem" myself and ended up using Jackson with POJO databinding, and Morphia.
While this sound a bit like cracking a nut with a sledgehammer, it is actually very easy to use, robust and quite performant and easy to maintain code wise.
Small caveat: You need to map your test_id field to MongoDB's _id if you want to reuse it.
Step 1: Create an annotated bean
You need to hint Jackson how to map the data from a JSON file to a POJO. I shortened the class a bit for the sake of readability:
#JsonRootName(value="person")
#Entity
public class Person {
#JsonProperty(value="test_id")
#Id
Integer id;
String name;
public Integer getId() {
return id;
}
public void setId(Integer id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
}
As for the embedded document Job, please have a look at the POJO data binding examples linked.
Step 2: Map the POJO and create a datastore
Somewhere during your application initialization, you need to map the annotated POJO. Since you already should have a MongoClient, I am going to reuse that ;)
Morphia morphia = new Morphia();
morphia.map(Person.class);
/* You can reuse this datastore */
Datastore datastore = morphia.createDatastore(mongoClient, "myDatabase");
/*
* Jackson's ObjectMapper, which is reusable, too,
* does all the magic.
*/
ObjectMapper mapper = new ObjectMapper();
Do the actual importing
Now importing a given JSON file becomes as easy as
public Boolean importJson(Datastore ds, ObjectMapper mapper, String filename) {
try {
JsonParser parser = new JsonFactory().createParser(new FileReader(filename));
Iterator<Person> it = mapper.readValues(parser, Person.class);
while(it.hasNext()) {
ds.save(it.next());
}
return Boolean.TRUE;
} catch (JsonParseException e) {
/* Json was invalid, deal with it here */
} catch (JsonMappingException e) {
/* Jackson was not able to map
* the JSON values to the bean properties,
* possibly because of
* insufficient mapping information.
*/
} catch (IOException e) {
/* Most likely, the file was not readable
* Should be rather thrown, but was
* cought for the sake of showing what can happen
*/
}
return Boolean.FALSE;
}
With a bit of refatcoring, this can be converted in a generic importer for Jackson annotated beans.
Obviously, I left out some special cases, but this would out of the scope of this answer.

With 3.2 driver, if you have a mongo collection and a collection of json documents e.g:
MongoCollection<Document> collection = ...
List<String> jsons = ...
You can insert individually:
jsons.stream().map(Document::parse).forEach(collection::insertOne);
or bulk:
collection.insertMany(
jsons.stream().map(Document::parse).collect(Collectors.toList())
);

I just faced this issue today and solved it in another different way while none here satisfied me, so enjoy my extra contribution. Performances are sufficient to export 30k documents and import them in my Springboot app for integration test cases (takes a few seconds).
First, the way your export your data in the first place matters.
I wanted a file where each line contains 1 document that I can parse in my java app.
mongo db --eval 'db.data.find({}).limit(30000).forEach(function(f){print(tojson(f, "", true))})' --quiet > dataset.json
Then I get the file from my resources folder, parse it, extract lines, and process them with mongoTemplate. Could use a buffer.
#Autowired
private MongoTemplate mongoTemplate;
public void createDataSet(){
mongoTemplate.dropCollection("data");
try {
InputStream inputStream = Thread.currentThread().getContextClassLoader().getResourceAsStream(DATASET_JSON);
List<Document> documents = new ArrayList<>();
String line;
InputStreamReader isr = new InputStreamReader(inputStream, Charset.forName("UTF-8"));
BufferedReader br = new BufferedReader(isr);
while ((line = br.readLine()) != null) {
documents.add(Document.parse(line));
}
mongoTemplate.insert(documents,"data");
} catch (Exception e) {
throw new RuntimeException(e);
}
}

List<Document> jsonList = new ArrayList<Document>();
net.sf.json.JSONArray array = net.sf.json.JSONArray.fromObject(json);
for (Object object : array) {
net.sf.json.JSONObject jsonStr = (net.sf.json.JSONObject)JSONSerializer.toJSON(object);
Document jsnObject = Document.parse(jsonStr.toString());
jsonList.add(jsnObject);
}
collection.insertMany(jsonList);

Runtime r = Runtime.getRuntime();
Process p = null;
//dir is the path to where your mongoimport is.
File dir=new File("C:/Program Files/MongoDB/Server/3.2/bin");
//this line will open your shell in giving dir, the command for import is exactly same as you use mongoimport in command promote
p = r.exec("c:/windows/system32/cmd.exe /c mongoimport --db mydb --collection student --type csv --file student.csv --headerline" ,null,dir);

public static void importCSV(String path) {
try {
List<Document> list = new ArrayList<>();
MongoDatabase db = DbConnection.getDbConnection();
db.createCollection("newCollection");
MongoCollection<Document> collection = db.getCollection("newCollection");
BufferedReader reader = new BufferedReader(new FileReader(path));
String line;
while ((line = reader.readLine()) != null) {
String[] item = line.split(","); // csv file is "" separated
String id = item[0]; // get the value in the csv assign keywords
String first_name = item[1];
String last_name = item[2];
String address = item[3];
String gender = item[4];
String dob = item[5];
Document document = new Document(); // create a document
document.put("id", id); // data into the database
document.put("first_name", first_name);
document.put("last_name", last_name);
document.put("address", address);
document.put("gender", gender);
document.put("dob", dob);
list.add(document);
}
collection.insertMany(list);
}catch (Exception e){
System.out.println(e);
}
}

Calling Existing PipeLine in GATE

I am new to Java and I want to call my saved pipeline using GATE JAVA API through Eclipse
I am not sure how I could do this although I know how to create new documents etc
FeatureMap params = Factory.newFeatureMap();
params.put(Document.DOCUMENT_URL_PARAMETER_NAME, new URL("http://www.gate.ac.uk"));
params.put(Document.DOCUMENT_ENCODING_PARAMETER_NAME, "UTF-8");
// document features
FeatureMap feats = Factory.newFeatureMap();
feats.put("date", new Date());
Factory.createResource("gate.corpora.DocumentImpl", params, feats, "This is home");
//End Solution 2
// obtain a map of all named annotation sets
Document doc = Factory.newDocument("Document text");
Map <String, AnnotationSet> namedASes = doc.getNamedAnnotationSets();
System.out.println("No. of named Annotation Sets:" + namedASes.size());
// no of annotations each set contains
for (String setName : namedASes.keySet()) {
// annotation set
AnnotationSet aSet = namedASes.get(setName);
// no of annotations
System.out.println("No. of Annotations for " +setName + ":" + aSet.size());

There is a good example of GATE usage from java. Probably it does exactly what you want. BatchProcessApp.java.
In particular:
loading pipeline is done with lines
// load the saved application
CorpusController application =
(CorpusController)PersistenceManager.loadObjectFromFile(gappFile);
pipeli executed with
// run the application
application.execute();
Code is informative, clear and could be easy changed for your particular needs. The oxygen of open source project :)

Something like this could be used(do not forget to init GATE: set GATE home and etc):
private void getProcessedText(String textToProcess) {
Document gateDocument = null;
try {
// you can use your method from above to build document
gateDocument = createGATEDocument(textToProcess);
corpusController.getCorpus().add(gateDocument);
corpusController.execute();
// put here your annotations processing
} catch (Throwable ex) {
ex.printStackTrace();
} finally {
if (corpusController.getCorpus() != null) {
corpusController.getCorpus().remove(gateDocument);
}
if (gateDocument != null) {
Factory.deleteResource(gateDocument);
}
}
}
private CorpusController initPersistentGateResources() {
try {
Corpus corpus = Factory.newCorpus("New Corpus");
corpusController = (CorpusController) PersistenceManager.loadObjectFromFile(new File("PATH-TO-YOUR-GAPP-FILE"));
corpusController.setCorpus(corpus);
} catch (Exception ex) {
ex.printStackTrace();
}
return corpusController;
}

Spring - validate that all message resources are well configured

We need to add some code to be executed on application load in order to validate that all the messages.properties elements are well defined for all languages.
Is this possible?
Steps: dynamically read on application load all the spring message codes from JSP or java classes then pass through all message resources properties files and validate that nothing is missing from them.

We ended up doing this manually but without using any library.
Steps:
have all the keys used in Java classes or JSP defined in a constant file
Read them using Java .class properties:
Field[] fields = Constants.class.getFields();
String filed[i].get(Constants.class);
Read all messageResources.properties file names from the project using:
String pathToThisClass = MessageResourcesValidator.class.getProtectionDomain().getCodeSource().getLocatin().getPath();
File filePath = new File(pathToThisClass);
String[] list = filePath.list(new DirFilter("(messages).*\\.(properties)"));
DirFilter is a normal class implementing Java's FileNameFilter
Create a class that read the properties from a file using its file name:
public class PropertiesFile{
private Properties prop;
public PropertiesFile(String fileName) throws Exception
{
init(fileName);
}
private void init(String fileName) throws Exception
{
prop = new Properties();
try (InputStream input = getClass().getClassLoader().getResourceAsStream(fileName);)
{
if(input == null)
{
throw new Exception("Enable to load properties file " + fileName);
}
prop.load(input);
}
catch(IOException e)
{
throw new Exception("Error loading properties file " + fileName);
}
}
public List<String> getPropertiesKeysList()
{
List<String> result = new ArrayList<>();
Enumeration<?> e = prop.propertyNames();
while(e.hasMoreElements())
{
result.add((String) e.nextElement());
// String value = prop.getProperty(key);
}
return result;
}
}
The part that does the comparison should be something as the following code that calls the above methods:
List<String> msgResourcesFiles = getMessageResourcesFileNames();
List<String> codeKeys = getListOfCodeMessageResources();
PropertiesFile file = null;
List<String> propKeys = null;
for(String fileName : msgResourcesFiles)
{
file = new PropertiesFile(fileName);
propKeys = file.getPropertiesKeysList();
for(String key : codeKeys)
{
if(!propKeys.contains(key))
{
throw new Exception("Missing key " + key);
}
}
}
Note: or another workaround would be to compare all message resources files to a default one and this way we minimize the code needed from the above explanation.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Elasticsearch initial configuration in Java code or in external script? - java

Related

Optimize the code which Copying AWS S3 objects into DB

How can I update custom properties in alfresco workflow task using only Java?

How can I import data to Mongodb from Json file using java

Calling Existing PipeLine in GATE

Spring - validate that all message resources are well configured

Categories

Resources