Apache Spark with elasticsearch V5.X - java

I am a novice in Java and I am looking for some examples of connector between elasticsearch V5.X and Spark in order to see some use cases.
At the moment here is my code :
package Spark;
import org.apache.hadoop.conf.Configuration;
import org.apache.log4j.Level;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaSparkContext;
import org.junit.Test;
import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableMap;
import scala.collection.immutable.Map;
import twitter4j.Status;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.SparkConf;
import org.elasticsearch.spark.rdd.api.java.JavaEsSpark;
public class EsSpark {
public EsSpark(){
SparkConf conf = new SparkConf().setAppName("MyApp1").setMaster("localhost");
conf.set("es.index.auto.create", "true");
JavaSparkContext jsc = new JavaSparkContext(conf);
Map<String, ?> numbers = (Map<String, ?>) ImmutableMap.of("one", 1, "two", 2);
Map<String, ?> airports = (Map<String, ?>) ImmutableMap.of("OTP", "Otopeni", "SFO", "San Fran");
JavaRDD<Map<String, ?>> javaRDD = jsc.parallelize(ImmutableList.of(numbers, airports));
JavaEsSpark.saveToEs(javaRDD, "spark/docs");
}
}
Thanks.

Except if you are working with a local instance of Elasticsearch, there are some important settings to be provided, notably es.nodes.
You can do it using
conf.set("es.nodes", "eshost:9200");
You can even specify multiple instances, prefer master nodes, but not all nodes are required.
Please refer to official documentation.
People at discussion forum at elastic often publish some code you can use as example.
Ensure to provide several documents as the EsSpark or EsSparkStreaming objects. Do not send 1 document each time, prefer multiple documents.
EsSpark or EsSparkStreaming connect to the nodes you provide, they check for the cluster topology (number of nodes, types of nodes) and they will send data directly to data nodes and to the correct shard (avoiding hops).
It is possible to prevent to push data directly to the data nodes (using the settings specified in this section of the documentation), but you will introduce bottlenecks.

Related

How to get .har file or network request using selenium4

As we know One of the features added in the new version of Selenium (4.0.0-alpha-2) is a very nice Interface for Chrome DevTools API in Java.DevTools API offers great capabilities for controlling the Browser and the Web Traffic
As per documentation using the latest version of selenium we can capture the network request from the session.
Before I used browsermob for getting the network request but unfortunately they didn't update it a couple of years.
I am looking for someone who used this selenium4 dev tools API for getting all the internal request.
Can anyone suggest me how can I start to get all the requests? Thanks, advance
You can find #adiohana's example in the selenium-chrome-devtools-examples repo on gitHub.
I think youed find this test example helpful:
public class ChromeDevToolsTest {
private static ChromeDriver chromeDriver;
private static DevTools chromeDevTools;
#BeforeClass
public static void initDriverAndDevTools() {
chromeDriver = new ChromeDriver();
// dev-tools handler
chromeDevTools = chromeDriver.getDevTools();
chromeDevTools.createSession();
}
#Test
public void interceptRequestAndContinue() {
//enable Network
chromeDevTools.send(Network.enable(Optional.empty(), Optional.empty(), Optional.empty()));
//add listener to intercept request and continue
chromeDevTools.addListener(Network.requestIntercepted(),
requestIntercepted -> chromeDevTools.send(
Network.continueInterceptedRequest(requestIntercepted.getInterceptionId(),
Optional.empty(),
Optional.empty(),
Optional.empty(), Optional.empty(),
Optional.empty(),
Optional.empty(), Optional.empty())));
//set request interception only for css requests
RequestPattern requestPattern = new RequestPattern("*.css", ResourceType.Stylesheet, InterceptionStage.HeadersReceived);
chromeDevTools.send(Network.setRequestInterception(ImmutableList.of(requestPattern)));
chromeDriver.get("https://apache.org");
}
You need to add the following imports:
import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableMap;
import org.junit.AfterClass;
import org.junit.Assert;
import org.junit.BeforeClass;
import org.junit.Test;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.devtools.Command;
import org.openqa.selenium.devtools.Console;
import org.openqa.selenium.devtools.DevTools;
import org.openqa.selenium.devtools.network.Network;
import org.openqa.selenium.devtools.network.model.BlockedReason;
import org.openqa.selenium.devtools.network.model.InterceptionStage;
import org.openqa.selenium.devtools.network.model.RequestPattern;
import org.openqa.selenium.devtools.network.model.ResourceType;
import org.openqa.selenium.devtools.security.Security;
import java.util.Optional;

Glassfish & MongoDB connection error : NoClassDefFoundError

I am running a Glassfish server that is trying to connect to MongoDB. At first I created seperate projects for the server and MongoDB. So now I am trying to merge those projects but it appears anything I try to do it results in a faliure.
The current error I am getting is:
2018-07-05T19:54:36.249+0200|Severe: java.lang.NoClassDefFoundError: org/bson/conversions/Bson
I am well aware that the error happens in runtime and that the possible cause is my classpath.
Currently I copied all of my code from one project to another, added Maven dependencies and the following happens:
if I create a separate .java file for my MongoDB and run it in the same folder that the Glassfish server is, it works perfectly fine.
if I run the server and try to call methods from the other class (a little bit modified) the upper error appears
Simplified code example withouth error:
import org.bson.Document;
import org.bson.conversions.Bson;
import com.mongodb.BasicDBObject;
import com.mongodb.MongoClient;
import com.mongodb.client.FindIterable;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoDatabase;
import com.mongodb.client.MongoIterable;
import com.mongodb.client.model.Filters;
import com.mongodb.client.model.Updates;
public class MyClass{
public static void main(String[]args){
String ip = "127.0.0.1";
int port = 27017;
MongoClient mongoClient = new MongoClient(ip,port);
/* Remaining code */
}
}
With error:
import org.bson.Document;
import org.bson.conversions.Bson;
import com.mongodb.BasicDBObject;
import com.mongodb.MongoClient;
import com.mongodb.client.FindIterable;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoDatabase;
import com.mongodb.client.MongoIterable;
import com.mongodb.client.model.Filters;
import com.mongodb.client.model.Updates;
public class MyClass{
private MongoClient mongoClient;
public MyClass(String ip, int port){
mongoClient = new MongoClient(ip, port); // Error called here
}
/* Remaining code */
}
Called from the server.java file:
MyClass mc = new MyClass("127.0.0.1",27017);
I also tried to download all of the bson jar files separately and add them to the project but that had no effect...
The working solution for me was to delete the whole project and create it once more. Apparently there was a problem with Eclipse or I made a mistake before and forgot about it.

Where is log declared?

I'm trying to propose a patch to deeplearning4j, but first I need to be able to build the project. I'm able to build it from maven using the manual instructions, but IntelliJ (2016.3.6) is finding errors, and when I look at the source code, I don't blame it.
The source file I'm specifically stumped by is https://github.com/deeplearning4j/deeplearning4j/blob/master/deeplearning4j-nlp-parent/deeplearning4j-nlp/src/main/java/org/deeplearning4j/models/word2vec/StaticWord2Vec.java, which has a couple references to a variable log that's not declared in this file.
package org.deeplearning4j.models.word2vec;
import lombok.extern.slf4j.Slf4j;
import org.deeplearning4j.models.embeddings.WeightLookupTable;
import org.deeplearning4j.models.embeddings.reader.ModelUtils;
import org.deeplearning4j.models.embeddings.wordvectors.WordVectors;
import org.deeplearning4j.models.word2vec.wordstore.VocabCache;
import org.nd4j.linalg.api.ndarray.INDArray;
import org.nd4j.linalg.compression.AbstractStorage;
import org.nd4j.linalg.factory.Nd4j;
import org.nd4j.linalg.ops.transforms.Transforms;
import java.util.ArrayList;
import java.util.Collection;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
/**
* This is special limited Word2Vec implementation, suited for serving as lookup table in concurrent multi-gpu environment
* This implementation DOES NOT load all vectors onto any of gpus, instead of that it holds vectors in, optionally, compressed state in host memory.
* This implementation DOES NOT provide some of original Word2Vec methods, such as wordsNearest or wordsNearestSum.
*
* #author raver119#gmail.com
*/
#Slf4j
public class StaticWord2Vec implements WordVectors {
private List<Map<Integer, INDArray>> cacheWrtDevice = new ArrayList<>();
private AbstractStorage<Integer> storage;
private long cachePerDevice = 0L;
private VocabCache<VocabWord> vocabCache;
private String unk = null;
... snipped
The class extends an interface, but does not explicitly extend a parent class. Inspecting the class file generated by Maven using javap, I see:
Compiled from "StaticWord2Vec.java"
public class org.deeplearning4j.models.word2vec.StaticWord2Vec
implements org.deeplearning4j.models.embeddings.wordvectors.WordVectors {
private static final org.slf4j.Logger log;
... snipped
I finally noticed the annotation #Slf4j and tracing the import statement, discovered that I needed to add the Lombok plugin to IntelliJ to be able to build this project.

Mongodb database is not created

I am creating mongodb database and trying to insert records in it but problem is that database is not created
My database name is "myMongoDB" and collection name is chanel when i run it,it gives
error and with BUILD SUCCESSFUL
package databaseconnection;
import com.mongodb.BasicDBObject;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.MongoClient;
import java.net.UnknownHostException;
public class InsertDriver {
public static void main(String args[])throws UnknownHostException
{
DB db=(new MongoClient("localhost",8080)).getDB("myMongoDB");
DBCollection dbcollection=db.getCollection("chanel");
BasicDBObject basicDBObject=new BasicDBObject();
basicDBObject.put("name", "dhiraj");
basicDBObject.put("subscription", 4100);
dbcollection.insert(basicDBObject);
}
}
Exception in thread "main" java.lang.NoSuchMethodError: com.mongodb.ReadPreference.primary()Lcom/mongodb/ReadPreference;
at com.mongodb.MongoClientOptions$Builder.<init>(MongoClientOptions.java:52)
at com.mongodb.MongoClient.<init>(MongoClient.java:128)
at com.mongodb.MongoClient.<init>(MongoClient.java:117)
at databaseconnection.InsertDriver.main(InsertDriver.java:21)
It looks like you mix several different versions of java mongodb client library.
If you take a look at this version of ReadPreference for instance http://grepcode.com/file/repo1.maven.org/maven2/org.mongodb/mongo-java-driver/2.7.3/com/mongodb/ReadPreference.java you'll see that there is no "primary" method there. But in different version it's there: http://grepcode.com/file/repo1.maven.org/maven2/org.mongodb/mongo-java-driver/2.9.1/com/mongodb/ReadPreference.java#ReadPreference.primary%28%29
Can you please list all jars from your classpath for more detailed help. It could be that classes from old mongodb client were added into some other jar.
One thing I want to clear, Databases are created in MongoDB when you insert some data in any collection of that database.
First of all, check MongoDB is running on your machine (By default it will run on port 27017)?
Try to insert some sample data from mongo shell.
sample commands:
use testDB
db.testCollection.insert({"name":"dev"});
It will inset this data in testCollection of testDB database. You can find it using :
db.testCollection.find()
If all this working fine. Then proceed with java driver.
You code looks good besides that 8080 port (I am assuming you manually changed the port from 27017 to 8080) and make sure MongoDB is running.
Actually i dont know what was wrong with my previous code but i uninstalled the
mongodb completly and then reinstalled it and tried following code and it worked fine for me .
package mongod;
import com.mongodb.MongoClient;
import com.mongodb.MongoException;
import com.mongodb.WriteConcern;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.BasicDBObject;
import com.mongodb.DBObject;
import com.mongodb.DBCursor;
import com.mongodb.ServerAddress;
import java.net.UnknownHostException;
import java.util.List;
public class Mongod {
//private static Object mongo;
Mongod mongo;
public static void main(String[] args) throws UnknownHostException {
MongoClient mongoClient = new MongoClient("localhost", 27017);
DB db = mongoClient.getDB("testDB1");
DBCollection dbcollection=db.getCollection("testCollection");
BasicDBObject basicDBObject=new BasicDBObject();
basicDBObject.put("name", "dhiraj");
basicDBObject.put("subscription", 4100);
dbcollection.insert(basicDBObject);
//boolean auth = db.authenticate("admin", "admin123".toCharArray());
//System.out.println(auth);
List<String> dbs = mongoClient.getDatabaseNames();
for (String dbss : dbs) {
System.out.println(dbss);
}
}
}

Read the contents of the import Package

I am working on creating a computer controlled bots for a game using Java. I got a example bot program and I am understanding this currently.
I am not able to understand what does #JProp means in the code below. Can any one help me on this. Also, how do I view all the contents of the import files at the start of the program.
package com.mycompany.mavenproject1;
import cz.cuni.amis.introspection.java.JProp;
import cz.cuni.amis.pogamut.base.agent.impl.AgentId;
import cz.cuni.amis.pogamut.base.agent.module.comm.PogamutJVMComm;
import cz.cuni.amis.pogamut.base.agent.navigation.IPathExecutorState;
import cz.cuni.amis.pogamut.base.communication.worldview.listener.annotation.EventListener;
import cz.cuni.amis.pogamut.base.utils.guice.AgentScoped;
import cz.cuni.amis.pogamut.base3d.worldview.object.ILocated;
import cz.cuni.amis.pogamut.base3d.worldview.object.Location;
import cz.cuni.amis.pogamut.unreal.communication.messages.UnrealId;
import cz.cuni.amis.pogamut.ut2004.agent.module.utils.TabooSet;
import cz.cuni.amis.pogamut.ut2004.agent.navigation.UT2004PathAutoFixer;
import cz.cuni.amis.pogamut.ut2004.agent.navigation.stuckdetector.UT2004DistanceStuckDetector;
import cz.cuni.amis.pogamut.ut2004.agent.navigation.stuckdetector.UT2004PositionStuckDetector;
import cz.cuni.amis.pogamut.ut2004.agent.navigation.stuckdetector.UT2004TimeStuckDetector;
import cz.cuni.amis.pogamut.ut2004.bot.impl.UT2004Bot;
import cz.cuni.amis.pogamut.ut2004.bot.impl.UT2004BotModuleController;
import cz.cuni.amis.pogamut.ut2004.bot.params.UT2004BotParameters;
import cz.cuni.amis.pogamut.ut2004.communication.messages.UT2004ItemType;
import cz.cuni.amis.pogamut.ut2004.communication.messages.gbcommands.Initialize;
import cz.cuni.amis.pogamut.ut2004.communication.messages.gbinfomessages.BotKilled;
import cz.cuni.amis.pogamut.ut2004.communication.messages.gbinfomessages.ConfigChange;
import cz.cuni.amis.pogamut.ut2004.communication.messages.gbinfomessages.FlagInfo;
import cz.cuni.amis.pogamut.ut2004.communication.messages.gbinfomessages.GameInfo;
import cz.cuni.amis.pogamut.ut2004.communication.messages.gbinfomessages.InitedMessage;
import cz.cuni.amis.pogamut.ut2004.communication.messages.gbinfomessages.Item;
import cz.cuni.amis.pogamut.ut2004.communication.messages.gbinfomessages.NavPoint;
import cz.cuni.amis.pogamut.ut2004.communication.messages.gbinfomessages.Player;
import cz.cuni.amis.pogamut.ut2004.communication.messages.gbinfomessages.PlayerKilled;
import cz.cuni.amis.pogamut.ut2004.communication.messages.gbinfomessages.Self;
import cz.cuni.amis.pogamut.ut2004.utils.UT2004BotRunner;
import cz.cuni.amis.utils.Heatup;
import cz.cuni.amis.utils.exception.PogamutException;
import cz.cuni.amis.utils.flag.FlagListener;
/**
* Example of Simple Pogamut bot, that randomly walks around the map searching
* for preys shooting at everything that is in its way.
*
* #author Rudolf Kadlec aka ik
* #author Jimmy
*/
#AgentScoped
public class CTFBot extends UT2004BotModuleController<UT2004Bot> {
/** boolean switch to activate engage behavior */
#JProp
public boolean shouldEngage = true;
/** boolean switch to activate pursue behavior */
It seems this JProp annotation is used for introspection purposes (allowing the contents of the variable which is decorated to be easily inspected from within your IDE).
Quoting this manual:
Introspection is designed to ease the bot's parameterization. It is
often needed to adjust multiple behavior parameters at runtime and you
will probably end up creating your own GUI (graphical user interface)
for this purpose. In introspection, you just annotate desired
variables with #JProp annotation and they will be accessible via the
Netbeans GUI.
Let's look how logging and introspection works in EmptyBot example.
First start the bot (F6), then have a look on it's source code. In the
initial section several variables annotated with the #JProp are
defined.
#JProp
public String stringProp = "Hello bot example";
#JProp
public boolean boolProp = true;
#JProp
public int intProp = 2;
#JProp
public double doubleProp = 1.0;
Now expand bot's node under the UT server node (in Services tab), you
will see two new nodes - Logs and Introspection. After selecting the
Introspection node the annotated variables will be shown in the
Properties (Ctrl + Shift + 7) window. Note that the intProp variable
is being continuously updated. New values of variables can be also set
in this window.

Categories