How to bulk index HTML files with Solr Cell? - java

I have a directory that contains 200 million HTML files (don't look at me, I didn't create this mess, I just have to deal with it). I need to index every HTML file in that directory into Solr. I've been reading guides on getting the job done, and I've got something going right now. After about an hour, I've got about 100k indexed, meaning this is going to take roughly 85 days.
I'm indexing the files to a standalone Solr server, running on a c4.8xlarge AWS EC2 instance. Here's the output from free -m with the Solr server running, and the indexer I wrote running as well:
total used free shared buffers cached
Mem: 60387 12981 47405 0 19 4732
-/+ buffers/cache: 8229 52157
Swap: 0 0 0
As you can see, I'm doing pretty good on resources. I increased the number of maxWarmingSearchers to 200 in my Solr config, because I was getting the error:
Exceeded limit of maxWarmingSearchers=2, try again later
Alright, but I don't think increasing that limit was really the right approach. I think the issue is that for each file, I am doing a commit, and I should be doing this in bulk (say 50k files / commit), but I'm not entirely sure how to adapt this code for that, and every example I see does a single file at a time. I really need to do everything I can to make this run as fast as possible, since I don't really have 85 days to wait on getting the data in Solr.
Here's my code:
Index.java
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.LinkedBlockingQueue;
public class Index {
public static void main(String[] args) {
String directory = "/opt/html";
String solrUrl = "URL";
final int QUEUE_SIZE = 250000;
final int MAX_THREADS = 300;
BlockingQueue<String> queue = new LinkedBlockingQueue<>(QUEUE_SIZE);
SolrProducer producer = new SolrProducer(queue, directory);
new Thread(producer).start();
for (int i = 1; i <= MAX_THREADS; i++)
new Thread(new SolrConsumer(queue, solrUrl)).start();
}
}
Producer.java
import java.io.IOException;
import java.nio.file.*;
import java.nio.file.attribute.BasicFileAttributes;
import java.util.concurrent.BlockingQueue;
public class SolrProducer implements Runnable {
private BlockingQueue<String> queue;
private String directory;
public SolrProducer(BlockingQueue<String> queue, String directory) {
this.queue = queue;
this.directory = directory;
}
#Override
public void run() {
try {
Path path = Paths.get(directory);
Files.walkFileTree(path, new SimpleFileVisitor<Path>() {
#Override
public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
if (!attrs.isDirectory()) {
try {
queue.put(file.toString());
} catch (InterruptedException e) {
}
}
return FileVisitResult.CONTINUE;
}
});
} catch (IOException e) {
e.printStackTrace();
}
}
}
Consumer.java
import co.talentiq.common.net.SolrManager;
import org.apache.solr.client.solrj.SolrServerException;
import java.io.IOException;
import java.util.concurrent.BlockingQueue;
public class SolrConsumer implements Runnable {
private BlockingQueue<String> queue;
private static SolrManager sm;
public SolrConsumer(BlockingQueue<String> queue, String url) {
this.queue = queue;
if (sm == null)
this.sm = new SolrManager(url);
}
#Override
public void run() {
try {
while (true) {
String file = queue.take();
sm.indexFile(file);
}
} catch (InterruptedException e) {
e.printStackTrace();
} catch (SolrServerException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}
SolrManager.java
import org.apache.solr.client.solrj.SolrClient;
import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.impl.HttpSolrClient;
import org.apache.solr.client.solrj.request.AbstractUpdateRequest;
import org.apache.solr.client.solrj.request.ContentStreamUpdateRequest;
import java.io.File;
import java.io.IOException;
import java.util.UUID;
public class SolrManager {
private static String urlString;
private static SolrClient solr;
public SolrManager(String url) {
urlString = url;
if (solr == null)
solr = new HttpSolrClient(url);
}
public void indexFile(String fileName) throws IOException, SolrServerException {
ContentStreamUpdateRequest up = new ContentStreamUpdateRequest("/update/extract");
String solrId = UUID.randomUUID().toString();
up.addFile(new File(fileName), solrId);
up.setParam("literal.id", solrId);
up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
solr.request(up);
}
}

You can use up.setCommitWithin(10000); to make Solr just commit automagically at least every ten seconds. Increase the value to make Solr commit each minute (60000) or each ten minutes (600000). Remove the explicit commit (setAction(..)).
Another option is to configure autoCommit in your configuration file.
You might also be able to index quicker by moving the HTML extraction process out from Solr (and just submitting the text to be indexed), or expanding the amount of servers you're posting to (more nodes in the cluster).

Am guessing you wont be searching the index in parallel while documents are being indexed. So here are the things that you could do.
You can configure the auto commit option in your solrconfig.xml. It can be done based on number of documents / time interval. For you, number of documents option would make more sense.
Remove that call to setAction() method in ContentStreamUpdateRequest object. you can maintain a count for number of calls made to indexFile() method. Say if it reaches 25000/10000 (based on your heap you can limit the count) then for that indexing call alone you can perform the commit using the SolrClient object like solr.commit(). so that the commit will be made once for specified count.
Let me know the results. Good Luck!

Related

Guaranteed to run function before AWS lambda exits

Is there a way in the JVM to guarantee that some function will run before a AWS lambda function will exit? I would like to flush an internal buffer to stdout as a last action in a lambda function even if some exception is thrown.
As far as I understand you want to execute some code before your Lambda function is stopped, regardless what your execution state is (running/waiting/exception handling/etc).
This is not possible out of the box with Lambda, i.e. there is no event fired or something similar which can be identified as a shutdown hook. The JVM will be freezed as soon as you hit the timeout. However, you can observe the remaining execution time by using the method getRemainingTimeInMillis() from the Context object. From the docs:
Returns the number of milliseconds left before the execution times out.
So, when initializing your function you can schedule a task which is regularly checking how much time is left until your Lambda function reaches the timeout. Then, if only less than X (milli-)seconds are left, you do Y.
aws-samples shows how to do it here
package helloworld;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.IOException;
import java.net.URL;
import java.util.HashMap;
import java.util.Map;
import java.util.stream.Collectors;
import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestHandler;
/**
* Handler for requests to Lambda function.
*/
public class App implements RequestHandler<Object, Object> {
static {
Runtime.getRuntime().addShutdownHook(new Thread() {
#Override
public void run() {
System.out.println("[runtime] ShutdownHook triggered");
System.out.println("[runtime] Cleaning up");
// perform actual clean up work here.
try {
Thread.sleep(200);
} catch (Exception e) {
System.out.println(e);
}
System.out.println("[runtime] exiting");
System.exit(0);
}
});
}
public Object handleRequest(final Object input, final Context context) {
Map<String, String> headers = new HashMap<>();
headers.put("Content-Type", "application/json");
headers.put("X-Custom-Header", "application/json");
try {
final String pageContents = this.getPageContents("https://checkip.amazonaws.com");
String output = String.format("{ \"message\": \"hello world\", \"location\": \"%s\" }", pageContents);
return new GatewayResponse(output, headers, 200);
} catch (IOException e) {
return new GatewayResponse("{}", headers, 500);
}
}
private String getPageContents(String address) throws IOException {
URL url = new URL(address);
try (BufferedReader br = new BufferedReader(new InputStreamReader(url.openStream()))) {
return br.lines().collect(Collectors.joining(System.lineSeparator()));
}
}
}

How to load Apache Ignite Cache when reading from a text file

I created a file helloworld.txt. Now I'm reading from the file and then I want to load the contents of the file into the cache, and whenever the cache is updated, it should write to the file as well.
This is my code so far:
Please tell me what to do to load the cache and then write from the cache to the file, as the instructions are not clear from Apache Ignite documentation.
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import org.apache.ignite.Ignite;
import org.apache.ignite.IgniteCache;
import org.apache.ignite.IgniteDataStreamer;
import org.apache.ignite.IgniteException;
import org.apache.ignite.Ignition;
import org.apache.ignite.examples.ExampleNodeStartup;
import org.apache.ignite.examples.ExamplesUtils;
public class FileRead {
/** Cache name. */
private static final String CACHE_NAME = "FileCache";
/** Heap size required to run this example. */
public static final int MIN_MEMORY = 512 * 1024 * 1024;
/**
* Executes example.
*
* #param args Command line arguments, none required.
* #throws IgniteException If example execution failed.
*/
public static void main(String[] args) throws IgniteException {
ExamplesUtils.checkMinMemory(MIN_MEMORY);
try (Ignite ignite = Ignition.start("examples/config/example-ignite.xml")) {
System.out.println();
try (IgniteCache<Integer, String> cache = ignite.getOrCreateCache(CACHE_NAME)) {
long start = System.currentTimeMillis();
try (IgniteDataStreamer<Integer, String> stmr = ignite.dataStreamer(CACHE_NAME)) {
// Configure loader.
stmr.perNodeBufferSize(1024);
stmr.perNodeParallelOperations(8);
///FileReads();
try {
BufferedReader in = new BufferedReader
(new FileReader("/Users/akritibahal/Desktop/helloworld.txt"));
String str;
int i=0;
while ((str = in.readLine()) != null) {
System.out.println(str);
stmr.addData(i,str);
i++;
}
System.out.println("Loaded " + i + " keys.");
}
catch (IOException e) {
}
}
}
}
}
}
For information on how to load the cache from a persistence store please refer to this page: https://apacheignite.readme.io/docs/data-loading
You have two options:
Start a client node, create IgniteDataStreamer and use it to load the data. Simply call addData() for each line in the file.
Implement CacheStore.loadCache() method, provide the implementation in the cache configuration and call IgniteCache.loadCache().
Second approach will require to have the file on all server nodes, by there will be no communication between nodes, so most likely it will be faster.

With Elastic Beanstalk, can I determine programmatically if I'm on the leader node?

I have some housekeeping tasks within an Elastic Beanstalk Java application running on Tomcat, and I need to run them every so often. I want these tasks run only on the leader node (or, more correctly, on a single node, but the leader seems like an obvious choice).
I was looking at running cron jobs within Elastic Beanstalk, but it feels like this should be more straightforward than what I've come up with. Ideally, I'd like one of these two options within my web app:
Some way of testing within the current JRE whether or not this server is the leader node
Some some way to hit a specific URL (wget?) to trigger the task, but also restrict that URL to requests from localhost.
Suggestions?
It is not possible, by design (leaders are only assigned during deployment, and not needed on other contexts). However, you can tweak and use the EC2 Metadata for this exact purpose.
Here's an working example about how to achieve this result (original source). Once you call getLeader, it will find - or assign - an instance to be set as a leader:
package br.com.ingenieux.resource;
import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.Collection;
import java.util.Collections;
import java.util.List;
import javax.inject.Inject;
import javax.ws.rs.GET;
import javax.ws.rs.Path;
import javax.ws.rs.Produces;
import org.apache.commons.io.IOUtils;
import com.amazonaws.services.ec2.AmazonEC2;
import com.amazonaws.services.ec2.model.CreateTagsRequest;
import com.amazonaws.services.ec2.model.DeleteTagsRequest;
import com.amazonaws.services.ec2.model.DescribeInstancesRequest;
import com.amazonaws.services.ec2.model.Filter;
import com.amazonaws.services.ec2.model.Instance;
import com.amazonaws.services.ec2.model.Reservation;
import com.amazonaws.services.ec2.model.Tag;
import com.amazonaws.services.elasticbeanstalk.AWSElasticBeanstalk;
import com.amazonaws.services.elasticbeanstalk.model.DescribeEnvironmentsRequest;
#Path("/admin/leader")
public class LeaderResource extends BaseResource {
#Inject
AmazonEC2 amazonEC2;
#Inject
AWSElasticBeanstalk elasticBeanstalk;
#GET
public String getLeader() throws Exception {
/*
* Avoid running if we're not in AWS after all
*/
try {
IOUtils.toString(new URL(
"http://169.254.169.254/latest/meta-data/instance-id")
.openStream());
} catch (Exception exc) {
return "i-FFFFFFFF/localhost";
}
String environmentName = getMyEnvironmentName();
List<Instance> environmentInstances = getInstances(
"tag:elasticbeanstalk:environment-name", environmentName,
"tag:leader", "true");
if (environmentInstances.isEmpty()) {
environmentInstances = getInstances(
"tag:elasticbeanstalk:environment-name", environmentName);
Collections.shuffle(environmentInstances);
if (environmentInstances.size() > 1)
environmentInstances.removeAll(environmentInstances.subList(1,
environmentInstances.size()));
amazonEC2.createTags(new CreateTagsRequest().withResources(
environmentInstances.get(0).getInstanceId()).withTags(
new Tag("leader", "true")));
} else if (environmentInstances.size() > 1) {
DeleteTagsRequest deleteTagsRequest = new DeleteTagsRequest().withTags(new Tag().withKey("leader").withValue("true"));
for (Instance i : environmentInstances.subList(1,
environmentInstances.size())) {
deleteTagsRequest.getResources().add(i.getInstanceId());
}
amazonEC2.deleteTags(deleteTagsRequest);
}
return environmentInstances.get(0).getInstanceId() + "/" + environmentInstances.get(0).getPublicIpAddress();
}
#GET
#Produces("text/plain")
#Path("am-i-a-leader")
public boolean isLeader() {
/*
* Avoid running if we're not in AWS after all
*/
String myInstanceId = null;
String environmentName = null;
try {
myInstanceId = IOUtils.toString(new URL(
"http://169.254.169.254/latest/meta-data/instance-id")
.openStream());
environmentName = getMyEnvironmentName();
} catch (Exception exc) {
return false;
}
List<Instance> environmentInstances = getInstances(
"tag:elasticbeanstalk:environment-name", environmentName,
"tag:leader", "true", "instance-id", myInstanceId);
return (1 == environmentInstances.size());
}
protected String getMyEnvironmentHost(String environmentName) {
return elasticBeanstalk
.describeEnvironments(
new DescribeEnvironmentsRequest()
.withEnvironmentNames(environmentName))
.getEnvironments().get(0).getCNAME();
}
private String getMyEnvironmentName() throws IOException,
MalformedURLException {
String instanceId = IOUtils.toString(new URL(
"http://169.254.169.254/latest/meta-data/instance-id"));
/*
* Grab the current environment name
*/
DescribeInstancesRequest request = new DescribeInstancesRequest()
.withInstanceIds(instanceId)
.withFilters(
new Filter("instance-state-name").withValues("running"));
for (Reservation r : amazonEC2.describeInstances(request)
.getReservations()) {
for (Instance i : r.getInstances()) {
for (Tag t : i.getTags()) {
if ("elasticbeanstalk:environment-name".equals(t.getKey())) {
return t.getValue();
}
}
}
}
return null;
}
public List<Instance> getInstances(String... args) {
Collection<Filter> filters = new ArrayList<Filter>();
filters.add(new Filter("instance-state-name").withValues("running"));
for (int i = 0; i < args.length; i += 2) {
String key = args[i];
String value = args[1 + i];
filters.add(new Filter(key).withValues(value));
}
DescribeInstancesRequest req = new DescribeInstancesRequest()
.withFilters(filters);
List<Instance> result = new ArrayList<Instance>();
for (Reservation r : amazonEC2.describeInstances(req).getReservations())
result.addAll(r.getInstances());
return result;
}
}
You can keep a secret URL (a long URL is un-guessable, almost as safe as a password), hit this URL from somewhere. On this you can execute the task.
One problem however is that if the task takes too long, then during that time your server capacity will be limited. Another approach would be for the URL hit to post a message to the AWS SQS. The another EC2 can have a code which waits on SQS and execute the task. You can also look into http://aws.amazon.com/swf/
Another approach if you're running on the Linux-type EC2 instance:
Write a shell script that does (or triggers) your periodic task
Leveraging the .ebextensions feature to customize your Elastic Beanstalk instance, create a container command that specifies the parameter leader_only: true -- this command will only run on an instance that is designated the leader in your Auto Scaling group
Have your container command copy your shell script into /etc/cron.hourly (or daily or whatever).
The result will be that your "leader" EC2 instance will have a cron job running hourly (or daily or whatever) to do your periodic task and the other instances in your Auto Scaling group will not.

Creating graph with Neo4j graph database takes too long

I use the following code to create a graph with Neo4j Graph Database:
import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.HashMap;
import java.util.Map;
import org.neo4j.graphdb.RelationshipType;
import org.neo4j.graphdb.index.IndexHits;
import org.neo4j.helpers.collection.MapUtil;
import org.neo4j.index.lucene.unsafe.batchinsert.LuceneBatchInserterIndexProvider;
import org.neo4j.unsafe.batchinsert.BatchInserter;
import org.neo4j.unsafe.batchinsert.BatchInserterIndex;
import org.neo4j.unsafe.batchinsert.BatchInserterIndexProvider;
import org.neo4j.unsafe.batchinsert.BatchInserters;
public class Neo4jMassiveInsertion implements Insertion {
private BatchInserter inserter = null;
private BatchInserterIndexProvider indexProvider = null;
private BatchInserterIndex nodes = null;
private static enum RelTypes implements RelationshipType {
SIMILAR
}
public static void main(String args[]) {
Neo4jMassiveInsertion test = new Neo4jMassiveInsertion();
test.startup("data/neo4j");
test.createGraph("data/enronEdges.txt");
test.shutdown();
}
/**
* Start neo4j database and configure for massive insertion
* #param neo4jDBDir
*/
public void startup(String neo4jDBDir) {
System.out.println("The Neo4j database is now starting . . . .");
Map<String, String> config = new HashMap<String, String>();
inserter = BatchInserters.inserter(neo4jDBDir, config);
indexProvider = new LuceneBatchInserterIndexProvider(inserter);
nodes = indexProvider.nodeIndex("nodes", MapUtil.stringMap("type", "exact"));
}
public void shutdown() {
System.out.println("The Neo4j database is now shuting down . . . .");
if(inserter != null) {
indexProvider.shutdown();
inserter.shutdown();
indexProvider = null;
inserter = null;
}
}
public void createGraph(String datasetDir) {
System.out.println("Creating the Neo4j database . . . .");
try {
BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(datasetDir)));
String line;
int lineCounter = 1;
Map<String, Object> properties;
IndexHits<Long> cache;
long srcNode, dstNode;
while((line = reader.readLine()) != null) {
if(lineCounter > 4) {
String[] parts = line.split("\t");
cache = nodes.get("nodeId", parts[0]);
if(cache.hasNext()) {
srcNode = cache.next();
}
else {
properties = MapUtil.map("nodeId", parts[0]);
srcNode = inserter.createNode(properties);
nodes.add(srcNode, properties);
nodes.flush();
}
cache = nodes.get("nodeId", parts[1]);
if(cache.hasNext()) {
dstNode = cache.next();
}
else {
properties = MapUtil.map("nodeId", parts[1]);
dstNode = inserter.createNode(properties);
nodes.add(dstNode, properties);
nodes.flush();
}
inserter.createRelationship(srcNode, dstNode, RelTypes.SIMILAR, null);
}
lineCounter++;
}
reader.close();
}
catch (IOException e) {
e.printStackTrace();
}
}
}
Comparing with other graph database technologies (titan, orientdb) it needs too much time. So may i am doing something wrong. Is there a way to boost up the procedure?
I use neo4j 1.9.5 and my machine has a 2.3 Ghz CPU (i5), 4GB RAM and 320GB disk and I am running on Macintosh OSX Mavericks (10.9). Also my heap size is at 2GB.
Usually I can import about 1M nodes and 200k relationships per second on my macbook.
Flush & Search
Please don't flush & search on every insert, that totally kills performance.
Keep your nodeIds in a HashMap from your data to node-id, and only write to lucene during the import.
(If you care about memory usage you can also go with something like gnu-trove)
RAM
Memory Mapping
You also use too little RAM (I usually use heaps between 4 and 60GB depending on the data set size) and you don't have any config set.
Please check as sensible config something like this, depending on you data volume I'd raise these numbers.
cache_type=none
use_memory_mapped_buffers=true
neostore.nodestore.db.mapped_memory=200M
neostore.relationshipstore.db.mapped_memory=1000M
neostore.propertystore.db.mapped_memory=250M
neostore.propertystore.db.strings.mapped_memory=250M
Heap
And make sure to give it enough heap. You might also have a disk that might be not the fastest. Try to increase your heap to at least 3GB. Also make sure to have the latest JDK, 1.7.._b25 had a memory allocation issue (it allocated only a tiny bit of memory for the

NoClassDefFoundError and Netty

First to say I'm n00b in Java. I can understand most concepts but in my situation I want somebody to help me. I'm using JBoss Netty to handle simple http request and using MemCachedClient check existence of client ip in memcached.
import org.jboss.netty.channel.ChannelHandler;
import static org.jboss.netty.handler.codec.http.HttpHeaders.*;
import static org.jboss.netty.handler.codec.http.HttpHeaders.Names.*;
import static org.jboss.netty.handler.codec.http.HttpResponseStatus.*;
import static org.jboss.netty.handler.codec.http.HttpVersion.*;
import com.danga.MemCached.*;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Set;
import org.jboss.netty.buffer.ChannelBuffer;
import org.jboss.netty.buffer.ChannelBuffers;
import org.jboss.netty.channel.ChannelFuture;
import org.jboss.netty.channel.ChannelFutureListener;
import org.jboss.netty.channel.ChannelHandlerContext;
import org.jboss.netty.channel.ExceptionEvent;
import org.jboss.netty.channel.MessageEvent;
import org.jboss.netty.channel.SimpleChannelUpstreamHandler;
import org.jboss.netty.handler.codec.http.Cookie;
import org.jboss.netty.handler.codec.http.CookieDecoder;
import org.jboss.netty.handler.codec.http.CookieEncoder;
import org.jboss.netty.handler.codec.http.DefaultHttpResponse;
import org.jboss.netty.handler.codec.http.HttpChunk;
import org.jboss.netty.handler.codec.http.HttpChunkTrailer;
import org.jboss.netty.handler.codec.http.HttpRequest;
import org.jboss.netty.handler.codec.http.HttpResponse;
import org.jboss.netty.handler.codec.http.HttpResponseStatus;
import org.jboss.netty.handler.codec.http.QueryStringDecoder;
import org.jboss.netty.util.CharsetUtil;
/**
* #author The Netty Project
* #author Andy Taylor (andy.taylor#jboss.org)
* #author Trustin Lee
*
* #version $Rev: 2368 $, $Date: 2010-10-18 17:19:03 +0900 (Mon, 18 Oct 2010) $
*/
#SuppressWarnings({"ALL"})
public class HttpRequestHandler extends SimpleChannelUpstreamHandler {
private HttpRequest request;
private boolean readingChunks;
/** Buffer that stores the response content */
private final StringBuilder buf = new StringBuilder();
protected MemCachedClient mcc = new MemCachedClient();
private static SockIOPool poolInstance = null;
static {
// server list and weights
String[] servers =
{
"lcalhost:11211"
};
//Integer[] weights = { 3, 3, 2 };
Integer[] weights = {1};
// grab an instance of our connection pool
SockIOPool pool = SockIOPool.getInstance();
// set the servers and the weights
pool.setServers(servers);
pool.setWeights(weights);
// set some basic pool settings
// 5 initial, 5 min, and 250 max conns
// and set the max idle time for a conn
// to 6 hours
pool.setInitConn(5);
pool.setMinConn(5);
pool.setMaxConn(250);
pool.setMaxIdle(21600000); //1000 * 60 * 60 * 6
// set the sleep for the maint thread
// it will wake up every x seconds and
// maintain the pool size
pool.setMaintSleep(30);
// set some TCP settings
// disable nagle
// set the read timeout to 3 secs
// and don't set a connect timeout
pool.setNagle(false);
pool.setSocketTO(3000);
pool.setSocketConnectTO(0);
// initialize the connection pool
pool.initialize();
// lets set some compression on for the client
// compress anything larger than 64k
//mcc.setCompressEnable(true);
//mcc.setCompressThreshold(64 * 1024);
}
#Override
public void messageReceived(ChannelHandlerContext ctx, MessageEvent e) throws Exception {
HttpRequest request = this.request = (HttpRequest) e.getMessage();
if(mcc.get(request.getHeader("X-Real-Ip")) != null)
{
HttpResponse response = new DefaultHttpResponse(HTTP_1_1, OK);
response.setHeader("X-Accel-Redirect", request.getUri());
ctx.getChannel().write(response).addListener(ChannelFutureListener.CLOSE);
}
else {
sendError(ctx, NOT_FOUND);
}
}
private void writeResponse(MessageEvent e) {
// Decide whether to close the connection or not.
boolean keepAlive = isKeepAlive(request);
// Build the response object.
HttpResponse response = new DefaultHttpResponse(HTTP_1_1, OK);
response.setContent(ChannelBuffers.copiedBuffer(buf.toString(), CharsetUtil.UTF_8));
response.setHeader(CONTENT_TYPE, "text/plain; charset=UTF-8");
if (keepAlive) {
// Add 'Content-Length' header only for a keep-alive connection.
response.setHeader(CONTENT_LENGTH, response.getContent().readableBytes());
}
// Encode the cookie.
String cookieString = request.getHeader(COOKIE);
if (cookieString != null) {
CookieDecoder cookieDecoder = new CookieDecoder();
Set<Cookie> cookies = cookieDecoder.decode(cookieString);
if(!cookies.isEmpty()) {
// Reset the cookies if necessary.
CookieEncoder cookieEncoder = new CookieEncoder(true);
for (Cookie cookie : cookies) {
cookieEncoder.addCookie(cookie);
}
response.addHeader(SET_COOKIE, cookieEncoder.encode());
}
}
// Write the response.
ChannelFuture future = e.getChannel().write(response);
// Close the non-keep-alive connection after the write operation is done.
if (!keepAlive) {
future.addListener(ChannelFutureListener.CLOSE);
}
}
#Override
public void exceptionCaught(ChannelHandlerContext ctx, ExceptionEvent e)
throws Exception {
e.getCause().printStackTrace();
e.getChannel().close();
}
private void sendError(ChannelHandlerContext ctx, HttpResponseStatus status) {
HttpResponse response = new DefaultHttpResponse(HTTP_1_1, status);
response.setHeader(CONTENT_TYPE, "text/plain; charset=UTF-8");
response.setContent(ChannelBuffers.copiedBuffer(
"Failure: " + status.toString() + "\r\n",
CharsetUtil.UTF_8));
// Close the connection as soon as the error message is sent.
ctx.getChannel().write(response).addListener(ChannelFutureListener.CLOSE);
}
}
When I try to send request like http://127.0.0.1:8090/1/2/3
I'm getting
java.lang.NoClassDefFoundError: com/danga/MemCached/MemCachedClient
at httpClientValidator.server.HttpRequestHandler.<clinit>(HttpRequestHandler.java:66)
I believe it's not related to classpath. May be it's related to context in which mcc doesn't exist.
Any help appreciated
EDIT:
Original code http://docs.jboss.org/netty/3.2/xref/org/jboss/netty/example/http/snoop/package-summary.html
I've modified some parts to fit my needs.
Why do you think this is not classpath related? That's the kind of error you get when the jar you need is not available. How do you start your app?
EDIT
Sorry - i loaded and tried the java_memcached-release_2.5.2 bundle in eclipse and found no issue so far. Debugging the class loading revealed nothing unusual. I can't help besides some more hints to double check:
make sure your download is correct. download and unpack again. (are the com.schooner.* classes available?)
make sure you use > java 1.5
make sure your classpath is correct and complete. The example you have shown does not include netty. Where is it.
I'm not familiar with interactions stemming from adding a classpath to the manifest. Maybe revert to plain style, add all jars needed (memcached, netty, yours) to the classpath and reference the main class to start, not a startable jar file

Categories