Netty UDP Performance Issue - java

I've implemented three small UDP server. One with a plain Java DatagramSocket (threaded), one with Netty and the last one also with Netty but with a threaded message handling (because Netty doesn't support multiple threads with UDP).
After some measurements I got the following results for requests per second:
DatagramSocket ~30.000 requests/second
Netty ~1.500 requests/second
Netty (threaded): ~8.000 requests/second
The real application I have to implement must handle > 25.000 requests/second. So my question is if I make something wrong with Netty or if Netty is not designed to handle that much of connections per second?
Here are the implementations
DatagramSocket Main
public static void main(String... args) throws Exception {
final int port = Integer.parseInt(args[0]);
final int threads = Integer.parseInt(args[1]);
final int work = Integer.parseInt(args[2]);
DATAGRAM_SOCKET = new DatagramSocket(port);
for (int i = 0; i < threads; i++) {
new Thread(new Handler(work)).start();
DatagramSocket Handler
private static final class Handler implements Runnable {
private final int work;
public Handler(int work) throws SocketException { = work;
public void run() {
try {
while (!DATAGRAM_SOCKET.isClosed()) {
final DatagramPacket receivePacket = new DatagramPacket(new byte[1024], 1024);
final InetAddress ip = receivePacket.getAddress();
final int port = receivePacket.getPort();
final byte[] sendData = "Hey there".getBytes();
final DatagramPacket sendPacket = new DatagramPacket(sendData, sendData.length, ip, port);
} catch (Exception e) {
System.out.println("ERROR: " + e.getMessage());
Netty Main
public static void main(String[] args) throws Exception
final int port = Integer.parseInt(args[0]);
final int sleep = Integer.parseInt(args[1]);
final Bootstrap bootstrap = new Bootstrap(); NioEventLoopGroup());;
bootstrap.handler(new MyNettyUdpHandler(sleep));
Netty Handler (threaded)
public class MyNettyUdpHandler extends MessageToMessageDecoder<DatagramPacket> {
private final Random random = new Random(System.currentTimeMillis());
private final int sleep;
public MyNettyUdpHandler(int sleep) {
this.sleep = sleep;
protected void decode(ChannelHandlerContext channelHandlerContext, DatagramPacket datagramPacket, List list) throws Exception {
new Thread(() -> {
try {
} catch (InterruptedException e) {
System.out.println("ERROR while sleeping");
final ByteBuf buffer = Unpooled.buffer(64);
buffer.writeBytes("Hey there".getBytes()); DatagramPacket(buffer, datagramPacket.sender()));
The non threaded Netty Handler is the same but without the thread.

You can change your Netty decode() method like so to make it equivalent to the DatagramSocket code:
protected void decode(ChannelHandlerContext channelHandlerContext, DatagramPacket datagramPacket, List list) throws Exception {
final Channel channel =;
channel.eventLoop().schedule(() -> {
final ByteBuf buffer = Unpooled.buffer(64);
buffer.writeBytes("Hey there".getBytes());
channel.writeAndFlush(new DatagramPacket(buffer, datagramPacket.sender()));
}, random.nextInt(sleep), TimeUnit.MILLISECONDS);
But I'm guessing the sleep() code is simulating business code you will later execute.
If that is the case make sure you don't run blocking code inside the handler.
To answer your question below:
You got a bit confused with the channels. You create a pipeline in the bootstrap, and you bind to some port. The returned channel is the server channel. The channel in the handlers method (your decode method in your case), is like the socket you get when you accept() in traditional socket programming. Note that port you extracted from the incoming DatagramPacket - it's roughly the same. So you send data to the client back on this channel.
The code I wrote that schedules the response is simply doing the same as what your DatagramSocket code, and the threaded netty code you wrote.
I wasn't sure why you did that, and simply assumed you have a business requirement to delay the response.
If this isn't the case, you can remove the schedule call, and your code will run much faster.
If your business logic is non-blocking, and runs in a few millis, you're done. If it's blocking, you need to try to find a non-blocking alternative, or run it in an executor, i.e. not on the event loop.
Hope this helps, even though this wasn't part of your original question. Netty is awesome, and I hate seeing bad examples and bad vibes about it so it's worth my time I guess ;)

Creating a thread in every decode() is inefficient.
You can submit the task to channel.eventLoop() as Eran said if the task is simple and won't block(In fact decode() in MesaggeToMessageDecoders is executed by the channel's EventLoop,so you need not submit it manually unless you want to shedule it).
Or you can submit the task to a ThreadPoolExecutor or EventExecutorGroup.
The latter is better because you can add listeners to the Future returned by EventExecutorGroup.submit() so you don't have to wait for the task to be completed.
My English is poor,hope these can help you.
You can write as following,just executing the simple logic code in the EventLoop(ie.I/O thread):
protected void decode(ChannelHandlerContext channelHandlerContext, DatagramPacket datagramPacket, List list) throws Exception {
//do something simple with datagramPacket
final ByteBuf buffer = Unpooled.buffer(64);
buffer.writeBytes("Hey there".getBytes()); DatagramPacket(buffer, datagramPacket.sender()));


Event driven and asynchronous serial port communication simultaneously

I'm completely new to serial port communication and need some help grasping it.
I need to communicate with a control board. This board can sometimes send events that I need to react to, and I need to send events to the board and await a response.
We have established a protocol where each event is always 12 bytes and the first 2 bytes determine the event type.
I know that when I send a specific message, I need to await a message with specific signifying bytes. At the same time I want it to be possible to react to events that are sent from the board. For instance the board might say that it is overheating, and at the same time I'm asking it to perform some command and reply.
My question is, if I write to the port and block for a second while awaiting the expected response, how I do ensure I don't "steal" the data my listener expects? E.g. do a serial ports work like a stream, where once I've read I've advanced past the point where it can be re-read.
I've done some implementation of this using jSerialComm, hopefully this can shed some light on my question.
First a listener that is registered using the addDataListener method. I want this to trigger when an event is present on the port that starts with "T".
private static LockerSerialPort getLockerSerialPort(final DeviceClient client) {
return MySerialPort.create(COM_PORT)
private static EventHandler createLocalEventHandler() {
return new EventHandler() {
public void execute(final byte[] event) {
System.out.println(new String(event));
public byte[] getEventIdentifier() {
// I want this listener to be executed when events that start with T are sent to the port
return "T".getBytes();
public String getName() {
return "T handler";
Next, I want to be able to write to the port and immediately get the response because it is needed to know if the command was successful or not.
private byte[] waitForResponse(final byte[] bytes) throws LockerException {
return blockingRead();
private void write(final byte[] bytes) throws LockerException {
try (var out = serialPort.getOutputStream()) {
} catch (final IOException e) {
throw Exception.from(e, "Failed to write to serial port %s", getComPort());
public byte[] blockingRead() {
return blockingRead(DEFAULT_READ_TIMEOUT);
private byte[] blockingRead(final int readTimeout) {
serialPort.setComPortTimeouts(SerialPort.TIMEOUT_READ_SEMI_BLOCKING, readTimeout, 0);
try {
byte[] readBuffer = new byte[PACKET_SIZE];
final int bytesRead = serialPort.readBytes(readBuffer, readBuffer.length);
if (bytesRead != PACKET_SIZE) {
throw RuntimeException.from(null, "Expected %d bytes in packet, got %d", PACKET_SIZE, bytesRead);
return readBuffer;
} catch (final Exception e) {
throw RuntimeException.from(e, "Failed to read packet within specified time (%d ms)", readTimeout);
When I call waitForResponse("command"), how do I know my blocking read doesn't steal data from my listener?
Are these two patterns incompatible? How would one usually handle a scenario like this?

Netty Nio read the upcoming messages from ChannelFuture in Java

I am trying to use the following code which is an implementation of web sockets in Netty Nio. I have implment a JavaFx Gui and from the Gui I want to read the messages that are received from the Server or from other clients. The NettyClient code is like the following:
public static ChannelFuture callBack () throws Exception{
String host = "localhost";
int port = 8080;
try {
Bootstrap b = new Bootstrap();;;
b.option(ChannelOption.SO_KEEPALIVE, true);
b.handler(new ChannelInitializer<SocketChannel>() {
public void initChannel(SocketChannel ch) throws Exception {
ch.pipeline().addLast(new RequestDataEncoder(), new ResponseDataDecoder(),
new ClientHandler(i -> {
synchronized (lock) {
connectedClients = i;
ChannelFuture f = b.connect(host, port).sync();
return f;
finally {
public static void main(String[] args) throws Exception {
ChannelFuture ret;
ClientHandler obj = new ClientHandler(i -> {
synchronized (lock) {
connectedClients = i;
ret = callBack();
int connected = connectedClients;
if (connected != 2) {
System.out.println("The number if the connected clients is not two before locking");
synchronized (lock) {
while (true) {
connected = connectedClients;
if (connected == 2)
System.out.println("The number if the connected clients is not two");
System.out.println("The number if the connected clients is two: " + connected );; // can I use that from other parts of the code in order to read the incoming messages?
How can I use the returned channelFuture from the callBack from other parts of my code in order to read the incoming messages? Do I need to call again callBack, or how can I received the updated message of the channel? Could I possible use from my code (inside a button event) something like (so as to take the last message)?
By reading that code,the NettyClient is used to create connection(ClientHandler ),once connect done,ClientHandler.channelActive is called by Netty,if you want send data to server,you should put some code here. if this connection get message form server, ClientHandler.channelRead is called by Netty, put your code to handle message.
You also need to read doc to know how netty encoder/decoder works.
How can I use the returned channelFuture from the callBack from other parts of my code in order to read the incoming messages?
share those ClientHandler created by NettyClient( line 29)
Do I need to call again callBack, or how can I received the updated message of the channel?
if server message come,ClientHandler.channelRead is called.
Could I possible use from my code (inside a button event) something like (so as to take the last message)?
yes you could,but not a netty way,to play with netty,you write callbacks(when message come,when message sent ...),wait netty call your code,that is : the driver is netty,not you.
last,do you really need such a heavy library to do network?if not ,try This code,it simple,easy to understanding

Netty 4.0.23 multiple hosts single client

My question is about creating multiple TCP clients to multiple hosts using the same event loop group in Netty 4.0.23 Final, I must admit that I don't quite understand Netty 4's client threading business, especially with the loads of confusing references to Netty 3.X.X implementations I hit through my research on the internet.
with the following code, I establish a connection with a single server, and send random commands using a command queue:
public class TCPsocket {
private static final CircularFifoQueue CommandQueue = new CircularFifoQueue(20);
private final EventLoopGroup workerGroup;
private final TcpClientInitializer tcpHandlerInit; // all handlers shearable
public TCPsocket() {
workerGroup = new NioEventLoopGroup();
tcpHandlerInit = new TcpClientInitializer();
public void connect(String host, int port) throws InterruptedException {
try {
Bootstrap b = new Bootstrap();;;
b.remoteAddress(host, port);
Channel ch = b.connect().sync().channel();
ChannelFuture writeCommand = null;
for (;;) {
if (!CommandQueue.isEmpty()) {
writeCommand = ch.writeAndFlush(CommandExecute()); // commandExecute() fetches a command form the commandQueue and encodes it into a byte array
if (CommandQueue.isFull()) { // this will never happen ... or should never happen
if (writeCommand != null) {
} finally {
public static void main(String args[]) throws InterruptedException {
TCPsocket socket = new TCPsocket();
socket.connect("", 2101);
in addition to executing commands off of the command queue, this client keeps receiving periodic responses from the serve as a response to an initial command that is sent as soon as the channel becomes active, in one of the registered handlers (in TCPClientInitializer implementation), I have:
public void channelActive(ChannelHandlerContext ctx) {
System.out.println("sent first message\n");
which activates a feature in the connected-to server, triggering a periodic packet that is returned from the server through the life span of my application.
The problem comes when I try to use this same setup to connect to multiple servers,
by looping through a string array of known server IPs:
public static void main(String args[]) throws InterruptedException {
String[] hosts = new String[]{"", "", ""};
TCPsocket socket = new TCPsocket();
for (String host : hosts) {
socket.connect(host, 2101);
once the first connection is established, and the server ( starts sending the designated periodic packets, no other connection is attempted, which (I think) is the result of the main thread waiting on the connection to die, hence never running the second iteration of the for loop, the discussion in this question leads me to think that the connection process is started in a separate thread, allowing the main thread to continue executing, but that's not what I see here, So what is actually happening? And how would I go about implementing multiple hosts connections using the same client in Netty 4.0.23 Final?
Thanks in advance

Netty slower than Tomcat

We just finished building a server to store data to disk and fronted it with Netty. During load testing we were seeing Netty scaling to about 8,000 messages per second. Given our systems, this looked really low. For a benchmark, we wrote a Tomcat front-end and run the same load tests. With these tests we were getting roughly 25,000 messages per second.
Here are the specs for our load testing machine:
Macbook Pro Quad core
16GB of RAM
Java 1.6
Here is the load test setup for Netty:
10 threads
100,000 messages per thread
Netty server code (pretty standard) - our Netty pipeline on the server is two handlers: a FrameDecoder and a SimpleChannelHandler that handles the request and response.
Client side JIO using Commons Pool to pool and reuse connections (the pool was sized the same as the # of threads)
Here is the load test setup for Tomcat:
10 threads
100,000 messages per thread
Tomcat 7.0.16 with default configuration using a Servlet to call the server code
Client side using URLConnection without any pooling
My main question is why such a huge different in performance? Is there something obvious with respect to Netty that can get it to run faster than Tomcat?
Edit: Here is the main Netty server code:
NioServerSocketChannelFactory factory = new NioServerSocketChannelFactory();
ServerBootstrap server = new ServerBootstrap(factory);
server.setPipelineFactory(new ChannelPipelineFactory() {
public ChannelPipeline getPipeline() {
RequestDecoder decoder = injector.getInstance(RequestDecoder.class);
ContentStoreChannelHandler handler = injector.getInstance(ContentStoreChannelHandler.class);
return Channels.pipeline(decoder, handler);
server.setOption("child.tcpNoDelay", true);
server.setOption("child.keepAlive", true);
Channel channel = server.bind(new InetSocketAddress(port));
Our handlers look like this:
public class RequestDecoder extends FrameDecoder {
protected ChannelBuffer decode(ChannelHandlerContext ctx, Channel channel, ChannelBuffer buffer) {
if (buffer.readableBytes() < 4) {
return null;
int length = buffer.readInt();
if (buffer.readableBytes() < length) {
return null;
return buffer;
public class ContentStoreChannelHandler extends SimpleChannelHandler {
private final RequestHandler handler;
public ContentStoreChannelHandler(RequestHandler handler) {
this.handler = handler;
public void messageReceived(ChannelHandlerContext ctx, MessageEvent e) {
ChannelBuffer in = (ChannelBuffer) e.getMessage();
ChannelBuffer out = ChannelBuffers.dynamicBuffer(512);
out.writerIndex(8); // Skip the length and status code
boolean success = handler.handle(new ChannelBufferInputStream(in), new ChannelBufferOutputStream(out), new NettyErrorStream(out));
if (success) {
out.setInt(0, out.writerIndex() - 8); // length
out.setInt(4, 0); // Status
Channels.write(e.getChannel(), out, e.getRemoteAddress());
public void exceptionCaught(ChannelHandlerContext ctx, ExceptionEvent e) {
Throwable throwable = e.getCause();
ChannelBuffer out = ChannelBuffers.dynamicBuffer(8);
out.writeInt(0); // Length
out.writeInt(Errors.generalException.getCode()); // status
Channels.write(ctx, e.getFuture(), out);
public void channelOpen(ChannelHandlerContext ctx, ChannelStateEvent e) {
I've managed to get my Netty solution to within 4,000/second. A few weeks back I was testing a client side PING in my connection pool as a safe guard against idle sockets but I forgot to remove that code before I started load testing. This code effectively PINGed the server every time a Socket was checked out from the pool (using Commons Pool). I commented that code out and I'm now getting 21,000/second with Netty and 25,000/second with Tomcat.
Although, this is great news on the Netty side, I'm still getting 4,000/second less with Netty than Tomcat. I can post my client side (which I thought I had ruled out but apparently not) if anyone is interested in seeing that.
The method messageReceived is executed using a worker thread that is possibly getting blocked by RequestHandler#handle which may be busy doing some I/O work.
You could try adding into the channel pipeline an OrderdMemoryAwareThreadPoolExecutor (recommended) for executing the handlers or alternatively, try dispatching your handler work to a new ThreadPoolExecutor and passing a reference to the socket channel for later writing the response back to client. Ex.:
public void messageReceived(ChannelHandlerContext ctx, MessageEvent e) {
executor.submit(new Runnable() {
private void processHandlerAndRespond(MessageEvent e) {
ChannelBuffer in = (ChannelBuffer) e.getMessage();
ChannelBuffer out = ChannelBuffers.dynamicBuffer(512);
out.writerIndex(8); // Skip the length and status code
boolean success = handler.handle(new ChannelBufferInputStream(in), new ChannelBufferOutputStream(out), new NettyErrorStream(out));
if (success) {
out.setInt(0, out.writerIndex() - 8); // length
out.setInt(4, 0); // Status
Channels.write(e.getChannel(), out, e.getRemoteAddress());

How can I implement a threaded UDP based server in Java?

How can I implement a threaded UDP based server in Java ?
Basically what I want, is to connect multiple clients to the server, and let each client have his own thread. The only problem is, that I don't know how to check if a client is trying to connect to the server and spawn a new thread for it.
boolean listening = true;
System.out.println("Server started.");
while (listening)
new ServerThread().start();
In this case the server will spawn new threads until it runs out of memory.
Here's the code for the ServerThread ( I think I need here a mechanism that stalls the creation of the ServerThread until a client tries to connect.
public ServerThread(String name) throws IOException
socket = new DatagramSocket();
So fathers of Java programming please help.
The design for this to a certain extent depends on whether each complete UDP "dialog" just requires a single request and immediate response, whether it's a single request or response with retransmissions, or whether there'll be a need to process lots of packets for each client.
The RADIUS server I wrote had the single request + retransmit model and spawned a thread for each incoming packet.
As each DatagramPacket was received it was passed to a new thread, and then that thread was responsible for sending back the response. This was because the computation and database accesses involved in generating each response could take a relatively long time and it's easier to spawn a thread than to have some other mechanism to handle new packets that arrive whilst old packets are still being processed.
public class Server implements Runnable {
public void run() {
while (true) {
DatagramPacket packet = socket.receive();
new Thread(new Responder(socket, packet)).start();
public class Responder implements Runnable {
Socket socket = null;
DatagramPacket packet = null;
public Responder(Socket socket, DatagramPacket packet) {
this.socket = socket;
this.packet = packet;
public void run() {
byte[] data = makeResponse(); // code not shown
DatagramPacket response = new DatagramPacket(data, data.length,
packet.getAddress(), packet.getPort());
Since UDP is a connectionless protocol, why do you need to spawn a new thread for each connection? When you receive a UDP packet maybe you should spawn a new thread to take care of dealing with the message received.
UDP connections are not like TCP connections. They do not remain active and such is the design of UDP.
The handlePacket() method of this next code block can do whatever it wants with the data received. And many clients can send multiple packets to the same UDP listener. Maybe it will help you.
public void run() {
DatagramSocket wSocket = null;
DatagramPacket wPacket = null;
byte[] wBuffer = null;
try {
wSocket = new DatagramSocket( listenPort );
wBuffer = new byte[ 2048 ];
wPacket = new DatagramPacket( wBuffer, wBuffer.length );
} catch ( SocketException e ) {
log.fatal( "Could not open the socket: \n" + e.getMessage() );
System.exit( 1 );
while ( isRunning ) {
try {
wSocket.receive( wPacket );
handlePacket( wPacket, wBuffer );
} catch ( Exception e ) {
log.error( e.getMessage() );
Have you looked at the Apache Mina project? I believe even one of its examples takes you through how to setup an UDP-based server with it. If this for a real product, I would not recommend trying to come up with your own implementation from scratch. You will want to use a library to accomplish this so you are not using one thread per connection, rather a thread pool.
I don't really see the need.
Its a school thing right?
If you need to keep track of the clients, you should have a local representation of each client (a Client object on your server). It can take care of whatever client-specific things you need to do.
In that case You need to be able to find out from which client the message was sent from. (using information from the message.) You can keep the clients in a map.
The most effective way is probably to do all handling in the main thread, unless whatever that needs to be done can "block" waiting for external events (or if some things that's supposed to happen might take a long time and some a very short.)
public class Client {
public void handleMessage(Message m) {
// do stuff here.
The client object can perhaps start a new thread in handleMessage() if neccesary.
You shouldn't start multiple server threads.
The server thread can do:
while(running) {
socket.receive(DatagramPacket p);
client = figureOutClient(p);
If there are no client-specific things to care about, just read the messages and handle them as they arrive, in one thread.
