What would be the best practices for unit testing networked asynchronous code? I am trying to do/learn tdd
I am currently planning this part of a library, but in principle:
This is a ssh client library. I want it to be asynchronous. Ssh connection process is very complex, actually. My connect method would set some connection state variables in atomic way, then use some kind of task executor to schedule a connection task. The connection requires connecting to the server, introducing by sending and receiving ssh protocol version and stuff, then completing a key exchange process, that itself is divided into few cases, because there are few key exchange algorithms each requiring different packets to be exchanged.
Although I heard that I should test public api, and test private methods by testing public methods that use it, but in this case it seems difficult, as the task is quite complex and it is probably easier to fake only parts of a negotiation versus the whole connection/negotiation, just to check each possible result of a connect method including results of every key exchange algorithm.
Is it a good justification to split the larger connect tasks into smaller ones, even though they are not publicly available to the user, and test each separate connection stage instead of just the whole connect method all at once? Does it somehow break best practices, or how to do it in a different way? For example is it testing implementation details?
What would be the best practices for unit testing networked asynchronous code? I am trying to do/learn tdd
The reference you need to read is Growing Object Oriented Software, by Freeman and Price. That text contains a long walk through of how to use tests to develop an asynchronous networked auction client.
The process, as described by the authors, frontloads a lot of the work to get an initial end to end test up and running first, before beginning to fill in the other details.
It's not the only way to do it, of course.
Although I heard that I should test public api, and test private methods by testing public methods that use it
Yes, and...
in this case it seems difficult, as the task is quite complex and it is probably easier to fake only parts of a negotiation versus the whole connection/negotiation, just to check each possible result of a connect method including results of every key exchange algorithm.
What often happens is that a complex solution can be broken down into modules, each of which contains its own "public API" -- See On the Criteria to be Used in Decomposing Systems into Modules, by Parnas. You can then test modules individually.
It will often turn out, for instance, that your code can be organized into two piles; an internal functional core, then an imperative shell which interacts with the boundary of your system.
As a rule, the functional core is much easier to test than the imperative shell, so strive for a shell that is "so simple that there are obviously no deficiencies."
so what is the definition of public api?
Roughly: the affordances that are accessible outside of the scope of the implementation.
Put another way, they are the parts of the module that can't be changed without rewriting the code that calls the module.
in this case I would probably split connection process into subtasks, like connection, ssh introduction and key exchange. And test individual subtasks in isolation. Also I would test key exchange support in isolation from specific key exchange algorithm implementations. Requirements for testing each of those parts are different, and only the first requires mocking a socket.
You might also want to look at Cory Benfield's talk Building Protocol Libraries the Right Way.
Not sure if that is okay or not.
The TDD police are not going to come kicking in your door if you don't do it "right". At worst, they'll write you a nasty note.
Related
I have the following situation:
I have 2 JVM processes (really 2 java processes running separately, not 2 threads) running on a local machine. Let's call them ProcessA an ProcessB.
I want them to communicate (exchange data) with one another (e.g. ProcessA sends a message to ProcessB to do something).
Now, I work around this issue by writing a temporary file and these process periodically scan this file to get message. I think this solution is not so good.
What would be a better alternative to achieve what I want?
Multiple options for IPC:
Socket-Based (Bare-Bones) Networking
not necessarily hard, but:
might be verbose for not much,
might offer more surface for bugs, as you write more code.
you could rely on existing frameworks, like Netty
RMI
Technically, that's also network communication, but that's transparent for you.
Fully-fledged Message Passing Architectures
usually built on either RMI or network communications as well, but with support for complicated conversations and workflows
might be too heavy-weight for something simple
frameworks like ActiveMQ or JBoss Messaging
Java Management Extensions (JMX)
more meant for JVM management and monitoring, but could help to implement what you want if you mostly want to have one process query another for data, or send it some request for an action, if they aren't too complex
also works over RMI (amongst other possible protocols)
not so simple to wrap your head around at first, but actually rather simple to use
File-sharing / File-locking
that's what you're doing right now
it's doable, but comes with a lot of problems to handle
Signals
You can simply send signals to your other project
However, it's fairly limited and requires you to implement a translation layer (it is doable, though, but a rather crazy idea to toy with than anything serious.
Without more details, a bare-bone network-based IPC approach seems the best, as it's the:
most extensible (in terms of adding new features and workflows to your
most lightweight (in terms of memory footprint for your app)
most simple (in terms of design)
most educative (in terms of learning how to implement IPC). (as you mentioned "socket is hard" in a comment, and it really is not and should be something you work on)
That being said, based on your example (simply requesting the other process to do an action), JMX could also be good enough for you.
I've added a library on github called Mappedbus (http://github.com/caplogic/mappedbus) which enable two (or many more) Java processes/JVMs to communicate by exchanging messages. The library uses a memory mapped file and makes use of fetch-and-add and volatile read/writes to synchronize the different readers and writers. I've measured the throughput between two processes using this library to 40 million messages/s with an average latency of 25 ns for reading/writing a single message.
What you are looking for is inter-process communication. Java provides a simple IPC framework in the form of Java RMI API. There are several other mechanisms for inter-process communication such as pipes, sockets, message queues (these are all concepts, obviously, so there are frameworks that implement these).
I think in your case Java RMI or a simple custom socket implementation should suffice.
Sockets with DataInput(Output)Stream, to send java objects back and forth. This is easier than using disk file, and much easier than Netty.
I tend to use jGroup to form local clusters between processes. It works for nodes (aka processes) on the same machine, within the same JVM or even across different servers.
Once you understand the basics it is easy working with it and having the options to actually run two or more processes in the same JVM makes it easy to test those processes easily.
The overhead and latency is minimal if both are on the same machine (usually only a TCP rountrip of about >100ns per action).
socket may be a better choice, I think.
Back in 2004 I implement code which do the job with sockets. Until then, many times I search for a better solution, because socket approach triggers firewall and my clients worry. There is no better solution until now. Client must serialize your data, send and server must receive and unserialize.
It is easy.
What is the best way to simulate a network in Java?
I'm in the early stages of a networked peer to peer project, and to determine some of the required characteristics of the clients I'd like to be able to simulate 100+ instances concurrently on my PC.
Ideally I'd like to create a "simulation" version of the sockets, with their own inputs and output streams. Eventually, I'm going to use these streams for data transfer instead of just moving data around between java objects, so what I'm wanting to simulate is the kind of latency, data loss and other errors you might get in an actual network.
Ideally these simulation methods would be very close to the actual stream standards of java.net.*, so I wouldn't need to do much of a rewrite in order to move from simulation to the actual client.
Can anyone point me in the right direction?
You can use Akka to create millions of Actors on a single machine, then organize communication between them similar to 'real' network.
Here's an example project:
https://github.com/adelbertc/scalanet
Well you don't really need to use any tools but put your brains to design it better.
You need interfaces for the under lying communication framework.
All you need is to mock/substitute the real implementation with a dummy one once you have coded against the interfaces.This dummy implementation can introduce features like latency,dummy data etc.
You can go with spring container.You can write some dummy server sockets in the container to simulate conversations between multiple instances or better use a web container to take that headache away from you.
For simulation purposes you may want to check Omnet++, I'ts great for big scaled simulations with built in data analysis/statistics tools.
Writing is similar to c++, see the tutorials it's pretty straight-forward.
Example 6 hosts network (Taken from Omnet++ tutorial)
I am basically practicing with Java socket programming by building client and server (not necessarily HTTP server). In brief, the clients are sending request through sockets to server and server adds requests to task queue. The thread pool initially has certain number of threads and each free one is assigned to one runnable task in the task queue. My web server also has a simple storage that stores and retrieves data from a file from disk. In this project, I have to take care of several concurrency issues.
Basically, I have to build client, server, thread pool, handler, storage. However, I want to test thoroughly in a good systematic way (unit test, integration test, etc.). I don't have much experience in testing so I am looking for pointers, methodologies, frameworks, or tutorials. (I use Ant to automate building, and initially consider JUnit and EasyMock for testing)
Before testing, I'd start by coding some rough and ready prototpye code. Just to see it working and to get a feel for the APIs I will be using.
Then introduce some unit tests with JUnit (there are other frameworks but JUnit is ubiquitous, and you'll find plenty of tutorials to get you started).
If your object needs to interact with some other Objects to complete it's tasks, then use mocks (EasyMock or whatever) to provide the interaction - this will probably lead to a bit of re-factoring.
Once you are happy, you can start to look at testing how your Objects interact, you can write new (integration) tests that replace the Mocks with the real thing. Greater interaction results in greater complexity.
Some things to remember
trivial methods aren't worth testing (e.g. simple accessors)
100% coverage is a waste of time
any test is better than none
Unit test is easier to achieve than integration test
Not all tests are functional
Testing multi-threaded applications is hard
There is a book on how Google does testing. Basically they don't write tests until something looks viable. They have engineers who advise on how to structure code for testing. The point is:
Runnable code is the goal
Tests add to that goal, but do not replace it
Writing code that can be tested is a learnt skill
I'm currently developing two Java networking applications for school projects. One over TCP and the other one over UDP. In both I have to implement simple custom protocol.
Even though I'm trying pretty hard, I can't find a way how to correctly test this kind of apps, or better develop with test first development.
If I have a client and I want real test without stubbing everything out, I have to implement server with simulated behaviour, which in case of simple apps like these is almost the whole project. I understand, that when something big, than writing few lines of Perl script to test it could really help.
Right now I'm developing server and client simultaneously, so that I can at least test by hand, but this doesn't seem like a clean way to develop. The only thing that is helping is tunneling the connection through logger, so that I can see all the data that goes through (using TunneliJ plugin for IDEA).
What is the best way to TDD a networking application with custom protocol? Should I just stub everything and be fine with it?
Separate the protocol from the network layer. Testing the logic of the protocol will become easier once you can feed it your own data, without the need to go through the network stack. Even though you are not using Python, I'd suggest to look at the structure of the Twisted framework. It's a nice example of how to unit-test networking applications.
We wound up with the same problem a while ago. We decided it was simpler to put two developers on the task: one to write the server and one to write the client. We started working in the same office so that we could code, test, modify, repeat a little bit more easily.
All in all, I think it was the best solution for us. It gave us the ability to actually test the program in conditions there were not ideal. For instance, our Internet went out a couple of times and our program crashed, so we fixed it. It worked rather well for us, but if you are a sole developer, it may not be the solution for you.
Whatever you do, when writing a custom protocol, I would check out Wireshark for monitoring your network traffic to make sure all of the packets are correct.
In my app I have code such as this
m_socket.receive(packet);
doSomething(packet);
I mock up the receive and hence can exercise everything that doSomething() needs to do.
Where does this break down for you? Here you are truly unit testing that your code behaves correctly, you can also mock the socket send, and se expectations for what you think should be sent according to your protocol.
We are of course not actually testing that the other end of the protocol is happy. That's integration testing. I always hanker after getting to IT as soon as possible. It's when you interact with the "other end" that you find the interesting stuff.
You are in the luck position of being in control of both ends, in that position I would probably spend some time instrument to create suitable, controllable test harnesses.
I'm asking on opinion about implementing framework that emulates ado.net in java (data tables, data sets etc). Idea is to avoid writing domain objects since they are, so far, just used to transport data from server to client and back, with no particular business methods inside them. Main goal is to speed up development time.
Could i benefit from writing this kind of framework? If it's done before provide link please.
Instead of emulating ADO.NET, I would use an ORM tool such as Hibernate. This way you can have a library that handles all of your SQL and persistence needs on it's own, and you don't have to worry about dealing with a pseudo-table like structure.
I find working with strongly-typed domain objects to be far, far easier and quicker to develop (and test) than working with SQL, resultsets, etc.
Whether it's ADO.NET or anything else, I've learned that my first step should be to not write any libraries or frameworks. Instead, I get on with the job.
The second time I'm tempted to write a library or framework, I typically think about it for a while, decide not to write it, and get on with the job.
The third time I'm tempted, I'll usually give into the temptation. By that time, I will have written a bunch of code that worked without the library or framework. I will refactor that code (that real, used, tested code). I'll do this iteratively, running my automated unit tests constantly, and writing new tests as necessary. When I'm interrupted in this task (as will happen, since writing frameworks may not be part of my job), I can be certain that I'm leaving running code behind, even though it's only half finished.
When I get back to it, I'll continue refactoring up to a point, remembering that I should get some experience using what I've refactored before going too much further in the direction I think it should go. With real code using it, I won't have to guess which direction to carry the refactoring.