Sharing thread between processes

Sharing thread between processes - java

I suppose this is not possible. But I am looking at best way to separate different layers of my service yet be able to access layers quickly or without overhead of IPC/RMI.
The main programming language I am using is java, but can use C++ if required.
What we have right now is a server that host database and access control. And we use RMI for consumers to request data. This slow and doesn't scale very well.
We need performance and scalability which we dont have at the moment.
What we are thinking of is using a layered architecture with database at base, access control ontop of it along with a notification bus to notify clients of changes in database.
The main problem is the overhead of communication that we want to avoid/or minimize.
Is there any magic thread that can run in two context (switch context) and share information that way. I know the short answer would be no, but what are the options?
Update
We are currently using Java RMI.
Our base layer will provide an API that can be used to create plugins that will run on top. So its not a fixed collectors/consumer we have. We can have 5-6 collectors running and same amount of consumers.
We can have upto 1000 consumers.

My first suggestion is that you should buy a book (or find an online tutorial) on building scalable applications, because you seem to be pretty lost.
Sharing a thread between processes doesn't make sense at any level - it is meaningless, but you can share the data that the thread accesses, which is probably what you want.
The fastest method will be C based IPC (e.g., shared memory, semasphores, etc: Shmget). You say you want to avoid the overhead of IPC, but really, it isn't going to get any faster than that.
But why do you want multiple processes? If you are worried about the overhead of communicating between processes, just have your threads in one process? There is no reason your different layers have to be in different processes.
But anyway, I am not convinced that your original statement that RMI is slow and doesn't scale is completely correct. If it is not scaling, you are probably not using the right framework. Maybe you have an issue that you only have one RMI end point on the server. Have you considered an J2EE system with stateless session beans?
Without knowing about your requirements, it is hard to say.

It is not possible in general to share thread between two processes due to OS design. The problem of sharing data between two or more processes is usually solved by sharing files, sharing database or sharing messages (which in turn can be synchronous or asynchronous), having processes communicate via pipes, say in Linux, or even sharing memory. You scenario description is not very precise, you need to describe all processes and how information is supposed to flow, what triggers information flow, etc.
Most likely you need high performance messaging library, https://github.com/real-logic/Aeron/ is one. But to get precise answer you would need to describe better what overhead exactly you want to minimize.

If your goal is to notify users, you should consider publish/subscribe messaging (pub/sub). There are many middleware vendors out there that provide this architecture though most are expensive in production scenarios. For open source, check out http://redis.io/topics/pubsub. (No affiliation.)

Related

2 programs that send messages to each other in Java [duplicate]

I have the following situation:
I have 2 JVM processes (really 2 java processes running separately, not 2 threads) running on a local machine. Let's call them ProcessA an ProcessB.
I want them to communicate (exchange data) with one another (e.g. ProcessA sends a message to ProcessB to do something).
Now, I work around this issue by writing a temporary file and these process periodically scan this file to get message. I think this solution is not so good.
What would be a better alternative to achieve what I want?

Multiple options for IPC:
Socket-Based (Bare-Bones) Networking
not necessarily hard, but:
might be verbose for not much,
might offer more surface for bugs, as you write more code.
you could rely on existing frameworks, like Netty
RMI
Technically, that's also network communication, but that's transparent for you.
Fully-fledged Message Passing Architectures
usually built on either RMI or network communications as well, but with support for complicated conversations and workflows
might be too heavy-weight for something simple
frameworks like ActiveMQ or JBoss Messaging
Java Management Extensions (JMX)
more meant for JVM management and monitoring, but could help to implement what you want if you mostly want to have one process query another for data, or send it some request for an action, if they aren't too complex
also works over RMI (amongst other possible protocols)
not so simple to wrap your head around at first, but actually rather simple to use
File-sharing / File-locking
that's what you're doing right now
it's doable, but comes with a lot of problems to handle
Signals
You can simply send signals to your other project
However, it's fairly limited and requires you to implement a translation layer (it is doable, though, but a rather crazy idea to toy with than anything serious.
Without more details, a bare-bone network-based IPC approach seems the best, as it's the:
most extensible (in terms of adding new features and workflows to your
most lightweight (in terms of memory footprint for your app)
most simple (in terms of design)
most educative (in terms of learning how to implement IPC). (as you mentioned "socket is hard" in a comment, and it really is not and should be something you work on)
That being said, based on your example (simply requesting the other process to do an action), JMX could also be good enough for you.

I've added a library on github called Mappedbus (http://github.com/caplogic/mappedbus) which enable two (or many more) Java processes/JVMs to communicate by exchanging messages. The library uses a memory mapped file and makes use of fetch-and-add and volatile read/writes to synchronize the different readers and writers. I've measured the throughput between two processes using this library to 40 million messages/s with an average latency of 25 ns for reading/writing a single message.

What you are looking for is inter-process communication. Java provides a simple IPC framework in the form of Java RMI API. There are several other mechanisms for inter-process communication such as pipes, sockets, message queues (these are all concepts, obviously, so there are frameworks that implement these).
I think in your case Java RMI or a simple custom socket implementation should suffice.

Sockets with DataInput(Output)Stream, to send java objects back and forth. This is easier than using disk file, and much easier than Netty.

I tend to use jGroup to form local clusters between processes. It works for nodes (aka processes) on the same machine, within the same JVM or even across different servers.
Once you understand the basics it is easy working with it and having the options to actually run two or more processes in the same JVM makes it easy to test those processes easily.
The overhead and latency is minimal if both are on the same machine (usually only a TCP rountrip of about >100ns per action).

socket may be a better choice, I think.

Back in 2004 I implement code which do the job with sockets. Until then, many times I search for a better solution, because socket approach triggers firewall and my clients worry. There is no better solution until now. Client must serialize your data, send and server must receive and unserialize.
It is easy.

What are good use cases for an in-process events system vs microservices with a broker?

I've seen recently that there are different frameworks out there that allow the use of a messaging architecture but implemented in process, both using same and different threads. The ones I know about are Spring, Guava EventBus and Reactor.
My question is about what are good use cases where someone would want to use them instead of sending messages to a full fledged broker. I understand that its usage allows for a better decoupling of the business logic but in a microservices architecture you would normally publish events to be consumed by other microservices. The advantage of that is the failure tolerance you have by adding a cluster of brokers where an erroneous message cause by a failure in an instance can be retried by another one. Implementing logic that is decomposed and executed by sending messages that are later consumed by the same system, specially when the subscribers are executed in different threads, seems to me difficult to then put the data back to a consistent state.

Advantages of microservices over in-process is not really in the change it represents for message consumption.
Microservices allow you to execute portion of your code on specific nodes within a cluster, permitting to allocate the heavy calculations on powerful computers and secondary or light resources on less powerful resources. Overall it allows you to balance the performances better and scale your resources on the portions of code that require it.
Also, whenever you update the code of a micro-service you do not impact the other services, so that your changes (and errors) are isolated. If everything runs within the same process any wrong update might actually render the entire solution unusable.
In the end, getting the communication out of your process (3rd party broker) allows you to share it with more people, agents, processes, etc. Otherwise people have to become part of your process (a module?) and this is really not efficient.
Honestly, the only good reason you have for intra-process communication within your monolithic is for speed (in-memory communication rather than on-the-wire communication).

Multiplayer card game on server using RPC

Basically I want a Java, Python, or C++ script running on a server, listening for player instances to: join, call, bet, fold, draw cards, etc and also have a timeout for when players leave or get disconnected.
Basically I want each of these actions to be a small request, so that players could either be processes on same machine talking to a game server, or machines across network.
Security of messaging is not an issue, this is for learning/research/fun.
My priorities:
Have a good scheme for detecting when players disconnect, but also be able to account for network latencies, etc before booting/causing to lose hand.
Speed. I'm going to be playing millions of these hands as fast as I can.
Run on a shared server instance (I may have limited access to ports or things that need root)
My questions:
Listen on ports or use sockets or HTTP port 80 apache listening script? (I'm a bit hazy on the differences between these).
Any good frameworks to work off of?
Message types? I'm thinking JSON or Protocol Buffers.
How to make it FAST?
Thanks guys - just looking for some pointers and suggestions. I think it is a cool problem with a lot of neat things to learn doing it.

As far as frameworks goes, Ginkgo looks promising for building a network service (which is what you're doing). The Python is very straightforward, and the asynchronicity enabled by gevent lets you do asynchronous things without generally having to worry about callbacks. The gevent core also gives you access to a lot of building blocks.
Rather than having lots of services communicating over ports, you might look into either 1) a good message queue, like RabbitMQ or 0mq, or 2) a distributed coordination server, like Zookeeper.
That being said, what you aim to do is difficult, especially if you're not familiar with the basics. It's a worthwhile endeavor to learn about those basics.
Don't worry about speed at first. Get it working, then make it scale. Of course, there are directions you can go that will make it easier to scale in the future. Zookeeper in particular gives you easy-to-implement primitives for scaling horizontally (i.e. multiple workers sharing the load). In particular, see the Zookeeper recipe book and their corresponding python implementations (courtesy of the kazoo, a gevent-based client library).
Don't forget that "fast" also means optimizing your own development time, for quicker iterations and less time cursing your development environment. So use Python, which will let you get up and running quickly now, and optimize later if you really truly start to bind on CPU time or memory use. (With this particular application, you're far more likely to bind on network IO.)

Anything else? Maybe a cup of coffee to go with your question :-)
Answering your question from the ground up would require several books worth of text with topics ranging from basic TCP/IP networking to scalable architectures, but I'll try to give you some direction nevertheless.
Questions:
Listen on ports or use sockets or HTTP port 80 apache listening script? (I'm a bit hazy on the differences between these).
I would venture that if you're not clear on the definition of each of these maybe designing an implementing a service that will be "be playing millions of these hands as fast as I can" is a bit hmm, over-reaching? But don't let that stop you as they say "ignorance is bliss."
Any good frameworks to work off of?
I think your project is a good candidate for Node.js. There main reason being that Node.js is relatively scaleable and it is good at hiding the complexity required for that scalability. There are downsides to Node.js, just Google search for 'Node.js scalability critisism'.
The main point against Node.js as opposed to using a more general purpose framework is that scalability is difficult, there is no way around it, and Node.js being so high level and specific provides less options for solving though problems.
The other drawback is Node.js is Javascript not Java or Phyton as you prefer.
Message types? I'm thinking JSON or Protocol Buffers.
I don't think there's going to be a lot of traffic between client and server so it doesn't really matter I'd go with JSON just because it is more prevalent.
How to make it FAST?
The real question is how to make it scalable. Running human vs human card games is not computationally intensive, so you're probably going to run out of I/O capacity before you reach any computational limit.
Overcoming these limitations is done by spreading the load across machines. The common way to do in multi-player games is to have a list server that provides links to identical game servers with each server having a predefined number of slots available for players.
This is a variation of a broker-workers architecture were the broker machine assigns a worker machine to clients based on how busy they are. In gaming users want to be able to select their server so they can play with their friends.
Related:
Have a good scheme for detecting when players disconnect, but also be able to account for network latencies, etc before booting/causing to lose hand.
Since this is in human time scales (seconds as opposed to miliseconds) the client should send keepalives say every 10 seconds with say 30 second session timeout.
The keepalives would be JSON messages in your application protocol not HTTP which is lower level and handled by the framework.
The framework itself should provide you with HTTP 1.1 connection management/pooling which allows several http sessions (request/response) to go through the same connection, but do not require the client to be always connected. This is a good compromise between reliability and speed and should be good enough for turn based card games.

Honestly, I'd start with classic LAMP. Take a stock Apache server, and a mysql database, and put your Python scripts in the cgi-bin directory. The fact that they're sending and receiving JSON instead of HTTP doesn't make much difference.
This is obviously not going to be the most flexible or scalable solution, of course, but it forces you to confront the actual problems as early as possible.
The first problem you're going to run into is game state. You claim there is no shared state, but that's not right—the cards in the deck, the bets on the table, whose turn it is—that's all state, shared between multiple players, managed on the server. How else could any of those commands work? So, you need some way to share state between separate instances of the CGI script. The classic solution is to store the state in the database.
Of course you also need to deal with user sessions in the first place. The details depend on which session-management scheme you pick, but the big problem is how to propagate a disconnect/timeout from the lower level up to the application level. What happens if someone puts $20 on the table and then disconnects? You have to think through all of the possible use cases.
Next, you need to think about scalability. You want millions of games? Well, if there's a single database with all the game state, you can have as many web servers in front of it as you want—John Doe may be on server1 while Joe Schmoe is on server2, but they can be in the same game. On the other hand, you can a separate database for each server, as long as you have some way to force people in the same game to meet on the same server. Which one makes more sense? Either way, how do you load-balance between the servers. (You not only want to keep them all busy, you want to avoid the situation where 4 players are all ready to go, but they're on 3 different servers, so they can't play each other…).
The end result of this process is going to be a huge mess of a server that runs at 1% of the capacity you hoped for, that you have no idea how to maintain. But you'll have thought through your problem space in more detail, and you'll also have learned the basics of server development, both of which are probably more important in the long run.
If you've got the time, I'd next throw the whole thing out and rewrite everything from scratch by designing a custom TCP protocol, implementing a server for it in something like Twisted, keeping game state in memory, and writing a simple custom broker instead of a standard load balancer.

How to parallelize execution on remote systems

What's a good method for assigning work to a set of remote machines? Consider an example where the task is very CPU and RAM intensive, but doesn't actually process a large dataset. The language of choice would be Java. I was thinking Hadoop would be a good option, but the dataset passed between remote machines is fairly small, and Hadoop seems to focus mainly on the distribution of data rather than distribution of work.
What are some good technologies that can help?
EDIT: I'm mainly interested in load balancing. There will be a series of jobs with a small (< 3MB) dataset, but significant processing and memory needs.

MPI would probably be a good choice, there's even a JAVA implementation.

MPI may be part of your answer, but looking at the question, I'm not sure if it addresses the portion of the problem you care about.
MPI provides a communication layer between processing components. It is low level requiring you to do a fair amount of work, but from what I saw in an introduction presentation, it also comes with some common matrix data manipulation functions.
In your question, you seem to be more interested in the load balancing/job processing aspects of the problem. If that really is your focus, maybe a small program hosted in a Servlet or an RMI server might be sufficient. Let each program go to the server for their next unit of work and then submit the results back (you might even be able to use a database/file share, but pay attention to locking issues). In other words, a pull mechanism versus a push mechanism.
This approach is fairly simple to implement and gives you the advantage of scaling up by just running more distributed clients. Load balancing isn't too important if you intend to allow your process to take full control of the machine. You can experiment with running multiple clients on a machine that has multiple cores to see if you can improve overall through-put for the node. A multi-threaded client would be more efficient, but can increase complexity depending on the structure of the code you are using to solve the problem.

High availability and scalable platform for Java/C++ on Solaris

I have an application that's a mix of Java and C++ on Solaris. The Java aspects of the code run the web UI and establish state on the devices that we're talking to, and the C++ code does the real-time crunching of data coming back from the devices. Shared memory is used to pass device state and context information from the Java code through to the C++ code. The Java code uses a PostgreSQL database to persist its state.
We're running into some pretty severe performance bottlenecks, and right now the only way we can scale is to increase memory and CPU counts. We're stuck on the one physical box due to the shared memory design.
The really big hit here is being taken by the C++ code. The web interface is fairly lightly used to configure the devices; where we're really struggling is to handle the data volumes that the devices deliver once configured.
Every piece of data we get back from the device has an identifier in it which points back to the device context, and we need to look that up. Right now there's a series of shared memory objects that are maintained by the Java/UI code and referred to by the C++ code, and that's the bottleneck. Because of that architecture we cannot move the C++ data handling off to another machine. We need to be able to scale out so that various subsets of devices can be handled by different machines, but then we lose the ability to do that context lookup, and that's the problem I'm trying to resolve: how to offload the real-time data processing to other boxes while still being able to refer to the device context.
I should note we have no control over the protocol used by the devices themselves, and there is no possible chance that situation will change.
We know we need to move away from this to be able to scale out by adding more machines to the cluster, and I'm in the early stages of working out exactly how we'll do this.
Right now I'm looking at Terracotta as a way of scaling out the Java code, but I haven't got as far as working out how to scale out the C++ to match.
As well as scaling for performance we need to consider high availability as well. The application needs to be available pretty much the whole time -- not absolutely 100%, which isn't cost effective, but we need to do a reasonable job of surviving a machine outage.
If you had to undertake the task I've been given, what would you do?
EDIT: Based on the data provided by #john channing, i'm looking at both GigaSpaces and Gemstone. Oracle Coherence and IBM ObjectGrid appear to be java-only.

The first thing I would do is construct a model of the system to map the data flow and try to understand precisely where the bottleneck lies. If you can model your system as a pipeline, then you should be able to use the theory of constraints (most of the literature is about optimising business processes but it applies equally to software) to continuously improve performance and eliminate the bottleneck.
Next I would collect some hard empirical data that accurately characterises the performance of your system. It is something of a cliché that you cannot manage what you cannot measure, but I have seen many people attempt to optimise a software system based on hunches and fail miserably.
Then I would use the Pareto Principle (80/20 rule) to choose the small number of things that will produce the biggest gains and focus only on those.
To scale a Java application horizontally, I have used Oracle Coherence extensively. Although some dismiss it as a very expensive distributed hashtable, the functionality is much richer than that and you can, for example, directly access data in the cache from C++ code .
Other alternatives for horizontally scaling your Java code would be Giga Spaces, IBM Object Grid or Gemstone Gemfire.
If your C++ code is stateless and is used purely for number crunching, you could look at distributing the process using ICE Grid which has bindings for all of the languages you are using.

You need to scale sideways and out. Maybe something like a message queue could be the backend between the frontend and the crunching.

Andrew, (in addition to modeling as a pipeline etc), measuring things is important. Have you ran a profiler over the code and got metrics of where most of the time is spent?
For the database code, how often does it change ? Are you looking at caching at the moment ? I assume you have looked at indexes etc over the data to speed up the Db ?
What levels of traffic do you have on the front end ? Are you caching web pages ? (It isn't too hard to say use a JMS type api to communicate between components. You can then put Web Page component on one machine (or more), and then put the integration code (c++) on another, and for many JMS products there are usually native C++ api's ie. ActiveMQ comes to mind), but it really helps to know how much of the time is in Web (JSP ?) , C++, Database ops.
Is the database storing business data, or is it being also used to pass data between Java and C++ ? You say you are using shared mem not JNI ? What level of multi-threading currently exists in the APP? Would you describe the code as being synchronous in nature or async?
Is there a physical relationship between the Solaris code and the devices that must be maintained (ie. do all the devices register with the c++ code, or can that be specified). ie. if you were to put a web load balancer on the frontend, and just put 2 machines up today is the relationhip of which devices are managed by a box initialized up front or in advance?
What are the HA requirements ? ie. just state info ? Can the HA be done just in the web tier by clustering Session data ?
Is the DB running on another machine ?
How big is the DB ? Have you optimized your queries ie. tried using explicit inner/outer joins sometimes helps versus nested sub queries (sometmes). (again look at the sql stats).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.