I read the "The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in MassiveScale, Unbounded, Out of Order Data Processing" paper. Alas, the SDK does not yet expose the accumulating & retracting triggering mode (section 2.3).
I was wondering if there was a workaround for getting similar semantics?
I have been reading the source and have figured out that StateTag or StateNamespace may be the way i can store the "last emitted value of the window" and hence can be used to calculate the retraction message down the pipeline. Is this the correct path or are there other classes/ways I can/should look at.
The upcoming state API is indeed your best bet for emulating retractions. Those classes you mentioned are part of the state API, but everything in the com.google.cloud.dataflow.sdk.util is for internal use only; we technically make no guarantees that the APIs won't change drastically, or even remain unreleased. That said, releasing that API is on our roadmap, and I'm hopeful we'll get it released relatively soon.
One thing to keep in mind: all the code downstream of your custom retractions will need to be able to differentiate them from normal records. This is something we'll do automatically for you once bonafide retraction support is ready, but in the mean time, you'll just need to make sure all the code you write that might receive a retraction knows how to recognize and handle it as such.
If I have many queues and each has an unique ID, would a Hashtable of Queues be the way to go? I know it sounds strange that I'm asking this, but just wondering if there might be a better way for optimization.
Sorry for the lack of information. I'm basically storing queues of messages which are identified by the client id.
The client will request to get messages from the server.
In the case when the ack does not reach the server, the message still remains in the queue until the client makes another attempt to get the oldest message.
The idea is to retain all the messages if the client fails to ack and to retrieve all messages in a FIFO manner.
The question doesn't provide any detail on what you want to do with this. And this is very important, because the usage pattern is critical in determining which data structure is going to be most efficient for your use case.
But I'd say that in the absence of other details, a HashTable of Queues sounds like a sensible choice, with the HashTable using the ID as a key and the corresponding Queue as a value.
Then the following operations will both be O(1) with very low overhead:
Add an item to the queue with a given ID
Pull the first item from the Queue with a given ID
Which is probably the usage pattern you are going to be needing in most cases.....
since java is statically typed, i would definitely use a hashtable... (that is, if we are talking about optimization)
I'm trying to think in the best way on communication for the game I'm writing. The scenario is simple: tcp sockets and request for authentication, map updates, chat updates, etc. What I was thinking to use was set of classes, like User, Map, Creature, etc and have a Message class, which will have enum with message types and Object to store previously mentioned classes. After I will convert this with GSON to json and on other side I will decode it corresponding to the message type indicated by the element of enum. The problem is that I will pass sometimes too much unnecessary data and that's doesn't let me quiet plus the integration of new types of messages will not be very easy neither for me, nor for someone else who might use it. In the previous version I have used my own XML protocol which also doesn't let me very happy.
So what I'm asking is advice for me the better way for communication or maybe some improvement of my idea.
Thanks in advance,
Serhiy.
XML and JSOn are intended to make application integration simple, but still be human readable.
If you want a protocol tuned to your needs, I suggest you start by determining what information you want to send and how it would look. Document this before you even start implementing it. That way the data sent will suit your needs. (This is more work BTW which is why it is not done more often)
I have an existing Protocol I'd like to write a java Client for. The Protocol consists of messages that have a header containing message type and message length, and then the announced number of bytes which is the payload.
I'm having some trouble modeling it, since creating a Class for each message type seems a bit excessive to me (that would turn out to be 20+ classes just to represent the messages that go over the wire) I was thinking about alternative models. But I can't come up with one that works.
I don't want anything fancy to work on the messages aside from notifying via publish subscribe when a message comes in and in some instances reply back.
Any pointers as to where to look?
A class for each message type is the natural OO way to model this. The fact that there are 20 classes should not put you off. (Depending on the relationship between the messages, you can probably implement common featues in superclasses.)
My advice is to not worry too much about efficiency to start with. Just focus on getting clean APIs that provide the required functionality. Once you've got things working, profile the code and see if the protocol classes are a significant bottleneck. If they are ... then think about how to make the code more efficient.
Here is a generic question. I'm not in search of the best answer, I'd just like you to express your favourite practices.
I want to implement a network protocol in Java (but this is a rather general question, I faced the same issues in C++), this is not the first time, as I have done this before. But I think I am missing a good way to implement it. In fact usually it's all about exchanging text messages and some byte buffers between hosts, storing the status and wait until the next message comes. The problem is that I usually end up with a bunch of switch and more or less complex if statements that react to different statuses / messages. The whole thing usually gets complicated and hard to mantain. Not to mention that sometimes what comes out has some "blind spot", I mean statuses of the protocol that have not been covered and that behave in a unpredictable way. I tried to write down some state machine classes, that take care of checking start and end statuses for each action in more or less smart ways. This makes programming the protocol very complicated as I have to write lines and lines of code to cover every possible situation.
What I'd like is something like a good pattern, or a best practice that is used in programming complex protocols, easy to mantain and to extend and very readable.
What are your suggestions?
Read up on the State design pattern to learn how to avoid lots of switch statements.
"sometimes what comes out has some "blind spot", I mean statuses of the protocol that have not been covered..."
State can help avoid gaps. It can't guarantee a good design, you still have to do that.
"...as I have to write lines and lines of code to cover every possible situation."
This should not be considered a burden or a problem: You must write lines of code to cover every possible situation.
State can help because you get to leverage inheritance. It can't guarantee a good design, you still have to do that.
Designing a protocol is usually all about the application space you are working within. For instance, http is all about handling web pages, graphics, and posts, while FTP is all about transferring files.
So in short, to start, you should decide what application space you are in, then define the actions that need to be taken. Then finally, before you start designing your actual protocol, you should seriously, seriously hunt for another protocol stack that does what you want to do and avoid implementing a protocol stack altoether. Only after you have determined that something else pre-built absolutely won't work for you should you start building your own protocol stack.
In C++ you can use Boost::Spirit library to parse your protocol message easily. The only "difficulty" is to define the grammar of your message protocol. Take a look at Gnutella source code to see how they solve this problem. Here http://www9.limewire.com/developer/gnutella_protocol_0.4.pdf is the Gnutella protocol specifications
Finite State Machine is what you want
FSM
So you define a whole bunch of states that you can be in as a receiver or sender (idle, connecting_phase1, connecting_phase2, packet expected,...)
Then define all the possible events (packet1 arrives, net closes, ...)
finally you have a table that says 'when in state x and event n happens do func y and transition to state q' - for every state and event (many will be null or dups)
Edit - how to make a FSM (rough sketch)
struct FSMNode
{
int m_nextState;
void (m_func*);
}
FSMNode states[NUMSTATES][NUMEVENTS]=
{ // state 0
{3, bang}, // event 0
{2,wiz},
{1, fertang}
}
{
{1, noop}, // event 0
{1, noop},
{3, ole}
}
.......
FSMNode node = states[mystate][event];
node.m_func(context);
mystate = node.m_nextState;
I am sure this is full of invalid syntax - but I hope you get the drift
Why not use XML as your protocol? You can encapsulate and categorize all your pieces of data inside XML nodes
Can't give you an example myself, but how about looking at how other (competent) people are doing it?
Like this one?
http://anonsvn.jboss.org/repos/netty/trunk/src/main/java/org/jboss/netty/handler/codec/http/
P.S. for that matter, I actually recommend using netty as your network framework and build your protocol on top of it. It should be very easy, and you'll probably get rid of bunch of headaches...
If you are using Java, consider looking at Apache MINA, it's documentation and samples should inspire you in the right way.
Right-click the network connection icon in the System Tray.
Click Troubleshoot problems.
The troubleshooter may find and fix the problem, in this case, you can get quickly started with your business.
If the troubleshooter can't fix the Winsocks problem, then you may get an error looking like:
"One or more network protocols are missing on this computer"