Problem with TCP nodes

tmp · October 2, 2015, 6:03pm

Hi there!

Today I encountered a problem with the TCP nodes (see patch below).

When sending big amounts of data I expect the TCP nodes to handle everything so that in the end I get what I have sent. But as you can see in the patch the length of received data sometimes differs. Is it a bug or did I misunderstand something?

Thanks ;)

TCP-Problem.v4p (9.1 kB)

velcrome · October 3, 2015, 5:54pm

Can confirm that the TCP (Network Server) behaves unreliable in 33.7 and latest alpha (both x64) with packages > 2048 byte, the bigger the package, the worse.

It seems like it omisses fragents of the package, usually at the beginning.

In addition, I found when it receives multiple packages during the same frame, it will not output them as a spread of Raw, but instead concatted, most often faulty and impossible to separate again.

Elias · October 6, 2015, 7:34pm

@@tmp
The TCP nodes do not handle anything for you. The data sent will usually be received as chunks of data across several frames. So some logic has to be added on the receiver side. Have a look at the attached patch for an example and for more complex scenarios have a look at the patches linked https://discourse.vvvv.org/t/12766 by @robotanton. So no it’s not a bug ;)

@@velcrome
The behaviour is indeed different due to internal changes to some queueing parameters. But what you describe (omissing fragments of a package, faulty, impossible to separate etc.) sounds like a new bug report with (hopefully) attached patch to re-produce.

TCP-Receiver-Sketch.v4p (78.8 kB)

velcrome · October 6, 2015, 10:24pm

i was under the impression that both segmentation and reordering of tcp segments can be kept completely transparent to the end user, because the tcp stack must be able to reliably state, if the entire multi-segment message has been received successfully?

the other issue is pretty much related to that, I guess. The latter half of one full tcp message (which should never exist imo) and the one right behind it will be concatenated in this unfortunate way.

i am wondering why it fails so often to get the entire package in time, even on gigabit lan and loosely synced 60Hz vvvv instances or as in the patch with plain localhost. might this be an issue of too small a buffer?

would be nice to see a proper redo in vl of this :)

Elias · October 7, 2015, 9:40am

Well yes TCP takes care of reordering and segmentation - the above posted example patch doesn’t have to deal with any of those things. Remember the actual TCP packages which go over the wire are much smaller than a user defined data package of say 80k like in the example patch. But what it does not take care of is how your stream of data should be segmented on an application level. This is your job or that of a library using some kind of protocol. Indeed something which we could and should very much improve so the whole process of receiving data gets as simple as implied in the initial post.

In VL my hope is to build much of the networking stack using the IObservable interface - starting from very low level notifications of byte arrays to high level application defined packages marshaled over to the main thread as late as possible.

catweasel · October 7, 2015, 9:46am

This explains when I tried using TCP to send data between instances it broke I was also under the same impression! UDP seems to work much better in these instances, is the packet size bigger with UDP?

velcrome · October 8, 2015, 3:42pm

mmh, not satisfied with the replies tagged as solution.

The initial post clearly states a problem when sending a message A, which can be corrupted (i.e. shortened) when received (message being synonymous with a single slice of Raw, as is the input of the nodes in question).

I expanded, that parts of message A will be lost on the wire, and sometimes, two distinct messages A and B will even be concatenated into a corrupted one, which seemed related to the fault discovered by the threadstarter.

Elias on the other hand only pointed out, that when sending Message A, B and C those might or might not arrive in order and might or might not arrive in the same frame.

He is addressing the Application Level, where the receiving end might have to reorder and queue stuff, while the original report clearly addresses the Network Level, where the end user is expecting to receive Message A the same way she sent it, which is the cornerstone of TCP being called “reliable”.

Please check tmp’s patch again and play around with it a little. It might take a couple bangs till you see the fault.

joreg · October 8, 2015, 4:37pm

no, he said:

and provided a patch that demonstrates exactly that:

everything arrives in order
all network-level-segments are put together again without any loss
only the app may receive different chunks which it has to queue and interpret correctly. and that is what you’d use a simple protocol for (as anton did in the photoshop nodes linked above).

still not?

velcrome · October 8, 2015, 5:46pm

still not.

individual slices of the input in TCP sender will sometimes be corrupted within, when arriving at the output.

no overlaying of app logic can fix this.

TCP is supposed to be reliable, in that regard that a transmission (i.e. segmentation into many small datagrams at the sender, handshaking all datagrams and reorganizing them at the receiver) will yield an exact copy of the individual tcp message at the receiver.

The tcp implementation in vvvv fails to do so sometimes, hence it is NOT reliable and therefore buggy.

this can be reproduced with the patch in first post, you just have to bang a little, maybe increase the message length. But eventually you will see that the length of a message turns out different for sender and receiver, which should never happen, even according to you.

joreg · October 8, 2015, 6:03pm

the original patch only shows what Elias mentioned above:

but that is fine because then he provides a patch that does essentially the same thing (sending large chunks of data) as the original patch and shows how you can build a receiver that receives everything correctly.

so i’d argue that the original patch and the patch provided by elias do exactly the same thing, only the one by elias works. or are you saying it does not work for you?

velcrome · October 8, 2015, 6:07pm

aye. I added a picture of the black swan, falsifying your statement and proving the bug :)
and be not mistaken, it is a bug because it is against the specification of TCP as a reliable protocol.

antons workaround will circumvene the bug, but only for messages with fixed length. a single message with a different length will break the whole thing.

I guess it boils down to this: the TCP node should only spit out completely transmitted messages and buffer them itself until transmission is completed sucessfully. Otherwise it should not be called TCP at all.

joreg · October 8, 2015, 6:45pm

again. the image shows what you assume to be a problem, when the patch of elias shows that it is not.

I just found you this fine read:
One of the most common beginner mistakes for people designing protocols for TCP/IP is that they assume that message boundaries are preserved. For example, they assume a single “Send” will result in a single “Receive”.
http://blog.stephencleary.com/2009/04/message-framing.html

where he talks about “length prefixing” that is what is used in the photoshop-thing linked above.

still?

velcrome · October 8, 2015, 7:58pm

ok, got it. sorry for being so slow :)

you are right and I was wrong, the whole tcp thing is quite counterintuitive.

that is so, because it has all this internal packet based, checksum powered, ack-handshaking, more-data-coming, window framing going on. that mechanism has much more information about the data inside, but in the end, we look at just one dull stream per host. without any mark of a package, and this is indeed done by specification!

this is quite a big thing really, because I assume lots of vvvv users never knew, and just exchanged UDP by TCP when packets got too big (admittedly thats what i tried). with the concept of an output stream, everybody using TCP must add some way to split the singular stream into the orginal Raw data slices, otherwise it might be of no use.

http and protobuf for example have headers containing the total length
xml and json have the property being encapsulated by (unescaped) delimiters like <> and {}
or make sure to ONLY send data of a fixed length

thanks for the patience to enlighten me

joreg · October 8, 2015, 8:09pm

np. still thinking that the above mentioned
\addonpack\lib\nodes\modules\IO\Photoshop\CollectChunks.v4p
could probly be turned into a general purpose CollectChunks (Raw) module for most tcp-needs. together with a Chunkify (Raw) that just adds 4 bytes of length to the message. shouldn’t that do?

velcrome · November 16, 2015, 4:08pm

hey @joreg, just as a follow up:
https://github.com/velcrome/vvvv-ZeroMQ

ZeroMQ has the nice property of using TCP and still outputting the packages the way you sent them, properly packaged and ordered. Or in the case of the vvvv-ZeroMQ wrapper: with all the proper binsizes at the Receive end.

It’s got shitloads of tried-and-true networking stuff (socket negotiations, handshakes, queueing) under the hood, that you’d be hardpressed to roll your own with even a fraction of the quality.

Did I mention it is async and fast? And that it supports not only tcp but also InProc (some kind of sharedmem for a single vvvv instance) and PGM (for scalable publish/subscribe)? Or that there are about 40+ other languages with ZeroMQ support?

mediadog · November 17, 2015, 12:59am

Thanks @velcrome, that looks wonderful - good to know about in general. And I love How It Began! http://zguide.zeromq.org/page:all#How-It-Began