This project has moved. For the latest updates, please go here.

about the async-loop pattern in new TVA.Communication

Developer
Nov 7, 2012 at 6:17 PM

I have noticed that the new version of TVA.Communication has adopted "async-loop" pattern in all sockets components. I don't know whether these changes are inspired by my "enhanced socket components" e-mail I sent to you, because I haven't hear any feedback since then. If it doesn't, I can only say that "great minds always think the same".

However, I also noticed that, your implementation is based on "ConcurrentQueue + Interlocked". if you did received my e-mail and reviewed my source codes, you may notice that I have pointed out a potential race condition in this implementation in the comments. 

although both ConcurrentQueue and Interlocked are atomic, however, there is no guarantee that their combination is atomic

Imagine a scenario that TcpClient is used to sending infrequent but important commands. The commands are not very often, but need to be sent immediately. What happened if the sending of old command completes after the new command is enqueued, and there is no more following commands? then that new command will remain in the sending queue forever, and will never be sent out !!! an infinite delay !!!

do you have any thought about how to avoid this rare but possible situation? much appreciated if you can share your idea with me.

best regards.

Developer
Nov 7, 2012 at 6:31 PM

following my question. 

I am afraid that I didn't describe the "race condition" clearly. However the key point is always "although both ConcurrentQueue and Interlocked are atomic, however, there is no guarantee that their combination is atomic"

take TcpClient for example, in its "ProcessSend" method,

if (m_sendQueue.TryDequeue(out payload))         

ThreadPool.QueueUserWorkItem(state => SendPayload((TcpClientPayload)state), payload);                   

else                       

<<<<<<<<<<<<<<<< what happend if "SendDataAsync" is invoked here, since "m_sending" is not reset, so "SendDataAsync" will not launch new sending operation, and "ProcessSend" has already decided not launching new sending operation based on the old state of the ConcurrentQueue. so that new piece of data will remain in the sending queue forever if no more data follows.              

Interlocked.Exchange(ref m_sending, 0);

Coordinator
Nov 7, 2012 at 6:41 PM

Hello cheka,

Rest assured that these changes were, indeed, inspired by the code you sent us. We were planning on sending out a thank you note, but it seems you beat us to it.

Also, we are aware of the potential race condition. I had personally been hoping that usage patterns would permit us to take no immediate action to fix this problem, but I have been considering solutions over the past several days. It seems to me that the simplest and possibly most effective solution would be to lock before checking the interlocked variable and concurrent queue, however that completely defeats the purpose. My next idea was to implement some sort of retry based on whether we were previously sending from the current thread and whether items are in the queue, however the idea is not well formed.

Also for your information, we have recently implemented a similar pattern to this in TVA.Collections.ProcessQueue which is simply checking the queue periodically to ensure that nothing is left in the queue. That solution is, of course, not ideal.

Any insight you can provide would be more than welcome.

Thanks,
Stephen

Developer
Nov 7, 2012 at 7:32 PM
Edited Nov 7, 2012 at 7:34 PM

hello staphen: 

thank you very much for quick reply. I haven't received any e-mail from GPA for a long time, I just checked the spam box, there is no e-mail from GPA. Weird. 

I am glad that you have noticed this race condition. I am also glad that both of us have noticed that "ConcurrentQueue's TryDequeue may still return false even after Enqueue" which will cause, as you said in the log, "send operations on sockets to stop indefinitely."

back to the topic of race condition. Your intuition is absolutely correct. " the simplest and possibly most effective solution" is adding lock. if you see my source codes, that is, implementing this "async-loop pattern" based on "ordinary Queue + lock". Believe me, in my test, the lock-free implementation (ConcurrenQueue + Interlock) doesn't behave as well as we imagine, which has no performance advantage compared against the combination of "Queue + lock". by the way, "SpinLock + Queue" has the worst performance. 

In my own codes, I have totally switched to use "Queue + Lock". Not only because it is simple, not-bad performance, no race condtition, it also has another advantage. "Queue + Lock" allows us to aggregate all existing data in the queue together before sending. By this way, it always send as much as possible, reducing the chance to compete for the lock

I have already sent you the source code about this "aggregate sending" pattern, hopefully it will be helpful to you.

also, would you please do me a favor. I want to figure out the reason that I cannot receive e-mail from GPA. Would you please send me a test e-mail to both addresses list below:

cheka008@gmail.com

stasimfs@yahoo.com.

keeping contact with your team is very important to me. so thank you very much in advance.

Coordinator
Nov 8, 2012 at 6:29 PM

We have just checked in a change using the SpinLock which we believe should solve the race condition. As well, we believe that since the conditions for the race are rare that these locks should rarely ever contend. I'm not sure how you've measured the performance of these implementations, but it seems to me that the "Queue + lock" mechanism holds the lock for much longer than it needs to when dequeuing so that it can aggregate all the items in the queue. My intuition tells me that this would increase lock contention because it increases the chances for a context switch while the lock is held. Was your "SpinLock + Queue" implementation aggregating as well?

We have not compared the performance of this latest implementation with the previous version or with your "Queue + lock", so we cannot vouch for its performance. Perhaps you would like to weigh in with your thoughts on this change?

Thanks,
Stephen

Developer
Nov 8, 2012 at 8:00 PM

I didn't implement the aggregating with "SpinLock + Queue". 

It is just compromise. if using the aggregating, it may hold the lock for a longer time when aggregating. However, after aggregating, since it has a lot of data to send, it won't compete for that lock again during a longer period of time.

different implementation may behave differently under different circumstances. That's the reason I abstract the sending logic into a Interface called "ISendController", and make it configurable to adapt to different communication situations. 

I will go back to re-run my test. I will let you know as soon as I get some results.

thanks.

Developer
Nov 20, 2012 at 9:21 PM

I recently test the performance of different sending strategies (Lock-free, Lock, SpinLock, and Aggregate) again. I would like to share my result with you.

My test method is very simple, following the classical "PING-PONG" method. One server, and one client. after the client connected to the server, the client first send a piece of data to the server, and then both the server and the client echo back what it receives. A timer is started both at the client and the server side, print out the throughput (number of bytes received) during last one second. 

during my test, the throughput is listed below:

lock:                              106 M/s

lock-free:                       101 M/s

aggregate:                    101 M/s

spinlock:                        98~100 M/s

I must mention that, above is just the result shown on my desktop. The throughput depends on a lot of factors, including the hardware, the buffer size, and the number of clients. If the same codes have different performance on your computer, it is absolutely normal. 

hopefully, my test result is helpful to you. 

best regards.