reliable TCP reconnect made easy
When I came to work on Syslog one of the most disturbing texts I came across was Rainer’s observation “On the (un)reliability of plain tcp syslog…“. The problem is that a sendmsg() system call is nearly always successful — it only indicates local errors (like a full send queue), but no network errors. So even after the other side initiated a connection shutdown one can happily write into the local buffer and only get an error on the second write.