Overlapped Completion Routines

Written by Evan "Eibro" Brooks
The most up to date version of this article can be found at http://users.hfx.eastlink.ca/~ebrooks/articles/, example code can be downloaded here.

In this article/tutorial I'll explain how to use OVERLAPPED completion routines for fast transfer of data to and from a host. Completion routines are one of the fastest methods you can use to send and receive data using Winsock. Their speed is overshadowed only by completion ports, which I will not cover here. I'll assume the reader is fluent in C/C++, and at least somewhat familiar with win32/Winsock.


The following code is pretty simple. We initialize and terminate the Winsock library. Both WSAStartup and WSACleanup return 0 on success. The #pragma simply tells the linker to link with ws2_32.lib. This is VC++ specific, so if you're using another compiler you're going to have to link with ws2_32.lib manually. Note that i'll be using assert for all error checking throughout my code. You'll obviously want to replace this with something more robust in production code.

#include <cassert>
#include <winsock2.h>
// VC++ specific, link with ws2_32.lib or similar if you're using another compiler
#pragma comment( lib, "ws2_32.lib" ) 

SOCKET sock;

int main( int argc, char** argv ) {
        WSADATA wsa;
        assert( WSAStartup( &wsa, WINSOCK_VERSION ) == 0 );

        // Code here

        assert( WSACleanup() == 0 );
        return 0;
}

The next step is to initialize our socket, and connect to a remote host. The example program i'm going to build will connect to a website and download index.html. Here we'll use www.google.ca as the remote host. If you've worked with Winsock in the past, the following code should be nothing new to you.

sock = socket( AF_INET, SOCK_STREAM, IPPROTO_TCP );
sockaddr_in addr = {0};
HOSTENT* lpHost = gethostbyname( "www.google.ca" );

addr.sin_addr.S_un.S_addr = *(ULONG*)lpHost->h_addr_list[0];
addr.sin_port = htons( 80 );
addr.sin_family = AF_INET;

assert( connect( sock, (sockaddr*)&addr, sizeof( sockaddr ) ) == 0 );

If all goes well, we're now connected to www.google.ca. Now we need to initialize an OVERLAPPED structure, a receive buffer to read data into and declare our completion routines. A completion routine is simply a function that is invoked when an OVERLAPPED operation completes (in this case either data has been sent, or received). I'll explain what an OVERLAPPED structure is in a little bit, don't worry about that for now. A completion routine has the following signature: void CALLBACK CompletionRoutine( DWORD, DWORD, LPWSAOVERLAPPED, DWORD ); We're going to need to declare two of these; one for send completion, and one for receive completion. Add the following prototypes to your program.

void CALLBACK RecvComplete( DWORD, DWORD, LPWSAOVERLAPPED, DWORD );
void CALLBACK SendComplete( DWORD, DWORD, LPWSAOVERLAPPED, DWORD );

Also, we're going to need a good sized buffer to receive data into. 1024 bytes should be good for this example. The size of this buffer really depends on what you're expecting to receive from the host. It is important to note that the recvBuffer must not go out of scope while an OVERLAPPED operation is pending. With our added prototypes, what we should have up to this point is

#include <cassert>
#include <winsock2.h>
// VC++ specific, link with ws2_32.lib or similar if you're using another compiler
#pragma comment( lib, "ws2_32.lib" ) 
static const u_long RECV_MAX = 1024;

void CALLBACK RecvComplete( DWORD, DWORD, LPWSAOVERLAPPED, DWORD );
void CALLBACK SendComplete( DWORD, DWORD, LPWSAOVERLAPPED, DWORD );

SOCKET sock;
char recvBuffer[RECV_MAX];

int main( int argc, char** argv ) {
        WSADATA wsa;
        assert( WSAStartup( &wsa, WINSOCK_VERSION ) == 0 );

        sock = socket( AF_INET, SOCK_STREAM, IPPROTO_TCP );
        sockaddr_in addr = {0};
        HOSTENT* lpHost = gethostbyname( "www.google.ca" );

        addr.sin_addr.S_un.S_addr = *(ULONG*)lpHost->h_addr_list[0];
        addr.sin_port = htons( 80 );
        addr.sin_family = AF_INET;

        assert( connect( sock, (sockaddr*)&addr, sizeof( sockaddr ) ) == 0 );

        assert( WSACleanup() == 0 );
        return 0;
}

Next is the OVERLAPPED structure. The function of the OVERLAPPED struture is to bridge the gap between the initiation of an OVERLAPPED operation, and its subsequent completion. The way this works is very similar to how a WNDPROC works when you're writing a message callback. Think of the OVERLAPPED structure like a window handle (hWnd). The OVERLAPPED structure initially construct and pass along to your WSASend and WSARecv calls will be passed along to the completion routine. This allows you to have multiple OVERLAPPED operations pending on the same completion routine, and still be able tell what operation has just completed. Just as we needed two completion routines, we're going to need to OVERLAPPED structures.

// Global variables
OVERLAPPED sendOv;
OVERLAPPED recvOv;

// In main
ZeroMemory( &sendOv, sizeof( OVERLAPPED ) );
ZeroMemory( &recvOv, sizeof( OVERLAPPED ) );

Now we will write our completion routines. Receive completion is a little more complicated than send completion. When we receive data, we need to stick it into our receive buffer directly after all other data we've already received. This is done by keeping a tally of n bytes we've received, and recving into recvBuffer + n bytes. Send completion is simple. We tell the remote host we're not sending anymore data.

void CALLBACK RecvComplete( DWORD dwError, DWORD dwTransferred, LPWSAOVERLAPPED lpOverlapped, DWORD dwFlags ) {

        assert( dwError == 0 );

        // Keep a tally of how many bytes we've received overall
        static DWORD dwTotalTransferred;        

        if ( dwTransferred == 0 ) {
                // If we received 0 bytes, the remote side has close the connection
                bQuit = true;
                return;
        }
        
        dwTotalTransferred += dwTransferred;
        // WSARecv here ...
        
}


void CALLBACK SendComplete( DWORD dwError, DWORD dwTransferred, LPWSAOVERLAPPED lpOverlapped, DWORD dwFlags ) {

        assert( dwError == 0 );
        shutdown( sock, SD_SEND );
}

Alright. Now we need to actually initiate sending and receiving data. This is done with WSASend/WSARecv. These functions take similar parameters; a socket, buffers, OVERLAPPED structure and a completion routine. They return 0 if the operation completed successfully, or SOCKET_ERROR otherwise. It is important to note that if the function returns SOCKET_ERROR, and WSAGetLastError() returns WSA_IO_PENDING, no error has occoured. It simply means the OVERLAPPED operation is pending, and will be completed at a later time. Like the any buffers passed along to WSASend/WSARecv, the OVERLAPPED structure must not go out of scope while an operation is pending. To initiate recving data, we call WSARecv:

DWORD dwBytes, dwFlags = 0;
WSABUF recvBuf;
recvBuf.buf = recvBuffer;
recvBuf.len = RECV_MAX;

assert( WSARecv( sock, &recvBuf, 1, &dwBytes, &dwFlags, &recvOv, RecvComplete ) == 0 || WSAGetLastError() == WSA_IO_PENDING );

The parameters dwBytes, and dwFlags can be safely ignored. They receive the number of bytes received and the operation flags if the operation completes immediatly (that is, WSARecv returns 0). I've never found a use for being notified in this manner, as the completion routine is queued to run anyway. I suppose you could optimize by firing the completion routine right away if you receive a return value of 0 (see below). Sending is initiated in a similar manner.

DWORD dwBytes;
WSABUF sendBuf;
sendBuf.buf = "GET /index.html HTTP/1.1\r\n\r\n"; // HTTP GET request, see HTTP RFC
sendBuf.len = strlen( sendBuf.buf );

assert( WSASend( sock, &sendBuf, 1, &dwBytes, 0, &recvOv, SendComplete ) == 0 || WSAGetLastError() == WSA_IO_PENDING );

That's just about it! We now need to write our application loop, and we're finished. At some point in our loop, we must put the thread into an alertable wait state. More or less, what this means is that our thread is sleeping, but can be resumed at any time. When our thread is in an alertable wait state, the operating system will have a chance to call our completion routines. If you're interested in how this works, see the QueueUserAPC() API.

while ( !bQuit ) 
        SleepEx( INFINITE, TRUE ); // Sleep until the OS wakes us (calls an APC)

Simple eh? If you're trying to incorporate completion routines in a GUI application, this is slightly more complicated. It usually involves calling MsgWaitForMultipleObjectsEx() or similar to set the thread in an alertable wait state. For completeness, I'll paste the complete program below.

#include <cassert>
#include <winsock2.h>
// VC++ specific, link with ws2_32.lib or similar if you're using another compiler
#pragma comment( lib, "ws2_32.lib" ) 
static const u_long RECV_MAX = 1024;

void CALLBACK RecvComplete( DWORD, DWORD, LPWSAOVERLAPPED, DWORD );
void CALLBACK SendComplete( DWORD, DWORD, LPWSAOVERLAPPED, DWORD );

bool bQuit = false;
SOCKET sock = INVALID_SOCKET;
char recvBuffer[RECV_MAX];
OVERLAPPED sendOv;
OVERLAPPED recvOv;

int main( int argc, char** argv ) {
        WSADATA wsa;
        assert( WSAStartup( &wsa, WINSOCK_VERSION ) == 0 );

        // Initialize our socket
        sock = socket( AF_INET, SOCK_STREAM, IPPROTO_TCP );

        // Resolve host and connect to it
        sockaddr_in addr = { 0 };
        HOSTENT* lpHost = gethostbyname( "www.google.ca" );

        addr.sin_addr.S_un.S_addr = *(ULONG*)lpHost->h_addr_list[0];
        addr.sin_port = htons( 80 );
        addr.sin_family = AF_INET;

        // Initialize OVERLAPPED structures
        ZeroMemory( &sendOv, sizeof( OVERLAPPED ) );
        ZeroMemory( &recvOv, sizeof( OVERLAPPED ) );

        assert( connect( sock, (sockaddr*)&addr, sizeof( sockaddr ) ) == 0 );

        // Send HTTP request
        DWORD dwBytes, dwFlags = 0;
        char* szRequest = "GET /index.html HTTP/1.1\r\n\r\n";
        WSABUF buf;
        buf.buf = szRequest;
        buf.len = strlen( szRequest );

        assert( WSASend( sock, &buf, 1, &dwBytes, dwFlags, &recvOv, SendComplete ) == 0 || WSAGetLastError() == WSA_IO_PENDING );


        // Start receiving response
        buf.buf = recvBuffer;
        buf.len = RECV_MAX;
        
        assert( WSARecv( sock, &buf, 1, &dwBytes, &dwFlags, &recvOv, RecvComplete ) == 0 || WSAGetLastError() == WSA_IO_PENDING );
                
        while ( !bQuit ) 
                SleepEx( INFINITE, TRUE ); // Sleep until the OS wakes us (calls an APC)

        // Display a message box with the data received as a caption
        MessageBox( NULL, recvBuffer, "Response", MB_ICONINFORMATION );

        assert( WSACleanup() == 0 );
        return 0;
}

void CALLBACK RecvComplete( DWORD dwError, DWORD dwTransferred, LPWSAOVERLAPPED lpOverlapped, DWORD dwFlags ) {

        assert( dwError == 0 );

        // Keep a tally of how many bytes we've received overall
        static DWORD dwTotalTransferred;        

        if ( dwTransferred == 0 ) {
                // If we received 0 bytes, the remote side has close the connection
                bQuit = true;
                return;
        }
        
        // Continute recving data
        dwTotalTransferred += dwTransferred;
        WSABUF buf;
        buf.buf = &recvBuffer[dwTotalTransferred];
        buf.len = RECV_MAX - dwTotalTransferred;
        assert( WSARecv( sock, &buf, 1, &dwTransferred, &dwFlags, &recvOv, RecvComplete ) == 0 || WSAGetLastError() == WSA_IO_PENDING );
        
}

void CALLBACK SendComplete( DWORD dwError, DWORD dwTransferred, LPWSAOVERLAPPED lpOverlapped, DWORD dwFlags ) {

        assert( dwError == 0 );
        shutdown( sock, SD_SEND );
}