
Thread optimized Ring Buffer
Compared to Queues published in DDJ Jan 2009 by Herb Sutter
Updated: 31-Aug-2009
Dr Dobbs has published several articles on thread safe queues and queue performance:
The following surface plot shows the results of running several Queue and Ring buffer implementations. The buffers are exercised with various data object sizes (1, 10, 100, 200, 500, 1000) and driven by varying number of Producer and Consumer threads. For simplicity, I am plotting the total number of threads (#Producers threads + #Consumer threads). The plot values represent the summation of the byte rate (transaction/second * dataSize) per test. The plot values are then divided by the byte rate of the baseline test object (RingBuffer). The baseline RingBuffer has no mutex locks and only supports a single Producer and single Consumer.
The test was run on a Windows XP32 box (HP8600 with 2 XEON quad core).
RingBufferLock4 is a fixed size ring buffer (100 elements) and stores the data object by value avoiding calls to Malloc. This buffer works well for small data objects.
RingBufferPtrLock4 is similar to RingBufferLock4 but stored the data objects by pointer and manages the memory allocation. This buffer works well for large data objects.
LowLockQueue4 is the thread safe queue published by Herb in DDJ Jan 2009.
The results show that the Ring buffers perform well and have several advantages over a queue:
Here is the code for the base line RingBuffer template:
//
// Simple fixed size ring buffer.
// Manage objects by value.
// Thread safe for single Producer and single Consumer.
//
#pragma once
template
class RingBuffer
{
public:
RingBuffer(size_t size = RingSize)
: m_size(size), m_buffer(new T[size]), m_rIndex(0), m_wIndex(0)
{ assert(size > 1 && m_buffer != NULL); }
~RingBuffer()
{ delete [] m_buffer; };
size_t Next(size_t n) const
{ return (n+1)%m_size; }
bool Empty() const
{ return (m_rIndex == m_wIndex); }
bool Full() const
{ return (Next(m_wIndex) == m_rIndex); }
bool Put(const T& value)
{
if (Full())
return false;
m_buffer[m_wIndex] = value;
m_wIndex = Next(m_wIndex);
return true;
}
bool Get(T& value)
{
if (Empty())
return false;
value = m_buffer[m_rIndex];
m_rIndex = Next(m_rIndex);
return true;
}
private:
T* m_buffer;
size_t m_size;
// volatile is only used to keep compiler from placing values in registers.
// volatile does NOT make the index thread safe.
volatile size_t m_rIndex;
volatile size_t m_wIndex;
};
DDJ_Thread_Queue_v1.zip (Linux and Windows)