Home


Dennis Lang
lang.dennis @ comcast.net

Thread optimized Ring Buffer
Compared to Queues published in DDJ Jan 2009 by Herb Sutter
Updated: 31-Aug-2009

Dr Dobbs has published several articles on thread safe queues and queue performance:

  • "Lock-Free Queues", by P. Marginean, DDJ July 2008
  • Writing Lock-Free code: A Corrected Queue, by Herb Sutter, DDJ September 2008
  • Maximize Locality, Minimize Contention, by Herb Sutter, May 2008
  • "Lock-Free Code: A False Sense of Security", by Herb Sutter, DDJ October 2008
  • "Measuring Parallel Performance: Optimizing a Concurrent Queue", by Herb Sutter, DDJ Jan 2009
  • The following surface plot shows the results of running several Queue and Ring buffer implementations. The buffers are exercised with various data object sizes (1, 10, 100, 200, 500, 1000) and driven by varying number of Producer and Consumer threads. For simplicity, I am plotting the total number of threads (#Producers threads + #Consumer threads). The plot values represent the summation of the byte rate (transaction/second * dataSize) per test. The plot values are then divided by the byte rate of the baseline test object (RingBuffer). The baseline RingBuffer has no mutex locks and only supports a single Producer and single Consumer.

    The test was run on a Windows XP32 box (HP8600 with 2 XEON quad core).

    RingBufferLock4 is a fixed size ring buffer (100 elements) and stores the data object by value avoiding calls to Malloc. This buffer works well for small data objects.

    RingBufferPtrLock4 is similar to RingBufferLock4 but stored the data objects by pointer and manages the memory allocation. This buffer works well for large data objects.

    LowLockQueue4 is the thread safe queue published by Herb in DDJ Jan 2009.

    The results show that the Ring buffers perform well and have several advantages over a queue:

  • Easy to control memory utilization by setting ring buffer size.
  • Load balancing - prevents Producer from consuming excessive resources if Consumer is slow.
  • Easy to instrument and measure Full and Empty queue states

  • Here is the code for the base line RingBuffer template:

    //
    //  Simple fixed size ring buffer.
    //  Manage objects by value.
    //  Thread safe for single Producer and single Consumer.
    //
    
    
    #pragma once
    
    template 
    class RingBuffer
    {
    public:
        RingBuffer(size_t size = RingSize) 
            : m_size(size), m_buffer(new T[size]), m_rIndex(0), m_wIndex(0) 
            { assert(size > 1 && m_buffer != NULL); }
    
        ~RingBuffer() 
            { delete [] m_buffer; };
    
        size_t Next(size_t n) const 
            { return (n+1)%m_size; }
        bool Empty() const 
            { return (m_rIndex == m_wIndex); }
        bool Full() const
            { return (Next(m_wIndex) == m_rIndex); }
    
        bool Put(const T& value)
        {
            if (Full()) 
                return false;
            m_buffer[m_wIndex] = value;
            m_wIndex = Next(m_wIndex);
            return true;
        }
    
        bool Get(T& value)
        {
            if (Empty())
                return false;
            value = m_buffer[m_rIndex];
            m_rIndex = Next(m_rIndex);
            return true;
        }
    
    private:
        T*              m_buffer;
        size_t          m_size;
    
        // volatile is only used to keep compiler from placing values in registers.
        // volatile does NOT make the index thread safe.
        volatile size_t m_rIndex;   
        volatile size_t m_wIndex;
    };
    
    


  • Download source code(multiple Queues and Ring buffers plus test engine): Dennis Lang's  
      DDJ_Thread_Queue_v1.zip (Linux and Windows)

  • Top
    Home