Template string_t<> with STL interface and generic implementation

While the STL's std::basic_string is perfect example of flexibility allowing a specification of memory allocation, and string character's traits, it does not allow you to change the implementation specifics. This is acceptable in most cases, yet on other hand not allowing the change of an implementation means that std::basic_string is not as generic as it should be.

Here is infamous example: Windows API string BSTR and it's ATL wrappers (CComBSTR, and _bstr_t) have completely different from std::basic_string implementation. BSTR has prefixed length and std::basic_string keeps a reference counter in the prefix. Neither ATL nor STL-for-Windows developers thought of each other (and as the matter of fact about the rest of us). These strings just don't connect. One has to copy BSTR into std::wstring, or use standard algorithms on BSTR iterators (well, OLECHAR pointers) to get the advantages of the standard library. I don't want to talk here about the Windows but the STL clearly fails to be generic enough to cover Windows BSTR design, no matter if it's bad or good.

File string_t.h represents string_t template that has an interface of std::basic_string though it allows easily changing the string's implementation specifics:

template<
            typename E,
            class Traits = std::char_traits<E>,
            class Impl = BSTR_Impl<E, SysString_allocator<E> > 
        >
   class string_t;
First two template arguments are as the same as STL ones: string's character type, and character traits. All the string's internal implementation specifics (class members, memory allocation, string access, iterators, and so on) moved out into a template argument Impl. The Impl class also implements std::allocator methods (second template argument) for the sake of compatibility with STL.

What exactly Impl<> has to implement? Not too much. string_t<> needs the following methods:

template<typename E, class Allocator>
class Impl
{
protected:   // Since string_t is publicly derived from Impl all methods are protected 

    typedef typename Allocator::size_type size_type;
    typedef Allocator allocator_type;
         
    //
    // iterator and const_iterator. In simple case it might be enough:
    //       typedef E* iterator; 
    //       typedef const E* const_iterator; 
    // 
    class iterator;
    class const_iterator;
    
    //
    // constructors/destructor. Destructor is not virtual for the sake of STL compatibility but you may 
    // decide to make it virtual to be able to overwrite it and thus safely derive from string_t
    //
    Impl(const Impl& s);
    Impl(const allocator_type&);
    Impl(const E* s, size_type len, const allocator_type&);
    Impl(size_type len, E c, const allocator_type&);
    ~Impl();		
    
    //
    // iterators, accessors, size-resize, and swap
    //
    iterator        begin();
    iterator        end();
    const_iterator  begin()     const;
    const_iterator  end()       const;

    size_type       size()      const;
    size_type       max_size()  const;
    size_type       capacity()  const;

    const E*        c_str()     const;
    const E*        data()      const;		// it might differ from c_str()
    
    void            swap(Impl& rhs);
    void            reserve(size_type resSize);
    void            resize(size_type resSize, E c);
    
    //
    // allocator access
    //
    allocator_type  get_allocator() const;
};
I feel I owe you an explanation why I've made the Impl template public base class of string_t. Normally Impl (and its counterpart Pimpl) should never be implemented as public base class; it's a good style to hide Impl by making it either Personally, I prefer using Impl as a private member of a class. In this case though, after many hesitations, I publicly derived string_t from Impl because the Impl and string_t:
  1. are tightly coupled
  2. they are not a part of any library, just mine. (As the matter of fact, I myself prefer using the STL basic_string as much as possible because it's well tuned, portable, well scalable and works seamlessly with other parts of STL, e.g. streams, in one word, because it is, well, a standard)
  3. and, most importantly, for the sake of extensibility of string_t. An extensibility was the main reason for the public derivation; this also gives the possibility to derive from string_t by making ~Impl destructor virtual.

The one drawback of making Impl public base class is that one has to be a little bit extra careful with his Impl protecting all the aforementioned methods and leaving public only what you really want to.

The following two items represent the examples of Impl for string_t:

  1. re-implementing ATL's CComBSTR; public derivation allowed me to extend string_t with the member functions one used to use in CComBSTR(e.g. Attach, Detach, and Empty).
  2. a std::basic_string wrapper as a sceleton for extension of the STL string.

Example of using string_t<> to wrap Windows BSTR implementation

As an example two templates BSTR_Impl<> and SysString_allocator<> allocator make use of Windows API ::SysXXXString memory allocation and manipulation. Along with string_t<> these templates allow to wrap up Windows API BSTR string in STL fashion way (see string_t.h for exact code):

template<typename E=OLECHAR, class Allocator = SysString_allocator<E> >
     class BSTR_Impl;
//
// ComBSTRing definition
//
typedef string_t<
                     OLECHAR, 
                     std::char_traits<OLECHAR>, 
                     BSTR_Impl<OLECHAR, SysString_allocator<OLECHAR> > 
		> 
    ComBSTRing;
ComBSTRing combines functionality of std::basic_string and CComBSTR. Just like CComBSTR it has familiar to you m_str, Attach, Detach, and Empty. The class though does not overload BSTR* operator& (), and in this matter behaves exactly as std::basic_string, that is, you don't need any adapters (e.g.. CAdapt<>) just to be able to put ComBSTRing into a STL container. You can though still use m_str for direct access to underlying BSTR.
// 
// Example of usage ComBSTRing instead of CComBSTR. Replaces substring "Debug" with "Release" in a file name
// 
STDMETHODIMP CSomeClass::get_ReleaseName(/* [in, out] */ BSTR* file)
{
    ComBSTRing bstrFile; 		// CComBSTR bstrFile;
    bstrFile.Attach(*file);
    static const ComBSTRing bstrToReplace(L"\\Debug.");
    int pos = bstrFile.find_last_of(bstrToReplace);
    if (pos == ComBSTRing::npos)
        bstrFile.assign(L"C:\\Temp\\Release.xml");
    else
        bstrFile.replace(pos -bstrToReplace.size() +1, bstrToReplace.size(), L"\\Release.");

    *file = bstrFile.Detach();
}

Reusing std::basic_string

The following is string_t<> implementation to wrap std::basic_string. It only may look like a little joke: making up a string with STL interface out of the STL string. But besides of the testing (that how I tested string_t) it also can be useful because of std::basic_string is (obviously) a standard, has already been meticulously tested (including multithreaded environment), has implemented "copy-on-write" ability through the reference counting (at least Visual C++ implementation of the string). You can add up more methods and members to it to make your very own class with the STL string's interface and functionality "embedded" to it.
#include <string>

template<class E, class Traits=std::char_traits<E>, class A = std::allocator<E> >
class std_string_Impl
{
    typedef std::basic_string<E, Traits, A>  Impl_t;
    std_string_Impl& operator=(const std_string_Impl&);
    std::basic_string<E, Traits, A>  m_str;
    
protected:
    typedef typename A::size_type   size_type;
    typedef Impl_t::iterator        iterator;
    typedef Impl_t::const_iterator  const_iterator;
    typedef A                       allocator_type;
    
    std_string_Impl(const std_string_Impl& s)             : m_str(s) {}
    std_string_Impl(const E* s, size_type len, const A& a): m_str(s, len, a) {}
    std_string_Impl(size_type len, E c, const A& a)       : m_str(len, c, a) {}
    std_string_Impl(const A&) {}
    /* virtual */ ~std_string_Impl() {}    // you may want to make it virtual to be able to derive from the class
    
    iterator        begin()                         { return m_str.begin(); }
    const_iterator  begin()     const               { return m_str.begin(); }
    iterator        end()                           { return m_str.end(); }
    const_iterator  end()       const               { return m_str.end(); }

    size_type       size()      const               { return m_str.size(); }
    size_type       max_size()  const               { return m_str.max_size(); }
    size_type       capacity()  const               { return m_str.capacity()(); }

    const E*        c_str()     const               { return m_str.c_str(); }
    const E*        data()      const               { return m_str.data(); }

    void            swap(std_string_Impl& rhs)      { m_str.swap(rhs.m_str); }

    void            reserve(size_type resSize)      { m_str.reserve(resSize); }
    void            resize(size_type resSize, E c)  { m_str.resize(resSize, c); }
    allocator_type  get_allocator() const           { return m_str.get_allocator(); }
};

//
// typedef for std_string and std_wstring
//
typedef string_t<char, std::char_traits<char>, std_string_Impl<char, std::char_traits<char> > > 
     std_string;

typedef string_t<wchar_t, std::char_traits<wchar_t>, std_string_Impl<wchar_t, std::char_traits<wchar_t> > > 
     std_wstring;

File string_t.cpp represents a simple console test for the ComBSTRing.


About me click here (http://GeorgeSalnikov.home.comcast.net).
Downloads:
  1. string_t.h
  2. string_t.cpp console test for the ComBSTRing.