Policy Based Design Using C++ Typemaps
Policies provide effective method of customizing class behavior. The standard way of providing policies is simply passing them to the class during instantiation or using typedef constructions. Both methods require knowledge about each policy position in template parameter list. Besides, if the required policy is not the first on the template parameter list, then the programmer is required to supply all the previous parameters. Moreover, the designer of class can't easily modify policy list without modifying the code where the class is used.
The proposed is method of using typemaps for providing policies. Typemaps are built using C++ meta-programming techniques. Each element of typemap contains the key and value parts. The key contains the policy category, and the value represents the policy itself. The programmer has to build the policy meta-container and pass it to every used class. The programmer, normally, would take the default policy list and replace only the necessary ones. Then the class accepts the whole list and chooses only the needed policies. Typemaps make policy based design easier and more convenient.
Although the idea of policies has existed long ago (STL is an example of policy based design for some degree), it is just recently become essential programming technique. As compilers become more standards compliant, policies become ample part of any class design. I would refer to [Alexandrescu] for discussion about policies. Here I will just walk through the policies mainly for introducing the sample code and demonstrating the benefits of proposed methods.
Suppose we are designing some ImageFile class. The class reads from a file, decodes the picture, and provides some convenient way for accessing its pixels. There is a place for several policies for such class.
The essential policy of almost any class is ErrorPolicy. Error policy can define whether ImageFile should throw an exception or just return a humble false. The programmer might want ImageFile to keep the text describing the error or just an error code or provide some custom error handler, which will log the error, send e-mail, connect through pipe, etc.
ThreadingPolicy is also important in today's multithreaded world. There is no reason to lock/unlock some mutex, if the class is used only from a single thread even in a multithreaded program. From the other side it is very convenient to use the same class from the different threads and not to worry about coordinating the access to that class manually. Once you start making mutexes around somebody's classes, then get ready to many nasty things like deadlocks, thread leeks, etc. The best solution is to tell the class what you think about the threading model and let it do the whole job.
Among other policies would be MemoryAllocatorPolicy, just like STL's allocator. StringPolicy could also help, to choose whether the class should use std::string or something custom made. On the edge of 32-64 bit computing LargeFilePolicy could be handy.
ImageFile class should be able to support many formats and have possibility for adding or removing them. This also can be done through the proposed policy mechanism. In this case the policy itself is a container of all supported image formats - ImageFormatContainer.
ImageFile class reads the image from file, but nobody says that it should read only from a local file. It is very convenient to be able to read the data from various ftp, http, and other *tp servers. FileMediaContainer could be the policy containing all the supported data sources.
The definition of ImageFile class would look like this.
template < class ErrorPT = DefErrorP, class ThreadPT = DefThreadP, class AllocPT = DefAllocP, class LargePT = DefLargeP, class FormatCT = DefFormatC, class SourceCT = DefSourceC > class ImageFile { ... } ;
In the simplest case with all default parameters ImageFile class can be used like this:
ImageFile<> nice_file ;
If you want to change the error policy, then just write
ImageFile<MyErrorHandler> ok_file ;
The first inconvenience comes when you need to change a policy, which is not the first on the template parameter list. In this case you have to provide the default values for all the previous arguments.
ImageFile < DefErrorP, DefThreadP, DefLargeP, MyMemoryAllocator > ugly_file ;
It is often suggested to use the following construction:
typedef ImageFile < DefaultErrorHandler, DefaultThreadingModel, MyMemoryAllocator > ugly_file_type ; ugly_file_type better_file ;
In this case you have to type an extra line of code for typedef. Usually the program contains many classes, so you have to typedef every class. Another inconvenience with typedefs is that you usually don't know how much deep into blocks you want to bury them. If typedefs are too deep in your nested blocks, then you have to typedef too often for the same class for different blocks. If you make them too visible, then you won't be able to use nice short names, since there will be several string_type's, value_type's, and connection_type's all around your code.
Inconvenience of providing policies is just a fraction of the bigger problem. If you will decide to add, remove or replace some policy, then you will have to change each use of this class in each application. If the project is big enough, then that kind of modifications become simply impossible. I don't even mention the case when the class is popular and widely used.
C++ Meta-programming
Imagine those days of machine level programming, when you had to give the physical address of every used variable. Adding, removing variable, or changing its size required substantial code rewriting. Similarly C++ meta-programming seems in that early stage. Many things have to be done manually, the code becomes very verbose, similar code has to be written several times, little change of design requires rewriting of large amount of code, etc. I would expect the future languages to move towards extending meta-programming capabilities.
Think about programming language that supports only switch statement. The program code would consist of selection statements and recursive calls. Imagine how awkward those programs would look like. C++ meta-programming code looks not much less awkward.
The equivalent of switch statement in meta-program is partial specialization. The recursive type definition serves as equivalent of recursive calls in procedural language. These two constructions are enough to implement virtually any logic, but of course at high cost.
In purpose to develop a better policy based designed, I'm going to describe an associative container - typemap. Look at typemap as meta-std::map, where the key and the value parts refer to C++ types. There are several designs already available, for example [Boost] MPL library. You can use mpl::map for the same purposes as the proposed typemap. Still sometimes it is useful to reimplement the existing design to adjust it better for the particular use. Typemap may be less generic, but designed as one module, located in one file. The shown implementation does not use C macro preprocessor at all. Reimplementation may take less time, than digging somebody else's meta-code, if something goes wrong. In many cases the ability to use meta-programming techniques is more important than actually having ready meta-libraries. Just like years ago, before C/C++ libraries become standardized, every programmer had own libraries of primitive routines.
In C++ sense typemap is not a class, but since we are involved in meta-programming, let's name it meta-class or classoid. We can emulate class functionality by using namespace constructs of C++ language. First we define namespace typemap, which acts as actual class. Then each class defined in that namespace is regarded as member function and each typedef acts as member variable. We move all private members into typemap_private namespace. Of coarse typemap_private is not private (in C++ sense) at all. The programmer can still access these members intentionally, but not accidentally. Actually, by this way we can simulate even better functionality of member access mechanism, than exists in modern C++. Instead of revoking the access to private members, we simply make them invisible.
namespace typemap { ... // public members namespace typemap_private { ... // private members } }
As for any associative container every element of typemap contains a key and value parts. The common way of designing meta-container is to have each element referring to the next one. The last element refers to a type that is never used anywhere else (Listing 1).
NullType contains Key member to prevent failures, when the compiler tries to look at the Key component of the last element. I use struct keyword instead of class just to skip writing annoying public statements. It really doesn't matter whether you use class or struct in this case. One may suggest using class, just to point that this is a C++ code, but this is not strictly a C++ class. In our meta-programming sense this is just a declaration of meta-member.
The typemap container is simply a reference to the first element. Typemap is defined as separate class, because in the future we may need to add something without touching the item class. I would suggest moving the implementation into private for all other members. More complicated constructions would require many parameters, and doing so will protect against accidental passing of unwanted template parameters.
Note the X type definition, which serves as a return value. Some libraries use type word, some try to use more descriptive names. I would argue for one-letter name for making code shorter and capital to indicate a type in C++ sense.
As we have described the container and its item, we may want to jump directly to the insertion of elements. Having procedural programming language it is relatively easy to implement the insertion into map class. Basically you have to search for the key and then insert or update the element depending on whether the key was found or not. The description of an insertion algorithm takes one sentence and the actual code will not be much longer. Having meta-language such simple algorithm requires the use of several techniques of meta-programming and significant size of code. One thing that makes the design of insert easier, is that we don't have the relational operation among C++ types. It is hard to invent a logic by how you should sort File and Widget classes. Our typemap meta-class just keeps the keys unique.
The insertion algorithm uses almost all the other techniques of typemap meta-class. As soon as we finish with insert member, we will be through almost all other typemap techniques.
The public part of insert member just checks for emptiness of the container and calls the private implementation (Listing 2). If the container is empty, then the specialization works and provided key-value pair becomes the first element of map. If the container is not empty, then the algorithm should search for the given key. This is a place for another member of typemap meta-class - find.
The search algorithm uses C++ partial specialization construction. The heart of the search algorithm is the match class (Listing 3). The first specialization works when element is found. As the K parameter becomes equal to KK, the compiler picks it up and returns the item as X. The second specialization is used, if the compiler reaches the end of the list without finding the element.
If insert algorithm can't find the given element, then it just appends the new one. This is easy to do using push_front member. Note, that having push_front is useful only internally and for debugging purposes. The user of the class should never use it, but always insert instead. Therefore push_front and push_back should be moved into private area. The definition of push_front is simple and straightforward (Listing 4). The definition of erase_front is similar.
The hardest part comes, when we have to replace an existing element. It is interesting to compare with procedural language case, where this is the easiest case. It doesn't matter however, whether we remove the existing element and insert the new one or just change the value part of the element. None of those change the complexity of the algorithm. The algorithm still looks ugly. While being ugly in regular programming is usually unacceptable, in meta-programming world it could be perfectly useful. Anyway, if reader can find a better solution than presented below, I will appreciate much for the information. The problem with the present solution is that it requires reversing the container after finding the element, erasing, adding, and then reversing again. The better solution would be a way of changing some element without restructuring the container itself. Another solution is to mark the existing element and skip using it later. The disadvantage of this method is the growth of the container with unwanted elements. While the described algorithm in no way harms the application performance, it degrades the compilation time, which is also a concern.
The described insertion algorithm reverses the part of the container for manipulating its boundary elements. It is hard to define relational operator for types and in our current purpose doesn't make any sense at all. In this case, when the elements are not ordered but just grouped, the reverse operation makes sense. Unfortunately reverse algorithm requires total restructuring of the container, so with large containers, the compilation time might degrade significantly. As shown on Listing 5, the reverse algorithm starts with NullType element and accumulates other elements through Item component. As soon as the end of container has reached (the compiler picks the specialization), then the current collected element (I parameter) becomes the resulting reversed container.
Erase member uses the matching technique to find the element. When the element is found, the algorithm reverses all the previous elements, to bring the found item to front. After using erase_front method, the algorithm reverses part of the container back and concatenates it with the remaining part. The insert method uses the same technique to remove the found element and then add the new one.
The final insert algorithm is described in Listing 7. It encapsulates all the previous defined techniques. I have repeated the code, to make it more readable and maybe compilable slightly faster.
Another functionality that we should add to typemap class is iteration through its elements. The iteration through the meta-container often plays a role of a bridge between meta- and real-code. The simplest case of for_each method is shown in Listing 6. The iteration is done by recursive inheritance. Each inherited class calls a member of its base. The last class calls the member of empty base, which does nothing.
For_each member accepts the container as first parameter and a functor as second. The functor has to accept a template parameter, which refers to the current element. For_each instantiates the functor and calls its member for each element of the container.
I don't see a good way of implementing generic for_each member, some sort of generic code of second degree. With current C++ capabilities I would prepare for_each member for the simple and most common case, and reimplement for_each for each more complicated and specific use. The examples coming with this article have implementation of for_each member with support of returning values and passing additional parameters.
Typemaps and Policies
As we have created the typemap meta-class, the policy design becomes simple and straightforward - just pass the typemap as template parameter. The usual programming library may have some default policies, suitable for most of the applications (Listing 8). The programmer then picks the default container, and modifies it to fit the specific needs.
typedef typemap::insert<KError, MyErr, typemap::insert<KMemory, MyMem, DefaultPolicy >::X >::X MyPolicy ;
Then any class that accepts policy as typemap can be used like this.
ImageFile<MyPolicy> image_file ; Socket<MyPolicy> sock ; Database<MyPolicy> db ; Widget<MyPolicy> w ;
The class accepts the policy as template parameter and then searches the used policies by their keys (Listing 9). The class only chooses the needed policies, ignoring others. If the library designer decided to add some other policies, then it is just enough to add it to DefaultPolicy container.
We can go further to submit policies that are containers themselves. For ImageFile this is the container of all supported image formats.
typedef typemap::insert<KPgm, PgmFmt, typemap::insert<KJpeg, JpegFmt, typemap::insert<KPng, PngFmt >::X >::X >::X SupportedFormats ;
More likely the library would have some set of supported formats, and programmer can just simply adjust the default list.
typedef typemap::erase< KGif,DefaultFormats >::X MyFormats ;
Then the new policy containing the list supported formats would look like this.
typedef typemap::insert< KImageFmts, MyFormats >::X FmtsPolicy ;
Then the class, that accepts the KImageFmts component has to iterate through the container to perform the necessary action (Listing 9).
A variable in any programming language is an element of some sort of associative container, where the variable name is the key. The programming languages provide more natural and less verbose ways of accessing variables, instead of building maps and explicitly searching for an element for accessing its value. Unfortunately C++ meta-programming doesn't have the machinery for manipulating with types as with regular variables. Maybe future languages may allow doing this, thus bringing more abstraction level to the code, where the same code syntax would apply equally to the types and variables. In the particular case of policies it would be very convenient to change the member types of a class at compile time similar to changing variables at runtime.
[Abrahams, Gurtovoy] David Abrahams, Alexey Gurtovoy: C++ Template Metaprogramming, Addison-Wesley, 2004. p. 86
[Alexandrescu] Andrei Alexandrescu: Modern C++ Design, Addison Wesley, 2001, p. 3
[Boost] Boost Library: http://www.boost.org, MPL library
Copyright (C) 2005,2006,2013 Andre Mirzoyan