General daily update: Implementing an Object Factory

Implementing an Object Factory

Say you write a simple drawing application, allowing editing of simple vectorized drawings consisting of lines, circles, polygons, and so on.^[1] In a classic object-oriented manner, you define an abstract Shape class from which all your figures will derive:

^[1] This "Hello, world" of design is a good basis for C++ interview questions. Although many candidates manage to conceive such a design, few of them know how to implement the loading of files, which is a rather important operation.

class Shape
{
public:
   virtual void Draw() const = 0;
   virtual void Rotate(double angle) = 0;
   virtual void Zoom(double zoomFactor) = 0;
   ...
};

You might then define a class Drawing that contains a complex drawing. A Drawing essentially holds a collection of pointers to Shape—such as a list, a vector, or a hierarchical structure—and provides operations to manipulate the drawing as a whole. Two typical operations you might want to do are saving a drawing as a file and loading a drawing from a previously saved file.

Saving shapes is easy: Just provide a pure virtual function such as Shape::Save(std:: ostream&). Then the Drawing::Save operation might look like this:

class Drawing
{
public:
   void Save(std::ofstream& outFile);
   void Load(std::ifstream& inFile);
   ...
};

void Drawing::Save(std::ofstream& outFile)
{
   write drawing header
   for (each element in the drawing)
   {
       (current element)->Save(outFile);
   }
}

The Shape-Drawing example just described is often encountered in C++ books, including Bjarne Stroustrup's classic (Stroustrup 1997). However, most introductory C++ books stop when it comes to loading graphics from a file, exactly because the nice model of having separate drawing objects breaks. Explaining the gory details of reading objects makes for a big parenthesis, which understandably is often avoided. On the other hand, this is exactly what we want to implement, so we have to bite the bullet. A straightforward implementation is to require each Shape-derived object to save an integral identifier at the very beginning. Each object should have its own unique ID. Then reading the file would look like this:

// a unique ID for each drawing object type
namespace DrawingType
{
const int
   LINE = 1,
   POLYGON = 2,
   CIRCLE = 3
};

void Drawing::Load(std::ifstream& inFile)
{
   // error handling omitted for simplicity
   while (inFile)
   {
      // read object type
      int drawingType;
      inFile >> drawingType;

      // create a new empty object
      Shape* pCurrentObject;
      switch (drawingType)
      {
         using namespace DrawingType;
      case LINE:
         pCurrentObject = new Line;
         break;
      case POLYGON:
         pCurrentObject = new Polygon;
         break;
      case CIRCLE:
         pCurrentObject = new Circle;
         break;
      default:
         handle error—unknown object type
      }
      // read the object's contents by invoking a virtual fn
      pCurrentObject->Read(inFile);
      add the object to the container
   }
}

This is indeed an object factory. It reads a type identifier from the file, creates an object of the appropriate type based on that identifier, and invokes a virtual function that loads that object from the file. The only problem is that it breaks the most important rules of object orientation:

It performs a switch based on a type tag, with the associated drawbacks, which is exactly what object-oriented programs try to eliminate.
It collects in a single source file knowledge about all Shape-derived classes in the program, which again you must strive to avoid. For one thing, the implementation file of Drawing::Save must include all headers of all possible shapes, which makes it a bottleneck of compile dependencies and maintenance.
It is hard to extend. Imagine adding a new shape, such as Ellipse, to the system. In addition to creating the class itself, you must add a distinct integral constant to the namespace DrawingType, you must write that constant when saving an Ellipse object, and you must add a label to the switch statement in Drawing::Save. This is an awful lot more than what the architecture promised—total insulation between classes—and all for the sake of a single function!

We'd like to create an object factory that does the job without having these disadvantages. One practical goal worth pursuing is to break the switch statement apart—so that we can put the Line creation statement in the file implementation for Line—and do the same for Polygon and Circle.

A common way to keep together and manipulate pieces of code is to work with pointers to functions, as discussed at length in Chapter 5. The unit of customizable code here (each of the entries in the switch statement) can be abstracted in a function with the signature

Shape* CreateConcreteShape();

The factory keeps a collection of pointers to functions with this signature. In addition, there has to be a correspondence between the IDs and the pointer to the function that creates the appropriate object. Thus, what we need is an associative collection—a map. A map offers access to the appropriate function given the type identifier, which is precisely what the switch statement offers. In addition, the map offers the scalability that the switch statement, with its fixed compile-time structure, cannot provide. The map can grow at runtime—you can add entries (tuples of IDs and pointers to functions) dynamically, which is exactly what we need. We can start with an empty map and have each Shape-derived object add an entry to it.

Why not use a vector? IDs are integral numbers, so we can keep a vector and have the ID be the index in the vector. This would be simpler and faster, but a map is better here. The map doesn't require its indices to be adjacent, plus it's more general—vectors work only with integral indices, whereas maps accept any ordered type as an index. This point will become important when we generalize our example.

We can start designing a ShapeFactory class, which has the responsibility of managing the creation of all Shape-derived objects. In implementing ShapeFactory, we will use the map implementation found in the standard library, std::map:

class ShapeFactory
{
public:
   typedef Shape* (*CreateShapeCallback)();
private:
   typedef std::map CallbackMap;
public:
   // Returns 'true' if registration was successful
   bool RegisterShape(int ShapeId,
      CreateShapeCallback CreateFn);
   // Returns 'true' if the ShapeId was registered before
   bool UnregisterShape(int ShapeId);
   Shape* CreateShape(int ShapeId);
private:
   CallbackMap callbacks_;
};

This is a basic design of a scalable factory. The factory is scalable because you don't have to modify its code each time you add a new Shape-derived class to the system. ShapeFactory divides responsibility: Each new shape has to register itself with the factory by calling RegisterShape and passing it its integral identifier and a pointer to a function that creates an object. Typically, the function has a single line and looks like this:

Shape* CreateLine()
{
   return new Line;
}

The implementation of Line also must register this function with the ShapeFactory that the application uses, which is typically a globally accessible object.^[2] The registration is usually performed with startup code. The whole connection of Line with the Shape Factory is as follows:

^[2] This brings us to the link between object factories and singletons. Indeed, more often than not, factories are singletons. Later in this chapter is a discussion of how to use factories with the singletons implemented in Chapter 6.

// Implementation module for class Line
// Create an anonymous namespace
//  to make the function invisible from other modules
namespace
{
   Shape* CreateLine()
   {
      return new Line;
   }
   // The ID of class Line
   const int LINE = 1;
   // Assume TheShapeFactory is a singleton factory
   // (see Chapter 6)
   const bool registered =
      TheShapeFactory::Instance().RegisterShape(
         LINE, CreateLine);
}

Implementing the ShapeFactory is easy, given the amenities std::map has to offer. Basically, ShapeFactory member functions forward only to the callbacks_ member:

bool ShapeFactory::RegisterShape(int shapeId,
   CreateShapeCallback createFn)
{
   return callbacks_.insert(
      CallbackMap::value_type(shapeId, createFn)).second;
}

bool ShapeFactory::UnregisterShape(int shapeId)
{
   return callbacks_.erase(shapeId) == 1;
}

If you're not very familiar with the std::map class template, the previous code might need a bit of explanation:

std::map holds pairs of keys and data. In our case, keys are integral shape IDs, and the data consists of a pointer to function. The type of our pair is std::pair. You must pass an object of this type when you call insert. Because that's a lot to write, it's better to use the typedef found inside std::map, which provides a handy name—value_type—for that pair type. Alternatively, you can use std::make_pair.
The insert member function we called returns another pair, this time containing an iterator (which refers to the element just inserted) and a bool that is true if the value didn't exist before, and false otherwise. The.second field access after the call to insert selects this bool and returns it in a single stroke, without our having to create a named temporary.
erase returns the number of elements erased.

The CreateShape member function simply fetches the appropriate pointer to a function for the ID passed in, and calls it. In the case of an error, it throws an exception. Here it is:

Shape* ShapeFactory::CreateShape(int shapeId)
{
   CallbackMap::const_iterator i = callbacks_.find(shapeId);
   if (i == callbacks_.end())
   {
      // not found
      throw std::runtime_error("Unknown Shape ID");
   }
   // Invoke the creation function
   return (i->second)();
}

General daily update

शनिवार, 13 अगस्त 2011

Implementing an Object Factory

Implementing an Object Factory

कोई टिप्पणी नहीं:

एक टिप्पणी भेजें

fly