[ Viewing Hints ] [ Exercise Solutions ] [ Volume 2 ] [ Free Newsletter ]
[ Seminars ] [ Seminars on CD ROM ] [ Consulting ]

Thinking in C++, 2nd ed. Volume 1

©2000 by Bruce Eckel

[ Previous Chapter ] [ Table of Contents ] [ Index ] [ Next Chapter ]

13: Dynamic Object Creation

Sometimes you know the exact quantity, type, and lifetime of the objects in your program. But not always.

How many planes will an air-traffic system need to handle? How many shapes will a CAD system use? How many nodes will there be in a network?

To solve the general programming problem, it’s essential that you be able to create and destroy objects at runtime. Of course, C has always provided the dynamic memory allocation functions malloc( ) and free( ) (along with variants of malloc( )) that allocate storage from the heap (also called the free store) at runtime.

However, this simply won’t work in C++. The constructor doesn’t allow you to hand it the address of the memory to initialize, and for good reason. If you could do that, you might :

Forget. Then guaranteed initialization of objects in C++ wouldn’t be guaranteed.
Accidentally do something to the object before you initialize it, expecting the right thing to happen.
Hand it the wrong-sized object.

And of course, even if you did everything correctly, anyone who modifies your program is prone to the same errors. Improper initialization is responsible for a large portion of programming problems, so it’s especially important to guarantee constructor calls for objects created on the heap.

So how does C++ guarantee proper initialization and cleanup, but allow you to create objects dynamically on the heap?

The answer is by bringing dynamic object creation into the core of the language. malloc( ) and free( ) are library functions, and thus outside the control of the compiler. However, if you have an operator to perform the combined act of dynamic storage allocation and initialization and another operator to perform the combined act of cleanup and releasing storage, the compiler can still guarantee that constructors and destructors will be called for all objects.

In this chapter, you’ll learn how C++’s new and delete elegantly solve this problem by safely creating objects on the heap.

Object creation

When a C++ object is created, two events occur:

Storage is allocated for the object.
The constructor is called to initialize that storage.

By now you should believe that step two always happens. C++ enforces it because uninitialized objects are a major source of program bugs. It doesn’t matter where or how the object is created – the constructor is always called.

Step one, however, can occur in several ways, or at alternate times :

Storage can be allocated before the program begins, in the static storage area. This storage exists for the life of the program.
Storage can be created on the stack whenever a particular execution point is reached (an opening brace). That storage is released automatically at the complementary execution point (the closing brace). These stack-allocation operations are built into the instruction set of the processor and are very efficient. However, you have to know exactly how many variables you need when you’re writing the program so the compiler can generate the right code.
Storage can be allocated from a pool of memory called the heap (also known as the free store). This is called dynamic memory allocation. To allocate this memory, a function is called at runtime; this means you can decide at any time that you want some memory and how much you need. You are also responsible for determining when to release the memory, which means the lifetime of that memory can be as long as you choose – it isn’t determined by scope.

Often these three regions are placed in a single contiguous piece of physical memory: the static area, the stack, and the heap (in an order determined by the compiler writer). However, there are no rules. The stack may be in a special place, and the heap may be implemented by making calls for chunks of memory from the operating system. As a programmer, these things are normally shielded from you, so all you need to think about is that the memory is there when you call for it.

C’s approach to the heap

To allocate memory dynamically at runtime, C provides functions in its standard library: malloc( ) and its variants calloc( ) and realloc( ) to produce memory from the heap , and free( ) to release the memory back to the heap. These functions are pragmatic but primitive and require understanding and care on the part of the programmer. To create an instance of a class on the heap using C’s dynamic memory functions, you’d have to do something like this:

//: C13:MallocClass.cpp
// Malloc with class objects
// What you'd have to do if not for "new"
#include "../require.h"
#include <cstdlib> // malloc() & free()
#include <cstring> // memset()
#include <iostream>
using namespace std;

class Obj {
  int i, j, k;
  enum { sz = 100 };
  char buf[sz];
public:
  void initialize() { // Can't use constructor
    cout << "initializing Obj" << endl;
    i = j = k = 0;
    memset(buf, 0, sz);
  }
  void destroy() const { // Can't use destructor
    cout << "destroying Obj" << endl;
  }
};

int main() {
  Obj* obj = (Obj*)malloc(sizeof(Obj));
  require(obj != 0);
  obj->initialize();
  // ... sometime later:
  obj->destroy();
  free(obj);
} ///:~

You can see the use of malloc( ) to create storage for the object in the line:

Obj* obj = (Obj*)malloc(sizeof(Obj));

Here, the user must determine the size of the object (one place for an error). malloc( ) returns a void* because it just produces a patch of memory, not an object. C++ doesn’t allow a void* to be assigned to any other pointer, so it must be cast.

Because malloc( ) may fail to find any memory (in which case it returns zero), you must check the returned pointer to make sure it was successful.

But the worst problem is this line:

Obj->initialize();

If users make it this far correctly, they must remember to initialize the object before it is used. Notice that a constructor was not used because the constructor cannot be called explicitly[50] – it’s called for you by the compiler when an object is created. The problem here is that the user now has the option to forget to perform the initialization before the object is used, thus reintroducing a major source of bugs.

It also turns out that many programmers seem to find C’s dynamic memory functions too confusing and complicated; it’s not uncommon to find C programmers who use virtual memory machines allocating huge arrays of variables in the static storage area to avoid thinking about dynamic memory allocation. Because C++ is attempting to make library use safe and effortless for the casual programmer, C’s approach to dynamic memory is unacceptable.

operator new

The solution in C++ is to combine all the actions necessary to create an object into a single operator called new . When you create an object with new (using a new-expression ), it allocates enough storage on the heap to hold the object and calls the constructor for that storage. Thus, if you say

MyType *fp = new MyType(1,2);

at runtime, the equivalent of malloc(sizeof(MyType)) is called (often, it is literally a call to malloc( )), and the constructor for MyType is called with the resulting address as the this pointer, using (1,2) as the argument list. By the time the pointer is assigned to fp, it’s a live, initialized object – you can’t even get your hands on it before then. It’s also automatically the proper MyType type so no cast is necessary.

The default new checks to make sure the memory allocation was successful before passing the address to the constructor, so you don’t have to explicitly determine if the call was successful. Later in the chapter you’ll find out what happens if there’s no memory left.

You can create a new-expression using any constructor available for the class. If the constructor has no arguments, you write the new-expression without the constructor argument list:

MyType *fp = new MyType;

Notice how simple the process of creating objects on the heap becomes – a single expression, with all the sizing, conversions, and safety checks built in. It’s as easy to create an object on the heap as it is on the stack.

operator delete

The complement to the new-expression is the delete-expression, which first calls the destructor and then releases the memory (often with a call to free( )). Just as a new-expression returns a pointer to the object, a delete-expression requires the address of an object.

delete fp;

This destructs and then releases the storage for the dynamically allocated MyType object created earlier.

delete can be called only for an object created by new. If you malloc( ) (or calloc( ) or realloc( )) an object and then delete it, the behavior is undefined. Because most default implementations of new and delete use malloc( ) and free( ), you’d probably end up releasing the memory without calling the destructor.

If the pointer you’re deleting is zero, nothing will happen. For this reason, people often recommend setting a pointer to zero immediately after you delete it, to prevent deleting it twice. Deleting an object more than once is definitely a bad thing to do, and will cause problems.

A simple example

This example shows that initialization takes place:

//: C13:Tree.h
#ifndef TREE_H
#define TREE_H
#include <iostream>

class Tree {
  int height;
public:
  Tree(int treeHeight) : height(treeHeight) {}
  ~Tree() { std::cout << "*"; }
  friend std::ostream&
  operator<<(std::ostream& os, const Tree* t) {
    return os << "Tree height is: "
              << t->height << std::endl;
  }
}; 
#endif // TREE_H ///:~

//: C13:NewAndDelete.cpp
// Simple demo of new & delete
#include "Tree.h"
using namespace std;

int main() {
  Tree* t = new Tree(40);
  cout << t;
  delete t;
} ///:~

We can prove that the constructor is called by printing out the value of the Tree. Here, it’s done by overloading the operator<< to use with an ostream and a Tree*. Note, however, that even though the function is declared as a friend, it is defined as an inline! This is a mere convenience – defining a friend function as an inline to a class doesn’t change the friend status or the fact that it’s a global function and not a class member function. Also notice that the return value is the result of the entire output expression, which is an ostream& (which it must be, to satisfy the return value type of the function).

Memory manager overhead

When you create automatic objects on the stack, the size of the objects and their lifetime is built right into the generated code, because the compiler knows the exact type, quantity, and scope. Creating objects on the heap involves additional overhead, both in time and in space. Here’s a typical scenario. (You can replace malloc( ) with calloc( ) or realloc( ).)

You call malloc( ), which requests a block of memory from the pool. (This code may actually be part of malloc( ).)

The pool is searched for a block of memory large enough to satisfy the request. This is done by checking a map or directory of some sort that shows which blocks are currently in use and which are available. It’s a quick process, but it may take several tries so it might not be deterministic – that is, you can’t necessarily count on malloc( ) always taking exactly the same amount of time.

Before a pointer to that block is returned, the size and location of the block must be recorded so further calls to malloc( ) won’t use it, and so that when you call free( ), the system knows how much memory to release.

The way all this is implemented can vary widely. For example, there’s nothing to prevent primitives for memory allocation being implemented in the processor. If you’re curious, you can write test programs to try to guess the way your malloc( ) is implemented. You can also read the library source code, if you have it (the GNU C sources are always available).

Early examples redesigned

Using new and delete, the Stash example introduced previously in this book can be rewritten using all the features discussed in the book so far. Examining the new code will also give you a useful review of the topics.

At this point in the book, neither the Stash nor Stack classes will “own” the objects they point to; that is, when the Stash or Stack object goes out of scope, it will not call delete for all the objects it points to. The reason this is not possible is because, in an attempt to be generic, they hold void pointers. If you delete a void pointer, the only thing that happens is the memory gets released, because there’s no type information and no way for the compiler to know what destructor to call.

delete void* is probably a bug

It’s worth making a point that if you call delete for a void*, it’s almost certainly going to be a bug in your program unless the destination of that pointer is very simple; in particular, it should not have a destructor. Here’s an example to show you what happens:

//: C13:BadVoidPointerDeletion.cpp
// Deleting void pointers can cause memory leaks
#include <iostream>
using namespace std;

class Object {
  void* data; // Some storage
  const int size;
  const char id;
public:
  Object(int sz, char c) : size(sz), id(c) {
    data = new char[size];
    cout << "Constructing object " << id 
         << ", size = " << size << endl;
  }
  ~Object() { 
    cout << "Destructing object " << id << endl;
    delete []data; // OK, just releases storage,
    // no destructor calls are necessary
  }
};

int main() {
  Object* a = new Object(40, 'a');
  delete a;
  void* b = new Object(40, 'b');
  delete b;
} ///:~

The class Object contains a void* that is initialized to “raw” data (it doesn’t point to objects that have destructors). In the Object destructor, delete is called for this void* with no ill effects, since the only thing we need to happen is for the storage to be released.

However, in main( ) you can see that it’s very necessary that delete know what type of object it’s working with. Here’s the output:

Constructing object a, size = 40
Destructing object a
Constructing object b, size = 40

Because delete a knows that a points to an Object, the destructor is called and thus the storage allocated for data is released. However, if you manipulate an object through a void* as in the case of delete b, the only thing that happens is that the storage for the Object is released – but the destructor is not called so there is no release of the memory that data points to. When this program compiles, you probably won’t see any warning messages; the compiler assumes you know what you’re doing. So you get a very quiet memory leak.

If you have a memory leak in your program, search through all the delete statements and check the type of pointer being deleted. If it’s a void* then you’ve probably found one source of your memory leak (C++ provides ample other opportunities for memory leaks, however).

Cleanup responsibility with pointers

To make the Stash and Stack containers flexible (able to hold any type of object), they will hold void pointers. This means that when a pointer is returned from the Stash or Stack object, you must cast it to the proper type before using it; as seen above, you must also cast it to the proper type before deleting it or you’ll get a memory leak.

The other memory leak issue has to do with making sure that delete is actually called for each object pointer held in the container. The container cannot “own” the pointer because it holds it as a void* and thus cannot perform the proper cleanup. The user must be responsible for cleaning up the objects. This produces a serious problem if you add pointers to objects created on the stack and objects created on the heap to the same container because a delete-expression is unsafe for a pointer that hasn’t been allocated on the heap. (And when you fetch a pointer back from the container, how will you know where its object has been allocated?) Thus, you must be sure that objects stored in the following versions of Stash and Stack are made only on the heap, either through careful programming or by creating classes that can only be built on the heap.

It’s also important to make sure that the client programmer takes responsibility for cleaning up all the pointers in the container. You’ve seen in previous examples how the Stack class checks in its destructor that all the Link objects have been popped. For a Stash of pointers, however, another approach is needed.

Stash for pointers

This new version of the Stash class, called PStash, holds pointers to objects that exist by themselves on the heap, whereas the old Stash in earlier chapters copied the objects by value into the Stash container. Using new and delete, it’s easy and safe to hold pointers to objects that have been created on the heap.

Here’s the header file for the “pointer Stash”:

//: C13:PStash.h
// Holds pointers instead of objects
#ifndef PSTASH_H
#define PSTASH_H

class PStash {
  int quantity; // Number of storage spaces
  int next; // Next empty space
   // Pointer storage:
  void** storage;
  void inflate(int increase);
public:
  PStash() : quantity(0), storage(0), next(0) {}
  ~PStash();
  int add(void* element);
  void* operator[](int index) const; // Fetch
  // Remove the reference from this PStash:
  void* remove(int index);
  // Number of elements in Stash:
  int count() const { return next; }
};
#endif // PSTASH_H ///:~

The underlying data elements are fairly similar, but now storage is an array of void pointers, and the allocation of storage for that array is performed with new instead of malloc( ). In the expression

void** st = new void*[quantity + increase];

the type of object allocated is a void*, so the expression allocates an array of void pointers.

The destructor deletes the storage where the void pointers are held rather than attempting to delete what they point at (which, as previously noted, will release their storage and not call the destructors because a void pointer has no type information).

The other change is the replacement of the fetch( ) function with operator[ ], which makes more sense syntactically. Again, however, a void* is returned, so the user must remember what types are stored in the container and cast the pointers when fetching them out (a problem that will be repaired in future chapters).

Here are the member function definitions:

//: C13:PStash.cpp {O}
// Pointer Stash definitions
#include "PStash.h"
#include "../require.h"
#include <iostream>
#include <cstring> // 'mem' functions
using namespace std;

int PStash::add(void* element) {
  const int inflateSize = 10;
  if(next >= quantity)
    inflate(inflateSize);
  storage[next++] = element;
  return(next - 1); // Index number
}

// No ownership:
PStash::~PStash() {
  for(int i = 0; i < next; i++)
    require(storage[i] == 0, 
      "PStash not cleaned up");
  delete []storage; 
}

// Operator overloading replacement for fetch
void* PStash::operator[](int index) const {
  require(index >= 0,
    "PStash::operator[] index negative");
  if(index >= next)
    return 0; // To indicate the end
  // Produce pointer to desired element:
  return storage[index];
}

void* PStash::remove(int index) {
  void* v = operator[](index);
  // "Remove" the pointer:
  if(v != 0) storage[index] = 0;
  return v;
}

void PStash::inflate(int increase) {
  const int psz = sizeof(void*);
  void** st = new void*[quantity + increase];
  memset(st, 0, (quantity + increase) * psz);
  memcpy(st, storage, quantity * psz);
  quantity += increase;
  delete []storage; // Old storage
  storage = st; // Point to new memory
} ///:~

The add( ) function is effectively the same as before, except that a pointer is stored instead of a copy of the whole object.

The inflate( ) code is modified to handle the allocation of an array of void* instead of the previous design, which was only working with raw bytes. Here, instead of using the prior approach of copying by array indexing, the Standard C library function memset( ) is first used to set all the new memory to zero (this is not strictly necessary, since the PStash is presumably managing all the memory correctly – but it usually doesn’t hurt to throw in a bit of extra care). Then memcpy( ) moves the existing data from the old location to the new. Often, functions like memset( ) and memcpy( ) have been optimized over time, so they may be faster than the loops shown previously. But with a function like inflate( ) that will probably not be used that often you may not see a performance difference. However, the fact that the function calls are more concise than the loops may help prevent coding errors.

To put the responsibility of object cleanup squarely on the shoulders of the client programmer, there are two ways to access the pointers in the PStash: the operator[], which simply returns the pointer but leaves it as a member of the container, and a second member function remove( ), which returns the pointer but also removes it from the container by assigning that position to zero. When the destructor for PStash is called, it checks to make sure that all object pointers have been removed; if not, you’re notified so you can prevent a memory leak (more elegant solutions will be forthcoming in later chapters).

A test

Here’s the old test program for Stash rewritten for the PStash:

//: C13:PStashTest.cpp
//{L} PStash
// Test of pointer Stash
#include "PStash.h"
#include "../require.h"
#include <iostream>
#include <fstream>
#include <string>
using namespace std;

int main() {
  PStash intStash;
  // 'new' works with built-in types, too. Note
  // the "pseudo-constructor" syntax:
  for(int i = 0; i < 25; i++)
    intStash.add(new int(i));
  for(int j = 0; j < intStash.count(); j++)
    cout << "intStash[" << j << "] = "
         << *(int*)intStash[j] << endl;
  // Clean up:
  for(int k = 0; k < intStash.count(); k++)
    delete intStash.remove(k);
  ifstream in ("PStashTest.cpp");
  assure(in, "PStashTest.cpp");
  PStash stringStash;
  string line;
  while(getline(in, line))
    stringStash.add(new string(line));
  // Print out the strings:
  for(int u = 0; stringStash[u]; u++)
    cout << "stringStash[" << u << "] = "
         << *(string*)stringStash[u] << endl;
  // Clean up:
  for(int v = 0; v < stringStash.count(); v++)
    delete (string*)stringStash.remove(v);
} ///:~

As before, Stashes are created and filled with information, but this time the information is the pointers resulting from new-expressions. In the first case, note the line:

intStash.add(new int(i));

The expression new int(i) uses the pseudo-constructor form , so storage for a new int object is created on the heap, and the int is initialized to the value i.

During printing, the value returned by PStash::operator[ ] must be cast to the proper type; this is repeated for the rest of the PStash objects in the program. It’s an undesirable effect of using void pointers as the underlying representation and will be fixed in later chapters.

The second test opens the source code file and reads it one line at a time into another PStash. Each line is read into a string using getline( ), then a new string is created from line to make an independent copy of that line. If we just passed in the address of line each time, we’d get a whole bunch of pointers pointing to line, which would only contain the last line that was read from the file.

When fetching the pointers, you see the expression:

*(string*)stringStash[v]

The pointer returned from operator[ ] must be cast to a string* to give it the proper type. Then the string* is dereferenced so the expression evaluates to an object, at which point the compiler sees a string object to send to cout.

The objects created on the heap must be destroyed through the use of the remove( ) statement or else you’ll get a message at runtime telling you that you haven’t completely cleaned up the objects in the PStash. Notice that in the case of the int pointers, no cast is necessary because there’s no destructor for an int and all we need is memory release:

delete intStash.remove(k);

However, for the string pointers, if you forget to do the cast you’ll have another (quiet) memory leak, so the cast is essential:

delete (string*)stringStash.remove(k);

Some of these issues (but not all) can be removed using templates (which you’ll learn about in Chapter 16).

new & delete for arrays

In C++, you can create arrays of objects on the stack or on the heap with equal ease, and (of course) the constructor is called for each object in the array. There’s one constraint, however: There must be a default constructor , except for aggregate initialization on the stack (see Chapter 6), because a constructor with no arguments must be called for every object.

When creating arrays of objects on the heap using new, there’s something else you must do. An example of such an array is

MyType* fp = new MyType[100];

This allocates enough storage on the heap for 100 MyType objects and calls the constructor for each one. Now, however, you simply have a MyType*, which is exactly the same as you’d get if you said

MyType* fp2 = new MyType;

to create a single object. Because you wrote the code, you know that fp is actually the starting address of an array, so it makes sense to select array elements using an expression like fp[3]. But what happens when you destroy the array? The statements

delete fp2; // OK
delete fp;  // Not the desired effect

look exactly the same, and their effect will be the same. The destructor will be called for the MyType object pointed to by the given address, and then the storage will be released. For fp2 this is fine, but for fp this means that the other 99 destructor calls won’t be made. The proper amount of storage will still be released, however, because it is allocated in one big chunk, and the size of the whole chunk is stashed somewhere by the allocation routine.

The solution requires you to give the compiler the information that this is actually the starting address of an array. This is accomplished with the following syntax:

delete []fp;

The empty brackets tell the compiler to generate code that fetches the number of objects in the array, stored somewhere when the array is created, and calls the destructor for that many array objects. This is actually an improved syntax from the earlier form, which you may still occasionally see in old code:

delete [100]fp;

which forced the programmer to include the number of objects in the array and introduced the possibility that the programmer would get it wrong. The additional overhead of letting the compiler handle it was very low, and it was considered better to specify the number of objects in one place instead of two.

Making a pointer more like an array

As an aside, the fp defined above can be changed to point to anything, which doesn’t make sense for the starting address of an array. It makes more sense to define it as a constant, so any attempt to modify the pointer will be flagged as an error. To get this effect, you might try

int const* q = new int[10];

const int* q = new int[10];

but in both cases the const will bind to the int, that is, what is being pointed to, rather than the quality of the pointer itself. Instead, you must say

int* const q = new int[10];

Now the array elements in q can be modified, but any change to q (like q++) is illegal, as it is with an ordinary array identifier.

Running out of storage

What happens when the operator new( ) cannot find a contiguous block of storage large enough to hold the desired object? A special function called the new-handler is called. Or rather, a pointer to a function is checked, and if the pointer is nonzero, then the function it points to is called.

The default behavior for the new-handler is to throw an exception, a subject covered in Volume 2. However, if you’re using heap allocation in your program, it’s wise to at least replace the new-handler with a message that says you’ve run out of memory and then aborts the program. That way, during debugging, you’ll have a clue about what happened. For the final program you’ll want to use more robust recovery.

You replace the new-handler by including new.h and then calling set_new_handler( ) with the address of the function you want installed:

//: C13:NewHandler.cpp
// Changing the new-handler
#include <iostream>
#include <cstdlib>
#include <new>
using namespace std;

int count = 0;

void out_of_memory() {
  cerr << "memory exhausted after " << count 
    << " allocations!" << endl;
  exit(1);
}

int main() {
  set_new_handler(out_of_memory);
  while(1) {
    count++;
    new int[1000]; // Exhausts memory
  }
} ///:~

The new-handler function must take no arguments and have a void return value. The while loop will keep allocating int objects (and throwing away their return addresses) until the free store is exhausted. At the very next call to new, no storage can be allocated, so the new-handler will be called.

The behavior of the new-handler is tied to operator new( ), so if you overload operator new( ) (covered in the next section) the new-handler will not be called by default. If you still want the new-handler to be called you’ll have to write the code to do so inside your overloaded operator new( ).

Of course, you can write more sophisticated new-handlers, even one to try to reclaim memory (commonly known as a garbage collector). This is not a job for the novice programmer.

Overloading new & delete

When you create a new-expression, two things occur. First, storage is allocated using the operator new( ), then the constructor is called. In a delete-expression, the destructor is called, then storage is deallocated using the operator delete( ). The constructor and destructor calls are never under your control (otherwise you might accidentally subvert them), but you can change the storage allocation functions operator new( ) and operator delete( ).

The memory allocation system used by new and delete is designed for general-purpose use. In special situations, however, it doesn’t serve your needs. The most common reason to change the allocator is efficiency: You might be creating and destroying so many objects of a particular class that it has become a speed bottleneck. C++ allows you to overload new and delete to implement your own storage allocation scheme, so you can handle problems like this.

Another issue is heap fragmentation. By allocating objects of different sizes it’s possible to break up the heap so that you effectively run out of storage. That is, the storage might be available, but because of fragmentation no piece is big enough to satisfy your needs. By creating your own allocator for a particular class, you can ensure this never happens.

In embedded and real-time systems, a program may have to run for a very long time with restricted resources. Such a system may also require that memory allocation always take the same amount of time, and there’s no allowance for heap exhaustion or fragmentation. A custom memory allocator is the solution; otherwise, programmers will avoid using new and delete altogether in such cases and miss out on a valuable C++ asset.

When you overload operator new( ) and operator delete( ), it’s important to remember that you’re changing only the way raw storage is allocated. The compiler will simply call your new instead of the default version to allocate storage, then call the constructor for that storage. So, although the compiler allocates storage and calls the constructor when it sees new, all you can change when you overload new is the storage allocation portion. (delete has a similar limitation.)

When you overload operator new( ), you also replace the behavior when it runs out of memory, so you must decide what to do in your operator new( ): return zero, write a loop to call the new-handler and retry allocation, or (typically) throw a bad_alloc exception (discussed in Volume 2, available at www.BruceEckel.com).

Overloading new and delete is like overloading any other operator. However, you have a choice of overloading the global allocator or using a different allocator for a particular class.

Overloading global new & delete

This is the drastic approach, when the global versions of new and delete are unsatisfactory for the whole system. If you overload the global versions, you make the defaults completely inaccessible – you can’t even call them from inside your redefinitions.

The overloaded new must take an argument of size_t (the Standard C standard type for sizes). This argument is generated and passed to you by the compiler and is the size of the object you’re responsible for allocating. You must return a pointer either to an object of that size (or bigger, if you have some reason to do so), or to zero if you can’t find the memory (in which case the constructor is not called!). However, if you can’t find the memory, you should probably do something more informative than just returning zero, like calling the new-handler or throwing an exception, to signal that there’s a problem.

The return value of operator new( ) is a void*, not a pointer to any particular type. All you’ve done is produce memory, not a finished object – that doesn’t happen until the constructor is called, an act the compiler guarantees and which is out of your control.

The operator delete( ) takes a void* to memory that was allocated by operator new. It’s a void* because operator delete only gets the pointer after the destructor is called, which removes the object-ness from the piece of storage. The return type is void.

Here’s a simple example showing how to overload the global new and delete:

//: C13:GlobalOperatorNew.cpp
// Overload global new/delete
#include <cstdio>
#include <cstdlib>
using namespace std;

void* operator new(size_t sz) {
  printf("operator new: %d Bytes\n", sz);
  void* m = malloc(sz);
  if(!m) puts("out of memory");
  return m;
}

void operator delete(void* m) {
  puts("operator delete");
  free(m);
}

class S {
  int i[100];
public:
  S() { puts("S::S()"); }
  ~S() { puts("S::~S()"); }
};

int main() {
  puts("creating & destroying an int");
  int* p = new int(47);
  delete p;
  puts("creating & destroying an s");
  S* s = new S;
  delete s;
  puts("creating & destroying S[3]");
  S* sa = new S[3];
  delete []sa;
} ///:~

Here you can see the general form for overloading new and delete. These use the Standard C library functions malloc( ) and free( ) for the allocators (which is probably what the default new and delete use as well!). However, they also print messages about what they are doing. Notice that printf( ) and puts( ) are used rather than iostreams. This is because when an iostream object is created (like the global cin, cout, and cerr), it calls new to allocate memory. With printf( ), you don’t get into a deadlock because it doesn’t call new to initialize itself.

In main( ), objects of built-in types are created to prove that the overloaded new and delete are also called in that case. Then a single object of type S is created, followed by an array of S. For the array, you’ll see from the number of bytes requested that extra memory is allocated to store information (inside the array) about the number of objects it holds. In all cases, the global overloaded versions of new and delete are used.

Overloading new & delete for a class

Although you don’t have to explicitly say static, when you overload new and delete for a class, you’re creating static member functions. As before, the syntax is the same as overloading any other operator. When the compiler sees you use new to create an object of your class, it chooses the member operator new( ) over the global version. However, the global versions of new and delete are used for all other types of objects (unless they have their own new and delete).

In the following example, a primitive storage allocation system is created for the class Framis. A chunk of memory is set aside in the static data area at program start-up, and that memory is used to allocate space for objects of type Framis. To determine which blocks have been allocated, a simple array of bytes is used, one byte for each block:

//: C13:Framis.cpp
// Local overloaded new & delete
#include <cstddef> // Size_t
#include <fstream>
#include <iostream>
#include <new>
using namespace std;
ofstream out("Framis.out");

class Framis {
  enum { sz = 10 };
  char c[sz]; // To take up space, not used
  static unsigned char pool[];
  static bool alloc_map[];
public:
  enum { psize = 100 };  // frami allowed
  Framis() { out << "Framis()\n"; }
  ~Framis() { out << "~Framis() ... "; }
  void* operator new(size_t) throw(bad_alloc);
  void operator delete(void*);
};
unsigned char Framis::pool[psize * sizeof(Framis)];
bool Framis::alloc_map[psize] = {false};

// Size is ignored -- assume a Framis object
void* 
Framis::operator new(size_t) throw(bad_alloc) {
  for(int i = 0; i < psize; i++)
    if(!alloc_map[i]) {
      out << "using block " << i << " ... ";
      alloc_map[i] = true; // Mark it used
      return pool + (i * sizeof(Framis));
    }
  out << "out of memory" << endl;
  throw bad_alloc();
}

void Framis::operator delete(void* m) {
  if(!m) return; // Check for null pointer
  // Assume it was created in the pool
  // Calculate which block number it is:
  unsigned long block = (unsigned long)m
    - (unsigned long)pool;
  block /= sizeof(Framis);
  out << "freeing block " << block << endl;
  // Mark it free:
  alloc_map[block] = false;
}

int main() {
  Framis* f[Framis::psize];
  try {
    for(int i = 0; i < Framis::psize; i++)
      f[i] = new Framis;
    new Framis; // Out of memory
  } catch(bad_alloc) {
    cerr << "Out of memory!" << endl;
  }
  delete f[10];
  f[10] = 0;
  // Use released memory:
  Framis* x = new Framis;
  delete x;
  for(int j = 0; j < Framis::psize; j++)
    delete f[j]; // Delete f[10] OK
} ///:~

The pool of memory for the Framis heap is created by allocating an array of bytes large enough to hold psize Framis objects. The allocation map is psize elements long, so there’s one bool for every block. All the values in the allocation map are initialized to false using the aggregate initialization trick of setting the first element so the compiler automatically initializes all the rest to their normal default value (which is false, in the case of bool).

The local operator new( ) has the same syntax as the global one. All it does is search through the allocation map looking for a false value, then sets that location to true to indicate it’s been allocated and returns the address of the corresponding memory block. If it can’t find any memory, it issues a message to the trace file and throws a bad_alloc exception.

This is the first example of exceptions that you’ve seen in this book. Since detailed discussion of exceptions is delayed until Volume 2, this is a very simple use of them. In operator new( ) there are two artifacts of exception handling. First, the function argument list is followed by throw(bad_alloc), which tells the compiler and the reader that this function may throw an exception of type bad_alloc. Second, if there’s no more memory the function actually does throw the exception in the statement throw bad_alloc. When an exception is thrown, the function stops executing and control is passed to an exception handler, which is expressed as a catch clause.

In main( ), you see the other part of the picture, which is the try-catch clause. The try block is surrounded by braces and contains all the code that may throw exceptions – in this case, any call to new that involves Framis objects. Immediately following the try block is one or more catch clauses, each one specifying the type of exception that they catch. In this case, catch(bad_alloc) says that that bad_alloc exceptions will be caught here. This particular catch clause is only executed when a bad_alloc exception is thrown, and execution continues after the end of the last catch clause in the group (there’s only one here, but there could be more).

In this example, it’s OK to use iostreams because the global operator new( ) and delete( ) are untouched.

The operator delete( ) assumes the Framis address was created in the pool. This is a fair assumption, because the local operator new( ) will be called whenever you create a single Framis object on the heap – but not an array of them: global new is used for arrays. So the user might accidentally have called operator delete( ) without using the empty bracket syntax to indicate array destruction. This would cause a problem. Also, the user might be deleting a pointer to an object created on the stack. If you think these things could occur, you might want to add a line to make sure the address is within the pool and on a correct boundary (you may also begin to see the potential of overloaded new and delete for finding memory leaks).

operator delete( ) calculates the block in the pool that this pointer represents, and then sets the allocation map’s flag for that block to false to indicate the block has been released.

In main( ), enough Framis objects are dynamically allocated to run out of memory; this checks the out-of-memory behavior. Then one of the objects is freed, and another one is created to show that the released memory is reused.

Because this allocation scheme is specific to Framis objects, it’s probably much faster than the general-purpose memory allocation scheme used for the default new and delete. However, you should note that it doesn’t automatically work if inheritance is used (inheritance is covered in Chapter 14).

Overloading new & delete for arrays

If you overload operator new and delete for a class, those operators are called whenever you create an object of that class. However, if you create an array of those class objects, the global operator new( ) is called to allocate enough storage for the array all at once, and the global operator delete( ) is called to release that storage. You can control the allocation of arrays of objects by overloading the special array versions of operator new[ ] and operator delete[ ] for the class. Here’s an example that shows when the two different versions are called:

//: C13:ArrayOperatorNew.cpp
// Operator new for arrays
#include <new> // Size_t definition
#include <fstream>
using namespace std;
ofstream trace("ArrayOperatorNew.out");

class Widget {
  enum { sz = 10 };
  int i[sz];
public:
  Widget() { trace << "*"; }
  ~Widget() { trace << "~"; }
  void* operator new(size_t sz) {
    trace << "Widget::new: "
         << sz << " bytes" << endl;
    return ::new char[sz];
  }
  void operator delete(void* p) {
    trace << "Widget::delete" << endl;
    ::delete []p;
  }
  void* operator new[](size_t sz) {
    trace << "Widget::new[]: "
         << sz << " bytes" << endl;
    return ::new char[sz];
  }
  void operator delete[](void* p) {
    trace << "Widget::delete[]" << endl;
    ::delete []p;
  }
};

int main() {
  trace << "new Widget" << endl;
  Widget* w = new Widget;
  trace << "\ndelete Widget" << endl;
  delete w;
  trace << "\nnew Widget[25]" << endl;
  Widget* wa = new Widget[25];
  trace << "\ndelete []Widget" << endl;
  delete []wa;
} ///:~

Here, the global versions of new and delete are called so the effect is the same as having no overloaded versions of new and delete except that trace information is added. Of course, you can use any memory allocation scheme you want in the overloaded new and delete.

You can see that the syntax of array new and delete is the same as for the individual object versions except for the addition of the brackets. In both cases you’re handed the size of the memory you must allocate. The size handed to the array version will be the size of the entire array. It’s worth keeping in mind that the only thing the overloaded operator new( ) is required to do is hand back a pointer to a large enough memory block. Although you may perform initialization on that memory, normally that’s the job of the constructor that will automatically be called for your memory by the compiler.

The constructor and destructor simply print out characters so you can see when they’ve been called. Here’s what the trace file looks like for one compiler:

new Widget
Widget::new: 40 bytes
*
delete Widget
~Widget::delete

new Widget[25]
Widget::new[]: 1004 bytes
*************************
delete []Widget
~~~~~~~~~~~~~~~~~~~~~~~~~Widget::delete[]

Creating an individual object requires 40 bytes, as you might expect. (This machine uses four bytes for an int.) The operator new( ) is called, then the constructor (indicated by the *). In a complementary fashion, calling delete causes the destructor to be called, then the operator delete( ).

When an array of Widget objects is created, the array version of operator new( ) is used, as promised. But notice that the size requested is four more bytes than expected. This extra four bytes is where the system keeps information about the array, in particular, the number of objects in the array. That way, when you say

delete []Widget;

the brackets tell the compiler it’s an array of objects, so the compiler generates code to look for the number of objects in the array and to call the destructor that many times. You can see that, even though the array operator new( ) and operator delete( ) are only called once for the entire array chunk, the default constructor and destructor are called for each object in the array.

Constructor calls

Considering that

MyType* f = new MyType;

calls new to allocate a MyType-sized piece of storage, then invokes the MyType constructor on that storage, what happens if the storage allocation in new fails? The constructor is not called in that case, so although you still have an unsuccessfully created object, at least you haven’t invoked the constructor and handed it a zero this pointer. Here’s an example to prove it:

//: C13:NoMemory.cpp
// Constructor isn't called if new fails
#include <iostream>
#include <new> // bad_alloc definition
using namespace std;

class NoMemory {
public:
  NoMemory() {
    cout << "NoMemory::NoMemory()" << endl;
  }
  void* operator new(size_t sz) throw(bad_alloc){
    cout << "NoMemory::operator new" << endl;
    throw bad_alloc(); // "Out of memory"
  }
};

int main() {
  NoMemory* nm = 0;
  try {
    nm = new NoMemory;
  } catch(bad_alloc) {
    cerr << "Out of memory exception" << endl;
  }
  cout << "nm = " << nm << endl;
} ///:~

When the program runs, it does not print the constructor message, only the message from operator new( ) and the message in the exception handler. Because new never returns, the constructor is never called so its message is not printed.

It’s important that nm be initialized to zero because the new expression never completes, and the pointer should be zero to make sure you don’t misuse it. However, you should actually do more in the exception handler than just print out a message and continue on as if the object had been successfully created. Ideally, you will do something that will cause the program to recover from the problem, or at the least exit after logging an error.

In earlier versions of C++ it was standard practice to return zero from new if storage allocation failed. That would prevent construction from occurring. However, if you try to return zero from new with a Standard-conforming compiler, it should tell you that you ought to throw bad_alloc instead.

placement new & delete

There are two other, less common, uses for overloading operator new( ).

You may want to place an object in a specific location in memory. This is especially important with hardware-oriented embedded systems where an object may be synonymous with a particular piece of hardware.
You may want to be able to choose from different allocators when calling new.

Both of these situations are solved with the same mechanism: The overloaded operator new( ) can take more than one argument. As you’ve seen before, the first argument is always the size of the object, which is secretly calculated and passed by the compiler. But the other arguments can be anything you want – the address you want the object placed at, a reference to a memory allocation function or object, or anything else that is convenient for you.

The way that you pass the extra arguments to operator new( ) during a call may seem slightly curious at first. You put the argument list (without the size_t argument, which is handled by the compiler) after the keyword new and before the class name of the object you’re creating. For example,

X* xp = new(a) X;

will pass a as the second argument to operator new( ). Of course, this can work only if such an operator new( ) has been declared.

Here’s an example showing how you can place an object at a particular location:

//: C13:PlacementOperatorNew.cpp
// Placement with operator new()
#include <cstddef> // Size_t
#include <iostream>
using namespace std;

class X {
  int i;
public:
  X(int ii = 0) : i(ii) {
    cout << "this = " << this << endl;
  }
  ~X() {
    cout << "X::~X(): " << this << endl;
  }
  void* operator new(size_t, void* loc) {
    return loc;
  }
};

int main() {
  int l[10];
  cout << "l = " << l << endl;
  X* xp = new(l) X(47); // X at location l
  xp->X::~X(); // Explicit destructor call
  // ONLY use with placement!
} ///:~

Notice that operator new only returns the pointer that’s passed to it. Thus, the caller decides where the object is going to sit, and the constructor is called for that memory as part of the new-expression.

Although this example shows only one additional argument, there’s nothing to prevent you from adding more if you need them for other purposes.

A dilemma occurs when you want to destroy the object. There’s only one version of operator delete, so there’s no way to say, “Use my special deallocator for this object.” You want to call the destructor, but you don’t want the memory to be released by the dynamic memory mechanism because it wasn’t allocated on the heap.

The answer is a very special syntax. You can explicitly call the destructor, as in

xp->X::~X(); // Explicit destructor call

A stern warning is in order here. Some people see this as a way to destroy objects at some time before the end of the scope, rather than either adjusting the scope or (more correctly) using dynamic object creation if they want the object’s lifetime to be determined at runtime. You will have serious problems if you call the destructor this way for an ordinary object created on the stack because the destructor will be called again at the end of the scope. If you call the destructor this way for an object that was created on the heap, the destructor will execute, but the memory won’t be released, which probably isn’t what you want. The only reason that the destructor can be called explicitly this way is to support the placement syntax for operator new.

There’s also a placement operator delete that is only called if a constructor for a placement new expression throws an exception (so that the memory is automatically cleaned up during the exception). The placement operator delete has an argument list that corresponds to the placement operator new that is called before the constructor throws the exception. This topic will be explored in the exception handling chapter in Volume 2.

Summary

It’s convenient and optimally efficient to create automatic objects on the stack, but to solve the general programming problem you must be able to create and destroy objects at any time during a program’s execution, particularly to respond to information from outside the program. Although C’s dynamic memory allocation will get storage from the heap, it doesn’t provide the ease of use and guaranteed construction necessary in C++. By bringing dynamic object creation into the core of the language with new and delete, you can create objects on the heap as easily as making them on the stack. In addition, you get a great deal of flexibility. You can change the behavior of new and delete if they don’t suit your needs, particularly if they aren’t efficient enough. Also, you can modify what happens when the heap runs out of storage.

Exercises

Solutions to selected exercises can be found in the electronic document The Thinking in C++ Annotated Solution Guide, available for a small fee from www.BruceEckel.com.

Create a class Counted that contains an int id and a static int count. The default constructor should begin:
Counted( ) : id(count++) {. It should also print its id and that it’s being created. The destructor should print that it’s being destroyed and its id. Test your class.
Prove to yourself that new and delete always call the constructors and destructors by creating an object of class Counted (from Exercise 1) with new and destroying it with delete. Also create and destroy an array of these objects on the heap.
Create a PStash object and fill it with new objects from Exercise 1. Observe what happens when this PStash object goes out of scope and its destructor is called.
Create a vector< Counted*> and fill it with pointers to new Counted objects (from Exercise 1). Move through the vector and print the Counted objects, then move through again and delete each one.
Repeat Exercise 4, but add a member function f( ) to Counted that prints a message. Move through the vector and call f( ) for each object.
Repeat Exercise 5 using a PStash.
Repeat Exercise 5 using Stack4.h from Chapter 9.
Dynamically create an array of objects of class Counted (from Exercise 1). Call delete for the resulting pointer, without the square brackets. Explain the results.
Create an object of class Counted (from Exercise 1) using new, cast the resulting pointer to a void*, and delete that. Explain the results.
Execute NewHandler.cpp on your machine to see the resulting count. Calculate the amount of free store available for your program.
Create a class with an overloaded operator new and delete, both the single-object versions and the array versions. Demonstrate that both versions work.
Devise a test for Framis.cpp to show yourself approximately how much faster the custom new and delete run than the global new and delete.
Modify NoMemory.cpp so that it contains an array of int and so that it actually allocates memory instead of throwing bad_alloc. In main( ), set up a while loop like the one in NewHandler.cpp to run out of memory and see what happens if your operator new does not test to see if the memory is successfully allocated. Then add the check to your operator new and throw bad_alloc.
Create a class with a placement new with a second argument of type string. The class should contain a static vector<string> where the second new argument is stored. The placement new should allocate storage as normal. In main( ), make calls to your placement new with string arguments that describe the calls (you may want to use the preprocessor’s __FILE__ and __LINE__ macros).
Modify ArrayOperatorNew.cpp by adding a static vector<Widget*> that adds each Widget address that is allocated in operator new( ) and removes it when it is released via operator delete( ). (You may need to look up information about vector in your Standard C++ Library documentation or in the 2^nd volume of this book, available at the Web site.) Create a second class called MemoryChecker that has a destructor that prints out the number of Widget pointers in your vector. Create a program with a single global instance of MemoryChecker and in main( ), dynamically allocate and destroy several objects and arrays of Widget. Show that MemoryChecker reveals memory leaks.

[50] There is a special syntax called placement new that allows you to call a constructor for a pre-allocated piece of memory. This is introduced later in the chapter.

[ Previous Chapter ] [ Table of Contents ] [ Index ] [ Next Chapter ]
Last Update:09/27/2001