What is a copy constructor?
A copy constructor is a special constructor for a class/struct that is
used to make a copy of an existing instance. The copy constructor for
the class MyClass must have the following signature, according to the
C++ standard:
|
MyClass( const MyClass& other );
| |
Note that none of the following constructors, despite the fact that
they
could do the same thing as a copy constructor, are copy
constructors:
1 2 3
|
MyClass( MyClass& other );
MyClass( MyClass* other );
MyClass( const MyClass* other );
| |
or my personal favorite way to create an infinite loop in C++:
|
MyClass( MyClass other );
| |
When do I need to write a copy constructor?
First, you should understand that if you do not declare a copy
constructor, the compiler gives you one implicitly. The implicit
copy constructor does a member-wise copy of the source object.
For example, given the class:
1 2 3 4 5
|
class MyClass {
int x;
char c;
std::string s;
};
| |
the compiler-provided copy constructor is exactly equivalent to:
1 2 3
|
MyClass::MyClass( const MyClass& other ) :
x( other.x ), c( other.c ), s( other.s )
{}
| |
In many cases, this is sufficient. However, there are certain
circumstances where the member-wise copy version is not good enough.
By far, the most common reason the default copy constructor is not
sufficient is because the object contains raw pointers and you need
to take a "deep" copy of the pointer. That is, you don't want to
copy the pointer itself; rather you want to copy what the pointer
points to. Why do you need to take "deep" copies? This is
typically because the instance owns the pointer; that is, the
instance is responsible for calling delete on the pointer at some
point (probably the destructor). If two objects end up calling
delete on the same non-NULL pointer, heap corruption results.
Rarely you will come across a class that does not contain raw
pointers yet the default copy constructor is not sufficient.
An example of this is when you have a reference-counted object.
boost::shared_ptr<> is example.
What is an assignment operator?
The assignment operator for a class is what allows you to use
= to assign one instance to another. For example:
1 2
|
MyClass c1, c2;
c1 = c2; // assigns c2 to c1
| |
There are actually several different signatures that an
assignment operator can have:
(1) MyClass& operator=( const MyClass& rhs );
(2) MyClass& operator=( MyClass& rhs );
(3) MyClass& operator=( MyClass rhs );
(4) const MyClass& operator=( const MyClass& rhs );
(5) const MyClass& operator=( MyClass& rhs );
(6) const MyClass& operator=( MyClass rhs );
(7) MyClass operator=( const MyClass& rhs );
(8) MyClass operator=( MyClass& rhs );
(9) MyClass operator=( MyClass rhs );
These signatures permute both the return type and the parameter
type. While the return type may not be too important, choice
of the parameter type is critical.
(2), (5), and (8) pass the right-hand side by non-const reference,
and is not recommended. The problem with these signatures is that
the following code would not compile:
1 2
|
MyClass c1;
c1 = MyClass( 5, 'a', "Hello World" ); // assuming this constructor exists
| |
This is because the right-hand side of this assignment expression is
a temporary (un-named) object, and the C++ standard forbids the compiler
to pass a temporary object through a non-const reference parameter.
This leaves us with passing the right-hand side either by value or
by const reference. Although it would seem that passing by const
reference is more efficient than passing by value, we will see later
that for reasons of exception safety, making a temporary copy of the
source object is unavoidable, and therefore passing by value allows
us to write fewer lines of code.
When do I need to write an assignment operator?
First, you should understand that if you do not declare an
assignment operator, the compiler gives you one implicitly. The
implicit assignment operator does member-wise assignment of
each data member from the source object. For example, using
the class above, the compiler-provided assignment operator is
exactly equivalent to:
1 2 3 4 5 6
|
MyClass& MyClass::operator=( const MyClass& rhs ) {
x = other.x;
c = other.c;
s = other.s;
return *this;
}
| |
In general, any time you need to write your own custom copy
constructor, you also need to write a custom assignment operator.
What is meant by Exception Safe code?
A little interlude to talk about exception safety, because programmers
often misunderstand exception handling to be exception safety.
A function which modifies some "global" state (for example, a reference
parameter, or a member function that modifies the data members of its
instance) is said to be exception safe if it leaves the global state
well-defined in the event of an exception that is thrown at any point
during the function.
What does this really mean? Well, let's take a rather contrived
(and trite) example. This class wraps an array of some user-specified
type. It has two data members: a pointer to the array and a number of
elements in the array.
1 2 3 4 5 6 7 8 9
|
template< typename T >
class MyArray {
size_t numElements;
T* pElements;
public:
size_t count() const { return numElements; }
MyArray& operator=( const MyArray& rhs );
};
| |
Now, assignment of one MyArray to another is easy, right?
1 2 3 4 5 6 7 8 9 10 11
|
template<>
MyArray<T>::operator=( const MyArray& rhs ) {
if( this != &rhs ) {
delete [] pElements;
pElements = new T[ rhs.numElements ];
for( size_t i = 0; i < rhs.numElements; ++i )
pElements[ i ] = rhs.pElements[ i ];
numElements = rhs.numElements;
}
return *this;
}
| |
Well, not so fast. The problem is, the line
|
pElements[ i ] = rhs.pElements[ i ];
| |
could throw an exception. This line invokes operator= for type T, which
could be some user-defined type whose assignment operator might throw an
exception, perhaps an out-of-memory (std::bad_alloc) exception or some
other exception that the programmer of the user-defined type created.
What would happen if it did throw, say on copying the 3rd element of 10
total? Well, the stack is unwound until an appropriate handler is found.
Meanwhile, what is the state of our object? Well, we've reallocated our
array to hold 10 T's, but we've copied only 2 of them successfully. The
third one failed midway, and the remaining seven were never even attempted
to be copied. Furthermore, we haven't even changed numElements, so whatever
it held before, it still holds. Clearly this instance will lie about the
number of elements it contains if we call count() at this point.
But clearly it was never the intent of MyArray's programmer to have count()
give a wrong answer. Worse yet, there could be other member functions that
rely more heavily (even to the point of crashing) on numElements being correct.
Yikes -- this instance is clearly a timebomb waiting to go off.
This implementation of operator= is not exception safe: if an exception is
thrown during execution of the function, there is no telling what the state
of the object is; we can only assume that it is in such a bad state (ie,
it violates some of its own invariants) as to be unusable. If the object is
in a bad state, it might not even be possible to destroy the object without
crashing the program or causing MyArray to perhaps throw another exception.
And we know that the compiler runs destructors while unwinding the stack to
search for a handler. If an exception is thrown while unwinding the stack,
the program necessarily and unstoppably terminates.