Tuesday, July 5, 2011

Copy Constructors and Assignment Operators: The Basic Rules

copy constructors and assignment operators is easy to understand when you realize that they're always there even if you don't write them, and that they have a default behavior that you probably already understand. Every struct and class has a default copy constructor and assignment operator method. Let us see an example-

Start with a struct called SShape with a few fields:

struct SShape {
        int ntop;
        int nleft;
        int nbottom;
        int nright;
Yes, even a struct as simple as this has a copy constructor and assignment operator. Now, look at this code:

1: SShape s1 = { 0, 0, 100, 200 };
2: SShape s2( s1 );
3: SShape s3;
4: s3 = s1;
Line 2 invokes the default copy constructor for r2, copying into it the members from r1. Line 3 does something similar, but invokes the default assignment operator of r3, copying into it the members from r1. The difference between the two is that the copy constructor of the target is invoked when the source object is passed in at the time the target is constructed, such as in line 2. The assignment operator is invoked when the target object already exists, such as on line 4.
Looking at what the default implementation produces, examine what Line 4 ends up doing:

1. s3.ntop    = s1.ntop;
2. s3.nleft   = s1.nleft;
3. s3.nbottom = s1.nbottom;
4. s3.nright  = s1.nright;

So, if the default copy constructor and assignment operators do all this for you, why would anyone implement their own? The problem with the default implementations is that a simple copy of the members may not be appropriate to clone an object. For instance, what if one of the members were a pointer that is allocated by the class? Simply copying the pointer isn't enough because now you'll have two objects that have the same pointer value, and both objects will try to free the memory associated with that pointer when they destruct. Look at an example class:

class CContact {
               char *cname;
               int nage;
               CContact( const char *cName, nAge ) {
                       cname = new char[strlen( cName ) + 1];
                       strcpy( cname, cName );
                       nage = nAge;
        ~CContact() {
               delete[] cname;

Now, look at some code that uses this class:

CContact oc1 ("Saurav", 40);
CContact oc2 = oc1;
The problem is, oc1 and oc2 will have the same pointer value for the "cname" field. When oc2 goes out of scope, its destructor will get called and delete the memory that was allocated when c1 was constructed (because the name field of both objects have the same pointer value). Then, when c1 destructs, it will attempt to delete the pointer value, and a "double-free" occurs. At best, the heap will catch the problem and report an error. At worst, the same pointer value may, by then, be allocated to another object, the delete will free the wrong memory, and this will introduce a difficult-to-find bug in the code.
The way you want to solve this is by adding an explicit copy constructor and an assignment operator to the class, like so:

CContact( const CContact& rhs ) {
        cname = new char[strlen( rhs.cname ) + 1];
        strcpy( cname, rhs.cname );
        nage = rhs.nage;
CContact& operator=( const CContact& rhs ) {
        char* tempName = new char[strlen( rhs.cname ) + 1];
        delete[] cname;
        cname = tempName;
        strcpy( cname, rhs.cname );
        nage = rhs.nage;
        return *this;

Now, the code that uses the class will function properly. Note that the difference between the copy constructor and assignment operator above is that the copy constructor can assume that fields of the object have not been set yet (because the object is just being constructed). However, the assignment operator must handle the case when the fields already have valid values. The assignment operator deletes the contents of the existing string before assigning the new string. You might ask why the tempName local variable is used, and why the code isn't written as follows instead:
delete[] cname;
cname = new char[strlen( rhs.cname ) + 1];
strcpy( cname, rhs.cname );
nage = rhs.nage;

The problem with this code is that if the new operator throws an exception, the object will be left in a bad state because the name field would have already been freed by the previous instruction. By performing all the operations that could fail first and then replacing the fields once there's no chance of an exception from occurring, the code is exception safe.
Note: The reason the assignment operator returns a reference to the object is so that code such as the following will work:

oc1 = oc2 = oc3;

One might think that the case when an explicit copy constructor and assignment operator methods are necessary is when a class or struct contains pointer fields. This is not the case. In the case above, the explicit methods were needed because the data pointed to by the field is owned by the object. If the pointer is a "back" (or weak) pointer, or a reference to some other object that the class is not responsible for releasing, it may be perfectly valid to have more than one object share the value in a pointer field.
There are situations when a class field actually refers to some entity that cannot be copied, or it does not make sense to be copied. For instance, what if the field were a handle to a file that it created? So it's possible that copying the object might require that another file be created that has its own handle. But, it's also possible that more than one file cannot be created for the given object. In this case, there cannot be a valid copy constructor or assignment operator for the class. As you have seen earlier, simply not implementing them does not mean that they won't exist, because the compiler supplies the default versions when explicit versions aren't specified. The solution is to provide copy constructors and assignment operators in the class and mark them as private. As long as no code tries to copy the object, everything will work fine, but as soon as code is introduced that attempts to copy the object, the compiler will indicate an error that the copy constructor or assignment operator cannot be accessed.
To create a private copy constructor and assignment operator, one does not need to supply implementation for the methods. Simply prototyping them in the class definition is enough.


        CContact( const Contact& rhs );
        CContact& operator=( const Contact& rhs );

This will disable the default copying semantics supplied by C++ for all classes.
Some people wish that C++ did not provide an implicit copy constructor and assignment operator if one isn't provided by the programmer. To simulate this desire, these programmers always define a private copy constructor and assignment operator when they define a new class, and thus the above three lines are a common pattern. When used, this pattern will prevent anyone from copying their object unless they explicitly support such an operation.
This is good practice: Unless you explicitly need to support deep copying of the instances, disable copying using the above technique.
(Another advantage of disabling copying is that auto_ptr can be used to manage a data-members lifetime, but that's probably another article.)

No comments:

Post a Comment