Friday, May 8, 2015
Thursday, April 9, 2015
The fundamental building block of OO software.
classdefines a data type, much like a
structwould be in C. In a computer science sense, a type consists of both a set of states and a set of operations which transition between those states. Thus
intis a type because it has both a set of states and it has operations like
i + jor
i++, etc. In exactly the same way, a
classprovides a set of (usually
public) operations, and a set of (usually non-
public) data bits representing the abstract values that instances of the type can have.
You can imagine that
classthat has member functions called
operator++, etc. (
intisn’t really a
class, but the basic analogy is this: a
classis a type, much like
intis a type.)
Note: a C programmer can think of a
classas a C
structwhose members default to
private. But if that’s all you think of a
class, then you probably need to experience a personal paradigm shift.
A region of storage with associated semantics.
After the declaration
int i;we say that “
iis an object of type
int.” In OO/C++, “object” usually means “an instance of a class.” Thus a class defines the behavior of possibly many objects (instances).
When it provides a simplified view of a chunk of software, and it is expressed in the vocabulary of a user (where a “chunk” is normally a class or a tight group of classes, and a “user” is another developer rather than the ultimate customer).
- The “simplified view” means unnecessary details are intentionally hidden. This reduces the user’s defect-rate.
- The “vocabulary of users” means users don’t need to learn a new set of words and concepts. This reduces the user’s learning curve.
Preventing unauthorized access to some piece of information or functionality.
The key money-saving insight is to separate the volatile part of some chunk of software from the stable part. Encapsulation puts a firewall around the chunk, which prevents other chunks from accessing the volatile parts; other chunks can only access the stable parts. This prevents the other chunks from breaking if (when!) the volatile parts are changed. In context of OO software, a “chunk” is normally a class or a tight group of classes.
The “volatile parts” are the implementation details. If the chunk is a single class, the volatile part is normally encapsulated using the
protectedkeywords. If the chunk is a tight group of classes, encapsulation can be used to deny access to entire classes in that group. Inheritance can also be used as a form of encapsulation.
The “stable parts” are the interfaces. A good interface provides a simplified view in the vocabulary of a user, and is designed from the outside-in (here a “user” means another developer, not the end-user who buys the completed application). If the chunk is a single class, the interface is simply the class’s
publicmember functions and
friendfunctions. If the chunk is a tight group of classes, the interface can include several of the classes in the chunk.
Designing a clean interface and separating that interface from its implementation merely allows users to use the interface. But encapsulating (putting “in a capsule”) the implementation forces users to use the interface.
In C, encapsulation was accomplished by making things
staticin a compilation unit or module. This prevented another module from accessing the
staticstuff. (By the way,
staticdata at file-scope is now deprecated in C++: don’t do that.)
Unfortunately this approach doesn’t support multiple instances of the data, since there is no direct support for making multiple instances of a module’s
staticdata. If multiple instances were needed in C, programmers typically used a
struct. But unfortunately C
structs don’t support encapsulation. This exacerbates the tradeoff between safety (information hiding) and usability (multiple instances).
In C++, you can have both multiple instances and encapsulation via a class. The
publicpart of a class contains the class’s interface, which normally consists of the class’s
publicmember functions and its
protectedparts of a class contain the class’s implementation, which is typically where the data lives.
The end result is like an “encapsulated
struct.” This reduces the tradeoff between safety (information hiding) and usability (multiple instances).
How can I prevent other programmers from violating encapsulation by seeing the private parts of my class?
Not worth the effort — encapsulation is for code, not people.
It doesn’t violate encapsulation for a programmer to see the
protectedparts of your class, so long as they don’t write code that somehow depends on what they saw. In other words, encapsulation doesn’t prevent people from knowing about the inside of a class; it prevents the code they write from becoming dependent on the insides of the class. Your company doesn’t have to pay a “maintenance cost” to maintain the gray matter between your ears; but it does have to pay a maintenance cost to maintain the code that comes out of your finger tips. What you know as a person doesn’t increase maintenance cost, provided the code you write depends on the interface rather than the implementation.
Besides, this is rarely if ever a problem. I don’t know any programmers who have intentionally tried to access the
privateparts of a class. “My recommendation in such cases would be to change the programmer, not the code” [James Kanze; used with permission].
thisis not special. Access is granted or denied based on the class of the reference/pointer/object, not based on the name of the reference/pointer/object. (See below for the fine print.)
The fact that C++ allows a class’ methods and friends to access the non-
publicparts of all its objects, not just the
thisobject, seems at first to weaken encapsulation. However the opposite is true: this rule preserves encapsulation. Here’s why.
Without this rule, most non-
publicmembers would need a
publicget method, because many classes have at least one method or friend that takes an explicit argument (i.e., an argument not called
this) of its own class.
Huh? (you ask). Let’s kill the mumbo jumbo and work out an example:
Consider assignment operator
Foo::operator=(const Foo& x). This assignment operator will probably change the data members in the left-hand argument,
*this, based on the data members in the right-hand argument,
x. Without the C++ rule being discussed here, the only way for that assignment operator to access the non-
xwould be for class
Footo provide a
publicget method for every non-
publicdatum. That would suck bigtime. (NB: “suck bigtime” is a precise, sophisticated, technical term; and I am writing this on April 1.)
The assignment operator isn’t the only one that would weaken encapsulation were it not for this rule. Here is a partial(!) list of others:
- Copy constructor.
- Comparison operators:
- Binary arithmetic operators:
- Binary bitwise operators:
- Static methods that accepts an instance of the class as a parameter.
- Static methods that creates/manipulates an instance of the class.
Conclusion: encapsulation would be shredded without this beneficial rule: most non-
publicmembers of most classes would end up having a
The Fine Print: There is another rule that is related to the above: methods and friends of a derived class can access the
protectedbase class members of any of its own objects (any objects of its class or any derived class of its class), but not others. Since that is hopelessly opaque, here’s an example: suppose classes
D2inherit directly from class
B, and base class
x. The compiler will let
D1’s members and friends directly access the
xmember of any object it knows to be at least a
D1, such as via a
D1object, etc. However the compiler will give a compile-time error if a
D1member or friend tries to directly access the
xmember of anything it does not know is at least a
D1, such as via a
D2object, etc. By way of (imperfect!!) analogy, you are allowed to pick your own pockets, but you are not allowed to pick your father’s pockets nor your brother’s pockets.
The members and base classes of a
publicby default, while in
class, they default to
private. Note: you should make your base classes explicitly
protected, rather than relying on the defaults.
classare otherwise functionally equivalent.
Enough of that squeaky clean techno talk. Emotionally, most developers make a strong distinction between a
structsimply feels like an open pile of bits with very little in the way of encapsulation or functionality. A
classfeels like a living and responsible member of society with intelligent services, a strong encapsulation barrier, and a well defined interface. Since that’s the connotation most people already have, you should probably use the
structkeyword if you have a class that has very few methods and has
publicdata (such things do exist in well designed systems!), but otherwise you should probably use the
If you want a constant that you can use in a compile time constant expression, say as an array bound, use
constexprif your compiler supports that C++11 feature, otherwise you have two other choices:
You have more flexibility if the constant isn’t needed for use in a compile time constant expression:
You can take the address of a static member if (and only if) it has an out-of-class definition:
You don’t. If you don’t want data in an interface, don’t put it in the class that defines the interface. Put it in derived classes instead. See, Why do my compiles take so long?.
Sometimes, you do want to have representation data in a class. Consider class
This type is designed to be used much as a built-in type and the representation is needed in the declaration to make it possible to create genuinely local objects (i.e. objects that are allocated on the stack and not on a heap) and to ensure proper inlining of simple operations. Genuinely local objects and inlining is necessary to get the performance of complex close to what is provided in languages with a built-in complex type.
Like C, C++ doesn’t define layouts, just semantic constraints that must be met. Therefore different implementations do things differently. One good explanation is in a book that is otherwise outdated and doesn’t describe any current C++ implementation: The Annotated C++ Reference Manual (usually called the ARM). It has diagrams of key layout examples. There is a very brief explanation in Chapter 2 of TC++PL3.
Basically, C++ constructs objects simply by concatenating sub objects. Thus
is represented by two
ints next to each other, and
is represented by an
Afollowed by an
int; that is, by three
ints next to each other.
Virtual functions are typically implemented by adding a pointer (the “vptr”) to each object of a class with virtual functions. This pointer points to the appropriate table of functions (the “vtbl”). Each class has its own vtbl shared by all objects of that class.