Thursday, April 9, 2015

What is a class?

The fundamental building block of OO software.

A class defines a data type, much like a struct would be in C. In a computer science sense, a type consists of both a set of states and a set of operations which transition between those states. Thus int is a type because it has both a set of states and it has operations like i + j or i++, etc. In exactly the same way, a class provides a set of (usually public) operations, and a set of (usually non-public) data bits representing the abstract values that instances of the type can have.

You can imagine that int is a class that has member functions called operator++, etc. (int isn’t really a class, but the basic analogy is this: a class is a type, much like int is a type.)

Note: a C programmer can think of a class as a C struct whose members default to private. But if that’s all you think of a class, then you probably need to experience a personal paradigm shift.

What is an object?

A region of storage with associated semantics.

After the declaration int i; we say that “i is an object of type int.” In OO/C++, “object” usually means “an instance of a class.” Thus a class defines the behavior of possibly many objects (instances).

When is an interface “good”?

When it provides a simplified view of a chunk of software, and it is expressed in the vocabulary of a user (where a “chunk” is normally a class or a tight group of classes, and a “user” is another developer rather than the ultimate customer).

  • The “simplified view” means unnecessary details are intentionally hidden. This reduces the user’s defect-rate.
  • The “vocabulary of users” means users don’t need to learn a new set of words and concepts. This reduces the user’s learning curve.

What is encapsulation?

Preventing unauthorized access to some piece of information or functionality.

The key money-saving insight is to separate the volatile part of some chunk of software from the stable part. Encapsulation puts a firewall around the chunk, which prevents other chunks from accessing the volatile parts; other chunks can only access the stable parts. This prevents the other chunks from breaking if (when!) the volatile parts are changed. In context of OO software, a “chunk” is normally a class or a tight group of classes.

The “volatile parts” are the implementation details. If the chunk is a single class, the volatile part is normally encapsulated using the private and/or protected keywords. If the chunk is a tight group of classes, encapsulation can be used to deny access to entire classes in that group. Inheritance can also be used as a form of encapsulation.

The “stable parts” are the interfaces. A good interface provides a simplified view in the vocabulary of a user, and is designed from the outside-in (here a “user” means another developer, not the end-user who buys the completed application). If the chunk is a single class, the interface is simply the class’s public member functions and friend functions. If the chunk is a tight group of classes, the interface can include several of the classes in the chunk.

Designing a clean interface and separating that interface from its implementation merely allows users to use the interface. But encapsulating (putting “in a capsule”) the implementation forces users to use the interface.

How does C++ help with the tradeoff of safety vs. usability?

In C, encapsulation was accomplished by making things static in a compilation unit or module. This prevented another module from accessing the static stuff. (By the way, static data at file-scope is now deprecated in C++: don’t do that.)

Unfortunately this approach doesn’t support multiple instances of the data, since there is no direct support for making multiple instances of a module’s static data. If multiple instances were needed in C, programmers typically used a struct. But unfortunately C structs don’t support encapsulation. This exacerbates the tradeoff between safety (information hiding) and usability (multiple instances).
In C++, you can have both multiple instances and encapsulation via a class. The public part of a class contains the class’s interface, which normally consists of the class’s public member functions and its friend functions. The private and/or protected parts of a class contain the class’s implementation, which is typically where the data lives.

The end result is like an “encapsulated struct.” This reduces the tradeoff between safety (information hiding) and usability (multiple instances).

How can I prevent other programmers from violating encapsulation by seeing the private parts of my class?

Not worth the effort — encapsulation is for code, not people.

It doesn’t violate encapsulation for a programmer to see the private and/or protected parts of your class, so long as they don’t write code that somehow depends on what they saw. In other words, encapsulation doesn’t prevent people from knowing about the inside of a class; it prevents the code they write from becoming dependent on the insides of the class. Your company doesn’t have to pay a “maintenance cost” to maintain the gray matter between your ears; but it does have to pay a maintenance cost to maintain the code that comes out of your finger tips. What you know as a person doesn’t increase maintenance cost, provided the code you write depends on the interface rather than the implementation.

Besides, this is rarely if ever a problem. I don’t know any programmers who have intentionally tried to access the private parts of a class. “My recommendation in such cases would be to change the programmer, not the code” [James Kanze; used with permission].

Can a method directly access the non-public members of another instance of its class?


The name this is not special. Access is granted or denied based on the class of the reference/pointer/object, not based on the name of the reference/pointer/object. (See below for the fine print.)

The fact that C++ allows a class’ methods and friends to access the non-public parts of all its objects, not just the this object, seems at first to weaken encapsulation. However the opposite is true: this rule preserves encapsulation. Here’s why.

Without this rule, most non-public members would need a public get method, because many classes have at least one method or friend that takes an explicit argument (i.e., an argument not called this) of its own class.

Huh? (you ask). Let’s kill the mumbo jumbo and work out an example:

Consider assignment operator Foo::operator=(const Foo& x). This assignment operator will probably change the data members in the left-hand argument, *this, based on the data members in the right-hand argument, x. Without the C++ rule being discussed here, the only way for that assignment operator to access the non-public members of x would be for class Foo to provide a public get method for every non-public datum. That would suck bigtime. (NB: “suck bigtime” is a precise, sophisticated, technical term; and I am writing this on April 1.)
The assignment operator isn’t the only one that would weaken encapsulation were it not for this rule. Here is a partial(!) list of others:
  • Copy constructor.
  • Comparison operators: ==, !=, <=, <, >=, >.
  • Binary arithmetic operators: x+y, x-y, x*y, x/y, x%y.
  • Binary bitwise operators: x^y, x&y, x|y.
  • Static methods that accepts an instance of the class as a parameter.
  • Static methods that creates/manipulates an instance of the class.
  • etc.
Conclusion: encapsulation would be shredded without this beneficial rule: most non-public members of most classes would end up having a public get method.

The Fine Print: There is another rule that is related to the above: methods and friends of a derived class can access the protected base class members of any of its own objects (any objects of its class or any derived class of its class), but not others. Since that is hopelessly opaque, here’s an example: suppose classes D1 and D2 inherit directly from class B, and base class B has protected member x. The compiler will let D1’s members and friends directly access the x member of any object it knows to be at least a D1, such as via a D1* pointer, a D1& reference, a D1 object, etc. However the compiler will give a compile-time error if a D1 member or friend tries to directly access the x member of anything it does not know is at least a D1, such as via a B* pointer, a B& reference, a B object, a D2* pointer, a D2& reference, a D2 object, etc. By way of (imperfect!!) analogy, you are allowed to pick your own pockets, but you are not allowed to pick your father’s pockets nor your brother’s pockets.

Is Encapsulation a Security device?


Encapsulation != security.

Encapsulation prevents mistakes, not espionage.

What’s the difference between the keywords struct and class?

The members and base classes of a struct are public by default, while in class, they default to private. Note: you should make your base classes explicitly public, private, or protected, rather than relying on the defaults.

struct and class are otherwise functionally equivalent.

Enough of that squeaky clean techno talk. Emotionally, most developers make a strong distinction between a class and a struct. A struct simply feels like an open pile of bits with very little in the way of encapsulation or functionality. A class feels like a living and responsible member of society with intelligent services, a strong encapsulation barrier, and a well defined interface. Since that’s the connotation most people already have, you should probably use the struct keyword if you have a class that has very few methods and has public data (such things do exist in well designed systems!), but otherwise you should probably use the class keyword.

How do I define an in-class constant?

If you want a constant that you can use in a compile time constant expression, say as an array bound, use constexpr if your compiler supports that C++11 feature, otherwise you have two other choices:

You have more flexibility if the constant isn’t needed for use in a compile time constant expression:

You can take the address of a static member if (and only if) it has an out-of-class definition:

Why do I have to put the data in my class declarations?

You don’t. If you don’t want data in an interface, don’t put it in the class that defines the interface. Put it in derived classes instead. See, Why do my compiles take so long?.

Sometimes, you do want to have representation data in a class. Consider class complex:

This type is designed to be used much as a built-in type and the representation is needed in the declaration to make it possible to create genuinely local objects (i.e. objects that are allocated on the stack and not on a heap) and to ensure proper inlining of simple operations. Genuinely local objects and inlining is necessary to get the performance of complex close to what is provided in languages with a built-in complex type.

How are C++ objects laid out in memory?

Like C, C++ doesn’t define layouts, just semantic constraints that must be met. Therefore different implementations do things differently. One good explanation is in a book that is otherwise outdated and doesn’t describe any current C++ implementation: The Annotated C++ Reference Manual (usually called the ARM). It has diagrams of key layout examples. There is a very brief explanation in Chapter 2 of TC++PL3.
Basically, C++ constructs objects simply by concatenating sub objects. Thus

is represented by two ints next to each other, and

is represented by an A followed by an int; that is, by three ints next to each other.

Virtual functions are typically implemented by adding a pointer (the “vptr”) to each object of a class with virtual functions. This pointer points to the appropriate table of functions (the “vtbl”). Each class has its own vtbl shared by all objects of that class.

Why is the size of an empty class not zero?

To ensure that the addresses of two different objects will be different. For the same reason, new always returns pointers to distinct objects. Consider:
There is an interesting rule that says that an empty base class need not be represented by a separate byte:
This optimization is safe and can be most useful. It allows a programmer to use empty classes to represent very simple concepts without overhead. Some current compilers provide this “empty base class optimization”.

Moreover, “empty base class optimization” is no longer an optional optimization but a mandatory requirement on class layout as of C++11. Go beat up on your compiler vendor if it does not implement it properly.

content coming soon

This means that the content will be coming very soon. This may take upto 7 days of time at max.

What is C++?

C++ is a general-purpose programming language with a bias towards systems programming that
  • is a better C
  • supports data abstraction (e.g., classes)
  • supports object-oriented programming (e.g., inheritance)
  • supports generic programming (e.g., reusable generic containers and algorithms)
  • supports functional programming (e.g., template metaprogramming, lambda functions, constexpr)
It is defined by an ISO standard, offers stability over decades, and has a large and lively user community. See also The C++ Programming Language and Evolving a language in and for the real world: C++ 1991-2006.

See also when and why C++ was invented.

Is C++ a practical language?


C++ is a practical tool. It’s not perfect, but it’s useful.

In the world of industrial software, C++ is viewed as a solid, mature, mainstream tool. It has widespread industry support which makes it “good” from an overall business perspective.

Is C++ a perfect language?


C++ wasn’t designed to demonstrate what a perfect language looks like. It was designed to be a practical tool for solving real world problems. It has a few warts, as do all practical programming tools, but the only place where it’s appropriate to keep fiddling with something until it’s perfect is in a pure academic setting. That wasn’t C++’s goal.

What’s so great about classes?

Classes are there to help you organize your code and to reason about your programs. You could roughly equivalently say that classes are there to help you avoid making mistakes and to help you find bugs after you do make a mistake. In this way, classes significantly help maintenance.

A class is the representation of an idea, a concept, in the code. An object of a class represents a particular example of the idea in the code. Without classes, a reader of the code would have to guess about the relationships among data items and functions – classes make such relationships explicit and “understood” by compilers. With classes, more of the high-level structure of your program is reflected in the code, not just in the comments.

A well-designed class presents a clean and simple interface to its users, hiding its representation and saving its users from having to know about that representation. If the representation shouldn’t be hidden – say, because users should be able to change any data member any way they like – you can think of that class as “just a plain old data structure”; for example:

Note that even data structures can benefit from auxiliary functions, such as constructors. When designing a class, it is often useful to consider what’s true for every object of the class and at all times. Such a property is called an invariant. For example, the invariant of a vector could be that (a) its representation consists of a pointer to a number of elements and (b) that number of elements is stored in an integer. It is the job of every constructor to establish the class invariant, so that every member function can rely on it. Every member function must leave the invariant valid upon exit. This kind of thinking is particularly useful for classes that manage resources such as locks, sockets, and files. For example, a file handle class will have the invariant that it holds a pointer to an open file. The file handle constructor opens the file. Destructors free resources acquired by constructors. For example, the destructor for a file handle closes the file opened by the constructor:

If you haven’t programmed with classes, you will find parts of this explanation obscure and you’ll underestimate the usefulness of classes. Look for examples. Like all good textbooks, TC++PL has lots of examples; for example, see A Tour of the Standard Library. Most modern C++ libraries consist (among other things) of classes and a library tutorial is one of the best places to look for examples of useful classes.

What’s the big deal with OO?

Object-oriented techniques using classes and virtual functions are an important way to develop large, complex software applications and systems. So are generic programming techniques using templates. Both are important ways to express polymorphism – at run time and at compile time, respectively. And they work great together in C++.
There are lots of definitions of “object oriented”, “object-oriented programming”, and “object-oriented programming languages”. For a longish explanation of what Stroustrup thinks of as “object oriented”, read Why C++ isn’t just an object-oriented programming language. That said, object-oriented programming is a style of programming originating with Simula (about 40 years ago!) relying on encapsulation, inheritance, and polymorphism. In the context of C++ (and of many other languages with their roots in Simula), it means programming using class hierarchies and virtual functions to allow manipulation of objects of a variety of types through well-defined interfaces and to allow a program to be extended incrementally through derivation.
See whats so great about classes for an idea about what’s great about “plain classes”. The point about arranging classes into a class hierarchy is to express hierarchical relationships among classes and to use those relationships to simplify code.
To really understand OOP, look for some examples. For example, you might have two (or more) device drivers with a common interface:

This Driver is simply an interface. It is defined with no data members and a set of pure virtual functions. A Driver can be used through this interface and many different kinds of drivers can implement this interface:

Note that these drivers hold data (state) and objects of them can be created. They implement the functions defined in Driver. We can imagine a driver being used like this:

The key point here is that f() doesn’t need to know which kind of driver it uses; all it needs to know is that it is passed a Driver; that is, an interface to many different kinds of drivers. We could invoke f() like this:

Note that when f() uses a Driver the right kind of operations are implicitly chosen at run time. For example, when f() is passed d1, uses Driver1::read(), whereas when f() is passed d2, uses Driver2::read(). This is sometimes called run-time dispatch or dynamic dispatch. In this case there is no way that f() could know the kind of device it is called with because we choose it based on an input.
Please note that object-oriented programming is not a panacea. “OOP” does not simply mean “good” – if there are no inherent hierarchical relationships among the fundamental concepts in your problem then no amount of hierarchy and virtual functions will improve your code. The strength of OOP is that there are many problems that can be usefully expressed using class hierarchies – the main weakness of OOP is that too many people try to force too many problems into a hierarchical mold. Not every program should be object-oriented. As alternatives, consider plain classes, generic programming, and free-standing functions (as in math, C, and Fortran).
If you’re still wondering “why OO?”, consider also business reasons:
The software industry is succeeding at automating many of life’s functions that used to be manual. In addition, software is improving the flexibility of devices that were previously automated, for example, transforming the internal implementation of many previously existing devices from mechanical to software (clocks, automobile ignition systems, etc.) or from being controlled by electrical circuitry to software (TVs, kitchen appliances, etc.). And, of course, software is integrated into every aspect of our daily business lives — originally software was limited to Accounting and Finance, but it is now embedded in Operations, Marketing, Sales, and Management — software is nearly everywhere.
This incredible success has constantly stressed the ability of the software development organizations to keep up. As an industry, software development has continuously failed to meet the demands for large, complex software systems. Yes, this failure is actually due to the success of software’s ability to bring perceived value — it is actually caused because demand is greater than our ability to satisfy that demand. And while it is possible for us software people to sit around and pat ourselves on the back for that demand, innovators and thought leaders in this and every other discipline are marked by one undeniable characteristic: they/we are not satisfied. As an industry, we must do better. A lot better. Uber better.
Our past successes have propelled users to ask for more. We created a market hunger that Structured Analysis, Design and Programming techniques have not been able to satisfy. This required us to create a better paradigm. Several, in fact.
C++ supports OO programming. C++ can also be used as a traditional, imperative programming language (“as a better C”) or using the generic programming approach. Naturally each of these approaches has its pros and cons; don’t expect the benefits of one technique while using another. (Most common case of misunderstanding: don’t expect to get the benefits of object-oriented programming if you’re using C++ as a better C.)
C++ also supports the generic programming approach. And most recently C++ is starting to support (as opposed to merely allow) the functional programming approach. The best programmers are able to decide which approach fits best in which situation, rather than trying to shove a single approach (“my favorite approach”) at every problem everywhere in every industry irrespective of the business context or the sponsor’s goals.

Most importantly, sometimes the best solution is achieved by using a combination of features from Object-Oriented, Generic and Functional programming styles, whereas trying to restrict oneself to one particular approach may lead to a suboptimal solution.

What’s the big deal with generic programming?

Generic programming techniques using templates are an important way to develop large, complex software applications and systems. So are object oriented techniques. Both are important ways to express polymorphism – at compile time and at run time, respectively. And they work great together in C++.
C++ supports generic programming. Generic programming is a way of developing software that maximizes code reuse in a way that does not sacrifice performance. (The “performance” part isn’t strictly necessary, but it is highly desirable.)

Generic programming is programming based on parameterization: You can parameterize a type with another (such as a vector with its element types) and an algorithm with another (such as a sort function with a comparison function). The aim of generic programming is to generalize a useful algorithm or data structure to its most general and useful form. For example, a vector of integers is fine and so is a function that finds the largest value in a vector of integers. However, a better generic find function will be able to find an element in a vector of any type or better still in any sequence of elements described with a pair of iterators:

These examples are from the STL (the containers and algorithms part of the ISO C++ standard library); for a brief introduction, see A Tour of the Standard Library from TC++PL.

Generic programming is in some ways more flexible than object-oriented programming. In particular, it does not depend on hierarchies. For example, there is no hierarchical relationship between an int and a string. Generic programming is generally more structured than OOP; in fact, a common term used to describe generic programming is “parametric polymorphism”, with “ad hoc polymorphism” being the corresponding term for object-oriented programming. In the context of C++, generic programming resolves all names at compile time; it does not involve dynamic (run-time) dispatch. This has led generic programming to become dominant in areas where run-time performance is important.

Please note that generic programming is not a panacea. There are many parts of a program that need no parameterization and many examples where run-time dispatch (OOP) is more approriate.
Generic components are pretty easy to use, at least if they’re designed well, and they tend to hide a lot of complexity. The other interesting feature is that they tend to make your code faster, particularly if you use them more. This creates a pleasant non-tradeoff: when you use the components to do the nasty work for you, your code gets smaller and simpler, you have less chance of introducing errors, and your code will often run faster.

Most developers are not cut out to create these generic components, but most can use them. Fortunately generic components are, um, generic, so your organization does not often need to create a lot of them. There are many off-the-shelf libraries of generic components. STL is one such library. Boost has a bunch more. 

What is multiparadigm programming?

In short: The same as just “programming,” using different features (notably OO and generic styles) in combination as needed.

Back when having OO and generic programming in the same language was still new, “multiparadigm programming” was originally a fancy way of saying “programming using more than one programming style, each to its best effect.” For example, using object-oriented programming when run-time resolution between different object types is required and generic programming when static type safety and run-time performance is at a premium. Naturally, the main strength of multiparadigm programming is in programs where more than one paradigm (programming style) is used, so that it would be hard to get the same effect by composing a system out of parts written in languages supporting different paradigms. The most compelling cases for multiparadigm programming are found where techniques from different paradigms are used in close collaboration to write code that is more elegant and more maintainable than would be possible within a single paradigm. A simple example is the traversal of a statically typed container of objects of a polymorphic type:
Here, Shape will be an abstract base class defining the interface to a hierarchy of geometric shapes. This example easily generalizes to any standard library container:
Is this OOP, GP, functional, or conventional structured programming? All of the above: It’s a function template (GP) with a procedural body (conventional structured) that uses a generic algorithm (GP again) and a lambda (functional) that takes a pointer to a base class and invokes a virtual function (OO). The key point is that this is all just “programming.”

So today instead of “multiparadigm programming” we should simply say “programming.” It’s all programming, just using the right language features together in combination as usual.

Is C++ better than Java? (or C#, C, Objective-C, JavaScript, Ruby, Perl, PHP, Haskell, FORTRAN, Pascal, Ada, Smalltalk, or any other language?)

Stop. This question generates much much more heat than light. Please read the following before posting some variant of this question.

In 99% of the cases, programming language selection is dominated by business considerations, not by technical considerations. Things that really end up mattering are things like availability of a programming environment for the development machine, availability of runtime environment(s) for the deployment machine(s), licensing/legal issues of the runtime and/or development environments, availability of trained developers, availability of consulting services, and corporate culture/politics. These business considerations generally play a much greater role than compile time performance, runtime performance, static vs. dynamic typing, static vs. dynamic binding, etc.

Those who ignore the (dominant!) business criteria when evaluating programming language tradeoffs expose themselves to criticism for having poor judgment. Be technical, but don’t be a techie weenie. Business issues really do dominate technical issues, and those who don’t realize that is destined to make decisions that have terrible business consequences — they are dangerous to their employer.

The most widely circulated comparisons tend to be those written by proponents of some language, Z, to prove that Z is better that other languages. Given its wide use, C++ is often top of the list of languages that the proponents of Z wants to prove inferior. Often, such papers are “published” or distributed by a company that sells Z as part of a marketing campaign. Surprisingly, many seem to take an unreviewed paper written by people working for a company selling Z “proving” that Z is best seriously. One problem is that there are always grains of truth in such comparisons. After all, no language is better than every other in all possible ways. C++ certainly isn’t perfect, but selective truth can be most seductive and occasionally completely misleading. When looking at a language comparison consider who wrote it, consider carefully if the descriptions are factual and fair, and also if the comparison criteria are themselves fair for all languages considered. This is not easy.
Stroustrup refuses to compare C++ to other languages for these reasons given in The Design and Evolution of C++:
“Several reviewers asked me to compare C++ to other languages. This I have decided against doing. Thereby, I have reaffirmed a long-standing and strongly held view: Language comparisons are rarely meaningful and even less often fair. A good comparison of major programming languages requires more effort than most people are willing to spend, experience in a wide range of application areas, a rigid maintenance of a detached and impartial point of view, and a sense of fairness. I do not have the time, and as the designer of C++, my impartiality would never be fully credible.
I also worry about a phenomenon I have repeatedly observed in honest attempts at language comparisons. The authors try hard to be impartial, but are hopelessly biased by focusing on a single application, a single style of programming, or a single culture among programmers. Worse, when one language is significantly better known than others, a subtle shift in perspective occurs: Flaws in the well-known language are deemed minor and simple workarounds are presented, whereas similar flaws in other languages are deemed fundamental. Often, the workarounds commonly used in the less-well-known languages are simply unknown to the people doing the comparison or deemed unsatisfactory because they would be unworkable in the more familiar language.
Similarly, information about the well-known language tends to be completely up-to-date, whereas for the less-known language, the authors rely on several-year-old information. For languages that are worth comparing, a comparison of language X as defined three years ago vs. language Y as it appears in the latest experimental implementation is neither fair nor informative. Thus, I restrict my comments about languages other than C++ to generalities and to very specific comments.”

That said, C++ is considered to be the best choice in programming language for a wide variety of people and applications.

Why is C++ so big?

C++ is not a tiny language designed to be a minimal language for teaching, but neither are the languages people most often compare it to, such as C, Java, C#. They too are huge compared to say, Pascal as Dr. Wirth originally defined it – for good reasons. The programming world is far more complex today than it was 30 years ago, and modern programming languages reflect that.

C++ isn’t as big as some people imagine. By word count, the size of the language specifications (excluding standard libraries) for C++, C#, and Java are currently within a few percentage points of each other. This reflects that they are general-purpose mainstream languages that have grown similar features – auto/var type deduction, range for loops, lambda functions, various levels of support for generic programming, and so on. It also reflects what design theorists call “essential complexity in the problem domain” – the complexity in the real world and that a serious language has to expose, everything from fundamental OS differences to calling C++ libraries.

In some cases C++ directly supports (i.e., in the language) what some other languages support through libraries, so the language part will be relatively larger. On the other hand, if you want to write a “typical modern application”, you need to consider operating system interfaces, GUI, databases, web interfaces, etc. the sum of language features, libraries, and programming conventions and standards that you must become familiar with dwarf the programming language. Here, C++’s size can be an advantage as far as it better supports good libraries.

Finally, the days where a novice programmer can know all of a language are gone, at least for the languages in widespread industrial use. Few people know “all of C” or “all of Java” either and none of those are novices. It follows that nobody should have to apologize for the fact that novices do not know all of C++. What you must do - in any language – is to pick a subset, get working writing code, and gradually learn more of the language, its libraries, and its tools. For my suggestion on how beginners can approach C++, see Programming: Principles and Practice using C++.

Who uses C++?

Lots and lots of companies and government sites. Lots. And if you’re using a compiler or runtime of another language, such as Java, chances are good that it too is implemented in C++.

There are too many C++ users to effectively count them, but the number is in the millions. C++ is supported by all major vendors. The large number of developers (and therefore the large amount of available support infrastructure including vendors, tools, training, etc.) is one of several critical features of C++.

During 1980-1991, the number of users doubled every seven and a half months (see The Design and Evolution of C++). The current growth rate is steady and positive. IDC’s 2001 estimate of the number of C++ programmers was “about 3 million”; their 2004 number was “more than 3 million.” That seems plausible and indicates a continued growth. Especially since about 2010 there is a renewed growth in C++ as both mobile and datacenter applications value “performance per Watt” as a new mainstream metric.

How long does it take to learn C++?

That depends on what you mean by “learning.” If you are a C programmer you can learn enough C++ to make you more effective at C-style programming in a day.

The book Programming: Principles and Practice using C++ has been used to get thousands of freshmen (1st year students) through the fundamentals of C++ and the programming techniques it supports (notably object-oriented programming and generic programming) in a semester.
On the other hand, if you want to be fully comfortable with all the major C++ language constructs, with data abstraction, Object-Oriented programming, generic programming, Object-Oriented design, etc., you can easily spend a year or two – if you aren’t already acquainted with those techniques (say, from Java or C#).

Is that then the time it takes to learn C++? Maybe, but then again, that is the timescale we have to consider to become better designers and programmers. If a dramatic change of the way we work and think about building systems isn’t our aim, then why bother to learn a new language? Compared to the time required to learn to play the piano well or to become fluent in a foreign (natural) language, learning a new and different programming language and programming style is easy.

For more observations about learning C++ see D&E or a note Bjarne Stroustrup wrote some time ago.

Companies successfully teach standard industry “short courses,” where a university semester course is compressed into one 40 hour work week. But regardless of where you get your training, make sure the courses have a hands-on element, since most people learn best when they have projects to help the concepts “gel.” But even if they have the best training, they’re not ready yet.

It takes 6-12 months to become broadly proficient in C++, especially if you haven’t done OO or generic programming before. It takes less time for developers who have easy access to a “local” body of experts, more if there isn’t a “good” general purpose C++ class library available. To become one of these experts who can mentor others takes around 3 years.

Some people never make it. You don’t have a chance unless you are teachable and have personal drive. As a bare minimum on “teachability,” you have to be able to admit when you’ve been wrong. As a bare minimum on “drive,” you must be willing to put in some extra hours. Remember: it’s a lot easier to learn some new facts than it is to change your paradigm, i.e., to change the way you think; to change your notion of goodness; to change your mental models.
Two things you should do:
  • Get your people two books: one to tell them what is legal, another to tell them what is moral
  • Consider bringing in a “mentor”
Two things you should not do:

  • You should not bother having your people trained in C as a stepping-stone to learning OO/C++
  • You should not bother having your people trained in Objective-C as a stepping-stone to learning OO/C++

What’s the best way to improve my C++ programs?

That depends on how you use it. Most people underestimate abstract classes and templates. Conversely, most people seriously overuse casts and macros. Have a look at one of Stroustrup’s papers or books for ideas. One way of thinking of abstract classes and templates is as interfaces that allow a more clean and logical presentation of services than is easy to provide through functions or single-rooted class hierarchies. See other sections of this FAQ for some specific examples and ideas.

Does it matter which programming language I use?

Yes, but don’t expect miracles. Some people seem to believe that a programming language can or at least should solve most of their problems with system building. They are condemned to search forever for the perfect programming language and become repeatedly disappointed. Others dismiss programming languages as unimportant “implementation details” and put their money into development processes and design methods. They are condemned to program in COBOL, C, and proprietary design languages forever. A good language – such as C++ – can do a lot for a designer and a programmer, as long as its strengths and limitations are clearly understood and respected.

What are some features of C++ from a business perspective?

Here are a few features of OO/C++ from a business perspective:

  • C++ has a huge installed base, which means you’ll have multi-vendor support for tools, environments, consulting services, etc., plus you’ll have a very valuable line-item on your resumé
  • C++ lets developers provide simplified interfaces to software chunks, which improves the defect-rate when those chunks are (re)used
  • C++ lets you exploit developer’s intuition through operator overloading, which reduces the learning curve for (re)users
  • C++ localizes access to a software chunk, which reduces the cost of changes.
  • C++ reduces the safety-vs.-usability tradeoff, which improves the cost of (re)using a chunk of software.
  • C++ reduces the safety-vs.-speed tradeoff, which improves defect rates without degrading performance.
  • C++ gives you inheritance and dynamic binding which let old code call new code, making it possible to quickly extend/adapt your software to hit narrow market windows.

Are virtual functions (dynamic binding) central to OO/C++?

Yes and no! OO-style dynamic polymorphism, which you get by calling virtual functions, is one of the two major ways C++ offers to achieve polymorphism, and the one you should use for things that can’t be known at compile time. The other is generic-programming-style static polymorphism, which you get by using templates, and you should often use for things that are known at compile time. They’re two great tastes that taste great together.

Without virtual functions, C++ wouldn’t be object-oriented. Operator overloading and non-virtual member functions are great, but they are, after all, just syntactic sugar for the more typical C notion of passing a pointer to a struct to a function. The standard library contains numerous templates that illustrate “generic programming” techniques, which are also great, but virtual functions are still at the heart of object-oriented programming using C++.

From a business perspective, there is very little reason to switch from straight C to C++ without virtual functions (for now we’ll ignore generic programming and the standard library). Technical people often think that there is a large difference between C and non-OO C++, but without OO, the difference usually isn’t enough to justify the cost of training developers, new tools, etc. In other words, if I were to advise a manager regarding whether to switch from C to non-OO C++ (i.e., to switch languages but not paradigms), I’d probably discourage him or her unless there were compelling tool-oriented reasons. From a business perspective, OO can help make systems extensible and adaptable, but just the syntax of C++ classes without OO may not even reduce the maintenance cost, and it surely adds to the training cost significantly.

Bottom line: C++ without virtual is not OO. Programming with classes but without dynamic binding is called “object based,” but not “object oriented.” Throwing out virtual functions is the same as throwing out OO. All you have left is object-based programming, similar to the original Ada language (the updated Ada language, by the way, supports true OO rather than just object-based programming).

Note: you don’t need virtual functions for generic programming. Among other things, this means you can’t tell which paradigm you’ve used simply by counting the number of virtual functions you have.

I’m from Missouri. Can you give me a simple reason why virtual functions (dynamic binding, dynamic polymorphism) and templates (static polymorphism) make a big difference?

They can improve reuse by letting old code call new code provided at run time (virtual functions) or compile time (templates).

Before OO and generic programming came along, reuse was accomplished by having new code call old code. For example, a programmer might write some code that called some reusable code such as printf().

With OO and generic programming, reuse can also be accomplished by having old code call new code. For example, a programmer might write some code that is called by a framework that was written by their great, great grandfather. There’s no need to change great-great-grandpa’s code. In fact, for dynamic binding with virtual functions, it doesn’t even need to be recompiled. Even if all you have left is the object file and the source code that great-great-grandpa wrote was lost 25 years ago, that ancient object file will call the new extension without anything falling apart.

That is extensibility, and that is OO and generic programming for powerful reusable abstraction.

Is C++ backward compatible with ANSI/ISO C?

C++ is as close as possible to compatible with C, but no closer. In practice, the major difference is that C++ requires prototypes, and that f() declares a function that takes no parameters (in C, a function declared using f() can be passed an arbitrary number of parameters of arbitrary types).

There are some very subtle differences as well, like sizeof('x') is equal to sizeof(char) in C++ but is equal to sizeof(int) in C. Also, C++ puts structure “tags” in the same namespace as other names, whereas C requires an explicit struct (e.g., the typedef struct Fred Fred; technique still works, but is redundant in C++).

Why is C++ (almost) compatible with C?

When Stroustrup invented C++, he wanted C++ to be compatible with a complete language with sufficient performance and flexibility for even the most demanding systems programming. He “had a perfect dread of producing yet-another pretty language with unintentional limitations.” See Section 2.7 of The Design and Evolution of C++ for historical details.

At the time, Stroustrup considered C the best systems programming language available. That was not as obvious then (1979) as it later became, but Stroustrup had experts such as Dennis Ritchie, Steve Johnson, Sandy Fraser, Greg Chesson, Doug McIlroy, and Brian Kernighan down the corridor from whom he could learn and get feedback. Without their help and advice, and without C, C++ would have been stillborn.

Contrary to repeated rumors, Stroustrup was never told that he had to use C; nor was he ever told not to use C. In fact, the first C++ manual grew from troff source of the C manual contributed by Dennis Ritchie. Many new languages were designed at Bell labs; in “Research” at least, there were no rules enforcing language bigotry.

Why was C++ invented?

Stroustrup wanted to write efficient systems programs in the styles encouraged by Simula67. To do that, he added facilities for better type checking, data abstraction, and object-oriented programming to C. The more general aim was to design a language in which developers could write programs that were both efficient and elegant. Many languages force you to choose between those two alternatives.

The specific tasks that caused Stroustrup to start designing and implementing C++ (initially called “C with Classes”) had to do with distributing operating system facilities across a network.

Where did the name C++ come from?

In Chapter 3 of D&E, Stroustrup wrote:
I picked C++ because it was short, had nice interpretations, and wasn’t of the form “adjective C.”
In C, ++ can, depending on context, be read as “next,” “successor,” or “increment,” though it is always pronounced “plus plus.” The name C++ and its runner up ++C are fertile sources for jokes and puns – almost all of which were known and appreciated before the name was chosen. The name C++ was suggested by Rick Mascitti. It was first used in December of 1983 when it was edited into the final copies of [Stroustrup,1984] and [Stroustrup,1984c].
In chapter 1 of TC++PL, Stroustrup wrote:
The name C++ (pronounced “see plus plus”) was coined by Rick Mascitti in the summer of 1983. The name signifies the evolutionary nature of the changes from C; “++” is the C increment operator. The slightly shorter name “C+” is a syntax error; it has also been used as the name of an unrelated language. Connoisseurs of C semantics find C++ inferior to ++C. The language is not called D, because it is an extension of C, and it does not attempt to remedy problems by removing features. For yet another interpretation of the name C++, see the appendix of [Orwell,1949].

The “C” in C++ has a long history. Naturally, it is the name of the language Dennis Ritchie designed. C’s immediate ancestor was an interpreted descendant of BCPL called B designed by Ken Thompson. BCPL was designed and implemented by Martin Richards from Cambridge University while visiting MIT in the other Cambridge. BCPL in turn was Basic CPL, where CPL is the name of a rather large (for its time) and elegant programming language developed jointly by the universities of Cambridge and London. Before the London people joined the project “C” stood for Cambridge. Later, “C” officially stood for Combined. Unofficially, “C” stood for Christopher because Christopher Strachey was the main power behind CPL.

Why does C++ allow unsafe code?

That is, why does C++ support operations that can be used to violate the rules of static (compile-time) type safety?
  • to access hardware directly (e.g. to treat an integer as a pointer to (address of) a device register)
  • to achieve optimal run-time and space performance (e.g. unchecked access to elements of an array and unchecked access to an object through a pointer)
  • to be compatible with C
That said, it is a good idea to avoid unsafe code like the plague whenever you don’t actually need one of those three features:
  • don’t use casts
  • keep C-style [] arrays out of interfaces (hide them in the innards of high-performance functions and classes where they are needed and write the rest of the program using proper strings, vectors, etc.)
  • avoid void* (keep them inside low-level functions and data structures if you really need them and present type safe interfaces, usually templates, to your users)
  • avoid unions
  • if you have any doubts about the validity of a pointer, use a smart pointer instead
  • don’t use “naked” new and delete (use containers, resource handles, etc., instead)
  • don’t use ...-style variadic functions (“printf style”)
  • avoid macros except for #include guards

Almost all C++ code can follow these simple rules. Please don’t be confused by the fact that you cannot follow these rules if you write C code or C-style code in C++.

Why are some things left undefined in C++?

Because machines differ and because C left many things undefined. For details, including definitions of the terms “undefined”, “unspecified”, “implementation defined”, and “well-formed”; see the ISO C++ standard. Note that the meaning of those terms differ from their definition of the ISO C standard and from some common usage. You can get wonderfully confused discussions when people don’t realize that not everybody shares definitions.

This is a correct, if unsatisfactory, answer. Like C, C++ is meant to exploit hardware directly and efficiently. This implies that C++ must deal with hardware entities such as bits, bytes, words, addresses, integer computations, and floating-point computations the way they are on a given machine, rather than how we might like them to be. Note that many “things” that people refer to as “undefined” are in fact “implementation defined”, so that we can write perfectly specified code as long as we know which machine we are running on. Sizes of integers and the rounding behavior of floating-point computations fall into that category.

Consider what is probably the the best known and most infamous example of undefined behavior:

The C++ (and C) notion of array and pointer are direct representations of a machine’s notion of memory and addresses, provided with no overhead. The primitive operations on pointers map directly onto machine instructions. In particular, no range checking is done. Doing range checking would impose a cost in terms of run time and code size. C was designed to outcompete assembly code for operating systems tasks, so that was a necessary decision. Also, C – unlike C++ – has no reasonable way of reporting a violation had a compiler decided to generate code to detect it: There are no exceptions in C. C++ followed C for reasons of compatibility and because C++ also compete directly with assembler (in OS, embedded systems, and some numeric computation areas). If you want range checking, use a suitable checked class (vector, smart pointer, string, etc.). A good compiler could catch the range error for a[100] at compile time, catching the one for p[100] is far more difficult, and in general it is impossible to catch every range error at compile time.
Other examples of undefined behavior stems from the compilation model. A compiler cannot detect an inconsistent definition of an object or a function in separately-compiled translation units. For example:

Compiling file1.c and file2.c and linking the results into the same program is illegal in both C and C++. A linker could catch the inconsistent definition of S, but is not obliged to do so (and most don’t). In many cases, it can be quite difficult to catch inconsistencies between separately compiled translation units. Consistent use of header files helps minimize such problems and there are some signs that linkers are improving. Note that C++ linkers do catch almost all errors related to inconsistently declared functions.
Finally, we have the apparently unnecessary and rather annoying undefined behavior of individual expressions. For example:

The value of j is unspecified to allow compilers to produce optimal code. It is claimed that the difference between what can be produced giving the compiler this freedom and requiring “ordinary left-to-right evaluation” can be significant. Leading experts are unconvinced, but with innumerable compilers “out there” taking advantage of the freedom and some people passionately defending that freedom, a change would be difficult and could take decades to penetrate to the distant corners of the C and C++ worlds. It is disappointing that not all compilers warn against code such as ++i+i++. Similarly, the order of evaluation of arguments is unspecified.

There is a sentiment that too many “things” are left undefined, unspecified, implementation-defined, etc. To address this, the ISO C++ committee has created Study Group 12 to review and recommend wide-ranging tightening-up to reduce undefined, unspecified, and implementation-defined behavior.

Why is portability considered so important?

Successful software is long-lived; life-spans of decades are not uncommon. A good application/program often outlives the hardware it was designed for, the operating system it was written for, the data base system it initially used, etc. Often, a good piece of software outlives the companies that supplied the basic technologies used to build it.

Often a successful application/program have customers/users who prefer a variety of platforms. The set of desirable platforms change as the user population changes. Being tied to a single platform or single vendor, limits the application/program’s potential use.

Obviously, complete platform independence is incompatible with the ability to use all platform specific facilities. However, you can often approximate platform independence for an application by accessing platform facilities through a “thin interface” representing the application’s view of its environment as a library.

Is C++ standardized?


The C++ standard was finalized and adopted by ISO (International Organization for Standardization) as well as several national standards organizations such as INCITS (the U.S. National Committee for Information Technology Standards), BSI (the British Standards Institute), DIN (the German national standards organization). The ISO standard was finalized and adopted by unanimous vote in November 1997, with minor updates in 2003 and now significant and valuable updates in 2011. Another set of updates is expected to be published in 2014.

The U.S. C++ committee is called “PL22.16”. The ISO C++ standards group is called “WG21”. The major players in the C++ standards process have included just about everyone: representatives from Australia, Canada, Denmark, Finland, France, Germany, Ireland, Japan, the Netherlands, New Zealand, Sweden, the UK, and the USA, along with representatives from about a hundred companies and many interested individuals. Major players have included AT&T, Ericsson, Digital, Borland, Hewlett Packard, IBM, Intel, Mentor Graphics, Microsoft, NVidia, Silicon Graphics, Sun Microsystems, and Siemens.

Who is on the standardization committee?

The committee consists of a large number of people (about 200) out of whom about 100 turn up at the week-long meetings two or three times a year. In addition there are national standards groups and meetings in several countries. Most members contribute either by attending meetings, by taking part in email discussions, or by submitting papers for committee consideration. Most members have friends and colleagues who help them. From day #1, the committee has had members from many countries and at every meeting people from half a dozen to a dozen countries attend. The final votes are done by about 20 national standards bodies. Thus, the ISO C++ standardization is a fairly massive effort, not a small coherent group of people working to create a perfect language for “people just like themselves.” The standard is what this group of volunteers can agree on as being the best they can produce that all can live with.

Naturally, many (but not all) of these volunteers have day jobs focused on C++: They include compiler writers, tool builders, library writers, application builders, researchers, book authors, consultants, test-suite builders, and more.

Here is a very-partial list of some major organizations involved: Adobe, Apple, Boost, Bloomberg, EDG, Google, HP, IBM, Intel, Microsoft, Oracle, Red Hat.

Here is a short list of names of members who you may have encountered in the literature or on the web: Dave Abrahams, Matt Austern, Pete Becker, Hans Boehm, Steve Clamage, Lawrence Crowl, Beman Dawes, Francis Glassborow, Doug Gregor, Pablo Halpern, Howard Hinnant, Jaakko Jarvi, John Lakos, Alisdair Meredith, Jens Maurer, Jason Merrill, Sean Parent, P.J. Plauger, Tom Plum, Gabriel Dos Reis, Bjarne Stroustrup, Herb Sutter, David Vandevoorde, Michael Wong. Apologies to the 200+ current and past members that we couldn’t list. Also, please note the author lists on the various papers: a standard is written by (many) individuals, not by an anonymous committee.

You can get a better impression of the breath and depth of expertise involved by examining the authors listed in the WG21 papers archive, but please remember there are major contributors to the standards effort who do not write a lot.

What is the difference between C++98 and C++03?

From a programmer’s view there is none. The C++03 revision of the standard was a bug fix release for implementers to ensure greater consistency and portability. In particular, tutorial and reference material describing C++98 and C++03 can be used interchangeably by all except compiler writers and standards gurus.

What is the difference between C++98 and C++11?

Will be covered in detail in saperate section.

Note that the C++ language will remain stable because compatibility is always a major concern. The committee tries hard not to break your (standard conforming) code. Except for some corner cases you’re unlikely to notice, all valid C++98 code is valid C++11 and C++14 code.

What is the difference between C++11 and C++14?

Will be covered in separate section.

Note that the C++ language will remain stable because compatibility is always a major concern. The committee tries hard not to break your (standard conforming) code. Except for some corner cases you’re unlikely to notice, all valid C++98 code is valid C++14 code.

What are some “interview questions” I could ask that would let me know if candidates really know their stuff?

This answer is primarily for non-technical managers and HR folks who are trying to do a good job at interviewing C++ candidates. If you’re a C++ programmer about to be interviewed, and if you’re lurking in this FAQ hoping to know the questions they’ll ask you ahead of time so you can avoid having to really learn C++, shame on you: spend your time becoming technically competent and you won’t have to try to “cheat” your way through life!

Back to the non-technical manager / HR person: obviously you are eminently qualified to judge whether a candidate is a good “fit” with your company’s culture. However there are enough charlatans, wannabes, and posers out there that you really need to team up with someone who is technically competent in order to make sure the candidate has the right level of technical skill. A lot of companies have been burned by hiring nice but incompetent duds — people who were incompetent in spite of the fact that they knew the answers to a few obscure questions. The only way to smoke out the posers and wannabes is to get someone in with you who can ask penetrating technical questions. You have no hope whatsoever of doing that yourself. Even if I gave you a bunch of “tricky questions,” they wouldn’t smoke out the bad guys.

Your technical sidekick might not be (and often isn’t) qualified to judge the candidate on personality or soft skills, so please don’t abdicate your role as the final arbiter in the decision making process. But please don’t think you can ask a half dozen C++ questions and have the slightest clue if the candidate really knows what they’re talking about from a technical perspective.

Having said all that, if you’re technical enough to read the C++ FAQ, you can dig up a lot of good interview questions here. The FAQ has a lot of goodies that will separate the wheat from the chaff. The FAQ focuses on what programmers should do, as opposed to merely what the compiler will let them do. There are things that can be done in C++ but shouldn’t be done. The FAQ helps people separate those two.

What does the FAQ mean by “such and such is evil”?

It means such and such is something you should avoid most of the time, but not something you should avoid all the time. For example, you will end up using these “evil” things whenever they are “the least evil of the evil alternatives.” It’s a joke, okay? Don’t take it too seriously.

The real purpose of the term (“Ah ha,” I hear you saying, “there really is a hidden motive!”; you’re right: there is) is to shake new C++ programmers free from some of their old thinking. For example, C programmers who are new to C++ often use pointers, arrays and/or #define more than they should. The FAQ lists those as “evil” to give new C++ programmers a vigorous (and droll!) shove in the right direction. The goal of farcical things like “pointers are evil” is to convince new C++ programmers that C++ really isn’t “just like C except for those silly // comments.”

Now let’s get real here. I’m not suggesting macros or arrays or pointers are right up there with murder or kidnapping. Well, maybe pointers. (Just kidding!) So don’t get all hyper about the word “evil”: it’s supposed to sound a little outrageous. And don’t look for a technically precise definition of exactly when something is or isn’t “evil”: there isn’t one.

Items labeled as “evil” (macros, arrays, pointers, etc.) aren’t always bad in all situations. When they are the “least bad” of the alternatives, use them!

Will I sometimes use any so-called “evil” constructs?

Of course you will!
One size does not fit all. Stop. Right now, take out a fine-point marker and write on the inside of your glasses: Software Development Is Decision Making. “Think” is not a four-letter word. There are very few “never…” and “always…” rules in software — rules that you can apply without thinking — rules that always work in all situations in all markets — one-size-fits-all rules.
In plain English, you will have to make decisions, and the quality of your decisions will affect the business value of your software. Software development is not mostly about slavishly following rules; it is a matter of thinking and making tradeoffs and choosing. And sometimes you will have to choose between a bunch of bad options. When that happens, the best you can hope for is to choose the least bad of the alternatives, the lesser of the “evils.”

You will occasionally use approaches and techniques labeled as “evil.” If that makes you uncomfortable, mentally change the word “evil” to “frequently undesirable”

Is it important to know the technical definition of “good OO”? Of “good class design”?

You might not like this, but the short answer is, “No.” (With the caveat that this answer is directed to practitioners, not theoreticians.)
Mature software designers evaluate situations based on business criteria (time, money and risk) in addition to technical criteria like whether something is or is not “good OO” or “good class design.” This is a lot harder since it involves business issues (schedule, skill of the people, finding out where the company wants to go so we know where to design flexibility into the software, willingness to factor in the likelihood of future changes - changes that are likely rather than merely theoretically possible, etc.) in addition to technical issues. However it results in decisions that are a lot more likely to bring good business results.
As a developer, you have a fiduciary responsibility to your employer to invest only in ways that have a reasonable expectation for a return on that investment. If you don’t ask the business questions in addition to the technical questions, you will make decisions that have random and unpredictable business consequences.
Like it or not, what that means in practice is that you’re probably better off leaving terms like “good class design” and “good OO” undefined. In fact I believe precise, pure-technical definitions of those terms can be dangerous and can cost companies money, ultimately perhaps even costing people their jobs. That sounds bizarre, but there’s a really good reason: if these terms are defined in precise, pure-technical terms, well-meaning developers tend to ignore business considerations in their desire to fulfill these pure-technical definitions of “good.”
Any purely technical definition of “good,” such as “good OO” or “good design” or anything else that can be evaluated without regard to schedule, business objectives (so we know where to invest), expected future changes, corporate culture with respect to a willingness to invest in the future, skill levels of the team that will be doing the maintenance, etc., is dangerous. It is dangerous because it deceives programmers into thinking they are making “right” decisions when in reality they might be making decisions that have terrible consequences. Or those decisions might not have terrible business consequences, but that’s the point: when you ignore business considerations while making decisions, the business consequences will be random and somewhat unpredicatable. That’s bad.

It is a simple fact that business issues dominate technical issues, and any definition of “good” that fails to acknowledge that fact is bad.