Friday, January 24, 2014

c++ stl vector operations

c++ stl vector operations : Standard Template Library Tutorial, ccplusplus.com
A vector models a dynamic array. Thus, it is an abstraction that manages its elements with a dynamic array (figure below). However, note that the standard does not specify that the implementation use a dynamic array. Rather, it follows from the constraints and specification of the complexity of its operation.
Figure: Structure of a Vector

To use a vector, you must `include the header file <vector>:


There, the type is defined as a template class inside namespace std:


The elements of a vector may have any type T that is assignable and copyable. The optional second template parameter defines the memory model. The default memory model is the model allocator, which is provided by the C++ standard library.

Abilities of Vectors


Vectors copy their elements into their internal dynamic array. The elements always have a certain order. Thus, vectors are a kind of ordered collection. Vectors provide random access. Thus, you can access every element directly in constant time, provided you know its position. The iterators are random access iterators, so you can use any algorithm of the STL.

Vectors provide good performance if you append or delete elements at the end. If you insert or delete in the middle or at the beginning, performance gets worse. This is because every element behind has to be moved to another position. In fact, the assignment operator would be called for every following element.
Size and Capacity
Part of the way in which vectors give good performance is by allocating more memory than they need to contain all their elements. To use vectors effectively and correctly you should understand how size and capacity cooperate in a vector.

Vectors provide the usual size operations size(), empty(), and max_size(). An additional "size" operation is the capacity() function. capacity() returns the number of characters a vector could contain in its actual memory. If you exceed the capacity(), the vector has to reallocate its internal memory.

The capacity of a vector is important for two reasons:
  1. Reallocation invalidates all references, pointers, and iterators for elements of the vector.
  2. Reallocation takes time.
Thus, if a program manages pointers, references, or iterators into a vector, or if speed is a goal, it is important to take the capacity into account.

To avoid reallocation, you can use reserve() to ensure a certain capacity before you really need it. In this way, you can ensure that references remain valid as long as the capacity is not exceeded:


Another way to avoid reallocation is to initialize a vector with enough elements by passing additional arguments to the constructor. For example, if you pass a numeric value as parameter, it is taken as the starting size of the vector:


Of course, the type of the elements must provide a default constructor for this ability. But note that for complex types, even if a default constructor is provided, the initialization takes time. If the only reason for initialization is to reserve memory, you should use reserve().

The concept of capacity for vectors is similar to that for strings, with one big difference: Unlike strings, it is not possible to call reserve() for vectors to shrink the capacity. Calling reserve() with an argument that is less than the current capacity is a no-op. Furthermore, how to reach an optimal performance regarding speed and memory usage is implementation defined. Thus, implementations might increase capacity in larger steps. In fact, to avoid internal fragmentation, many implementations allocate a whole block of memory (such as 2K) the first time you insert anything if you don't call reserve() first yourself. This can waste Jots of memory if you have many vectors with only a few small elements.

Because the capacity of vectors never shrinks, it is guaranteed that references, pointers, and iterators remain valid even when elements are deleted or changed, provided they refer to a position before the manipulated elements. However, insertions may invalidate references, pointers, and iterators.

There is a way to shrink the capacity indirectly: Swapping the contents with another vector swaps the capacity. The following function shrinks the capacity while preserving the elements:


You can even shrink the capacity without calling this function by calling the following statement


However, note that after swap(), all references, pointers, and iterators swap their containers. They still refer to the elements to which they referred on entry. Thus, shrinkCapacity() invalidates all references, pointers, and iterators.

Vector Operations

Create, Copy, and Destroy Operations
Table below lists the constructors and destructors for vectors. You can create vectors with and without elements for initialization. If you pass only the size, the elements are created with their default constructor. Note that an explicit call of the default constructor also initializes fundamental types such as int with zero for some remarks about possible initialization sources.

Table:Constructors and Destructors of Vectors
Operation
Effect
vector<Elem> c
Creates an empty vector without any elements
vector<Elem> c1(c2)
Creates a copy of another vector of the same type (all elements are copied)
vector<Elem> c(n)
Creates a vector with n elements that are created by the default constructor
vector<Elem> c(n,elem)
Creates a vector initialized with n copies of element elem
vector<Elem> c(beg,end)
Creates a vector initialized with the elements of the range [beg,end)
c.~vector<Elem>()
Destroys all elements and frees the memory

Nonmodifying Operations

Table below lists all nonmodifying operations of vectors.

Table: Nonmodifying Operations of Vectors
Operation
Effect
c.size()
Returns the actual number of elements
c.empty()
Returns whether the container is empty (equivalent to size()==0, but might be faster)
c.max_size()
Returns the maximum number of elements possible
capacity()
Returns the maximum possible number of elements without reallocation
reserve()
Enlarges capacity, if not enough yet
c1 == c2
Returns whether c1 is equal to c2
c1 != c2
Returns whether c1 is not equal to c2 (equivalent to ! (c1==c2))
c1 < c2
Returns whether c1 is less than c2
c1 > c2
Returns whether c1 is greater than c2 (equivalent to c2<c1)
c1 <= c2
Returns whether c1 is less than or equal to c2 (equivalent to ! (c2<c1))
c1 >= c2
Returns whether c1 is greater than or equal to c2 (equivalent to ! (c1<c2))

Assignments

Table: Assignment Operations of Vectors
Operation
Effect
c1 = c2
Assigns all elements of c2 to c1
c.assign(n,elem)
Assigns n copies of element elem
c.assign(beg,end)
Assigns the elements of the range [beg,end)
c1.swap(c2)
Swaps the data of c1 and c2
swap(c1,c2)
Same (as global function)

Table above lists the ways to assign new elements while removing all ordinary elements. The set of assign() functions matches the set of constructors. You can use different sources for assignments (containers, arrays, standard input). All assignment operations call the default constructor, copy constructor, assignment operator, and/or destructor of the element type, depending on how the number of elements changes. For example:



Element Access


Table below shows all vector operations for direct element access. As usual in C and C++, the first element has index 0 and the last element has index size()-1. Thus, the nth element has index n-1. For nonconstant vectors, these operations return a reference to the element. Thus you could modify an element by using one of these operations (provided it is not forbidden for other reasons).

Table: Direct Element Access of Vectors
Operation
Effect
c.at(idx)
Returns the element with index idx (throws range error exception if idx is out of range)
c[idx]
Returns the element with index idx (no range checking)
c.front()
Returns the first element (no check whether a first element exists)
c.back()
Returns the last element (no check whether a last element exists)

The most important issue for the caller is whether these operations perform range checking. Only at() performs range checking. If the index is out of range, it throws an out_of_range exception. All other functions do not check. A range error results in undefined behavior. Calling operator [], front(), and back() for an empty container always results in undefined behavior:


So, you must ensure that the index for operator [] is valid and the container is not empty when either front() or back() is called:


Iterator Functions

Vectors provide the usual operators to get iterators (table below). Vector iterators are random access iterators. Thus, in principle you could use all algorithms of the STL.

Table: Iterator Operations of Vectors
Operation
Effect
c.begin()
Returns a random access iterator for the first element
c.end()
Returns a random access iterator for the position after the last element
c.rbegin()
Returns a reverse iterator for the first element of a reverse iteration
c.rend()
Returns a reverse iterator for the position after the last element of a reverse iteration

The exact type of these iterators is implementation defined. However, for vectors they are often ordinary pointers. An ordinary pointer is a random access iterator, and because the internal structure of a vector is usually an array, it has the correct behavior. However, you can't count on it. For example, if a safe version of the STL that checks range errors and other potential problems is used, the iterator type is usually an auxiliary class.

Iterators remain valid until an element with a smaller index gets inserted or removed, or reallocation occurs and capacity changes.
Inserting and Removing Elements
Table below shows the operations provided for vectors to insert or to remove elements. As usual by using the STL, you must ensure that the arguments are valid. Iterators must refer to valid positions, the beginning of a range must have a position that is not behind the end, and you must not try to remove an element from an empty container.

Regarding performance, you should consider that inserting and removing happens faster when
  • Elements are inserted or removed at the end
  • The capacity is large enough on entry
  • Multiple elements are inserted by a single call rather than by multiple calls
Inserting or removing elements invalidates references, pointers, and iterators that refer to the following elements. If an insertion causes reallocation, it invalidates all references, iterators, and pointers.

Table: Insert and Remove Operations of Vectors
Operation
Effect
c.insert(pos,elem)
Inserts at iterator position pos a copy of elem and returns the position of the new element
c.insert(pos,n,elem)
Inserts at iterator position pos n copies of elem (returns nothing)
c.insert(pos,beg,end)
Inserts at iterator position pos a copy of all elements of the range [beg,end) (returns nothing)
c.push_back(elem)
Appends a copy of elem at the end
c.pop_back()
Removes the last element (does not return it)
c.erase(pos)
Removes the element at iterator position pos and returns the position of the next element
c.erase(beg,end)
Removes all elements of the range [beg,end) and returns the position of the next element
c.resize(num)
Changes the number of elements to num (if size() grows, new elements are created by their default constructor)
c.resize(num,elem)
Changes the number of elements to num (if size() grows, new elements are copies of elem)
c.clear()
Removes all elements (makes the container empty)

Vectors provide no operation to remove elements directly that have a certain value. You must use an algorithm to do this. For example, the following statement removes all elements that have the value val:


To remove only the first element that has a certain value, you must use the following statements:


Using Vectors as Ordinary Arrays

The C++ standard library does not state clearly whether the elements of a vector are required to be in contiguous memory. However, it is the intention that this is guaranteed and it will be fixed due to a defect report. Thus, you can expect that for any valid index i in vector v, the following yields true:


This guarantee has some important consequences. It simply means that you can use a vector in all cases in which you could use a dynamic array. For example, you can use a vector to hold data of ordinary C-strings of type char* or const char*


Of course, you have to be careful when you use a vector in this way (like you always have to be careful when using dynamic arrays). For example, you have to ensure that the size of the vector is big enough to copy some data into it and that you have an '\0' element at the end if you use the contents as a C-string. However, this example shows that whenever you need an array of type T for any reason (such as for an existing C library) you can use a vector<T> and pass the address of the first element.

Note that you must not pass an iterator as the address of the first element. Iterators of vectors have an implementation-specific type, which may be totally different from an ordinary pointer:


Exception Handling

Vectors provide only minimal support for logical error checking. The only member function for which the standard requires that it may throw an exception is at(), which is the safe version of the subscript operator. In addition, the standard requires that only the usual standard exceptions may occur, such as bad_alloc for a lack of memory or exceptions of user-defined operations.

If functions called by a vector (functions for the element type or functions that are user supplied) throw exceptions, the C++ standard library guarantees the following:
  1. If an element gets inserted with push_back() and an exception occurs, this function has no effect.
  2. insert() either succeeds or has no effect if the copy operations (copy constructor and assignment operator) of the elements do not throw.
  3. pop_back() does not throw any exceptions.
  4. erase() and clear do not throw if the copy operations (copy constructor and assignment operator) of the elements do not throw.
  5. swap() does not throw.
  6. If elements are used that never throw exceptions on copy operations (copy constructor and assignment operator), every operation is either successful or has no effect. Such elements might be "plain old data" (POD). POD describes types that use no special C++ feature. For example, every ordinary C structure is POD.
All these guarantees are based on the requirements that destructors don't throw.

Examples of Using Vectors

The following example shows a simple usage of vectors:


The output of the program might look like this:


Note my use of the word "might." The values of max_size() and capacity() are implementation defined. Here, for example, you can see that the implementation doubles the capacity if the capacity no longer fits.

Class vector<bool>

For Boolean elements of a vector, the C++ standard library provides a specialization of vector. The goal is to have a version that is optimized to use less size than a usual implementation of vector for type bool. Such a usual implementation would reserve at least 1 byte for each element. The vector<bool> specialization usually uses internally only 1 bit for an element, so it is typically eight times smaller. Note that such an optimization also has a snag: In C++, the smallest addressable value must have a size of at least 1 byte. Thus, such a specialization of a vector needs special handling for references and iterators.

As a result, a vector<bool> does not meet all requirements of other vectors (for example, a vector<bool>::reference is not a true lvalue and vector<bool>::iterator is not a random access iterator). Therefore, template code might work for vectors of any type except bool. In addition, vector<bool> might perform slower than normal implementations because element operations have to be transformed into bit operations. However, how vector<bool> is implemented is implementation specific. Thus, the performance (speed and memory) might differ.

Note that class vector<bool> is more than a specialization of vector<> for bool. It also provides some special bit operations. You can handle bits or flags in a more convenient way.

vector<bool> has a dynamic size, so you can consider it a bitfield with dynamic size. Thus, you can add and remove bits. If you need a bitfield with static size, you should use bitset rather than a vector<bool>. 

Table: Special Operations of vector<bool>
Operation
Effect
c.flip()
Negates all Boolean elements (complement of all bits)
m[idx].flip()
Negates the Boolean element with index idx (complement of a single bit)
m[idx] = val
Assigns val to the Boolean element with index idx (assignment to a single bit)
m[idx1] = m[idx2]
Assigns the value of the element with index idx2 to the element with index idx1

The additional operations of vector<bool> are shown in Table: The operation flip(), which processes the complement, can be called for all bits and a single bit of the vector. Note that you can call flip() for a single Boolean element. This is surprising, because you might expect that the subscript operator returns bool and that calling flip() for such a fundamental type is not possible. Here the class vector<bool> uses a common trick, called a proxy[8] : For vector<bool>, the return type of the subscript operator (and other operators that return an element) is an auxiliary class. If you need the return value to be bool, an automatic type conversion is used. For other operations, the member functions are provided. The relevant part of the declaration of vector<bool> looks like this,


As you can see, all member functions for element access return type reference. Thus, you could also use the following statement:


As usual, to avoid undefined behavior, the caller must ensure that the first, last, and sixth elements exist.

The internal type reference is only used for nonconstant containers of type vector<bool>. The constant member functions for element access return ordinary values of type bool.








-----------------------------------------------------------------
See Also:
-----------------------------------------------------------------

-----------------------------------------------------------------

No comments:

Post a Comment