Monday, January 20, 2014

iterators in c++

STL iterators in c++ : Standard Template Library Tutorial, ccplusplus.com
An iterator is an object that can "iterate" (navigate) over elements. These elements may be all or part of the elements of a STL container. An iterator represents a certain position in a container. The following fundamental operations define the behavior of an iterator:
  • Operator *
    Returns the element of the actual position. If the elements have members, you can use operator -> to access those members directly from the iterator (In some older environments, operator -> might not work yet for iterators.)
  • Operator ++
    Lets the iterator step forward to the next element. Most iterators also allow stepping backward by using operator - -.
  • Operators == and !=
    Return whether two iterators represent the same position.
  • Operator =
    Assigns an iterator (the position of the element to which it refers).
These operations are exactly the interface of ordinary pointers in C and C++ when they are used to iterate over the elements of an array. The difference is that iterators may be smart pointers — pointers that iterate over more complicated data structures of containers. The internal behavior of iterators depends on the data structure over which they iterate. Hence, each container type supplies its own kind of iterator. In fact, each container class defines its iterator type as a nested class. As a result, iterators share the same interface but have different types. This leads directly to the concept of generic programming: Operations use the same interface but different types, so you can use templates to formulate generic operations that work with arbitrary types that satisfy the interface.
All container classes provide the same basic member functions that enable them to use iterators to navigate over their elements. The most important of these functions are as follows:
  • begin()
    Returns an iterator that represents the beginning of the elements in the container. The beginning is the position of the first element (if any).
  • end()
    Returns an iterator that represents the end of the elements in the container. The end is the position behind the last element. Such an iterator is also called a past-the-end iterator.

Thus, begin() and end() define a half-open range that includes the first element but excludes the last (see figure below). A half-open range has two advantages:
Figure : begin() and end() for Containers

  1. You have a simple end criterion for loops that iterate over the elements: They simply continue as long as end() is not reached.
  2. It avoids special handling for empty ranges. For empty ranges, begin() is equal to end().
Here is an example demonstrating the use of iterators. It prints all elements of a list container (it is the promised enhanced version of the first list example).



After the list is created and filled with the characters 'a' through 'z', all elements are printed within a for loop:


The iterator pos is declared just before the loop. Its type is the iterator type for constant element access of its container class:


Every container defines two iterator types:
  1. container::iterator
    is provided to iterate over elements in read/write mode.
  2. container: : const_iterator
    is provided to iterate over elements in read-only mode.
For example, in class list the definitions might look like the following:


The exact type of iterator and const_iterator is implementation defined. Inside the for loop, the iterator pos first gets initialized with the position of the first element:


The loop continues as long as pos has not reached the end of the container elements:


Here, pos is compared with the past-the-end iterator. While the loop runs the increment operator, ++pos navigates the iterator pos to the next element.

All in all, pos iterates from the first element, element-by-element, until it reaches the end (figure below). If the container has no elements, the loop does not run because coll.begin() would equal coll.end().

Figure : Iterator pos Iterating Over Elements of a List



In the body of the loop, the expression *pos represents the actual element. In this example, it is written followed by a space character. You can't modify the elements because a const_iterator is used. Thus, from the iterator's point of view the elements are constant. However, if you use a nonconstant iterator and the type of the elements is nonconstant, you can change the values. For example:


Note that the preincrement operator (prefix ++) is used here. This is because it might have better performance than the postincrement operator. The latter involves a temporary object because it must return the old position of the iterator. For this reason, it generally is best to prefer ++pos over pos++. Thus, you should avoid the following version:


For this reason, I recommend using the preincrement and pre-decrement operators in general.

Examples of Using Associative Containers

The iterator loop in the previous example could be used for any container. You only have to adjust the iterator type. Now you can print elements of associative containers. The following are some examples of the use of associative containers.
Examples of Using Sets and Multisets
The first example shows how to insert elements into a set and to use iterators to print them:


As usual, the include directive


defines all necessary types and operations of sets.
The type of the container is used in several places, so first a shorter type name gets defined:


This statement defines type IntSet as a set for elements of type int. This type uses the default sorting criterion, which sorts the elements by using operator <. This means the elements are sorted in ascending order. To sort in descending order or use a completely different sorting criterion, you can pass it as a second template parameter. For example, the following statement defines a set type that sorts the elements in descending order (Note that you have to put a space between the two ">" characters. ">>" would be parsed as shift operator, which would result in a syntax error.)


greater<> is a predefined function object. For a sorting criterion that uses only a part of the data of an object (such as the ID). 

All associative containers provide an insert() member function to insert a new element:


The new element receives the correct position automatically according to the sorting criterion. You can't use the push_back() or push_front() functions provided for sequence containers. They make no sense here because you can't specify the position of the new element.


After all values are inserted in any order, the state of the container is as shown in figure below. The elements are sorted into the internal tree structure of the container so that the value of the left child of an element is always less (with respect to the actual sorting criterion) and the value of the right child of an element is always greater. Duplicates are not allowed in a set, so the container contains the value 1 only once.

Figure : A Set that Has Six Elements

To print the elements of the container, you use the same loop as in the previous list example. An iterator iterates over all elements and prints them:


Again, because the iterator is defined by the container, it does the right thing, even if the internal structure of the container is more complicated. For example, if the iterator refers to the third element, operator ++ moves to the fourth element at the top. After the next call of operator ++ the iterator refers to the fifth element at the bottom (figure below).
Figure : Iterator pos Iterating over Elements of a Set

The output of the program is as follows:


If you want to use a multiset rather than a set, you need only change the type of the container (the header file remains the same):


A multiset allows duplicates, so it would contain two elements that have value 1. Thus, the output of the program would change to the following:


Examples of Using Maps and Multimaps
The elements of maps and multimaps are key/value pairs. Thus, the declaration, the insertion, and the access to elements are a bit different. Here is an example of using a multimap:


The program may have the following output:


However, because "this" and "is" have the same key, their order might be the other way around. When you compare this example with the set example on page 87, you can see the following two differences:
  1. The elements are key/value pairs, so you must create such a pair to insert it into the collection. The auxiliary function make_pair() is provided for this purpose. See page 203 for more details and other possible ways to insert a value.
  2. The iterators refer to key/value pairs. Therefore, you can't just print them as a whole. Instead, you must access the members of the pair structure, which are called first and second. Thus, the expression


    yields the second part of the key/value pair, which is the value of the multimap element. As with ordinary pointers, the expression is defined as an abbreviation for (In some older environments, operator -> might not work yet for iterators. In this case, you must use the second version.)


    Similarly, the expression


    yields the first part of the key/value pair, which is the key of the multimap element.
Multimaps can also be used as dictionaries. See page 209 for an example.
Maps as Associative Arrays
In the previous example, if you replace type multimap with map you would get the output without duplicate keys (the values might still be the same). However, a collection of key/value pairs with unique keys could also be thought of as an associative array. Consider the following example:


The declaration of the container type must specify both the type of the key and the type of the value:


Maps enable you to insert elements by using the subscript operator [ ]:


Here, the index is used as the key and may have any type. This is the interface of an associative array. An associative array is an array in which the index may be of an arbitrary type.

Note that the subscript operator behaves differently than the usual subscript operator for arrays: Not having an element for an index is not an error. A new index (or key) is taken as a reason to create and to insert a new element of the map that has the index as the key. Thus, you can't have a wrong index. Therefore, in this example in the statement


the expression


creates a new element that has the key "Null". The assignment operator assigns 0 (which gets converted into float) as the value.

You can't use the subscript operator for multimaps. Multimaps allow multiple elements that have the same key, so the subscript operator makes no sense because it can handle only one value. As shown on page 90, you must create key/value pairs to insert elements into a multimap. You can do the same with maps. See page 202 for details.

Similar to multimaps, for maps to access the key and the value of an element you have to use the first and second members of the pair structure. The output of the program is as follows:


Iterator Categories

Iterators can have capabilities in addition to their fundamental operations. The additional abilities depend on the internal structure of the container type. As usual, the STL provides only those operations that have good performance. For example, if containers have random access (such as vectors or deques) their iterators are also able to perform random access operations (for example, positioning the iterator directly at the fifth element).
Iterators are subdivided into different categories that are based on their general abilities. The iterators of the predefined container classes belong to one of the following two categories:
  1. Bidirectional iterator
    As the name indicates, bidirectional iterators are able to iterate in two directions: forward, by using the increment operator, and backward, by using the decrement operator. The iterators of the container classes list, set, multiset, map, and multimap are bidirectional iterators.
  2. Random access iterator
    Random access iterators have all the properties of bidirectional iterators. In addition, they can perform random access. In particular, they provide operators for "iterator arithmetic" (in accordance with "pointer arithmetic" of an ordinary pointer). You can add and subtract offsets, process differences, and compare iterators by using relational operators such as < and >. The iterators of the container classes vector and deque, and iterators of strings are random access iterators.
To write generic code that is as independent of the container type as possible, you should not use special operations for random access iterators. For example, the following loop works with any container:


However, the following does not work with all containers:


The only difference is the use of operator < instead of operator != in the condition of the loop. Operator < is only provided for random access iterators, so this loop does not work with lists, sets, and maps. To write generic code for arbitrary containers, you should use operator != rather than operator <. However, doing so might lead to code that is less safe. This is because you may not recognize that pos gets a position behind end(). It's up to you to decide which version to use. It might be a question of the context, or it might even be a question of taste.


To avoid misunderstanding, note that I am talking about "categories" and not "classes." A category only defines the abilities of iterators. The type doesn't matter. The generic concept of the STL works with pure abstraction; that is, anything that behaves like a bidirectional iterator is a bidirectional iterator.








-----------------------------------------------------------------
See Also:
-----------------------------------------------------------------

-----------------------------------------------------------------

No comments:

Post a Comment