There are several ways in C++ to create an array of objects of some
type T
. Three common methods are:
#define N 10 // array size N is known at compile time T static_array[N]; int n = 20; // array size n is calculated at runtime T* dynamic_array = new T[n]; std::vector<T> vector_array; // array size can be changed at runtime
Of course, you can still use the calloc()
and malloc()
functions and your
program will compile and run, but it’s not a good idea to mix C and C++
unless you have to because you’re relying on legacy C libraries. However you
allocate the array, you can access an element in it using an unsigned
integer index:
const T& element_of_static_array = static_array[index]; const T& element_of_dynamic_array = dynamic_array[index]; const T& element_of_vector_array = vector_array[index];
Let’s deal with dynamic arrays and vectors first, and return to the static array later in this chapter.
What would happen if we provide an index
value that is
larger than or equal to the array size? In all three of the preceding
examples, the code will silently return garbage. (The exception to this
rule for Microsoft Visual Studio 2010 is discussed later.) The situation
is even worse if you decide to use the operator []
in the left-hand side
of an assignment:
some_array[index] = x;
Depending on your luck (or lack of thereof) you might overwrite some
other unrelated variable, an element of another array, or even a program
instruction, and in the latter case your program will most likely crash.
Each of these errors also provides opportunities for malicious intruders
to take over your program and turn it to bad ends. However, the std::vector
provides an
at(index)
function, which
does bounds checking by throwing an out_of_range
exception.
The problem with this is that if you want to do this sanity check, you
have to rigorously use the at()
function everywhere
for accessing an array element. And naturally, this slows your code down,
so once you are done testing, you’ll want to replace it everywhere with
the []
operator, which is
faster. But doing that replacement requires massive editing of your code,
which is a lot of work, followed by a need to retest it, because during
that tedious process you could accidentally mistype something.
So instead of the at()
function, I suggest
the following. Although a dynamic array leaves the []
operator totally out
of your control, the STL vector implements it as a C++ function that we
can rewrite according to our bug-hunting goals. And that’s what we’ll do
here. In the file scpp_vector.hpp we
redefine the []
operators as
follows:
T& operator [] (size_type index) { SCPP_TEST_ASSERT(index < std::vector<T>::size(), "Index " << index << " must be less than " << std::vector<T>::size()); return std::vector<T>:: operator[](index); } const T& operator [] (size_type index) const { SCPP_TEST_ASSERT(index < std::vector<T>::size(), "Index " << index << " must be less than " << std::vector<T>::size()); return std::vector<T>::operator[](index); }
Let’s see how this works. Here is an example of how to use it (including—intentionally—how not to use it):
#include <iostream> #include "scpp_vector.hpp" using namespace std; int main() { scpp::vector<int> vect; for(int i=0; i<3; ++i) vect.push_back(i); cout << "My vector = " << vect << endl; for(int i=0; i<=vect.size(); ++i) cout << "Value of vector at " << i << " is " << vect[i] << endl; return 0; }
First, note that instead of writing std::vector<int>
or
just vector<int>
we
wrote scpp
::vector<int>
. This
is to distinguish our vector from the STL’s vector. By using our
scpp::vector
we replace
the standard implementation—in this case, the implementation of operator []
—by our own safe implementation, and
you will see the same approach to preventing other bugs later in this
book. scpp::vector
also gives
you a <<
operator for
free, so you can print your vector as long as it is not too big, and as
long as the type T
defines the
<<
operator.
The next thing to notice is that in the second loop, instead of
writing i<vect.size()
we wrote
i<=vect.size()
. This is a very
common programming error, and we did it just to see what happens when the
index is out of bounds. Indeed, the program produces the following
output:
My vector = 0 1 2 Value of vector at 0 is 0 Value of vector at 1 is 1 Value of vector at 2 is 2 Index 3 must be less than 3 in file scpp_vector.hpp #17
This sanity check works as long as the symbol SCPP_TEST_ASSERT_ON
is
defined, and is easy to switch on and off at compile time when necessary.
The problem with this approach is that the vector’s []
operator is very often
used inside loops, so this sanity check is used a lot and therefore slows
the program down significantly just as using at()
would. If you feel
that this is becoming a problem in your program, feel free to define a new
macro, such as SCPP_TEST_ASSERT_INDEX_OUT_OF_BOUNDS
, which
would work exactly the same way as SCPP_TEST_ASSERT
but
would be used only inside scpp::vector::operator[]
.
SCPP_TEST_ASSERT_INDEX_OUT_OF_BOUNDS
should
differ from SCPP_TEST_ASSERT
only by
the fact that it can be switched on and off independently of the
SCPP_TEST_ASSERT
macro,
so you can deactivate it after you are sure that your code does not have
this bug while keeping the rest of your sanity checks active.
In addition to allowing you to catch this index-out-of-bounds error, the template vector has one advantage over statically and dynamically allocated arrays: its size grows as needed (as long as you don’t run out of memory). However, this advantage comes at a cost. The vector, if not told in advance how much memory will be needed, allocates some default amount (called its “capacity”). When the actual size reaches this capacity, the vector will allocate a bigger chunk of memory, copy old data into the new memory area, and release the old chunk of memory. So from time to time, adding a new element to a template vector could suddenly become slow. Therefore, if you know in advance what number of elements you will need, as with both static and dynamically allocated arrays, tell the vector up front, for instance, in the constructor:
scpp::vector<int> vect(n);
This creates a vector with a specified number of elements in it. You could also write:
scpp::vector<int> vect(n, 0);
which would also initialize all elements to a specified value (in this case zero, but any other value will work too).
An alternative is to create a vector with zero elements in it but to specify the desired capacity:
scpp::vector<int> vect; vect.reserve(n);
The difference between this example and the previous one is that in
this case the vector is empty (i.e., vect.size()
returns 0), but when you start
adding elements to it, you will not run into the incrementing capacity
procedure with the corresponding slowdown until you reach the size of
n.
Now, as promised, let’s deal with the static array:
#define N 10 // array size N is known at compile time T static_array[N];
Here, the size is known at compile time and will not change. Of course, to use the safe array with its boundary check, you can use a template vector with the size specified in a constructor:
scpp::vector vect(N);
This will work exactly the same as the static array, but the
problem here is efficiency. While the static array allocates its memory on
stack, the template vector allocates memory inside the constructor using
the new
operator, and this is
a relatively slow operation. If runtime efficiency is important in your
case, it’s better to use a template array, defined as follows in the
scpp_array.hpp file:
namespace scpp { // Fixed-size array template <typename T, unsigned N> class array { public: typedef unsigned size_type; // Most commonly used constructors: array() {} explicit array(const T& initial_value) { for(size_type i=0; i<size(); ++i) data_[i] = initial_value; } size_type size() const { return N; } // Note: we do not provide a copy constructor and assignment operator. // We rely on the default versions of these methods generated by the compiler. T& operator[] (size_type index) { SCPP_TEST_ASSERT(index < N, "Index " << index << " must be less than " << N); return data_[index]; } const T& operator [] (size_type index) const { SCPP_TEST_ASSERT(index < N, "Index " << index << " must be less than " << N); return data_[index]; } // Accessors emulating iterators: T* begin() { return &data_[0]; } const T* begin()const { return &data_[0]; } // Returns an iterator PAST the end of the array. T* end() { return &data_[N]; } const T* end()const { return &data_[N]; } private: T data_[N]; }; } // namespace scpp
This array behaves exactly like a static C array. However, when
compiled with the sanity check macro SCPP_TEST_ASSERT
active,
it provides an index-out-of-bounds check. The begin()
and end()
methods are
provided to simulate iterators, so that you can use this array in some of
the situations where you would have used the template vectors—for example,
to sort numbers. The following code demonstrates how to sort this array
using STL’s sort algorithm:
#include <algorithm> scpp::array<int, 5> a(0); a[0] = 7; a[1] = 2; a[2] = 3; a[3] = 9; a[4] = 0; cout << "Array before sort: " << a << endl; sort(a.begin(), a.end()); cout << "Array after sort: " << a << endl;
This produces the following output:
Array before sort: 7 2 3 9 0 Array after sort: 0 2 3 7 9
As a side benefit, you also get a <<
operator, which
allows you to stream an array as shown in the previous example, assuming
it is not too large and the template type T
has a <<
operator. Of
course, the use of this fixed-sized array must be limited to cases where
the array size N
is not too large.
Otherwise, you’ll be spending your stack memory, a limited resource, on
this array.
So the advice in this section is not to use static or dynamically
allocated arrays, but a template vector or array instead. This solves
another problem described in Chapter 1: when you use
the new
operator with
brackets, you need to use the delete
operator with
brackets as well. If you cross-use these operators (new
with brackets and
delete
without or vice
versa) you will corrupt the memory heap, which generally leads to bad
consequences. Once we decide not to use dynamically allocated arrays,
which are created through the new
operator with
brackets, we’ve killed two birds with one stone: the problem of an index
out of bounds, and the problem of mixing operators with and without
brackets. In short, do not use the new
operator (and the
corresponding delete
operator) with
brackets. Your life will be easier.
Note
At the time of this writing, the newest version of
Microsoft Visual Studio 2010 Ultimate
diagnoses the index-out-of-bounds error in
std::
vector
when compiled in a Debug mode, and
pops up a dialog box (Figure 4-1).
This dialog offers you the choice to Ignore, Abort, or Retry (in which case you can debug the application). While “Ignore” seems appropriate only if you are extremely adventurous, one can hope that the rest of the compilers working under Unix, Linux, and Mac OS will catch up to the trend.
Now that we’ve settled on the use of a template vector or array as an implementation of a linear array, let’s consider what to do if you need a two-dimensional matrix, a three-dimensional array, or generally speaking, an n-dimensional array. Because all the issues in the general case of n-dimensional arrays can be illustrated using a two-dimensional matrix, we will limit our discussion to this case and call it simply a matrix, with the understanding that the same principles apply to three or more dimensions.
If the size of the matrix is known at compile time, you can easily implement it as an array of arrays, and be done with it. Therefore, we’ll concentrate on the more complex case of a matrix whose size is calculated at run time. Such a matrix can easily be created as a vector of vectors, and in fact this approach is the only one possible if different rows must be of different lengths. However, most of the time all rows should be of the same length (e.g., the matrix is rectangular or even quadratic), and in this case the approach of using a vector of vectors is inefficient: it requires multiple memory allocations, which is a relatively slow operation (and the same can be said about deallocation). Because our goal in using C++ is efficiency, we’ll try a different approach and create a rectangular matrix using only one memory allocation, as shown in the class matrix in the scpp_matrix.hpp file:
// Two-dimensional rectangular matrix. template <typename T> class matrix { public: typedef unsigned size_type; matrix(size_type num_rows, size_type num_cols) : rows_(num_rows), cols_(num_cols), data_(num_rows * num_cols) { SCPP_TEST_ASSERT(num_rows > 0, "Number of rows in a matrix must be positive"); SCPP_TEST_ASSERT(num_cols > 0, "Number of columns in a matrix must be positive"); } matrix(size_type num_rows, size_type num_cols, const T& init_value) : rows_(num_rows), cols_(num_cols), data_(num_rows * num_cols, init_value) { SCPP_TEST_ASSERT(num_rows > 0, "Number of rows in a matrix must be positive"); SCPP_TEST_ASSERT(num_cols > 0, "Number of columns in a matrix must be positive"); } size_type num_rows() const { return rows_; } size_type num_cols() const { return cols_; } // Accessors: return element by row and column. T& operator() (size_type row, size_type col) { return data_[ index(row, col) ]; } const T& operator() (size_type row, size_type col) const { return data_[ index(row, col) ]; } private: size_type rows_, cols_; std::vector<T> data_; size_type index(size_type row, size_type col) const { SCPP_TEST_ASSERT(row < rows_, "Row " << row << " must be less than " << rows_); SCPP_TEST_ASSERT(col < cols_, "Column " << col << " must be less than " << cols_); return cols_ * row + col; } };
First of all, there are two constructors. The first allows you to
create a matrix with a specified number of rows and columns. The second,
with the additional init_value
argument,
allows you also to initialize each element to a specified value (e.g., to
set each element of a matrix<double>
to 0.0). Note that access
to elements is provided via the ()
operator, not
[]
. This is because the
[]
operator in C++ takes
only one argument, never two or more. So to access a multidimensional
array, we either need to use multiple []
operators, such as
my_matrix[i][j]
, or a single ()
operator, such as
my_matrix(i,j)
.
The first approach could be achieved if we had the []
operator return a
T*
pointer to the zeroth
element of the i-th row. However, this denies us the
diagnosis of a column index out of bounds, which defeats the purpose of
catching bugs at runtime. We could, of course, create some template class
that would include a smart reference to a row, return an instance of it
using the first operator ([i]
), and
then use the bounds check in the second operator ([j]
). To some degree, it is a matter of taste. I
did not see the value of resorting to this complex design just to preserve
the my_matrix[i][j]
syntax, and the
()
operator with multiple
arguments seems just fine.
The checks for an index out of bounds are performed inside the
index(row, col)
function, separately
for row and column numbers, and in the case of a runtime error lead to
calls to an error handler that are familiar by now. Finally, at the end of
the same file, you will discover a <<
operator for a
template matrix<T>
. They are
provided so you can output your matrix like this:
cout << "my matrix =\n" << my_matrix << endl;
as long as the matrix is not too large and the type T
defines the <<
operator.
Rules for this chapter to avoid “index out of bounds” errors:
Do not use static or dynamically allocated arrays; use a template array or vector instead.
Do not use
new
anddelete
operators with brackets—leave it up to the template vector to allocate multiple elements.Use
scpp:vector
instead ofstd::vector
andscpp::array
consistently instead of a static array, and switch the sanity checks on.For a multidimensional array, use
scpp::matrix
and access elements through the()
operator to provide index-out-of-bounds checks.
Get Safe C++ now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.