Basic Pointer Concepts
A pointer is a variable which can contain the address (storage location) of another variable or object.
Here are some declarations of pointer variables:
int* p_int; //p_int is a "pointer to an int". //p_int can contain the storage location of an int. char* p_char; //p_char is a "pointer to a char". //p_char can contain the storage location of a char. struct ClubMember { string name; int age; double balance; }; ClubMember* p_cm; //p_cm is a "pointer to a ClubMember". //p_cm can contain the storage location of a ClubMember.
Notes on the declaration syntax illustrated above:
int* p; //Option 1 ... seems to be the most frequently used int * p; //Option 2 ... seems to be the most infrequently used int *p; //Option 3 ... somewhere in between in terms of usageBut you need to be careful, since in all cases the * symbol "binds" to the p and not to the int. This means, for example, that if you give declarations like
int* p1, p2;it is natural to think that you have declared two "pointer-to-int" variables. But you have not. The first, p1, is a pointer-to-int, but the second, p2, is just an ordinary int variable. This is just one more argument for having only one declaration per line, since you do not have this potential misunderstanding with these declarations:
int* p1; int* p2;Those who use Option 3 above as their declaration convention are somewhat less likely to have a problem, since they would be inclined to use the following (correct) syntax when placing two declarations on one line:
int *p1, *p2;
There are essentially two times we can make a pointer variable "point" at something:
There are also essentially two kinds of things a pointer can be made to point at:
Here are some relevant examples in which a pointer is made to point at something that already exists:
int i = 6; int* p_int = &i;
int i = 6; int* p_int; p_int = &i;
ClubMember cm = { "John", 32, 100.00 }; ClubMember* p_cm = &cm;
And here are some relevant examples in which a pointer is made to point at a newly acquired memory location (i.e., a piece of memory acquired "dynamically" by the program when it executes the given statement).
int* p_int = new int;
int* p_int; p_int = new int;
ClubMember* p_cm = new ClubMember;
There is an important "universal" pointer value called nullptr (a brand new keyword in C++11). A pointer value of any type may be assigned this value to indicate explicitly that the pointer does not point at anything. Note that this is not the same as having a pointer variable whose value is "undefined". Thus it is always true that
AnyType* p = nullptr;
makes sense, provided that AnyType
itself is defined. A
pointer variable that contains no value, not even nullptr
,
or a garbage value,is sometimes called a "dangling pointer".
Note that you will continue to see the predefined constant
NULL
and the value 0
used to indicate that a pointer does not point
to anything. Legacy code (older code that has not been updated will
contain these values for this purpose), but any new code that you write
should use nullptr
.
If we have a pointer variable, say p, which "points at" or "points to" something, a natural question is, "How do we use the pointer variable p to gain access to the thing to which it points?" The answer is that the expression *p refers to the thing to which p points. If the thing to which p points is a memory location on the heap, then that memory location may have no other name, and may therefore be accessible only in this way (via the pointer variable p). That is why such memory locations on the heap are sometimes referred to as anonymous variables, as well as dynamic variables (because they are obtained "dynamically", as the program runs).
Here is a simple example, in which we first obtain a new memory location (on the heap) that is capable of holding an integer, then we assign the value 17 to that location (we write 17 to the location, in effect), then we alter the value in that location (we alter the contents of the location in exactly the same way we would alter the contents of the location of an "ordinary" variable), and finally we output the value at that location (we read from the location, in effect):
int* p = new int; //Obtain a new location *p = 17; //Put the value 17 into that location *p = *p + 5; //Modify the value in that location cout << *p; //Display the value in that location
When a pointer variable p points to a variable (or object) of a structured type, such as a struct type (or a class type), the situation is somewhat more complex. For example, if we have
ClubMember* p = new ClubMember;
then, as before, *p points at (or, "refers to") the entire club member. But what if we want to refer to just one of the fields of the value of type ClubMember, say the "name" field for example? A good guess would be
*p.name
since if cm is an "ordinary" ClubMember variable, cm.name would be the correct syntax. However, this does not work because the . operator has a higher precedence than the * operator, and so we would have to use the following syntax instead:
(*p).name
Because this is at best awkward and inconvenient, C++ provides an alternate syntax using the "arrow operator", which, for the above example, would look like this:
p->name
This notation should be preferred when dealing with data or function members of struct or class variables, as in the following example.
ClubMember* p = new ClubMember; p->name = "John"; p->age = 32; p->balance = 100.00; cout << p->name << " owes $" << p->balance << endl;
There is an "intimate" connection between arrays and pointers in C++ (and C), which permits pointers to be used as an alternate and sometimes more convenient way of gaining access to array elements and traversing some or all of the elements of an array.
First, the name of an array is also a pointer to the first element of that array. That is, the name of an array may be treated as a (const) value which is equal to the address of the first element of that array. This means, for example, if we have the array
int a[] = { 1, 2, 3, 4, 5 };
then the two statements shown below are equivalent.
int* p_int = a; int* p_int = &a[0];
In the context of arrays, C++ and C programmers often perform "pointer arithmetic". Given below are some examples of the typical kinds of expressions one often sees, and which you may find useful to employ yourself. Note in particular the use of the increment/decrement operators.
int* p_int = &a[2]; //p_int points at the third element of a cout << *(p_int+2) << endl;//Display the fifth element of a cout << *(p_int-2) << endl;//Display the first element of a ++p_int; //p_int now points at the fourth element of a p_int++; //p_int now points at the fifth element of --p_int; //p_int now points at the fourth element of a p_int--; //p_int now points at the third element of a
It's important to realize the pointer arithmetic in the context of arrays is "smart", in the sense that when the pointer is incremented (or decremented), the resulting pointer value is the address of the next (or previous) value of the array, whether the type of the array is int, or double, or ClubMember. An analogous statement holds if an integer value is added to or subtracted from a pointer value.
The above discussion involved pointers being used in the context of static arrays, but of course we can also have dynamic arrays. A dynamic int array of size 5 would be obtained with the statement
int* p_int = new int[5]
after which, the third element of the array may be referred to by either of the following expressions:
p_int[2] *(p_int+2)
When the memory acquired from the heap during the running of a program, and "pointed to" by a pointer variable in the program, is no longer required by the program, the program should "return the memory to the heap". Doing so simply means that your program is behaving like a "good citizen", since otherwise it might cause a potentially serious "memory leak". If this happened, it would mean that your program had laid claim to some memory that it no longer was using, and perhaps could no longer even access, and other software running on the computer could not access it either, so long as your program continued to run. A situation like this could actually cause your computer to "run out of memory".
The syntax for returning memory to the heap differs, depending on whether the memory being returned contains a simple variable or an array. The following examples illustrate the difference.
int* p_int = new int; //Declare and initialize p_int *p_int = 17; //Assign value of 17 to the dynamic memory delete p_int; //Return the dynamic memory to the heap
In this case the "deleted" memory contains the value 17, but this value can no longer be accessed by the program. Note that the syntax is potentially misleading. It would appear that we are "deleting p", but in fact p has gone nowhere. It's the memory that p points at that we are "deleting" (i.e., returning to the heap), not p itself. Note that here we are dealing with a pointer that points at a simple variable. If the pointer points at an array, the syntax is given in the next example.
int* p_int = new int[5]; //Declare the dynamic array .... //Work with the dynamic array delete [] p_int; //Return the dynamic array's memory to the heap
In using the const keyword with a pointer definition, there are essentially four possibilities, as illustrated below. The thing pointed at can be made constant, or the the pointer itself, or both, or neither. In the illustrations below, "OK" means the corresponding line of code would compile, given the previous definitions of i and/or j, while "not OK" means that the line would not compile. We depart from our usual convention here and place a space on either side of the * in the definition of p, to emphasize the position of const.
Here are the four cases:
int i = 1; int j = 2; int * p = &i; p = &j; //OK (the pointer value can be altered) *p = 6; //OK (the value pointed at can also be altered)
int i = 1; int j = 2; const int * p = &i; p = &j; //OK (the pointer value can be altered) *p = 6; //not OK (the value pointed at cannot be altered)
int i = 1; int j = 2; int * const p = &i; p = &j; //not OK (the pointer value cannot be altered) *p = 6; //OK (the value pointed at can be altered)
int i = 1; int j = 2; const int * const p = &i; p = &j; //not OK (the pointer value cannot be altered) *p = 6; //not OK (the value pointed at also cannot be altered)
Actually, the effect is the same if const is placed after the type name, instead of before. Thus the following two lines are equivalent (see Case 2 above):
const int * p = &i; int const * p = &i;
And the following two lines are also equivalent (see Case 4 above):
const int * const p = &i; int const * const p = &i;
In fact, the keyword const can be placed either before or after the type name in the declaration of an "ordinary" named constant as well. That is, although you most often see the first of the following two alternatives, the second is equally valid:
const int SIZE = 10; int const SIZE = 10;
Some Issues with normal pointers in C++ are as follows:
For a discussion of C++ smart pointers see here.