Mathematical Background for Algorithm Analysis
In order to deal with the mathematical aspects of algorithm analysis, we need to be sure we have a clear grasp of some notational conventions, and that we understand a few basic principles and formulas.
Interval notation is a convenient means of indicating a certain kind of range of values. First let us assume that we have an underlying set of values of some kind (integer, real, even something exotic like C++ STL iterators or Java iterators). The notations shown below assume that a < b (whatever that means in the given context) and should be interpreted to mean, "All values strictly between a and b, and if a square bracket is adjacent to a value, that value is also included, but if a round bracket is next to a value, that value is omitted." That is,
When dealing with algorithms, we are often processing data that is indexed by a range of values, and when doing so we need to be careful not to commit the insidious off-by-one error. In other words, we need to be sure at all times exactly how many values are in the range we are working with.
So, for example, we need to know that the number of values in the closed interval [a, b] with integer end points is b-a+1, while the number of values in the half-open interval [a, b) with integer end points is b-a. These seem to be the two most frequently encountered intervals.
Another operation we perform from time to time is finding the mid-point of an interval with integer end points, and we need to be clear on this procedure. Usually we are doing this because we want the "middle index" between the first and last values in a range of index values. The "middle" value is just the average of the two values, i.e. middle = (first+last)/2, but we need to look closely at what this calculation gives us.
First, note that only if there is an odd number of values in the range do we have a "true" middle value. If we have an even number of values, they split into two equal groups and that puts us in a bit of a dilemma. What value should we choose to be the "middle"? As a matter of convention, when we refer to the "middle" value in a range containing an even number of values, we shall always mean the largest value in the "bottom half" of the range.
Fortunately, the calculation middle = (first+last)/2 gives us the correct value of middle in both cases (i.e., whether the number of values is odd or even), as you can easily convince yourself by looking at a few examples:
You need to understand the use of
You need to be thoroughly familiar with the following elementary functions (domain, range, and graph, for example):
The analysis of algorithms involves studying something called the computational complexity of an algorithm to determine how much work the algorithm does in performing its task. This, in turn, requires knowing the order of magnitude of any function we might come up with to express the amount of work done. Since such functions tend to fall into a few common categories (polynomial, logarithmic, exponential, and related functions), it is important to know how these functions relate to each other from the point of view of the "growth" of such a function as its argument increases.
Often, when we are computing quantities related to the analysis of algorithms, the calculations do not give us "nice" integer answers, which we frequently prefer to have. At the same time, knowing the "nearest integer", sometimes above and sometimes below, the value we have calculated, may be all we need, in the sense of being "close enough". The floor and ceiling functions are useful for finding the nearest integer value, in either direction, to a non-integer value.
The arguments that we will make in attempting to establish the truth of certain propositions will tend to be more informal than formal. Nevertheless, we may also make use of, and you should certainly be familiar with, a number of the more common and widely used "methods of proof" from your previous mathematical experience. These include
Remember that a single counterexample is sufficient to establish that a general proposition is not true.
It is often useful to be able to count things. Sometimes we just want to know how many things there are, and that's the end of it. At other times we may want to compare that number with some other number, in order to compute a third quantity, such as a ratio, or a probability, for example. One of the things we often need to count is the number of ways we can "choose r things from n things", where r<=n, and an important distinction to make when doing this is whether or not the order of the r things is important for any reason. If if is, we are counting permutations; if it isn't, we are "merely" counting combinations. In any case, there are a number of useful counting (or combinatorial) principles and formulas that you should know.
n! = 1 * 2 * 3 * ... * (n-1) * nand note as well that the value of 0! is defined to be 1.
P(n, r) = n!/(n-r)!
C(n, r) = n!/((n-r)!r!)
Sometimes the analysis of algorithms will require us to compute the probability of an event (also called an outcome), such as the probability that a data element will be found in a certain range of elements, for example. It is therefore useful to have at hand some basic principles of probability calculation and manipulation. The following list is very short and very basic.
P(A and B) = P(A) * P(B)
P(A and B) = P(A) * P(B|A)in which P(B|A) is read as "the probability of B, given A" and is called the conditional probability of B, given A.
P(A or B) = P(A) + P(B) - P(A and B)
1 + 2 + 3 + ... + n = n(n+1)/2
12 + 22 + 32 + ... + n2 = n(n+1)(2n+1)/6
a, a+d, a+2d, ...the nth term is given by the formula
an = a+(n-1)dand the sum of n terms is given by this formula:
Sn = n(2a+(n-1)d)/2
a, ar, ar2, ...the nth term is given by the formula
an = arn-1and the sum of n terms is given by this formula:
Sn = a(rn-1)/(r-1)
1 + 2 + 22 + 23 ... + 2n = 2n+1-1