General Reference Material | Algorithm Analysis

Algorithm Analysis: An Overview

What is an algorithm?

An algorithm is a procedure for performing a given task. It must be comprised of a finite sequence of concrete, unambiguous steps which eventually cause the algorithm to terminate, once it has correctly performed its task.

What are the characteristics of a "good" algorithm?

One can, of course, easily argue that the most important feature of a "good" algorithm is that it be correct. However, there are many other issues to consider, and when designing a new algorithm you should always attempt to do the following:

Make sure it is easy to understand, code and debug. To do this, an appropriate choice of algorithm design methodology will have to be chosen and used. Two commonly-used approaches are structured, top-down design, and the object-oriented approach. And, of course, one should not forget the easiest approach: finding an off-the-shelf algorithm that has already be written and tested and does exactly what you need to have done. Avoid reinventing the wheel whenever possible.
Have it make the best possible use of the computer's resources (time and storage space in particular). In order to do this, some "algorithm analysis" may be required.

What does it mean for an algorithm to be efficient?

An efficient algorithm for solving a problem is one that solves the problem within the constraints on the resources for solving that problem. Thus one algorithm is more efficient than another if it solves the same problem using fewer of the available resources.

How do we measure the efficiency of an algorithm?

Usually the resources we are concerned about when choosing an algorithm are:

The time it will take to run
The storage space in memory it will use

In any case, whether we are measuring and studying the time taken by an algorithm, or the storage space used by that algorithm, we are dealing with the computational complexity of the algorithm, which can be further refined into time complexity and storage complexity considerations.

Nowadays storage space is much less critical than it used to be, because both internal memory storage and external disk (and other forms of) storage are so cheap and readily available. Consequently, storage complexity is generally given much less attention in algorithm analysis than time complexity.

The time taken for an algorithm to run depends on how much "work" it has to do. A common way of measuring this is to count the number of operations (comparisons, assignments, exchanges, function calls, and perhaps others) that it has to perform during the completion of its task. Each of these operations may take different amounts of time on any particular computer, and such times will certainly vary from machine to machine. Thus the operation count is a machine-independent way of measuring the amount of work.

Furthermore, on any given machine, we often make the simplifying assumption that each operation takes more or less the same amount of time, and we frequently have to apply an artificial delay to each operation in order to be able to measure at all the amount of time taken to perform an algorithm on a small data set.

How do we compare the efficiency of algorithms?

There are essentially two ways:

We may perform a theoretical (mathematical) analysis of the algorithms, based on how much "work" each one would do when performing its task on a fixed number, say n, of data items. If we manage to express the amount of work done by each algorithm using a mathematical expression of some kind, we can then vary the size of n in each of those expressions and compare the results.
We may actually implement each algorithm of interest in a programming language and perform empirical "experiments" on data sets of various sizes and compare the results.

Of course this high-level description begs several questions. Among the most important of these are: What do we actually mean by the "work" done by an algorithm, and how do we actually measure it? We discuss these questions in the context of the efficiency of an algorithm.

What are some factors affecting how fast a computer solves a problem?

There are many factors that will affect how fast a computer solves a particular problem. Here are some of them:

Speed of the computer's cpu
Programming language used to implement the solution
Quality of the compiler for the programming language used
Methodology used to solve the problem
Skill of the programmer(s) who coded the solution
Amount of data to be processed
The nature of the particular data set for a given instance of the problem
The amount of I/O involved in the problem solution
The efficiency of the algorithm(s) used to solve the problem

Absolute time vs. relative time

Certainly when we ultimately get to the point where we are solving an actual problem with actual data, we are interested in getting it solved in the shortest possible time. That is, we are interested in the absolute time our program (i.e., our algorithm) will take to do the job.

On the other hand, when we are studying and comparing algorithms, and trying to get a sense of how the performance of one compares with that of another, for a given data set, the relevant thing to consider is the relative time. This is most easily measured by taking the ratio of the times, or the ratio of the operation counts, to compare the amount of work being done by one algorithm with that being done by the other.

It is also helpful to compute such ratios when a single algorithm is applied to different data sets, in order to get a feeling for how the performance of the algorithm is affected by an increase in the size of the data set on which it has to work. For example, sometimes doubling the amount of data an algorithm has to work on will simply double the amount of work it has to do (linear search), other times it may quadruple (multiply by 4) the amount it must do (insertion sort and selection sort), and still other times it may simply add a constant amount to the work required (binary search).

Analyzing algorithm performance

Being able to ask and answer questions about the performance of an algorithm on data sets of varying size, and about how the performance of one algorithm compares to that of another on data sets of the same size is an important skill for any programmer to acquire. In the development of commercial software, the performance of a program, and thus, ultimately, the sales of that product, can be severely compromised by the choice of the wrong algorithm at some stage of the development.

It is often helpful to consider the following three scenarios when analyzing algorithm performance, though usually we wind up concentrating on one or the other:

best-case scenario
What situation causes the algorithm to do the least amount of work possible? One example would be finding the target in the first item examined in a linear (sequential) search. Usually this case is only of passing interest, since we find ourselves in that situation only infrequently in practice.
worst-case scenario
What situation causes the algorithm to do the most work? In the case of linear search, for example, this would be the case where the very last item examined turned out to be the target, or perhaps the case where target was not present at all. Fortunately, like the best-case scenario, this is perhaps an infrequent occurrence. On the other hand, if we can answer the relevant questions for this case, then at least we have a "upper bound" on how bad things can get, and since it is likely things will be better most of the time, studying the worst-case scenario is often both a useful thing to do, and easier than an average case analysis.
average-case scenario
What situation causes the algorithm to do a "typical" or "average" amount of work? In the case of linear search, for example, this would be the case where the value is "somewhere in the middle". This situation, though the one of most interest since it is the one most likely to be encountered in practice, is also quite frequently the most difficult to analyze mathematically. A typical approach is to assign some kind of "probability distribution" to the various possibilities that may occur and then compute the "expected value" of the amount of work to be done, based on that probability distribution over all possible outcomes.
every-case scenario
Some authors consider this scenario as well, in which the distinguishing feature is that each data item must be processed, no matter what the data values are. For example, any algorithm to find the sum of all values in an array must process each value, whatever the number of values.

Relevant mathematical functions and "growth of functions"

There are a number of mathematical functions that crop up frequently in the context of algorithm analysis, and you should be thoroughly familiar with them, with their graphs, and with their "rates of growth":

polynomial functions; in particular:
- constant functions
- linear functions
- quadratic functions
- cubic functions
logarithmic functions
exponential functions
factorial functions
Certain combinations of these functions, particularly the product x(log x)

The point of knowing these functions is that the amount of work done by an algorithm (the operation count, for example) can often be expressed as a function T(n) where n is the number of data items and T is some form of one or more of the above functions. Such a function is a mapping from the positive integers to the reals, and is called a complexity function. The notation is meant to suggest that we are measuring the amount of time as a function of n, the number of data items. The standard "complexity classes" of functions against which we compare our algorithms are based on the following functions, which are listed in "increasing order of magnitude":

y = 1
y = log x
y = x
y = x(log x)
y = x²
y = x³
y = 2^x
y = x!

We sometimes think of each of the above functions as being the canonical form for a collection of functions. This is just a fancy way of saying that it is the "simplest" of an infinite number of functions having the same form. For example, y = x is the "simplest" of all (linear) functions of the form y = 3x, y = -2x+7, and so on.

Our knowledge of the behavior of such functions can then be translated into knowledge about the behavior of the corresponding algorithm. An important part of this process is to be able to "see the forest for the trees", that is, to be able to ignore certain details of the analysis that are not critical to the performance issues that we are analyzing. This is where the key ideas of "asymptotic analysis" are useful.

Asymptotic measures and notation

As you study the discussion in the paragraphs below, you may find it helpful to examine the figures on this page, which illustrate the concepts discussed.

There are a number of useful measures of asymptotic behavior of functions, and, by extension therefore, algorithms:

An asymptotic upper bound: Big-Oh
This is the most-used measure, and probably the most useful. It essentially gives us a measure (a complexity function T(n), which expresses the amount of time, T, that an algorithm takes to perform its task, as a function of n, the number of data values that it processes) and says, "Your algorithm can't be any worse than that."
For example, in the case of linear search, we can derive a complexity function
```
T(n) = 1/2 n + 1/2
```
which tells us the number of comparison operations we have to perform during an average successful search on n items, as a function of n. We can also find another linear function, say
```
y = c * n
```
which "dominates" our complexity function T(n) in the following sense:

There is some value of n, say n₀, for which T(n) <= c * n for all n >= n₀.

In this case we would say our linear search algorithm is O(n), which is pronounced, "Big-Oh of n". In fact, we can find many other such linear functions, but one is all we need. Can you actually find one that works?

More formally, we say that T(n) is O(f(n)) if there are positive constants c and n₀ such that T(n) <= c * f(n) whenever n >= n₀.
An asymptotic lower bound: Big-Omega
This essentially gives us a complexity function measure which says, "Your algorithm can't be any better than that."
In this case we would say our algorithm is Ω((f(n)), which is pronounced, "Big-Omega of f of n". Can you actually find a linear function that works to show that linear search is Ω((n)?

More formally, we say that T(n) is Ω(f(n)) if there are positive constants c and n₀ such that T(n) >= c * f(n) whenever n >= n₀.
An asymptotic upper and lower bound: Big-Theta
If we have a complexity function f(n) that is both a Big-Oh and a Big-Omega measure for our algorithm, we say that it is "Big-Theta of f(n)", which we write as Θ(f(n)). We can also say that the function is "an order-f(n) function", which we say as "an order f of n function". Equivalently, we would say that our algorithm is an "order-f(n) algorithm". Note that in this context we generally replace the actual f(n) by the "canonical form" of the class in which f(n) lies. Using this terminology, what would you say about linear search?
More formally, we say that T(n) is Θ(f(n)) if and only if T(n) is O(f(n)) and T(n) is Ω(f(n)).

There are other asymptotic measures, but these are perhaps the most common, and in fact, as we mentioned above, Big-O is the most frequently encountered, since in practical terms we are most interested in finding an upper bound on how much work our algorithm is going to have to do to solve our problem.

Whenever we have an "order-f(n)" algorithm, a convenient way of thinking of what this means is that the amount of work the algorithm has to do to process n items of data is "proportional to f(n)".

For example, if we have an order-n algorithm, like linear search, the amount of work done in searching through n data items to find a particular value is proportional to n, the number of items.

On the other hand, since selection sort is an order-n² algorithm, the amount of work selection sort does in sorting n data items is proportional to the square of the number of items.