Terminology

Not everyone will necessarily agree with each of the following definitions, but taken together they should give you a good sense of what we mean when we use one or more of these terms in context.

  1. A somewhat restricted notion of a type is a simply a collection of values, or "components", or "data items".
  2. A data type is a collection of values (its domain) together with a set of permissible operations on those values.
  3. An abstract data type (ADT) is a data type whose properties (domain and operations) are specified independently of any implementation.
  4. A data structure is the implementation of an ADT. In C++, a class represents an ADT and its implementation.
  5. An application programming interface (API) comprises the set of public operations provided so that "client programmers" can use the implementation of an ADT in their programs.

More on Abstract Data Types (ADTs)

From an operational point of view, an Abstract Data Type (abbreviated ADT) is a collection of data components, together with a collection of operations that are permitted on those data components.

This is a conceptual notion, independent of any programming language, but if you are familiar with a programming language like C++ or Java, you will immediately recognize that in those languages the class is the mechanism that is used to implement this notion of an ADT.

It is very important, however, to separate the concept of a particular ADT from its implementation, and to realize that the same (conceptual) ADT may in fact be implemented in many different ways, either in different programming languages, or even in the same language.

The beauty of the ADT notion is that, since it is a notion, it can be pretty much what we want it to be. It turns out that a number of common concepts (stacks, queues, lists, and so on) are so useful that these have been given standard, tested and debugged implementations and have been placed in libraries (the Standard Template Library in C++ or the Java Foundation Classes in Java, for example). This saves us from re-inventing the wheel every time we need a stack, or a queue, or a list, or some other "container" that is available from such a library.

On the other hand, quite often there are data types that we need, or would find useful, that we just can't find in a library, and then we have to go back to the drawing board. For example, perhaps we want an ADT to provide us with a menu facility for our text-based programs and another one to provide us with an easy way to display on-line help or other items of text when our programs are running. That is, we may want to implement a Menu ADT and TextItems ADT.

As with any ADT, we then need to decide what the data components will be and how they will be represented, as well as what the operations will be and how they will be implemented. In C++, both the interface and the underlying data representation are "visible" in the C++ class specification file (the .h file), even though the data components are generally inaccessible to the "client", since they are normally either private or protected. On the other hand, the implementation of the operations (the implementation of the "interface" to the ADT, i.e., to the class) is contained in an implementation file, and the actual implementation of any operation is both invisible and inaccessible, if the class (the ADT) is made accessible via a file pair that consists of a specification file (a file with a .h extension) and a corresponding object file (one with a .obj or .o or similar extension, depending on your system).

Sometimes the domain (i.e., the data) of an abstract data type may be simple (or atomic) and other times it will be structured, which in turn leads to the ADT itself being called simple or structured.

Each operation of an ADT is implemented as a function and invoked by a function call. Sometimes "operator functions" may permit the use of "built-in" operators to be used to perform operations for a programmer-defined ADT which are analogous to their built-in counterparts.

More on Application Programming Interfaces (APIs)

Once an Abstract Data Type (ADT), with its private and/or protected data and its public interface are all in place, that public interface can be "published", i.e., made available to "client programmers", at which point the public interface will likely be referred to as an API, an Application Programming Interface. It contains (or should contain) all that a client programmer needs to know about how the corresponding ADT can be used in a program.

The operations found in the interface of an ADT may be classified in various ways. The following are typical (but non-orthogonal, i.e., non-mutually exclusive) categories into which any given operation may be placed:

  1. foundation operations (also called manager operations): constructors (including copy constructors) and destructors
  2. transformer operations (also called modifier, converter, manipulator, or mutator operations)
  3. observer operations (also called inspector, accessor or selector operations)
  4. reader and writer operations (getters and setters)
  5. facilitator operations (also called manager or auxiliary operations)
  6. iterator operations [An iterator is something (a member function, or even a separate object) that can act on a sequence as follows: It can get the first item, determine if there are more items, and get the next item. Equivalently, an iterator is an operation that allows us to process, one component at a time, each component in an instance of an ADT implemented by a class.]
  7. helper operations (helper functions may also be external to the ADT)

Questions to ask when meeting a new data structure

  1. Why do we need this data structure?
  2. What does a picture of it look like?
  3. How is it defined? This includes questions of syntax, semantics, terminology.
  4. How is a variable or object of this type declared or initialized?
  5. How is assignment of values of this type handled?
  6. How are values of this type input and output?
  7. How are components of values of this type accessed/processed?
  8. What issues have to be considered if a value of this type is passed as a parameter or returned as the return-value of a function?

Guidelines for choosing a data structure and its associated algorithms

  1. Make sure you are reasonably familiar with common practice, so as to avoid reinventing the wheel.
  2. Assess the needs of the application first, and then make the best choice you can find of a data structure that matches those needs, subject to the following two criteria: