C++ Reference Material | The union Data Structure

The union Data Structure

The union data structure is a legacy feature carried over from the C programming language, and the kind of facility it provides can often be delivered in C++ by using the inheritance mechanism.

A union is like a struct in that it generally has several fields, all of which are public by default. Unlike a struct, however, only one of the fields is used at any given time. In other words, it is a structure that allows the same storage space to be used to store values of different data types at different times. Thus it is necessary, and the programmer's responsibility, to keep track of what is actually stored in the union at any given time.

Unions can be used to conserve memory when a structure is needed in which several pieces of information of different types must be represented but only one will be used at a time, as when the components of a container must contain values of differing data types.

The usual syntax for a union is illustrated by

    union UnionType
    {
        int i;
        double d;
        char c;
    };
    UnionType u;

in which we see the definition of a union data type called UnionType, followed by the declaration of a variable of that type, namely u. The variable u can hold either an int value, or a double value, or a char value at any given time, but only one of these values.

The syntax for accessing the value in u is what you would probably expect-namely, one of the following, depending on what kind of value is stored in the union:

u.i
u.d
u.c

Arrow notation (up->i, if up is a pointer to a union of the above type) can also be used, if and when appropriate.

Because it is the programmer's responsibility to keep track of what is in a union at any time, unions are frequently enclosed in a larger structure which has, perhaps among many other fields, a field called a "tag field" to keep track of what is in the union. For example, the above union might appear in a situation like this:

enum TagType { INTEGER, DOUBLE, CHARACTER };
struct DataType
{
    OtherType otherField;
    TagType tag;
    UnionType u;
};
DataType myData;

With a setup like this, if you want to store a double value, you might do this:

myData.tag = DOUBLE;
myData.u.d = 3.14;

Then, later on, you can do this:

if (myData.tag == DOUBLE)
    // You know you can use a double value here ...

You can also use an "anonymous" union (a union without a name), as in

struct DataType
{
    OtherType otherField;
    TagType tag;
    union
    {
        int i;
        double d;
        char c;
    };
}

in which case the fields of the union are treated simply as fields of the struct (so there is no need to use the . or -> operator, but there must of course be no name conflicts with other fields in the struct).

In either of the above examples the field otherField may be called the "invariant" part of the struct (the part that does not vary in the kind of data it holds), while the union part might be called the "variant" part of the struct (the part that does vary).

Here are some general rules regarding unions that might occasionally come in handy but we give them only for the sake of completeness since we will be using unions only in the context of binary expression trees:

Like a struct, all members of a union are public by default.
A union can have a constructor to initialize any of its members. A union without a constructor can be initialized with another union of the same type, with an expression of the type of the first member of the union, or with an initializer (enclosed in braces) of the type of the first member of the union.
A union can have other member functions, such as a destructor, but no member of a union can be declared virtual.
None of a union's data members can be declared static.
A union can be assigned to another union of the same type, but unions cannot be compared.
A union cannot be used as a base class in inheritance.
Unions can have objects as data members only if the objects do not have a constructor, a destructor, or an overloaded assignment operator.
Anonymous unions have some additional restrictions:
- They can contain only data members.
- All members must be public.
- If declared at file scope (globally), an anonymous union must be explicitly declared static.

See the Deitel reference for a good discussion of unions.