hashing.txt TextItem file for the hashing program. ======================================== Program Description This program permits the user to perform a small amount of "experimentation" with a certain kind of hashing scheme, and to view the results of modifying the parameters that determine the precise way in which that hashing scheme is implemented. The data source must be a textfile, whose name will be supplied by the user, and which contains birthdate/name pairs of people, in the following format: 18950206Babe Ruth 19320227Elizabeth Taylor 19290115Dr. Martin Luther King, Jr. Note that birthdates are given in what we might (presumptuously) call the most "sensible" format, and one that might have avoided the Y2K fiasco. A sample data file called "famous_names.txt" is supplied. It contains the names of some allegedly "famous" people, as well as their birthdates, most of which came from the web site www.famousbirthdays.com. The program loads this data into a hash table in memory, in which the index for a particular data item is determined in the following way: Screen 1 of 5 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - first, forming the product of the two integers obtained by computing the date integer divided by 10000 and the date integer modulus 10000. - second, then finding the modulus of the resulting value with respect to the table size chosen by the user. If a collision occurs, the next location examined is determined by applying the same modulus operation to the current location after the user's probe size has been added to the current location. Here is a summary of what the different menu options provide: Option 1 terminates the program. Option 2 displays the information you are now reading. Option 3 allows you to "choose the hashing parameters". Keep in mind that choosing these parameters clears the hash table for new input, and that these parameters *must be set* before loading in an input data file. Screen 2 of 5 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! The two parameters you can choose (in Option 3) are: 1. the size of the hash table 2. the size of the "probe increment" to use when a collision occurs Option 4 allows you to view the values of the current user-chosen parameters (those set in Option 3) at any time, even the default values before they are set. Both of the default values are 0. Option 5 allows you to read in the data from a file and place that data in the hash table you have set up in Option 3. If the table size and/or the probe increment chosen by the user are such that the data cannot be loaded into the resulting table using that hashing scheme, the user must be returned to the main menu (where the user may opt, among other choices, to try again). There are six potential errors that must be recognized, and reported, if they occur. They are: Screen 3 of 5 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - first, attempting to load an input data file before the table size and probe size have been set - second, a missing input data file - third, too much data in the file for the hash table, i.e., the number of lines in the file is greater than the size of the hash table - fourth, the hashing scheme not being able to place the data in the table, i.e., the table size and the probe increment and the data are such that the hashing scheme cannot find a location for one or more data items - fifth, attempting to search for something in an empty "database", i.e., an empty hash table - sixth, attempting to display a "hashLog" file before one has been created Note that this error will only occur if there is no current (*or previous*) hashLog file in the current directory. That is, it will only occur on the first run of the program, if Option 6 is chosen before a data file has been read in, or on a later run if the last hashLog file created has been deleted (outside the program) and again Option 6 is chosen before reading in another data file. Screen 4 of 5 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Option 6 displays some information about how the data was placed in the hash table, simply by displaying the contents of a file called hashLog. This file must have been created by the program when the data in the input data file was read in and the hash table was created. In fact, this table is re-created each time a new input data file is read in to create a new hash table. The contents of this hashLog file are: - first, a list containing each name, in the order it appeared in the input data file, and followed on the same line by a list of all the locations that were examined up until that data item with that name was actually stored - second, a list of the complete contents of the hash table, in the order of the index values in the hash table The exact format of the hashLog file may be observed by running the sample executable with the sample input file. Option 7 allows you to search the hash table database for a person or persons (since there may be more than one) born on a given date. Dates are entered in the form yyyymmdd when searching. One final note: Although it is not an "error" to choose a table size and probe size that are not relatively prime, when choosing these two values via menu Option 3 the user is reminded that not doing so may result in the program's not being able to find places in the table for some of the data. Then the user is given another opportunity to choose different values. Screen 5 of 5 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ---------------------------------------- ========================================