Assignment #5 - Dictionary Array

Assignment #5 - Dictionary Array Due Thursday, October 14, 2010 at 11:59PM

Overview

This lab asks you to implement a dictionary library via a dynamically allocated array of char pointers. Each time another word is read from the input file, that pointer is copied into the array of pointers. You will apply all that you have learned so far about arrays, pointers, passing pointers, malloc and free, as well as file I/O.

The Dictionaries

The "dictionary" that serves as this lab's main data structure is conceptually a growable array of strings. Since strings are, in C, implemented as arrays of chars, this data structure is really an array of character pointers. The INITIAL_DICT_SIZE should be 10.
The array needs to be a dynamic array because it needs to be growable. In order to make the array grow, we allocate a new array twice as large and set each of its elements to alias the elements of the original array. By aliasing the original elements, I mean initializing the new array by copying the pointer values from the old array into the new one so that the new array references the same data as the old one via the same indexes. Once this is accomplished, the old array can be free()'d and the original pointer can be made to reference the new array. In this way, we have, in effect "grown" the original array -- the original pointer now references a bigger array containing the same data in the same positions.
Please use the following type for the dictionary:
  typedef struct {
    char **wordList;
    unsigned long wordListSize;
    unsigned long count;
  } dictionary;
  
The dictionary should be implemented as a library and, as such, should make use of appropriately named and configured .c and .h files.
The dictionary should support the following operations:

initDict(...) -- Create and initialize an empty dictionary. Think carefully about what the single argument to this method should be. This method should allocate the entire dictioanry -- including the struct. Be careful here.
This method should return a 0 on success and non-0 on failure.

destroyDict(...) -- Frees all of the dictionary's dynamically allocated memory -- the dictinary array, the individual words -- and the dictionary struct, itself.
This method should return a 0 on success and non-0 on failure.

addWordDict(...) -- Accepts a word, represented as a char * and add it to the dictionary, which should be passed by reference.
For easy of implementation, this method should initially use the "next available slot" at the end of the list. But, in your final implementation, it should perform an insertion sort to maintain the dictionary in sorted order. Type "man strcmp" for help on comparing strings.
This method should cause the dictionary to grow as needed. You probably also want to update the count of the number of words within the dictionary.
This method should return a 0 on success and non-0 on failure.

printDict(...) -- This method should print to stdout each word within the dictionary, one word per line. The dictionary should be passed by reference.
This method should return a 0 on success and non-0 on failure.

verifyWordDict(...) -- Accepts a word, represented as a char * and searches the dictionary, passed by reference, for it. If found, this method returns 0. If not found, this method returns 1.
Your initial implementation should probably be a brute-force search. But, once your addWord(...) implements in-order insertion, this method should perform a binary search.

growDict() -- This method causes the dictionary array to double in size as described earlier. You probably want this function to be static so that it can't be called directly by the user and, instead, only by add(...). As with the other methods, the dictionary should be passed in by reference, not just the array.

Main Program

Your main program should serve as a test driver. It should accept its input as command-line arguments. The first arguments are the names of dictionary files. The last argument is the name of a query file. At least one dictionary file is required, but multiple dictioanry files are allowed. Exactly one query file is required.
  lab5 dictionaryFile [dictionaryFile ...] queryFile
  
For each supplied dictionary file name, create a dictionary. As a hint, look at argv to determine the number of dictionaries and create an array -- remember, all but one of the arguments (the query file) is the name of a dictionary file. Once you have created separate dictionary files, process the query file.
Your main program should look up each word in the query file in the corresponding dictionary. It should print out the word and "found" or "not found", as shown in the example, as appropriate.
  0	run	not found
  1	walk	found
  

The Dictionary Files

The dictionary files are plain text files that contain one word per line. Below is a short example:
  code
  nerd
  like
  programming
  text
  test
  this
  fun
  work
  hard
  

The Query File

The query file contains three columns. The first column is a dictionary number. The second column is a word. The third, optional, column is a comment that can safely be ignored by the program. Dictionary #0 is the first dictionary passed in at the command line, #1 is the 2nd, &c.
The first column is separated from the second column by whitespace. Everything after the second column is the third column and can safely be ignored by the program. Please note that third column might be absent and the use of the #-pounds signs in the example below is purely cosmetic.
Please consider the following example:
  0	run 	# not found
  1	walk	# found
  

Consolidated Example

Please consider the following example run with the files as shown:
  lab5 nounsDict.txt verbsDict.txt adverbsDict.txt queries.txt
  
nounsDict.txt:
desk
chair
pencil
pen
computer

verbsDict.txt:
run
jump
sit

adverbsDict.txt:
quickly
slowly
happily
recently
queries.txt:
0	desk		# found
1	desk		# not found
1	run		# found
2	quickly		# found
2	desk		# not found
      
It should produce the following results:
0	desk		found
1	desk		not found
1	run		found
2	quickly		found
2	desk		not found
  

The Test Driver

The test driver should operate as follows:
  testlab5 dictdir resultsdir querylistfile
  
dictdir specifies a directory that contains all of the dictionary files. resultsdir specifies a diretory into which all result files are written. querylistfile specifies a plain text file that contains a list of arguments to be fed to lab 5.
For example, consider the following two invocations of lab 5:
  lab5 nounsDict.txt verbsDict.txt adverbsDict.txt languagequeries.txt
  lab5 colorsDict.txt namesDict.txt animalsDict.txt toythemes.txt
  
We can create a querylist file to represent these two queries executed in this order as follows:
  nounsDict.txt verbsDict.txt adverbsDict.txt languagequeries.txt
  colorsDict.txt namesDict.txt animalsDict.txt toythemes.txt
  
For each line within the query list file, the test driver should run the query and sore the results into a new file to be created within the results directory. The results files should be named test1.txt, test2.txt, test3.txt, etc., etc., etc. The number in the file name should represent the line number from the query list file. In other words the results from the first line of the query list file should be stored within test1.txt. The results from the second line, test2.txt, etc.
Plan of Attack
The following plan of attack is purely a suggestion. Please feel free to charge forward at your pleasure:

Implement the dictionary library's create, addWord, and print methods.
Implement a simple test driver -- and test
Implement verify -- and another test set to your test driver -- and test
Implement destroy -- and verify using a debugger
Implement grow -- add another set to your test driver and test
Retest destroy -- and verify using a debugger
Implent in-order addWord -- and test it again
Implement binary search within verify -- and test it again
Implement a test driver that uses a dictionary file, if yours doesn't already
Implement a test driver that can maintain multiple dictionaries supplied on the command line and test
Implement the use of the query file -- and test
Clean up
Celebrate

We're Here To Help!
As always -- remember, we're here to help!