Write a C++ program that will dynamically resize a word structure array that stores the
frequency of words found in a file. The array should start with enough space to store 2 words and
double each time more space is required.
struct word
{
char str[20];
int frequency;
};
A word will be defined as a sequence of characters separated with whitespace or a combination
of punctuation and a space. Our words will be case insensitive. So, make sure that all
characters are changed to lowercase before entering them into your array. After counting the
frequency of words located in the file, the array should be ordered alphabetically. Then, you need
to display the contents of the word array followed by the size (number of elements) of your word
structure array.
You may not use any of the cstring or string libraries. If you desire to use a function from one of
these libraries, you must write your own version. You can use the functions found in the ctype
library like isspace(), ispunct(), and tolower().
Sample File:
This is a simple test. You
should test more thoroughly than
this.
Sample Output
Enter a filename:
simple.txt
a 1
is 1
more 1
should 1
simple 1
test 2
than 1
this 2
thoroughly 1
you 1
array size: 16
I am having trouble with figuring out how to count the frequency of words...what
are the steps in which i can do this?(how can i anayze word by word and then count the frequency?)
What you are being asked to do is create a histogram, which is just a list containing two pieces of information: a word and the number of times that word appears.
struct word
{
char str[20];
int frequency;
};
word* histogram; // I am a dynamic array
int histogram_size = 0; // This is the number of words available in the array
int histogram_used = 0; // This is the number of words used in the array
// You'll need a function to grow your histogram array
void resize_histogram()
{
// Create a newer, bigger array
word* new_histogram = new word[ histogram_size * 2 ];
/* copy all the stuff from histogram[] to new_histogram[] */
...
// swap the arrays and destroy the old
delete [] histogram;
histogram = new_histogram;
// don't forget to remember the new size
histogram_size *= 2;
}
// You'll need a function to see if your array has a word in it
int find_index_of_word_in_histogram( constchar* s )
{
/* look through all the used elements in the histogram to find the index of s */
/* If s is in the histogram, return it's index. */
/* If s is NOT in the histogram, return -1 */
}
// You'll need a function to update the histogram:
// If the histogram has the word in it, increment the word's count (or frequency).
// If the histogram does not have the word in it, add it to the histogram and set its count to 1.
void add_word_to_histogram( constchar* s )
{
/* Use the find_index_of_word_in_histogram( s ) function to find a word. */
/* If the word is not found, make sure the histogram has enough space and add it to the end. */
/* Use the resize_histogram() function if necessary. */
...
}
int main()
{
// Initialize the histogram: space for two words, but no words used.
histogram = new word[ 2 ];
histogram_size = 2;
// Here's where we'll keep the word we read from file:
char str[20];
while (/* get a word from file */)
{
/* add the word to the histogram */
}
/* sort the histogram array */
sort_histogram();
/* print the histogram */
for (int n = 0; n < histogram_used; n++)
{
cout << histogram[ n ].str << " " << histogram[ n ].frequency << "\n";
}
// Don't forget to free your histogram
delete [] histogram;
return 0;
}