Counting characters of each word in file

Hi,
I am a beginner of c++ programming and I have a exercise to count characters of each word in file (.txt) and which length is most common.
sounds like a good starting problem.
What if two lengths are equally common?
Split that into smaller tasks.

* You have words in file and you have to read them. Can you do that?
(Apparently we don't need to keep them, but as test you could print each word on separate line.)

* Count characters of a word. If you use std::string to hold a word, then this is quite trivial.
(But, if you should ignore punctuation, then it gets more interesting.)

You get a list of numbers from the above. One number per word.

* Count how many times each unique value occurs in the list.

* Each word-length will get count of occurrences. How to find largest value from such list?
(Simpler case: How to find largest value from a list of numbers?)

Note that more than one length can occur as frequently.

Example: I, you, car, me, easy, them, giraffe
Lengths: 1, 3, 3, 2, 4, 4, 7

length: occurrences
1: 1
2: 1
3: 2
4: 2
7: 1

The most common are 3 and 4.
@keskiverto, Excellent explanation. I wish I had the patience/ability to explain things like that. It's definitely a skill in itself.

Punctuation can get a little hairy. It can attach to the beginning or end of a word. And then there's hyphenated words and that double-dash thing that often attaches to both words but should be seen as separating them. And should "'s" at the end of a word count as two "letters"? Or one? Or none?
I can open a file:
1
2
3
fstream file, result;
file.open("intput.txt", ios::in);
result.open("output.txt", ios::out); 

but i have never used std::string, and I don't know how to use it correctly.
Keskiverto explained it well, I can't explain it near as good but it may help to hear it from more than one source.

This is the possible approach you could take,

1) Read in from a text file, instead of using the << operator it may be more efficient to read in full lines so use getline([stream],[string]) , store the result in a temporary string - hint use a while loop
 
while(getline(input,line)){}


2) Now you can break this line into individual words using a stringstream
http://www.cplusplus.com/reference/sstream/stringstream/stringstream/

3) You will need a container to store your words in, use a vector.

4) Populate the vector WHILE strings are present in the stringstream, hint use an inner while loop

 
 while(ss >> tempWord){}


5) Now that we have broken our line into words, loop through the words if and ONLY if the vector of words is greater than 0 ( if it's 0 this is an empty line, possibly spacing from a paragraph ) ,create an integer named count outside the main while loop called letter count.

6) Add the size of each word to the variable count.

1
2
3

count += words.at(0).size();


note you will also need to check if the last letter in a word is punctuation, if so you will not count this.

7) Clear the stringstring for the next line.

1
2
3
4

ss.str("");
ss.clear();


8) Print the number of letters.


If you have any further questions , please do I ask hopefully I can help and hopefully this was of some help.
Last edited on
A simple way to keep track of the number of occurences of word size is to set up an array of say 50 integers, count[50]. I doubt whether there are too many words over 50 characters.

So if you have a giraffe with 7 characters add one to count[7], cat - add one to count[3] etc etc.

Now you can go through the list and discover all sorts of things, biggest word size, smallest (1), most frequent, least frequent (1) or (0)?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
#include <fstream>
#include <iostream>
#include <string>
using namespace std;

void menu() {
	cout << "\n";
	cout << "MENU\n";
	cout << "| 1. Count lines\n";
	cout << "| 2. Count empty lines (without text)\n";
	cout << "| 3. Count lines which are longer then input n(integer)\n";
	cout << "| 4. Count an input word\n";
	cout << "| 0. Quit\n";
	cout << "\n";
}

void countLines(string lines) {
	
	int numLines = 0;
	ifstream file;
	file.open("your_text_file.txt");
	while (!file.eof()) {
		getline(file, lines);
		numLines++;
	}
	cout << "In the file are " << numLines << " lines." << endl;
	file.close();
}

void emptyLines(string lines) {
	
	int numLines = 0;
	ifstream file;
	file.open("your_text_file.txt");
	while (!file.eof()) {
		getline(file, lines);
		if (lines.length() == 0)
		numLines++;
	}
	cout << "In the file are " << numLines << " lines without text." << endl;
	file.close();
}

void countNLines(string lines, int n) {
	
	int numLines = 0;
	ifstream file;
	file.open("your_text_file.txt");
	while (!file.eof()) {
		getline(file, lines);
		if (lines.length() > n)
		numLines++;
	}
	cout << "In the file are " << numLines << " lines longer then input length." << endl;
	file.close();
}

void countEnteredWord(string lines, string word) {
	
	int numWord = 0;
	ifstream file;
	file.open("your_text_file.txt");
	cout << "Enter the word: ";
	cin >> word;
	while (!file.eof()) {
		file>> lines;
		if (word == lines)
		numWord++;
	}
	cout << "In the file are " << numWord << " entered word." << endl;
	file.close();
}
	

int main()
{
	string lines, word, text;
	
	int options, n;
	while (true) {
		menu();
		cin >> options;
		switch (options) {
		case 0:
			cout << "Quit";
			return 0;
		case 1:
			countLines(lines);
			break;
		case 2:
			emptyLines(lines);
			break;
		case 3:
			countNLines(lines, n);
                        cout << "Enter the length of the line n(integer): ";
			cin >> n;
			break;
		case 4:
			countEnteredWord(lines, word);
			break;
		default:
			cout << "This option don't exist, try again!!!";
		}
	}
}
Last edited on
Topic archived. No new replies allowed.