how to find mode from a vector of ints

Hi everyone!

I have been using this site for a bit now and I can't do any studying without it. I am just finishing a C++ course so I am somewhat versed in C++.

My problem is finding the mode of a vector of ints. The user enters values in any order, positive or negative, that are not doubles and I would have the program display out other mathematical information as well as the mode. I found some code on this site that kind of helped but I had to do some revisions as it did not work.

The part of code that is giving me trouble is after the vector iterator.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
	copy(istream_iterator<double>(cin), istream_iterator<double>(), back_inserter(v));
	sort(v.begin(), v.end());

	double mode = 0;
	double curVal = 0;
	double localFreq = 0;
	double maxFreq = 0;

	for( vector<double>::const_iterator iter = v.begin(); iter != v.end(); ++iter ) {
		sum = sum + *iter;
		
		if( curVal != *iter ) {
			curVal = *iter;
			localFreq = 0;
		}

		while( *iter == curVal ){
			localFreq++;
			break;
		}

		if( localFreq > maxFreq ) {
			maxFreq = localFreq;
			mode = curVal;
		}

		cout << *iter << "\t= " << sum << endl;
	}


And I display output further down in the code as follows.
1
2
3
4
5
	// Mode
	if( localFreq == maxFreq )
		cout << "\nThere is no mode\n" << endl;
	else
		cout << "\nMode = " << mode << "\n" << endl;


Now this code 'works' but only under certain circumstances. If you were to type in 1 1 2 2 3 as the input the mode = 1, which is not true, there is no mode. And if you were to input 1 2 3 3 it would display there is no mode, which there is.

Any help is appreciated, and thank you!
Your code doesn't find a mode if the numbers aren't sorted already. You should create a something that maps each number to the number of times it appears in the vector and then find the one with the highest number.
The numbers are sorted, line 2 sort's the numbers in the vector, I have tested it to make sure it does. But if I where to map the numbers how would I go about doing that?
OK, I've done a lot of work and got it to work, well mostly. It is a way different format than what i had before.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
	double mode;

	vector<double> v;
	vector<int> vOccurrence;

	set<double> occurSet;
	
	copy(istream_iterator<double>(cin), istream_iterator<double>(), back_inserter(v));
	sort(v.begin(), v.end());

	vOccurrence.resize(v.size());

	for( vector<double>::const_iterator iter = v.begin(); iter != v.end(); ++iter ) {
		sum = sum + *iter;//used for mean

		// vOccurrence will be used later to find the mode
		for( size_t i = 0; i < v.size(); ++ i) {
			if( *iter == v[i] )
				++vOccurrence[i];
		}
	}


The code above runs through the user's input vector(v), and at each iteration of that vector(v) it will go through the vector(v) again checking how many time a number occurs and storing the occurrence in another vector(vOccurrence).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
	// Mode
	for( vector<int>::const_iterator iter = vOccurrence.begin(); iter != vOccurrence.end(); ++iter ) {
		if( *iter > maxOccur ) {
			maxOccur = *iter;
		}
	}
	for( size_t i = 0; i < vOccurrence.size(); ++i) {
			if( vOccurrence[i] == maxOccur ) {
				mode = v[i];
				occurSet.insert(mode);
			}
	}
	cout << "\nMode =";
	for( set<double>::const_iterator iter = occurSet.begin(); iter != occurSet.end(); ++iter )
		cout << " " << *iter << endl;


The code above runs through the vOccurrence vector finding the biggest number, which would be the most occurring number. Since both v and vOccurrence are the same size both their elements correspond.

Meaning: v[0] and vOccurrence[0], v containing the users first input, vOccurrence holding the number of occurrences there were.

But since vOccurrence would have multiple doubles i put the highest occurring user inputs into a set, getting rid of doubles. So this will output a single mode, or if there are multiple, it will display multiple.

i.e. users input: 1 1 2 3 4 4 5 6 7 7 output: Mode = 1 4 7

With that ironed out i still have to display the possibility of no mode. As it stands now if the user inputs: 1 2 3 4 the output will be Mode = 1 2 3 4. Which is not correct. Again any help is appreciated!
i am planning now of making a program that will do the love game "flames"
david leonard wrote:
Which is not correct.

It seems correct to me. 1, 2, 3, and 4 all appear the most frequently in the data set. I don't think you could create a set of data without a mode outside of an empty set.
I agree with Zhuge, but if you still want to prevent that behavior, you can keep track of the highest frequency, and clear() the vector if it turns out to be 1.
@dillu333: Could you please try to contribute better to discussions, instead of basically spamming threads with a vaguely related, barely readable comment? And this forum does not have a signature option for a reason; I don't want to see your website ad.
Don't bother. It's a bot.
I understand the point that Zhuge made, and it is correct but unfortunately that's not how i am supposed to submit it. If the numbers given are single in occurrence i would have to display no mode. Now what i am thinking of doing is going through the occurrence vector and if all values are == to 1 than there is no mode, because each number has occurred once(thank you to helios). I will update you once it get it working.

Thanks for the input
So i now have the mode working great, thank you again. Now the final step in this project is a frequency distribution graph. I have figured out the code for finding the increment value to be used but now i'm just stuck. I have to display 10 ranges of whatever the data set is, sort the number of occurrences the number has in the range specified and then give a percent of how many numbers are in that range.

e.x.

if user input is: 1 2 3 4 5
output would be:
[1.00..1.50) = 1 : 0.20
[1.50..2.00) = 0 : 0.00
[2.00..2.50) = 1 : 0.20
[2.50..3.00) = 0 : 0.00
[3.00..3.50) = 1 : 0.20
[3.50..4.00) = 0 : 0.00
[4.00..4.50) = 1 : 0.20
[4.50..5.00) = 0 : 0.00
[5.00..5.50) = 1 : 0.20
[5.50..6.00) = 0 : 0.00

i have the increment value code here
1
2
	// frequency distribution
	double increment = ( ( max + 1 ) - min ) / 10;


Any ideas of how i go about doing this? I have a vague idea that i am trying to work with but i don't think it will work. So please again any help is appreciated.
So i got the frequency graph to work. It took some creative thinking, and personally i think it is a little messy but it does do the job... i think. If people could please look over what i have written and comment on anything i should change or if something seems wrong please let me know.

I know that some people on this site are in my class, i have seen similar posts on questions relating to the project(s) that are in the same time period as the assignment. So if you intend to use the code i have provided please please please make sure that you UNDERSTAND what i have written, and please change the variable names.

I will be posting the whole program after the assignment due date if anyone was interested in the whole program, or just want something to reference to help them. I am not the greatest coder, but i would like to share my work as an AID for people

Anyway, now that i've said my piece this is the frequency graph code to look over.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
	double range = ( ( max + 1 ) - min ) / 10;
	prevVal = range + min;

	for( int i = 0; i < 10; ++i ) {
		c = 0;

		if( i == 0 ) {
			cout << "[" << min << ".." << prevVal << ")";

			for( vector<double>::const_iterator iter = v.begin(); iter != v.end(); ++iter )
				if( *iter < prevVal )
					c = c + 1;

			cout << " = " << c << " : " << c / v.size()  << endl;
		}
		else {
			cout << "[" << prevVal << "..";
			nextVal = range + prevVal;
			cout << nextVal << ")";

			for( vector<double>::const_iterator iter = v.begin(); iter != v.end(); ++iter )
				if( *iter >= prevVal && *iter < nextVal)
					c = c + 1;

			if( c >= 1 )
				cout << " = " << c << " : " << c / v.size() << endl;
			else
				cout << " = " << 0 << " : " << 0 << endl;
			
			prevVal = range + prevVal;
			
		}
	}


Thank you to everyone who commented, although not to much help i appreciate anyone who contributed to help solve my problems. I will use this site in the future and plan on helping others... if i can.

Cheers!
Hey man great stuff i am really stuck on the mode still... i am very close but dont know why it is not working.. Can you please email me your code? I will not in any way shape or form steal it from you i just want to understand how you have done it.. I have been stuck on this for 8 hours now... Anyways thanks for all the help my email is bend0verplz@hotmail.com (The 0 is a zero!) if you decide to send me it.

Thanks a lot,

Jackson
Last edited on
Ok so i have figured out how to get the mode and the no mode... Now i am stuck on multi mode every time i run the program is gives me no mode when there are more than 1.. Any ideas?
Jackson, i am really sorry i didn't respond last night, i hope you were able to find the help. But here is the code for anyone who was interested.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
#include <iostream>
#include <algorithm>
#include <vector>
#include <string>
#include <math.h>
#include <set>
using namespace std;

int main() {
	cout << "Statistics, <c> 2010 David Leonard\n" <<
		"Enter integer values separated by white spaces, when done enter ^Z ( Ctrl + Z ).\n" << endl;

	double sum = 0;
	double variance = 0;
	double sDeviation = 0;
	double maxOccur = 0;
	double mode;	
	int ckOccur = 0;

	vector<double> vUserInput;
	vector<int> vOccurrence;

	set<double> occurSet;

	copy(istream_iterator<double>(cin), istream_iterator<double>(), back_inserter(vUserInput));
	sort(vUserInput.begin(), vUserInput.end());
	
	double min = vUserInput.front();
	double max = vUserInput.back();

	vOccurrence.resize(vUserInput.size());

	for( vector<double>::const_iterator iter = vUserInput.begin(); iter != vUserInput.end(); ++iter ) {
		sum = sum + *iter;//used for mean

		// vOccurrence will be used later to find the mode
		for( size_t i = 0; i < vUserInput.size(); ++ i) {
			if( *iter == vUserInput[i] )
				++vOccurrence[i];
		}
	}		

	// N/min/max values
	cout << "\nResults:"
			"\nN = " << vUserInput.size() << 
			"\nMin = " << min <<
			"\nMax = " << max << endl;
	//end n/min/max values

	// Arithmetic mean
	double mean = sum / vUserInput.size();
	cout << "Arithmetic Mean = " << mean << endl;
	//end mean

	//Statistical Median
	if( vUserInput.size() % 2 == 1 )
		cout << "Statistical Median = " << vUserInput[vUserInput.size()/2] << endl;
	else if( vUserInput.size() % 2 == 0 ){
		double medianEle1 = vUserInput[(vUserInput.size() / 2) - 1];
		double medianEle2 = vUserInput[vUserInput.size() / 2];
		double median = (medianEle1 + medianEle2)/2;
		cout << "Statistical Median = " << median << endl;
	}//end statistical median

	// Variance
	for( vector<double>::const_iterator iter = vUserInput.begin(); iter != vUserInput.end(); ++iter ) {
		variance = pow( (*iter - mean), 2 ) + variance;
	}
	variance = variance / vUserInput.size();
	cout << "Variance = " << variance << endl;
	//end variance

	// Standard deviation
	sDeviation = sqrt( variance );
	cout << "Standard Deviation = " << sDeviation << endl;
	//end standard deviation

	// Mode
	for( vector<int>::const_iterator iter = vOccurrence.begin(); iter != vOccurrence.end(); ++iter ) {
		if( *iter == 1 )//if the vector is filled with 1's it means each number has occurred only once, therefore no mode.
			ckOccur = ckOccur + 1;
		else if( *iter > maxOccur )
			maxOccur = *iter;
	}

	for( size_t i = 0; i < vOccurrence.size(); ++i) {
			if( vOccurrence[i] == maxOccur ) {
				mode = vUserInput[i];
				occurSet.insert(mode);
			}
	}

	if( ckOccur == vOccurrence.size() )
		cout << "\nThere is no mode\n" << endl;
	else {
		cout << "\nMode =";
		for( set<double>::const_iterator iter = occurSet.begin(); iter != occurSet.end(); ++iter )
			cout << " " << *iter;
		cout << "\n" << endl;
	}//end mode

	// frequency distribution
	double prevVal = 0;
	double nextVal = 0;
	double c = 0;

	double range = ( ( max + 1 ) - min ) / 10;
	prevVal = range + min;

	for( int i = 0; i < 10; ++i ) {
		c = 0;//zero out c so that we can start a new count

		if( i == 0 ) {
			cout << "[ " << min << " . . " << prevVal << " )";

			for( vector<double>::const_iterator iter = vUserInput.begin(); iter != vUserInput.end(); ++iter )
				if( *iter < prevVal )
					c = c + 1;

			cout << "\t= " << c << " : " << c / vUserInput.size()  << endl;
		}
		else {
			cout << "[ " << prevVal << " . . ";
			nextVal = range + prevVal;
			cout << nextVal << " )";

			for( vector<double>::const_iterator iter = vUserInput.begin(); iter != vUserInput.end(); ++iter )
				if( *iter >= prevVal && *iter < nextVal)
					c = c + 1;

			if( c >= 1 )
				cout << "\t= " << c << " : " << c / vUserInput.size() << endl;
			else
				cout << "\t= " << 0 << " : " << 0 << endl;
			
			prevVal = range + prevVal;
			
		}
	}//end frequency distribution

	cout << "\n" << endl; //giving some space between the two graphs

	// histogram
	prevVal = 0;
	nextVal = 0;
	c = 0;

	range = ( ( max + 1 ) - min ) / 10;
	prevVal = range + min;

	for( int i = 0; i < 10; ++i ) {
		c = 0;//zero out c so that we can start a new count

		if( i == 0 ) {
			cout << "[ " << min << " . . " << prevVal << " )\t";

			for( vector<double>::const_iterator iter = vUserInput.begin(); iter != vUserInput.end(); ++iter )
				if( *iter < prevVal )
					cout << '*';

			cout << endl;
		}
		else {
			cout << "[ " << prevVal << " . . ";
			nextVal = range + prevVal;
			cout << nextVal << " )\t";

			for( vector<double>::const_iterator iter = vUserInput.begin(); iter != vUserInput.end(); ++iter )
				if( *iter >= prevVal && *iter < nextVal)
					cout << '*';

			cout << endl;
			
			prevVal = range + prevVal;			
		}

	}//end histogram

}//end main 
No problem! Thanks for the post onto the mp3 project lol it seems a LOTTTTT harder...
Topic archived. No new replies allowed.