Thanks for your reply.
1. Since I try to use SMILE (Structural Modeling, Inference, and Learning Engine) to do the work. In SMILE, I'll iterate over the values with DSL_dataset::At, and store the data row by row. But when the iteration is over, I want to compare the value using set::size() in one column (belong to the same variable).
2. Since I will do the discretization work firstly, I only care about the integer numbers and do the comparision work.
3. This is not a homework. I tried to find some example about how to store a dataset using SET container and use set::size to return the number of unique values in the column, but I cannot find an good example. BTW, I am learning the knowledge by myself what u said.
Well, I know nothing about SMILE or programming on a Mac, but I can babble semi-intelligently about the STL... :-,
A set is an unordered list of elements, so it does not look like the right structure to use to store your data. I would opt for a vector. Since everything is a number except for the first column (I don't know what 'high' and 'low' means...) you might as well use doubles.
1 2 3 4 5 6
// If that 'high' and 'low' stuff means anything...
constdouble high_value = 10.0;
constdouble low_value = 0.0;
typedef vector <double> post_t;
typedef vector <post_t> all_posts_t;
You could make a more advanced class if you like (which inherits from vector or some SMILE type, so long as it provides iterators to the elements).
To compare unique values in a column, you need to be able to collect them out of that column, and then you need to condense it. The STL <algorithm>s provides a number of useful functions just for this kind of thing.
#include <algorithm>
#include <iterator>
#include <set>
#include <vector>
...
usingnamespace std;
// This is my transformation functor.
// Given a column number (at construction) and a post_t (when used),
// it returns the value in the given column.
//
struct f_extract_column
{
int column_number;
f_extract_column( int column_number ): column_number( column_number ) { }
doubleoperator () ( const post_t& post )
{
return post.at( column_number );
}
};
// This function counts the number of unique values in a column by
// first copying the column values out into a std::set and then returning
// the size of the set.
//
int unique_values_in_column( all_posts_t posts, int column_number )
{
set <double> column;
transform(
posts.begin(),
posts.end(),
insert_iterator <set <double> > ( column, column.begin() ),
f_extract_column( column_number )
);
return column.size();
}
Hmm... I'm sure there is a more elegant way to do this, but just off the top of my head that's it for now...