Hashing way of bounded non-sequential in

Hashing way of bounded non-sequential index for contiguous data structure

Dear experts

I am amazed at performance promotion by using contiguous data structure (std::vector) instead of linked data structures.

I often have no choice but to use bounded non-sequential index (e.g. 1,5, 10, 12 ..) for data collections. In such a situation, associative array is well-known technique as the below,

std::unordered_map<int, Data> dat;
dat[1]= Data();
dat[5]= Data();
// ... and so on
// and loop access to dat

When I change it into contiguous way, I can obtain dramatic speed up for (sequential) accessing each element (probably due to cache efficiency)

std::map<int, int> index;
std::vector<Data> dat;

int ind = 0;
index[1] = ind; ind++; dat.push_back(Data());
index[5] = ind; ind++; dat.push_back(Data());
// ... and so on
// and loop access to dat via index;

My question is, is there more smart way to construct bounded non-sequential index mapping to sequential index, not by std::map(or std::unordered_map) but by cheaper way (such as hash functions).

(index range is bounded R: [min_index : max_index] (interval is irregular), R->[0,1,2,....])

Kind regards

Last edited on

dutch (2548)

not by std::map (or std::unordered_map) but by cheaper way (such as hash functions).

std::unordered_map is a hash table.

Mitsuru (156)

Thank you for your reply.

std::unordered_map is a hash table.

Yes, it is. std::unordered_map is hash techiniques (1) hasing "key" by hash function (2) process hash table (collision check/store etc).

I wonder if my first way is to do something with unnecessary process (e.g. is hash table needed, otherwise just hash function works well? do more suitable hash table exist?).

I appologize for my ambiguous questions. But, I think this indexing problems seem common to programming. I wonder if there is more sophisticated way.

Kind regards

JLBorges (13770)

Have you considered using a sparse representation of the matrix / vector?
For instance, using Eigen: https://eigen.tuxfamily.org/dox-devel/group__TutorialSparse.html

Some other libraries: https://en.wikipedia.org/wiki/Sparse_matrix#Software

Mitsuru (156)

Dear JLBorges

Thank you for your profound reply.

I am willing to consider the data structure, compressed sparse row (CSR) as you told.

BTW, I have noticed that some (originally) linked data structures are proposed to be represented as contiguous form, which not only save memory space, but also lead to efficient computations based on sequential & cache efficiency. But, such a contiguous form seems to entail usage limitations (e.g. size of elements is previously known, dynamic construction is inefficient, random access is inefficient, just for sequential access etc.)

I would like to be able to use them properly (with knowledge of trade-off and limitation) in case-by-case manner.
(If someone knows summary or review of such a data structure, I hope you tell me)

Kind regards

Last edited on

Topic archived. No new replies allowed.

Hashing way of bounded non-sequential index for contiguous data structure

C++

Forum