Parallel programming working with large amounts of data

Hi,

I need to speed up execusion of code in a software i have. The process doesnt use more than 10% of my processor but its written in vc6 and now im migrating to c++ Builder 2009 or vc2008. When i do this i want to speed up the execusion of one function which does the following:

There are 2 large arrays: array[50000][3] and array2[50000][3] and using these two arrays of numbers i need to do a very intensive calculation. It takes around 3 - 5 minutes on my core i7 920 processor but only utilizes about 10-15% of the processor. It does work faster on older cpu's which arnt multicore.

Can someone recommend to me any library's i can use to take advantage of threads to speed up this process?

I do know that working on shared data is a problem but the calculation only reads and it is not a problem having a copy of the data for each thread that i execute as i have no memory limitations as its a scientific software.

Thank you for reading this, any help/advice/links will be appriciated

MC.
See the second part of my first post: http://www.cplusplus.com/forum/general/12018/
Thanks for the reply, i had OpenMP in mind but c++ builder 2009 doesnt support it althought it supports boost and there is always winapi.

Unlike that post i dont have outputs everything it just one big calculation multiple nested for loops in one function which assign the final result to a variable. This is the only part i want to execute using threads.

It looks like you have experience in this, could you advise an approach if Array1 has to be written to and Array2 is read only. Do you think it would be efficient to split array1 logically by letting one thread do the first 50% and 2nd thread work on the rest of the array?

What I usually do is let thread i start processing at element i and advance by n elements. For example, for n=4, thread 1 would process elements 1, 5, 9, ...
I find that that's easier to manage and it makes it easier to convert single-threaded code to multi-threaded.
I'm relatively new to this but wouldnt that cause data race problems? Could you link me to an example or some documentation. I havent been able to find anything usefull because im not sure what im exactly looking for.
Visual C++ 2008 Professional supports OpenMP by default. So does the Visual C++ 2005 Professional (but you have to set up compiler parameters for each project to use OpenMP:[1] select: Confituration Project-> C/C++ -> Code Generation -> Runtime Library->Multi-threaded (/MT); [2] always set the project to "Release" rather than "Debug". You are lucky to use it after steps [1] and [2] if you do not see the "VCOMP90D.DLL" error, which take extra effort to fix it. Therefore it is recommended that VC++2008 Pro be used for OpenMP. BTW:VC++2008 Express Edition does not supports OpenMP.)

Depending on the computation you want to perform on the two arrays, you need to change the current sequential computation algorithm to a parallel one (i.e., differentiating between what variables should be shared among threads and what other variables can be local (having a separate copy) to each thread. The following code illustrates the point by examples:
1
2
3
4
5
// sum up n elements of array a: sequential algorithm
int sum = 0;
int n = 5;
for (i=0; i < n; i++)
    sum += a[i];


The corresponding OpenMP code looks like follows:
********* illustrative purpose *********
1
2
3
4
5
6
7
8
9
10
11
12
int sum = 0;
int n = 5;
#pragma omp parallel shared(n,a,sum) private(sumLocal) // OpenMP clause
    {
     sumLocal = 0;
     #pragma omp for // OpenMP clause
       for (i=0; i<n; i++)
          sumLocal += a[i];
     #pragma omp critical 
       sum += sumLocal;	// only one thread does the update at one time
    } /*-- End of parallel region --*/
printf("Value of sum after parallel region: %d\n",sum);


It looks OpenMP is a perfect solution to your problem.You may pick up any OpenMP book recommended at http://openmp.org/wp/. Good Luck!
(If you PM me your code, I will see if I can work out an OpenMP solution for you).
Last edited on
You may also download free, open source, cross-platform Boost C++ library from
http://www.boost.org/

and use its thread class for the parallel computation. Specifically what you need to do is the following three things [steps (1), (2) and (3)]:

Assuming that you have worked out (1) a parallel algorithm [function prototyped as: typename T ArrayProduct(T array1, T array2)] for computing the product of array1[50000][3] and array2[50000][3]; (2) the computation algorithm ArrayProduct() satisfies the reentrant condition (multiple threads can execute it concurrently); (3) You want to have ten threads working concurrently on the computation. The Boost C++ code would look as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <boost/thread/thread.hpp>
#include <iostream>

void EveryThreadDoThis()
{
    ArrayProduct(array1, array2);
}

int main(int argc, char* argv[])
{
    boost::thread_group threads;// use boost's thread_group to
    for (int i = 0; i < 10; ++i) // create ten threads 
        threads.create_thread(&EveryThreadDoThis); // each of them does its computation concurrently
    threads.join_all();// wait until all threads complete their computation
}


It looks (and indeed is) very simple using Boost C++, but the complexity would rest on the design of the parallel algorithm ArrayProduct and deal with the synchronization among the ten threads if such an issue arises.

Hope this helps. Have fun with parallel computation!

Last edited on
Visual Studio 6 included a profiler. Why not run it thru the profiler to see where the time's being spent?
Thanks for the reply's, openMP looks really good, unfortunately for me i have to use BCB2007/2009 which means openMP is out of the question, i will probably have to use boost.

Can i instead have two threads lets say and have two different methods one iterating all EVEN numbers from 0 to n and the other all ODD from 1 to n. The calculation involves setting the 3rd column of the entire array1 and array2 is just read from.

I found some examples of using win api but read that using win api in c++ to create and terminate threads causes memory leaks? Does this hold true?

Why does the function have to be prototyped as: typename T ?

Sorry for all these questions that might sound silly to some, its hard to find one source of good information like you guys =)
->Can i instead have two threads lets say and have two different methods one iterating all EVEN numbers from 0 to n and the other all ODD from 1 to n. The calculation involves setting the 3rd column of the entire array1 and array2 is just read from.
Yes, you can.

->I found some examples of using win api but read that using win api in c++ to create and terminate threads causes memory leaks? Does this hold true?
I do not think so.

->Why does the function have to be prototyped as: typename T ?
You may choose whatever function signature you like. typename T is just my imagination.
Thank you very much all of you, i will post again if i have any further questions, c++ builder 2009 does have the boost library's. And ill let you know the results.
Is there any way to isolate the boost thread library?
What do you mean?
Boost library's unzipped are 170 megs, i dont want to have to include all that just to use boost::thread. Isnt there a way i can just get the headers to include for thread alone somewhere?
You can get (on Windows) only the modules and headers you need. On Linux, you'll have to download and compile the entire library.
Yes it is what im interested in, it's come down to using BCB2007. Do you know any place i can download this set of headers by themselves?
Thank you all, i've solved all my problems =) I should have come across boost library's earlier.
Topic archived. No new replies allowed.