Mpi send and receive

Mpi send
Last edited on
(1) Put code in code tags. Otherwise it is unreadable.

(2) Start with something simple - like sending a small 1-d array from one processor to another.

Your code as it stands is unsalvageable.
What does unslavageable means?
It means it's so broken that it's preferable to rewrite it from scratch rather than attempting to fix it.

By the way, dictionaries exist.
Last edited on
If you want a cheap and nasty version for TWO processors you can try this.

For convenience, I've flattened the arrays to 1-d, and used vectors rather than new/delete. (You can get at the data buffer with the .data() member function).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
#include <iostream>
#include <vector>
#include <numeric>
#include "mpi.h"
using namespace std;

int main( int argc, char* argv[] )
{
   int rank, nproc;
   MPI_Status stat;
   int nums[3];
   int rows, cols, r0, r1;

   // Initialise MPI
   MPI_Init( &argc, &argv );
   MPI_Comm_size( MPI_COMM_WORLD, &nproc );
   MPI_Comm_rank( MPI_COMM_WORLD, &rank  );

   int tag = 1;    // Not crucial, but useful to increment for debugging a crash

   if ( rank == 0 )                                   // Root processor
   {
      rows = 4;
      vector<double> A =                              // "Flattened" array (4 x 5 matrix)
                   { 1, 2, 3, 4, 5,
                     6, 7, 8, 9, 10,
                     11, 12, 13, 14, 15,
                     16, 17, 18, 19, 20 };
      vector<double> B = { 10, 20, 30, 40, 50 };      // RHS (5-element vector)
      vector<double> result( rows );

      // Do half the rows on the root and half on the other processor
      r0 = rows / 2;   r1 = rows - r0;   cols = B.size();
      nums[0] = r0;   nums[1] = r1;   nums[2] = cols;

      // Send data to other processor
      MPI_Send( nums, 3, MPI_INT, 1, tag++, MPI_COMM_WORLD );
      MPI_Send( A.data() + r0 * cols, r1 * cols, MPI_DOUBLE, 1, tag++, MPI_COMM_WORLD );
      MPI_Send( B.data(), cols, MPI_DOUBLE, 1, tag++, MPI_COMM_WORLD );

      // Do multiplies for the first r0 rows
      for ( int i = 0; i < r0; i++ ) result[i] = inner_product( A.begin() + i * cols, A.begin() + (i+1) * cols, B.begin(), 0.0 );

      // Receive results back for the last r1 rows from the other processor
      MPI_Recv( result.data() + r0, r1, MPI_DOUBLE, 1, tag++, MPI_COMM_WORLD, &stat );

      // print results
      for ( int j = 0; j < rows; j++ ) cout << result[j] << '\n';
   }

   else                                               // Processor 1

   {
      // Receive data; need sizes first
      MPI_Recv( nums, 3, MPI_INT, 0, tag++, MPI_COMM_WORLD, &stat );
      r0 = nums[0];   r1 = nums[1];   cols = nums[2];
      vector<double> A(r1*cols), B(cols), result(r1);
      MPI_Recv( A.data(), r1 * cols, MPI_DOUBLE, 0, tag++, MPI_COMM_WORLD, &stat );
      MPI_Recv( B.data(), cols, MPI_DOUBLE, 0, tag++, MPI_COMM_WORLD, &stat );

      // Do multiplies for the last r1 rows
      for ( int i = 0; i < r1; i++ ) result[i] = inner_product( A.begin() + i * cols, A.begin() + (i+1) * cols, B.begin(), 0.0 );

      // Send back data
      MPI_Send( result.data(), r1, MPI_DOUBLE, 0, tag++, MPI_COMM_WORLD );
   }

   MPI_Finalize();
}


With Microsoft MPI (yes, really!) and g++:
Batch file to compile:
set OPT1="C:\Program Files (x86)\Microsoft SDKs\MPI\Include"
set OPT2="C:\Program Files (x86)\Microsoft SDKs\MPI\Lib\x64\msmpi.lib"
g++ -I%OPT1% -o mult.exe mult.cpp %OPT2%


"C:\Program Files\Microsoft MPI\bin"\mpiexec -n 2 mult.exe
550
1300
2050
2800
Last edited on
Many thanks for your helpful answer,
I just don't understand this part
A.data() + r0 * cols, r1 * cols in the second send_mpi. Could you please help me with more detailed here?
I am trying to arrange it so that:
(a) root (processor 0) is the only processor that knows all the matrix;
(b) processor 0 deals with the first r0 rows of the multiply and processor 1 deals with the last r1 rows.

If the array is flattened (i.e. written out sequentially) then the first element that root must send to processor 1 is element index r0*cols (remember that arrays count from 0). A.data() is a pointer to the start of the array, so A.data()+r0*cols will point to the required element. r1 rows of cols elements then means sending r1*cols elements. This is the MPI_Send on line 38.

At the receiving end, processor 1 doesn't need to know about the whole array, so it can receive the data straight into the start of its A buffer; i.e. the pointer A.data() on line 58.
Last edited on
Topic archived. No new replies allowed.