Derefrencing Null pointer and buffer overrun warnings

Hello!

I have the following code that works properly most of the time but rarely outputs garbage. I suspect it has something to do with the memory allocation and pointer. Any help to address the issue would be appreciated. I do get the warnings related to "C6011 Derefrencing Null pointer" and "C6386 buffer overrun". How can I allocate memory for the cmplx type in a smarter way and address the Null pointer issue?



struct arrdata CPA(cmplx* arrC, int csize) {

int size = (csize % 2 == 1 ? (csize-1)/2 : (csize-2)/2);

/// Allocate memory
cmplx** matX = (cmplx**) malloc(sizeof(cmplx*) * size);
for (int i = 0; i < size; i++)
matX[i] = (cmplx*) malloc(sizeof(cmplx) * size);
zeroCmplxMatrix(matX, size, size);

cmplx* vecX = (cmplx*) malloc(sizeof(cmplx) * size);
zeroCmplxArray(vecX, size);
cmplx* arrB = (cmplx*) malloc(sizeof(cmplx) * (size+1));
zeroCmplxArray(arrB, size+1);
cmplx* arrA = (cmplx*) malloc(sizeof(cmplx) * (size+1));
zeroCmplxArray(arrA, size+1);

for (int i = 0; i < size; i++) {
for (int j = 0; j < size; j++)
matX[i][j] = arrC[i+j+1];

vecX[i] = -arrC[i+size+1];
}

struct lapackdata lupade = ctrlLapackPade(matX, size);
ctrlLapackSolve(lupade, vecX, size);

arrB[0] = 1;
for (int i = 1; i <= size; i++)
arrB[i] = vecX[size-i];

arrA[0] = arrB[0]*arrC[0];
arrA[1] = arrB[0]*arrC[1] + arrB[1]*arrC[0];

for (int i = 2; i <= size; i++)
for (int j = 0; j < i+1; j++)
arrA[i] += arrB[j]*arrC[i-j];

/// Free resources
free(lupade.indP);
free(lupade.matLU);
free(vecX);
for (int i = 0; i < size; i++)
free(matX[i]);
free(matX);


struct arrdata adata;
adata.arrA = arrA;
adata.arrB = arrB;
adata.length = size+1;

return adata;
}
Last edited on
Use code tags.
Provide complete compileable code.
Your snippet may leak memory - where do you free the memory for arrA and arrB?

You do realise that there is a <complex> type in c++?
And a <vector> container?
Last edited on
Is this C code?

When you use malloc() you need to check the returned value for NULL to check that memory has actually been allocated.
Am I not allocating memory for the <complex> type in c++? This is one struct among many so there is no way to send the whole compileable code but I have traced the issue to this struct. @lastchance: Can you please help me define and allocate these variables in a smarter way?

How can I check with malloc to see if memory is allocated? Yes this is primarily a C code but I have used some features of c++ such as the cmplx type.

Here is how I set the values to zero for the complex array and matrix.
void zeroCmplxArray(cmplx* arrA, int size) {
for (int i = 0; i < size; i++)
arrA[i] = cmplx(0,0);
}


void zeroCmplxMatrix(cmplx** matA, int rows, int cols) {
for (int i = 0; i < rows; i++)
for (int j = 0; j < cols; j++)
matA[i][j] = cmplx(0,0);
}
Last edited on
I have used some features of c++ such as the cmplx type.


There's no such type.

complex<double> maybe? Or have you aliased to that?


If you use c++ features like std::vector then you won't have to worry about allocating and deallocating. And it will probably throw an exception if it can't allocate.

Run it through a debugger and see what line it actually fails at, and what the values of variables or status of pointers are then.
The challenge is that it does not fail. It "executes" the code and just output some nonsensical result. The nonsense happens exactly inside that Struct. Before that everything is done correctly. That is why there are some memory allocation issue or things are being overwritten. This works 99% of time. Only under a very rare choice of csize(=15) that makes size=7 this happens. Does this number 7 ring any bell why things fall apart?

Can you please provide an example and be more specific here on how I can use std::vector instead of all malloc and freeing?
Last edited on
What are you actually trying to do with your code?

I see vague references to linear algebra libraries (lapack) and what I suspect involves permutation and LU decomposition. So what is the target?
That section of the code does the following:

Take a vector and an integer as input and then solves a linear system of the general form Ax=b. It constructs the matrix A and the rightside vector b based on the input vector arrC and solves for the x. Here input is arrC and x is arrB. Once arrB is obtained it constructs the second vector arrA. Those two vectors are the outcome of these computation. Lapack is just for LU factorization to solve the linear system of the general form Ax=b.


Here is some comments above respective lines:

/// Assign values
for (int i = 0; i < size; i++) {
for (int j = 0; j < size; j++)
matX[i][j] = arrC[i+j+1];

vecX[i] = -arrC[i+size+1];
}

/// Solve matrix equation
struct lapackdata lupade = ctrlLapackPade(matX, size);
ctrlLapackSolve(lupade, vecX, size);

/// Denominator - arrB
arrB[0] = 1;
for (int i = 1; i <= size; i++)
arrB[i] = vecX[size-i];

/// Numerator - arrA
/// a0 = b0*c0
arrA[0] = arrB[0]*arrC[0];
/// a1 = b0*c1 + b1*c0
arrA[1] = arrB[0]*arrC[1] + arrB[1]*arrC[0];

for (int i = 2; i <= size; i++)
for (int j = 0; j < i+1; j++)
arrA[i] += arrB[j]*arrC[i-j];
Last edited on
Your post is very unclear.

It looks like you are inputting a single array arrC, from which you are constructing the matrix "A" (your array matX[][]) and the RHS "b" (your array vecX). They aren't "general". You also create the [i] component of vecX multiple times.

Your matrix is symmetric, so you might be better using Cholesky factorisation, not LU factorisation.

I have no idea how you are inverting your lower-triangular and upper-triangular system. This looks very odd. It should be a trivial back-substitution.

You are also freeing resources in lupade that you didn't allocate in the first place.
Last edited on
You are raising a number of good questions that might be relevant to the issue or not. First the algorithm is a classical way of constructing rational functions from power series. So it is absolutely correct and I have used this code for a long time. This is a rare issue that I just discovered after testing this code for many months.

Your point on Cholesky versus LU valid but LU works and I can visit that later. I doubt the issue is due to LU but I cannot rule it out completely.

I suggest we start from memory allocation. How do you suggest we allocate memory to the complex vectors arrA,arrB, vecX and the complex matrix matX in a robust fashion?

Let's fix that and then we move forward to address any remaining "bad coding" if the issue persists?

BTW, vecX[i] is defined once for each i inside the for loop. So not clear if I understood your point.
for (int i = 0; i < size; i++) {
for (int j = 0; j < size; j++)
matX[i][j] = arrC[i+j+1];

vecX[i] = -arrC[i+size+1];
}

Last edited on
@sinasb,
Put your code in code tags (first item in the format menu); then it will be readable and we will be able to see from indentation where loops start and end.

There is no built-in type cmplx in C or C++; how have you defined this?

Are you calculating Pade approximants? Maybe say so if that is the case, as people will then have an idea what you are attempting to calculate.

For every malloc there should be a free. That is not the case in your code, so you will have to explain where the memory assigned is freed. For example, arrA and arrB. Of course, they may be freed in code that you have not shown us. Similarly, you didn't allocate lupade.indP or lupade.matLU in this snippet, but you do try to free them. If you leak memory then eventually you will run out of memory and allocations won't happen.

Use a debugger (like gdb). It will tell you exactly where the code failed and the values of variables/status of pointers at those points.

There exist tools like valgrind that will tell you if you are leaking memory.

To be honest, I would rewrite your code in c++ and use std::vector and std::complex<double>. There are also other free matrix libraries which will provide you with linear solvers.

It is very difficult to debug code that we cannot compile and run for ourselves.
Last edited on
Here you go, @sinasb. Help yourself. You can use it for either double or complex<double>. There are better solvers (e.g. PLU), but for small N Gaussian elimination is fine.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
#include <iostream>
#include <iomanip>
#include <vector>
#include <complex>
#include <algorithm>
#include <cmath>
#include <cassert>
#include <cstdlib>
using namespace std;

const double SMALL = 1.0e-30;

// Prototype functions
template<typename T> bool solve( vector< vector<T> > A, vector<T> B, vector<T> &X );
template<typename T> T polynomial( const vector<T> &P, T z );


//======================================================================


// Pade approximant [N][N] class
template<typename T>
class Pade
{
public:
   int N;
   vector<T> A, B;           // polynomials in numerator and denominator, both of degree N

   Pade( const vector<T> &C );
   T evaluate( T z ) { return polynomial<T>( A, z ) / polynomial<T>( B, z ); }
};


template <typename T>
Pade<T>::Pade( const vector<T> &C )
{
   assert( C.size() % 2 );     // expect C to have degree 2N; i.e. 2N+1 coefficients
   N = C.size() / 2;
   A = vector<T>(1+N);
   B = vector<T>(1+N);

   vector< vector<T> > M( N, vector<T>( N ) );
   vector<T> RHS( N );
   for ( int i = 0; i < N; i++ )
   {
      for ( int j = 0; j < N; j++ ) M[i][j] = C[N+i-j];
      RHS[i] = -C[N+i+1];
   }
   vector<T> X( N );
   if ( !solve( M, RHS, X ) )
   {
      cerr << "Sorry. Stuffed.\n";
      exit( 1 );
   }
   copy( X.begin(), X.end(), B.begin() + 1 );

   B[0] = 1.0;
   for ( int i = 0; i <= N; i++ )
   {
      for ( int j = 0; j <= i; j++ ) A[i] += B[j] * C[i-j];
   }
}


//======================================================================


int main()
{
   const int N = 7;
// using TYPE = complex<double>;
   using TYPE = double;

   vector<TYPE> C(2*N+1);

   // Let's try exp(); I can manage that
   C[0] = 1;
   for ( int j = 1; j <= 2 * N; j++ ) C[j] = C[j-1] / ( j + 0.0 );

   Pade<TYPE> pade( C );

   cout << "Table of approximants to exp(z) for Pade approximant [N][N] with N = " << pade.N << "\n\n";
   // Compare
   for ( int i = 0; i <= 10; i++ )
   {
      TYPE z = i * 0.5;
      cout << setw( 20 ) << z << setw( 20 ) << exp( z ) << setw( 20 ) << pade.evaluate( z ) << '\n';
   }
}


//======================================================================


template<typename T> bool solve( vector< vector<T> > A, vector<T> B, vector<T> &X )
//--------------------------------------
// Solve AX = B by Gaussian elimination
//--------------------------------------
{
   int n = A.size();

   // Row operations for i = 0, ,,,, n - 2 (n-1 not needed)
   for ( int i = 0; i < n - 1; i++ )
   {
      // Pivot: find row r below with largest element in column i
      int r = i;
      double maxA = abs( A[i][i] );
      for ( int k = i + 1; k < n; k++ )
      {
         double val = abs( A[k][i] );
         if ( val > maxA )
         {
            r = k;
            maxA = val;
         }
      }
      if ( r != i )
      {
         for ( int j = i; j < n; j++ ) swap( A[i][j], A[r][j] );
         swap( B[i], B[r] );
      }

      // Row operations to make upper-triangular
      T pivot = A[i][i];
      if ( abs( pivot ) < SMALL ) return false;            // Singular matrix

      for ( int r = i + 1; r < n; r++ )                    // On lower rows
      {
         T multiple = A[r][i] / pivot;                     // Multiple of row i to clear element in ith column
         for ( int j = i; j < n; j++ ) A[r][j] -= multiple * A[i][j];
         B[r] -= multiple * B[i];
      }
   }
   if ( abs( A[n-1][n-1] ) < SMALL ) return false;         // Singular matrix

   // Back-substitute
   X = B;
   for ( int i = n - 1; i >= 0; i-- )
   {
      for ( int j = i + 1; j < n; j++ )  X[i] -= A[i][j] * X[j];
      X[i] /= A[i][i];
   }

   return true;
}

//======================================================================

template<typename T> T polynomial( const vector<T> &P, T z )
{
   T result{};
   for ( int i = P.size() - 1; i >= 0; i-- ) result = P[i] + result * z;
   return result;
}

//====================================================================== 


Table of approximants to exp(z) for Pade approximant [N][N] with N = 7

                   0                   1                   1
                 0.5             1.64872             1.64872
                   1             2.71828             2.71828
                 1.5             4.48169             4.48169
                   2             7.38906             7.38906
                 2.5             12.1825             12.1825
                   3             20.0855             20.0855
                 3.5             33.1155             33.1155
                   4             54.5982             54.5982
                 4.5             90.0171             90.0173
                   5             148.413             148.415

Last edited on
@lastchance: Thanks for sharing the code. I just realized that you have a strong numerical analysis background. Have you done calculations with large N in higher precision using c++? When N grows precision has to increase inevitably at some point or the approximation would be inaccurate. I need some time to go through your code and do some comparison but I am glad that I have met someone in this forum who knows Pade :)

And sorry for the confusion regarding 'cmplx'. Yes we are using the same complex type. Here is what I had done. Apologies!

1
2
3
#include <complex>
using namespace std;
typedef complex<double> cmplx;
Have you done calculations with large N in higher precision using c++?

If N is large then it may be worth going for a better matrix solver.

The version below uses LU factorisation, with row permutation for stability.

I've set the degree N to 5 so that you can compare the numerator and denominator polynomial coefficients for exp(z) with those given in Wikipedia: https://en.wikipedia.org/wiki/Padé_approximant
They compare OK.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
#include <iostream>
#include <iomanip>
#include <vector>
#include <complex>
#include <algorithm>
#include <cmath>
#include <cstdlib>
#include <cassert>
using namespace std;

const double SMALL = 1.0e-30;

// Prototype functions
template<typename T> bool solveLU( vector< vector<T> > A, vector<T> B, vector<T> &X );
template<typename T> T polynomial( const vector<T> &P, T z );


//======================================================================


// Pade approximant [N][N] class
template<typename T>
class Pade
{
public:
   int N;
   vector<T> A, B;           // polynomials in numerator and denominator, both of degree N

   Pade( const vector<T> &C );
   T evaluate( T z ) { return polynomial<T>( A, z ) / polynomial<T>( B, z ); }
};


template <typename T>
Pade<T>::Pade( const vector<T> &C )
{
   assert( C.size() % 2 );     // expect C to have degree 2N; i.e. 2N+1 coefficients
   N = C.size() / 2;
   A = vector<T>(1+N);
   B = vector<T>(1+N);

   vector< vector<T> > M( N, vector<T>( N ) );
   vector<T> RHS( N );
   for ( int i = 0; i < N; i++ )
   {
      for ( int j = 0; j < N; j++ ) M[i][j] = C[N+i-j];
      RHS[i] = -C[N+i+1];
   }
   vector<T> X( N );
   if ( !solveLU( M, RHS, X ) )
   {
      cerr << "Sorry. Stuffed.\n";
      exit( 1 );
   }
   copy( X.begin(), X.end(), B.begin() + 1 );

   B[0] = 1.0;
   for ( int i = 0; i <= N; i++ )
   {
      for ( int j = 0; j <= i; j++ ) A[i] += B[j] * C[i-j];
   }
}


//======================================================================


int main()
{
   const int N = 5;
// using TYPE = complex<double>;
   using TYPE = double;

   vector<TYPE> C(2*N+1);

   // Let's try exp(); I can manage that
   C[0] = 1;
   for ( int j = 1; j <= 2 * N; j++ ) C[j] = C[j-1] / ( j + 0.0 );

   Pade<TYPE> pade( C );

   cout << "Table of approximants to exp(z) for Pade approximant [N][N] with N = " << pade.N << '\n';
   cout << "Polynomial coefficients in numerator  : ";   for ( auto e : pade.A ) cout << e << "  ";   cout << '\n';
   cout << "Polynomial coefficients in denominator: ";   for ( auto e : pade.B ) cout << e << "  ";   cout << "\n\n";
   // Compare
   for ( int i = 0; i <= 10; i++ )
   {
      TYPE z = i * 0.5;
      cout << setw( 20 ) << z << setw( 20 ) << exp( z ) << setw( 20 ) << pade.evaluate( z ) << '\n';
   }
}


//======================================================================


template<typename T> bool solveLU( vector< vector<T> > A, vector<T> B, vector<T> &X )
// Solve Ax = B by (permuted) LU factorisation
{
   int n = A.size();

   vector< vector<T> > L( n, vector<T>( n ) );
   vector< vector<T> > U( n, vector<T>( n ) );

   for ( int i = 0; i < n; i++ )
   {
      // Pivot: find row r below with largest element in column i
      int r = i;
      double maxA = abs( A[i][i] );
      for ( int k = i + 1; k < n; k++ )
      {
         double val = abs( A[k][i] );
         if ( val > maxA )
         {
            r = k;
            maxA = val;
         }
      }

      if ( r != i )
      {
         for ( int j = 0; j < n; j++ )
         {
            swap( A[i][j], A[r][j] );
            if ( j < i ) swap( L[i][j], L[r][j] );
         }
         swap( B[i], B[r] );
      }

      // ith row -> U
      for ( int j = i; j < n; j++ )
      {
         U[i][j] = A[i][j];
         for ( int k = 0; k < i; k++ ) U[i][j] -= L[i][k] * U[k][j];
      }

      if ( abs( U[i][i] ) < SMALL ) return false;          // can't factorise

      // ith column -> L
      for ( int j = i; j < n; j++ )
      {
         L[j][i] = A[j][i];
         for ( int k = 0; k < i; k++ ) L[j][i] -= L[j][k] * U[k][i];
         L[j][i] /= U[i][i];
      }
   }

   // Solve lower-triangular system LY = B, where Y = UX
   vector<T> Y = B;
   for ( int i = 0; i < n; i++ )
   {
      if ( abs( L[i][i] ) < SMALL ) return false;          // L is singular
      for ( int j = 0; j < i; j++ ) Y[i] -= L[i][j] * Y[j];
      Y[i] /= L[i][i];
   }

   // Solve upper-triangular system  UX = Y
   X = Y;
   for ( int i = n - 1; i >= 0; i-- )
   {
      if ( abs( U[i][i] ) < SMALL ) return false;          // U is singular
      for ( int j = i + 1; j < n; j++ ) X[i] -= U[i][j] * X[j];
      X[i] /= U[i][i];
   }

   return true;
}


//======================================================================


template<typename T> T polynomial( const vector<T> &P, T z )
{
   T result{};
   for ( int i = P.size() - 1; i >= 0; i-- ) result = P[i] + result * z;
   return result;
}


//====================================================================== 


Table of approximants to exp(z) for Pade approximant [N][N] with N = 5
Polynomial coefficients in numerator  : 1  0.5  0.111111  0.0138889  0.000992063  3.30688e-05  
Polynomial coefficients in denominator: 1  -0.5  0.111111  -0.0138889  0.000992063  -3.30688e-05  

                   0                   1                   1
                 0.5             1.64872             1.64872
                   1             2.71828             2.71828
                 1.5             4.48169             4.48169
                   2             7.38906             7.38906
                 2.5             12.1825             12.1825
                   3             20.0855              20.086
                 3.5             33.1155             33.1197
                   4             54.5982             54.6311
                 4.5             90.0171             90.2356
                   5             148.413             149.697
I have not done any software-precision implementation but I have used open-source packages that use GNU Multiple Precision Arithmetic Library. It slows down the run-time significantly. However if you are just looking at Pade computations those packages, such as PARI/GP can do fast and precise computations for even [100/100]. What slows down the process is the power series computation. In the example you sent me, there is an explicit function to approximate. So obtaining the Taylor series is fast even in software precision. In many interesting applications, there are implicit functions that require successive linearization for a vector function to develop the power series and that is in itself an enormous undertaking, i.e. developing the software-precision implementation of the power series for use in Pade approximant.
The problem with memory corruption is that the error is usually not where the symptom appears. Your symptom is in CPA(), but the error itself could have happened thousands of statements before CPA was ever called.
Topic archived. No new replies allowed.