I decided to publish my own tool which I started to use rather successfully to automate implementations of some compute-intensive code typical for HPC. The compute-intensive code normally comes in number of variations: same formula must be effectively translated for 1D, 2D and 3D cases, with different approximations, with different base types, for CUDA and CPU, and so on. Now this task is often done by using C++ templates as an "extension" of C (like: C with templates). But it appears that not all of the typical requirements can be expressed easily with C++ templates. C code generation makes it possible to write and debug less and get all that variations from a single source.
The most comprehensive sample for today is test7.cpp, which implements rather typical boundary update problem for a regular grid, implemented in several variations. The basic outline of the task solved by test7.cpp is described directly in the README file of the project.
I hope that the way this task is solved in test7.cpp is better than any attempt to do the same with C++ templates. But the question is: may be I am wrong? How would YOU solve this problem in modern C++ without code generation?
If you need the same code to be able to generate both native and CUDA code then code generation is the way to go.
However, in my opinion it would better if you just design a language and write a compiler for it, rather than require your users to do stuff like
1 2 3 4 5 6 7
function_("void", "f", "bool c, int &a, int &b")(
if_("c")(
"a = b;\n"
) << else_()(
"b = a;\n"
)
);
Mixing code and metacode makes for a confusing mess.
> However, in my opinion it would better if you just design a language and write a compiler for it
Thanks for feedback! Yes I thought about creating a sort of DSL, but for now stopped on this simple solution for several reasons:
1) design a good language for these purposes means "reinventing Fortran". It is too complex task. At least for me. I'm much more into HPC than into language design.
2) generating code in the same or compatible language as the generator itself may promise two merits:
a) one can use same data structures both to perform actual calculation and for code generation. If one generates code in a separate language, he has to replicate some domain definitions in that language.
b) it opens the way to do run-time reflection in the way that generator produces a specifically tuned code, compiles it, then dynamically loads and runs, without restarting and relinking.
But the main question for me is if we can reach the same goal as in test7.cpp with C++ templates only (putting aside CUDA)? I'm asking because I'm no expert in template metaprogramming, especialy as for its C++17 state.