It is possible to do profiling without altering any C++ implementation, and without writing any C++ testbed. Using Google's gperftools combined with cython, you can do C++ profiling by writing python script running the MOOSE functions in quetion.
First cython, gperftools, libc6-prof packages have to be installed. Secondly a cython wrapper should be made for three functions of gperftools. After that, moose may be recompiled with the 'profile' option. Lastly, the wrapper may be included into arbitrary python script, thus gperftools functions can be used.
Cython:
~$ sudo apt-get install cython
gperftools: download it from here, then install it.
libc6-prof:
~$ sudo apt-get install libc6-prof
kcachegrind (optional, for interpreting profiler output):
~$ sudo apt-get install kcachegrind
The simplest way to get the wrapper done is to write a cython script wrapping the gperftools functions and a python script that compiles the wrapped functions and link them to the gperftools library.
Let's call the cython script gperftools_wrapped.pyx:
cdef extern from "gperftools/profiler.h": int ProfilerStart(char* fname) void ProfilerStop() void ProfilerFlush() def ProfStart(fname): return ProfilerStart(fname) def ProfStop(): ProfilerStop() def ProfFlush(): ProfilerFlush()
Here we define a python function for each function of gperftools that we wrap. More functions can be wrapped for more custom profiling (see ProfilerStartWithOptions()).
The python compiler script may look something like this (setup.py):
from distutils.core import setup from Cython.Build import cythonize setup( name = 'gperftools_wrapped', ext_modules = cythonize("gperftools_wrapped.pyx"), )
Now the setup.py may be run with the following manner, adding the -lprofiler flag:
~$ python setup.py build_ext --inplace -lprofiler
If everything went right now you should have gperftools_wrapped.c, gperftools_wrapped.so, and a build directory as result of the compilation.
Put gperftools_wrapped.so nearby your python testbed and import as gperftools_wrapped, so you can profile python C extensions. But (!) first the C extensions may be compiled using the -lprofiler flag.
To profile moose, it should be recompiled with altering the Makefile setting BUILD:
BUILD=profile
Essentially you should add the -lprofiler flag. So if the flags corresponding to the "profile" BUILD option does not include -lprofiler you should add it yourself (probably that is the case).
Flags to use for example:
CXXFLAGS = -pg -lprofiler -fpermissive -fno-strict-aliasing -fPIC -Wall -Wno-long-long -pedantic -DUSE_GENESIS_PARSER
You may only add the -lprofiler flag to the Makefile which compiles the C++ code you are interested in profiling (not tested). Then recompile moose.
Before profiling one should always set the PYTHONPATH to the directory from where python picks up moose functions. To get the function names in your profiling, this should be done, whether it is already set in e.g. your .bashrc script. Example:
export PYTHONPATH=/path_to_moose/python/
To test profiling let's use an existing demo to check the runtime of HSolve functions.
From the moose directory alter the script at Demos/traub_2005/py/test_hsolve_tcr.py. First import the wrapper we just made.
from gperftools_wrapped import *
Then edit the testHSolve function, adding the wrapper functions:
def testHSolve(self): ProfStart("hsolve.prof") self.schedule(simdt, plotdt, 'hsolve') self.runsim(simtime, pulsearray=pulsearray) self.savedata() ProfFlush() ProfStop() def testEE(self): pass #self.schedule(simdt, plotdt, 'ee') #self.runsim(simtime, pulsearray=pulsearray) #self.savedata()
You can also comment out the testEE() function so the it will run faster.
After running the python script you should have a file named hsolve.prof. As you can see the string passed to ProfStart() determines the name of the profiler's output.
You can interpret the output using pprof, or if you installed kcachegrind. Note that for the 'program' parameter of pprof you should provide the _moose.so file inside /path_to_moose/python/moose/.
pprof text method:
~$ pprof --text /path_to_moose/python/moose/_moose.so hsolve.prof > log ~$ less log
kcachegrind method:
~$ pprof --callgrind /path_to_moose/python/moose/_moose.so hsolve.prof > output.callgrind ~$ kcachegrind output.callgrind