You are not logged in.

#1 2010-06-04 02:35:49

ogronom
Member
From: Toronto, Canada
Registered: 2008-05-06
Posts: 123

gprof: time difference

Hi guys,

One of my test case takes approximately 10 times more time than it should be. I tried to use gprof to find where is the bottleneck, and get the following thing.

time ./test_rintegral.out
... <skipped output>...
./test_rintegral.out  29.79s user 0.08s system 99% cpu 30.058 total

Around 30 seconds.

While the output of profile

Flat profile:
Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 20.53      0.42     0.42  6812301     0.00     0.00  grid_lin_log_table_eval_function_derivative
 15.46      0.74     0.32  6808185     0.00     0.00  power_free_bound_eval
 14.01      1.03     0.29  6810796     0.00     0.00  d_free_coulomb_eval_all
 11.59      1.27     0.24  6810796     0.00     0.00  general_dirac_free_eval
 10.63      1.50     0.22  6810796     0.00     0.00  general_dirac_bound_eval
 10.39      1.71     0.21  6810796     0.00     0.00  d_free_coulomb_tail_eval
  7.25      1.86     0.15  6791217     0.00     0.00  bound_artificial_eval
  3.38      1.93     0.07  6809490     0.00     0.00  integrant_change_variables_eval
  3.38      2.00     0.07     1306     0.00     0.00  quad_romberg_2
  0.97      2.02     0.02        1     0.02     0.02  general_dirac_bound_init_bound_artificial
  0.48      2.03     0.01     1305     0.00     0.00  power_free_bound_eval_const
  0.48      2.04     0.01        2     0.01     0.01  general_function_init_integrant_change_variables
  0.48      2.05     0.01        1     0.01     0.02  rintegral_power_fb_const
  0.48      2.06     0.01                             d_free_coulomb_eval
  0.24      2.06     0.01        1     0.01     0.01  general_grid_init_grid_lin_log_table
  0.24      2.07     0.01                             d_free_coulomb_tail_conjugate_eval_asymptotic
  0.00      2.07     0.00    19579     0.00     0.00  d_power_mexp_eval
  0.00      2.07     0.00     2811     0.00     0.00  grid_lin_log_table_eval
  0.00      2.07     0.00     1502     0.00     0.00  grid_lin_log_eval_function_derivative
  0.00      2.07     0.00     1502     0.00     0.00  solve_linear_log_eq
  0.00      2.07     0.00     1500     0.00     0.00  analityc_inner_fb_eval_0_0_0
  0.00      2.07     0.00     1500     0.00     0.00  general_inner_eval
  0.00      2.07     0.00     1500     0.00     0.00  test_rinner_compare_given
  0.00      2.07     0.00     1306     0.00     0.00  rintegral_power_fb_origin_without_power
  0.00      2.07     0.00     1303     0.00     0.00  bisect_find_index
  0.00      2.07     0.00     1303     0.00     0.00  cubic_spline_eval
  0.00      2.07     0.00        4     0.00     0.00  grid_lin_log_eval_inverse
  0.00      2.07     0.00        2     0.00     0.00  power_series_eval
  0.00      2.07     0.00        2     0.00     0.00  rintegral_power_fb_asymptotic_without_power
  0.00      2.07     0.00        1     0.00     0.00  bound_artificial_init
  0.00      2.07     0.00        1     0.00     0.00  cubic_spline_calc
  0.00      2.07     0.00        1     0.00     0.00  cubic_spline_ialloc
  0.00      2.07     0.00        1     0.00     0.00  cubic_spline_ifree
  0.00      2.07     0.00        1     0.00     0.00  cubic_spline_init
  0.00      2.07     0.00        1     0.00     0.00  d_free_coulomb_init_k
  0.00      2.07     0.00        1     0.00     0.00  d_free_coulomb_tail_init_finish
  0.00      2.07     0.00        1     0.00     0.00  general_dirac_free_init_d_free_coulomb
...<skipped the rest. The thing is that cumulative time stuck on 2.07>...

Therefore I get that, gprof thinks that cumulative time of execution is 2 seconds.

What do I miss?

Offline

#2 2010-06-04 16:38:55

stqn
Member
Registered: 2010-03-19
Posts: 1,191
Website

Re: gprof: time difference

My guess is that the 6 M calls * 8 functions = 48 M function calls take some time smile.

Maybe try some inlining or -fomit-frame-pointer or eliminating some calls...

Offline

#3 2010-06-04 17:25:29

ogronom
Member
From: Toronto, Canada
Registered: 2008-05-06
Posts: 123

Re: gprof: time difference

Thats obvious dealing with this right now.

The question is why gprof shows that cumulative time is only 2s, while it takes 30s to execute program.

Last edited by ogronom (2010-06-04 17:25:49)

Offline

#4 2010-06-04 18:13:44

stqn
Member
Registered: 2010-03-19
Posts: 1,191
Website

Re: gprof: time difference

Maybe the flat profile doesn't take calling time into account? Have you tried --graph?
(I've never used gprof.)

Edit: I don't think what I wrote makes any sense. I'll stop here smile.

Last edited by stqn (2010-06-04 18:37:06)

Offline

#5 2010-06-04 18:30:26

ogronom
Member
From: Toronto, Canada
Registered: 2008-05-06
Posts: 123

Re: gprof: time difference

By default gprof generates both of them. (Values of time are the same). I just posted flat one to get the idea.

Offline

#6 2010-06-04 21:14:39

tavianator
Member
From: Waterloo, ON, Canada
Registered: 2007-08-21
Posts: 859
Website

Re: gprof: time difference

gprof doesn't provide "inclusive" timing, to my knowledge, which means that time spent in system calls is largely unaccounted for.

I usually use "valgrind --tool=callgrind --separate-recs=10 --simulate-cache=yes --dump-instr=yes --collect-jumps=yes" for profiling, and kcachegrind to view the data.  A major benefit is that it doesn't require the code to be compiled with any particular set of flags, although "-g" is helpful.

Offline

#7 2010-06-05 12:58:05

ogronom
Member
From: Toronto, Canada
Registered: 2008-05-06
Posts: 123

Re: gprof: time difference

Thanks

Offline

Board footer

Powered by FluxBB