FreeBSD perf stat equivalent

Is there a FreeBSD equivalent to the Linux command perf stat -r 10 <your app and arguments> that will run a command 10 times and give timing data?
 
You're probably looking for this:
YT0f69i.png
 
Thank you, though I am not the OP :p. Didn't mean to hijack the thread, sorry :(. Still, I think this answers the original question.
Indeed. I didn't look closely and assumed by your verbiage you were the original poster. I'll leave that to eatonphil to clarify if everything that was needed was answered.
 
And is there a sampling profiler available (like perf record or Oracle collect)?

I know about callgrind and cachegrind :), which are a bit different.
 
I would not call "perf stat" a timing benchmark tool.

It will generate output like

Code:
    0.546392      task-clock:u (msec)       #    0.443 CPUs utilized            ( +-  3.87% )
                 0      context-switches:u        #    0.000 K/sec
                 0      cpu-migrations:u          #    0.000 K/sec
               159      page-faults:u             #    0.291 M/sec
            327450      cycles:u                  #    0.599 GHz                      ( +-  7.86% )  (77.75%)
           1653670      stalled-cycles-frontend:u #  505.01% frontend cycles idle     ( +-  4.26% )
           1585953      stalled-cycles-backend:u  #  484.33% backend cycles idle      ( +-  4.29% )
            187607      instructions:u            #    0.57  insn per cycle
                                                  #    8.81  stalled cycles per insn  ( +-  0.00% )
             41232      branches:u                #   75.463 M/sec                    ( +-  0.00% )
              2965      branch-misses:u           #    7.19% of all branches          ( +- 37.10% )  (22.25%)

       0.001233042 seconds time elapsed                                          ( +- 11.59% )


Tools like this* basically do either or both of two things
  1. Use Performance Monitoring Counters (PMCs). These are hardware counters that can monitor events such as cache-misses, branch-misses etc. These can either be aggregated into statistics or recorded and later viewed as a flamegraph or calltree.
  2. Use sampling. In this mode, the tool will run an extra thread with a timer (say 99Hz) that takes a snapshot of the callstack every time. Again with postprocessing this can generate flamegraphs and calltrees.
The OP asked about `perf stat` which corresponds to point 1 in aggregation mode.

hyperfine looks interesting but it seems to be just a benchmarking framework to time multiple runs.

I need to look at pmcstat more closely.

* Linux perf, Oracle collect, Intel VTune are the ones I know of, there is also gprof but that needs the application to be recompiled

Update:
Oracle collect has now been released and is part of GNU binutils as gprofng, https://sourceware.org/binutils/wiki/gprofng
 
Last edited:
Back
Top