NetBSD Documentation: Kernel Profiling HOWTO

This is a description of how kernel profiling works, and how to use it. This documentation was written by Matthias Drochner .

Kernel Profiling


Kernel Profiling

How does it work (top)

Two different sets of data about the behaviour of the profiled code are recorded independently: the frequency of every function call (call graph profiling) and the time spent in each function, estimated by the probability of the program counter being within the function when sampled at a random time which is in turn estimated by the fraction of profiling timer interrupts occurring while the function in question is executing. The gprof(1) utility interprets the data. There are some limitations caused by the missing correlation between the two data sets which are noted in the BUGS section of its man page.

Kernel profiling and user program profiling are mostly similar; there are only small differences in the way the profiling data are accessed and how the profiling is controlled.

The data related to kernel profiling are located within a global structure _gmonparam which is initialized by kmstartup() (in kern/subr_prof.c) during system initialization. The user level control program kgmon(8) uses sysctl(3) calls for control and data access, and partly kvm(3) accesses (also in the standard case where a live kernel is profiled!).

Call graph recording (top)

The profiling flag (-pg) causes the compiler to issue a call to mcount() on every function entry. This is dispatched by machine specific glue to _mcount(frompc,selfpc), which is implemented in sys/lib/libkern/mcount.c. frompc is the address the function was called from, and selfpc is the address of the called function itself.

For every (frompc, selfpc) pair encountered during the profiling, a struct tostruct is allocated from the array pointed to by _gmonparam.tos The entries are simply allocated from the beginning to the end in the order of first use. Some magic within kmstartup() determines the size of the array from the kernel's text size -- it seems that this is kind of an "educated guess".

The struct tostruct entries contain the address of the called function (selfpc) together with a histogramming counter. Entries belonging to the same calling address form a linked list. The list heads (ie the index of the first entry within the _gmonparam.tos array belonging to a particular calling address) are located in a second data array _gmonparam.froms which is indexed by the calling address (frompc) divided by some value (which should not be larger than the minimal distance of two calls within the code - see also the comments in sys/sys/gmon.h).

Note that for standard function calls there is only one selfpc for each frompc, so that the typical list consists of one member only.

Statistical profiling (top)

If profiling is started, a profiling timer interrupt is set up which calls statclock() (see sys/kern/kern_clock.c). This should be a timer independent of the normal system clock to avoid interferences with functions running synchronously to system clock ticks. statclock() is used for both user program and kernel profiling.

The program counter at the time of the interruption, divided by some value again, is used as index into the histogram _gmonparam.kcount and the corresponding cell is incremented.

How to use it (top)


Back to  NetBSD Documentation: Kernel

(contact us)   Generated from %NetBSD: index.xml,v 1.3 2006/02/27 13:54:49 kano Exp %
Copyright © 1994-2006 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.
NetBSD® is a registered trademark of The NetBSD Foundation, Inc.