|
|
COMP 320 Tutorial - Performance Profiling |
|
| Home | Administration | Assignments | Syllabus & Notes | Related Web Links |
|---|
Tutorial goals:
Often you want to know about the performance aspects of your program. Of course, when you are developing your program, you should understand its asymptotic "big-O" complexity behavior. But frequently you want more detail than that.
Modern architectures are sufficiently complicated (by features like superscalar pipelining and multi-level caching) that static analyses cannot both accurately and precisely describe program running time. What we are left with are dynamic analyses, i.e., running or simulating programs and analyzing what happened.
You might already have discovered two simple UNIX tools:
time program will report the time taken to
complete the program. It reports elapsed ("wall clock") time,
user CPU time, and system CPU time. The latter two count
processor time spent on that program, broken down into "user
time" corresponding to your code and "system time" corresponding
to various low-level routines used by your code.
top will report the time and space used so far by
programs currently running.
|
The three pieces of performance information that developers are usually most interested in are
How could you track these? We'll change the program's source code to track these.
ctime()) and compute the time
consumed. I.e., add a stopwatch to each function.
Note that in each case, the act of profiling will change the profile at least a little.
| Many modern processors have profiling abilities built-in. Profiling still requires a software interface to this hardware. Advantages include that profiling doesn't require modifying the source code, and thus the act of profiling doesn't modify the code's behavior. Disadvantages include that such profiling has a fixed hardware-determined set of capabilities, and it is more difficult to map the profiling results back to the source code. |
We could integrate the first two of these with the
printf-debugging advocated in a
prior tutorial.
Instead, UNIX has standard tools for these:
prof and the similar
gprof from GNU
tcov
gprof
prof and gprof are essentially identical in
purpose. We'll use the latter, since it has more features.
They are both tools for analyzing the run-time required in a program,
and breaking this time down among all the various defined functions.
In essence, it is a tool for automatically adding a stopwatch to
each function.
Using gprof is a three-step process:
Compile with the -pg flag with gcc.
(Use the -xpg flag with cc.)
This adds code into the resulting executable to do the profiling.
Run the program program.
The resulting profiling information is put into the file
gmon.out.
Run gprof program,
which interprets the information in
gmon.out and prints a human-readable summary.
By default, this produces lots of information.
(Use gprof program | more to view it all!)
For each function, it shows the percentage of time spent in a
function and how that is broken down into the functions it calls.
(Note: It includes all the functions included in system
libraries, too.)
Various command-line options are available to limit what you see.
man gprof, or
this manual.
|
Compile the sample program
without any compiler optimizations,
and use
Compile the program using various compiler optimizations
( E.g., here's the unoptimized and optimized results for one set of test inputs, as executed on one of the Owlnet servers. |
See also [BO] Section 5.15.1.
tcov
Note: tcov is used with cc, while
gcov is used with gcc. However,
gcov is not yet available on Owlnet.
tcov is similar to prof/gprof.
But, rather than timing each function,
it counts the number of times each statement is executed.
Using it is a four-step process:
Compile with the -xprofile=tcov flag with
cc.
Again, you can use our Makefile.
This adds code into the resulting executable to do the profiling.
Run the program program.
The resulting profiling information is put into the file
program.profile/tcovd.
Run tcov -x program.profile source1.c
source2.c ... (listing all your source
files, or using *.c)
to separate that profiling information into readable files
source1.tcov,
source2.tcov, ...
Look at the files *.tcov.
|
Now that you have this information, what do you do with it?
Within this course, a few classes have described a few optimization techniques, and more will be described in later classes. It will be helpful to use some of these in the last assignment, where optimizing time and space resources is part of the requirements.
|
See also [BO] Section 5.15.2.