CodeAnalyst leverages a third-party profiling tool called OProfile. Oprofile provides a suite of command-line utilities which CodeAnalyst modified to include additional supports for AMD processors. AMD CodeAnalyst 2.7 requires Linux® kernel that supports Oprofile version 0.9.1 and later.
"Opcontrol" is a command-line tool used to control a profiling session. This allows users to collect data from a command line or script. This latter capability is quite useful as the command line utility can be invoked from within test scripts or scripts that perform extensive or often-repeated experiments.
The opcontrol command line utility allows users to configure, start, and stop data collection. In the version provided by CodeAnalyst, it supports:
The profiling results can be imported into the CodeAnalyst Graphical User Interface (GUI) for analysis. A CodeAnalyst project must be created to import and view performance data that was collected using the command line utility.The command line switches to opcontrol set up and control performance data collection. Analysis often concentrates on the performance of a particular program. The Oprofile command line utility (opcontrol) can launch the program to be analyzed or it can launch a test script that launches the program of interest. The ability to launch a command file offers considerable flexibility when writing test scripts. Specific test cases can be encapsulated into individual script files. The following is an example of how to use opcontrol in a script:
Time-based profiling works by collecting samples at specified time intervals. Over a period of time, the samples collected can show which blocks of code use the most processor time. See Time-Based Profiling for further information.
The timer interval determines how often a TBP sample is taken . On Linux, Time-based profiling uses event CPU_CLK_UNHALTED (performance counter event 0x76) which represents the amount of running time of a processor i.e. CPU is not in a halted state. This event allows system idle time to be automatically factored out from IPC (or CPI) measurements, providing the OS halts the CPU when going idle. The time representation (in seconds or millisecond) can be calculated from the processor clock speed. For instance, on a processor running at clock speed 800MHz, to specify 1 millisecond time interval of time-based profiling using the opcontrol tool, we input:
opcontrol --event=CPU_CLK_UNHALTED:800000::1:1
Please see the section below for more information of the opcontrol command line options.
Event-based profiling works by using performance counters in the processor to count the number of times a specific processor event occurs. When the specified counter threshold of an event is reached, Oprofile collects a sample from the processor. Up to four events can be profiled in a given session,and each event can be assigned a different counter threshold. EBP requires an APIC-enabled system. See Event-Based Profiling for further information.
To successfully use EBP, the user needs to consult the performance monitor event tables. See the section on Performance Monitoring Events or the BIOS and Kernel Developer's Guide for the AMD processor in your test platform. For a general description of how to use these performance monitoring features, refer to the AMD64 Architecture Programmer's Manual, Volume 2, order# 24593, "Debug and Performance Resources" section.
The Event Select, Unit Mask and Event Count (sampling period) must be specified for each event to be measured. The Oprofile utility accepts event specifications that are formatted in the following manner:
[OPROFILE_EVENT_NAME]:[Count]:[Unit mask]:[Kernel]:[User]A complete list of events can be viewed using command opcontrol -l.
- [OPROFILE_EVENT_NAME] specifies the name of event to be profiled.
- [Unit Mask] is a two digit, hexadecimal value which specifies the Unit Mask value for the event.
- [Count] is a decimal number that specifies the Event Count (sampling period.)
- [Kernel] 0 or 1 to specify kernel-space profiling.
- [User] 0 or 1 to specify user-space profiling.
Consider, for example, the DCache Refill From L2 or System event which can be used to measure only refills from system memory through the use of a Unit Mask that qualifies the event. The Event name is "DATA_CACHE_REFILLS_FROM_L2_OR_NORTHBRIDGE" and a Unit Mask value of 0x01 measures only refills from system memory. Using an Event Count of 25,000, the full opcontrol event specification is:
opcontrol --event=DATA_CACHE_REFILLS_FROM_L2_OR_NORTHBRIDGE:25000:0x1:1:1The Retired Instructions event (Event Select 0x0C0) does not require a Unit Mask. Using an Event Count of 250,000, the full opcontrol event specification is:
opcontrol --event=RETIRED_INSTRUCTIONS:250000::1:1
The Event Count field of the event specification determines how many times the event is to be counted before an EBP sample is collected. This field is important because some events happen so frequently that the processing of the samples becomes very slow or collecting the samples can cause the system to stop responding. This latter scenario may occur if the specified Event Count is too small for an event. For example, event 0x4000 (DCache Access) happens quite frequently, so specifying a count of 10000 will generate a significant number of samples during a normal 5-second profiling session. It is advisable to specify a large count number when profiling an event for the first time, and allow for later adjustments of the count to get more statistically useful data.
The following example command measures the Data Cache Accesses event (Event Select 0x040) using an Event Count of 100,000:
opcontrol --event=DATA_CACHE_ACCESSES:100000::1:1
The user is allowed to specify up to four events on the command line. For example, to specify events DATA_CACHE_ACCESSES and RETIRED_INSTRUCTIONS in the same session with reasonable counts, enter the following command:
opcontrol --event=DATA_CACHE_ACCESSES:100000::1:1 -e RETIRED_INSTRUCTIONS:25000::1:1
After profiling has begun, oprofile prints out the confirmation messages and returns to command prompt. To stop, the profiling session, simply run opcontrol with the corresponded option flags as listed in opcontrol --help.
Instruction-Based Sampling collects performance data on instruction fetch (IBS fetch sampling) and macro-op execution (IBS op sampling.) IBS fetch sampling provides information about instruction cache (IC), instruction translation lookaside buffer (ITLB) behavior, and other aspects of the process of fetching instructions. IBS op sampling provides information about the execution of macro-ops that are issued from AMD64 instructions. IBS op data is wide-ranging and covers branch and memory access operations. See Instruction-Based Sampling for more information.
IBS fetch sampling and IBS op sampling are controlled by the --ibs-fetch and --ibs-op switches. The --ibs-fetch switch takes a single decimal value which is the fetch interval (sampling period) for IBS fetch sampling. IBS fetch sampling counts completed fetches to determine the next fetch operation to monitor and sample. The --ibs-op switch takes a single decimal value which is the op interval (sampling period) for IBS op sampling. IBS op sampling counts processor cycles to determine the next macro-op to monitor and sample. IBS fetch and op sampling may be enabled independently:
opcontrol --ibs-fetch=250000or IBS fetch and op sample data may be collected at the same time:
opcontrol --ibs-op=250000
opcontrol --ibs-fetch=250000 --ibs-op=250000
Profiles taken by using the command line utility can be imported into a CodeAnalyst project. With default configuration, opcontrol creates profiles in /var/lib/oprofile/samples/current directory. It contains the samples collected by Oprofile daemon.
The profile data can be imported into the CodeAnalyst GUI for further review and interpretation. Profile data are imported into a CodeAnalyst project. See Creating a CodeAnalyst Project in order to create a project. The section Importing Profile Data into CodeAnalyst illustrates the process of importing profile data.
To see the list of command line switch, simply run "opcontrol --help" at command prompt. To identify the CodeAnalyst provided version of "opcontrol" tool, simply run "opcontrol --version" as shown in figure below.