366 lines
18 KiB
Markdown
366 lines
18 KiB
Markdown
|
# Performance Monitor (`perfmon`) example
|
||
|
|
||
|
(See the README.md file in the upper level 'examples' directory for more information about examples.)
|
||
|
|
||
|
## Overview
|
||
|
This example illustrates usage of `perfmon` APIs to monitor and profile functions.
|
||
|
The example will calculate performance statistic for simple test function.
|
||
|
The simple test function could be exchanged to one from the user.
|
||
|
|
||
|
The example contain test function that will be executed with perfmon component and collect CPU statistic. The test function will be executed 200 times in each test case. The first test case collect statistic from all available performance counters, and second test just from defined in the list.
|
||
|
|
||
|
## How to use example
|
||
|
|
||
|
### Hardware Required
|
||
|
|
||
|
Example should be able to run on any commonly available ESP32 development board.
|
||
|
|
||
|
### Configure the project
|
||
|
|
||
|
```
|
||
|
make menuconfig
|
||
|
```
|
||
|
|
||
|
* Set serial port under Serial Flasher Options.
|
||
|
|
||
|
### Build and Flash
|
||
|
|
||
|
Enter `make -j4 flash monitor` if you are using GNU Make based build system or enter `idf.py build flash monitor` if you' are using CMake based build system.
|
||
|
|
||
|
(To exit the serial monitor, type ``Ctrl-]``.)
|
||
|
|
||
|
See the [Getting Started Guide](https://docs.espressif.com/projects/esp-idf/en/latest/get-started/index.html) for full steps to configure and use ESP-IDF to build projects.
|
||
|
|
||
|
## Example Output
|
||
|
|
||
|
1. Example starts and call first test. The first test call test function 'exec_test_function'
|
||
|
|
||
|
```bash
|
||
|
I (288) example: Start
|
||
|
I (288) example: Start test with printing all available statistic
|
||
|
Value = 750, select = 0, mask = 0001. Counts cycles.
|
||
|
Amount of cycles
|
||
|
Value = 0, select = 1, mask = 0001. Overflow of counter.
|
||
|
Overflow counter
|
||
|
Value = 0, select = 2, mask = 0001. Successfully Retired Instructions.
|
||
|
|
||
|
JX instructions
|
||
|
Value = 0, select = 2, mask = 0002. Successfully Retired Instructions.
|
||
|
|
||
|
CALLXn instructions
|
||
|
Value = 3, select = 2, mask = 0004. Successfully Retired Instructions.
|
||
|
|
||
|
return instructions (RET, RETW, ...)
|
||
|
Value = 0, select = 2, mask = 0008. Successfully Retired Instructions.
|
||
|
|
||
|
supervisor return instructions (RFDE, RFE, RFI, RFWO, RFWU)
|
||
|
Value = 100, select = 2, mask = 0010. Successfully Retired Instructions.
|
||
|
|
||
|
Conditional branch instructions where execution
|
||
|
transfers to the target (aka. taken branch),
|
||
|
or loopgtz/loopnez instr where execution skips
|
||
|
the loop (aka. not-taken loop)
|
||
|
Value = 0, select = 2, mask = 0020. Successfully Retired Instructions.
|
||
|
|
||
|
J instr
|
||
|
Value = 1, select = 2, mask = 0040. Successfully Retired Instructions.
|
||
|
|
||
|
CALLn instr
|
||
|
Value = 0, select = 2, mask = 0080. Successfully Retired Instructions.
|
||
|
|
||
|
Conditional branch instr where execution
|
||
|
falls through (aka. not-taken branch)
|
||
|
Value = 0, select = 2, mask = 0100. Successfully Retired Instructions.
|
||
|
|
||
|
Loop instr where execution falls into loop (aka. taken loop)
|
||
|
Value = 0, select = 2, mask = 0400. Successfully Retired Instructions.
|
||
|
|
||
|
Last inst of loop and execution transfers
|
||
|
to LBEG (aka. loopback taken)
|
||
|
Value = 0, select = 2, mask = 0800. Successfully Retired Instructions.
|
||
|
|
||
|
Last inst of loop and execution falls
|
||
|
through to LEND (aka. loopback fallthrough)
|
||
|
Value = 309, select = 2, mask = 8000. Successfully Retired Instructions.
|
||
|
|
||
|
Non-branch instr (aka. non-CTI)
|
||
|
Value = 0, select = 3, mask = 0002. Data-related GlobalStall cycles.
|
||
|
Store buffer full stall
|
||
|
Value = 0, select = 3, mask = 0004. Data-related GlobalStall cycles.
|
||
|
Store buffer conflict stall
|
||
|
Value = 0, select = 3, mask = 0008. Data-related GlobalStall cycles.
|
||
|
Data Cache-miss stall (unused)
|
||
|
Value = 0, select = 3, mask = 0010. Data-related GlobalStall cycles.
|
||
|
Data RAM/ROM/XLMI busy stall
|
||
|
Value = 0, select = 3, mask = 0020. Data-related GlobalStall cycles.
|
||
|
Data inbound-PIF request stall (includes s32c1i)
|
||
|
Value = 0, select = 3, mask = 0040. Data-related GlobalStall cycles.
|
||
|
MHT lookup stall
|
||
|
Value = 0, select = 3, mask = 0080. Data-related GlobalStall cycles.
|
||
|
Uncached load stall (included in MHT lookup stall below)
|
||
|
Value = 0, select = 3, mask = 0100. Data-related GlobalStall cycles.
|
||
|
Bank-conflict stall
|
||
|
Value = 0, select = 4, mask = 0001. Instruction-related and Other Glob
|
||
|
alStall cycles.
|
||
|
ICache-miss stall
|
||
|
Value = 0, select = 4, mask = 0002. Instruction-related and Other Glob
|
||
|
alStall cycles.
|
||
|
Instruction RAM/ROM busy stall
|
||
|
Value = 0, select = 4, mask = 0004. Instruction-related and Other Glob
|
||
|
alStall cycles.
|
||
|
Instruction RAM inbound-PIF request stall
|
||
|
Value = 0, select = 4, mask = 0008. Instruction-related and Other Glob
|
||
|
alStall cycles.
|
||
|
TIE port stall
|
||
|
Value = 0, select = 4, mask = 0010. Instruction-related and Other Glob
|
||
|
alStall cycles.
|
||
|
External RunStall signal status
|
||
|
Value = 0, select = 4, mask = 0020. Instruction-related and Other Glob
|
||
|
alStall cycles.
|
||
|
Uncached fetch stall
|
||
|
Value = 1, select = 4, mask = 0040. Instruction-related and Other Glob
|
||
|
alStall cycles.
|
||
|
FastL32R stall
|
||
|
Value = 0, select = 4, mask = 0080. Instruction-related and Other Glob
|
||
|
alStall cycles.
|
||
|
Iterative multiply stall
|
||
|
Value = 0, select = 4, mask = 0100. Instruction-related and Other Glob
|
||
|
alStall cycles.
|
||
|
Iterative divide stall
|
||
|
Value = 0, select = 5, mask = 0001. Exceptions and Pipeline Replays.
|
||
|
Other Pipeline Replay (i.e. excludes cache miss etc.)
|
||
|
Value = 0, select = 5, mask = 0002. Exceptions and Pipeline Replays.
|
||
|
Level-1 interrupt
|
||
|
Value = 0, select = 5, mask = 0004. Exceptions and Pipeline Replays.
|
||
|
Greater-than-level-1 interrupt
|
||
|
Value = 0, select = 5, mask = 0008. Exceptions and Pipeline Replays.
|
||
|
Debug exception
|
||
|
Value = 0, select = 5, mask = 0010. Exceptions and Pipeline Replays.
|
||
|
NMI
|
||
|
Value = 0, select = 5, mask = 0020. Exceptions and Pipeline Replays.
|
||
|
Window exception
|
||
|
Value = 0, select = 5, mask = 0040. Exceptions and Pipeline Replays.
|
||
|
Allocate exception
|
||
|
Value = 0, select = 5, mask = 0080. Exceptions and Pipeline Replays.
|
||
|
Other exceptions
|
||
|
Value = 0, select = 5, mask = 0100. Exceptions and Pipeline Replays.
|
||
|
HW-corrected memory error
|
||
|
Value = 0, select = 6, mask = 0001. Hold and Other Bubble cycles.
|
||
|
Processor domain PSO bubble
|
||
|
Value = 0, select = 6, mask = 0004. Hold and Other Bubble cycles.
|
||
|
R hold caused by Data Cache miss(unused)
|
||
|
Value = 0, select = 6, mask = 0008. Hold and Other Bubble cycles.
|
||
|
R hold caused by Store release
|
||
|
Value = 0, select = 6, mask = 0010. Hold and Other Bubble cycles.
|
||
|
R hold caused by register dependency
|
||
|
Value = 0, select = 6, mask = 0020. Hold and Other Bubble cycles.
|
||
|
R hold caused by MEMW, EXTW or EXCW
|
||
|
Value = 0, select = 6, mask = 0040. Hold and Other Bubble cycles.
|
||
|
R hold caused by Halt instruction (TX only)
|
||
|
Value = 322, select = 6, mask = 0080. Hold and Other Bubble cycles.
|
||
|
CTI bubble (e.g. branch delay slot)
|
||
|
Value = 0, select = 6, mask = 0100. Hold and Other Bubble cycles.
|
||
|
WAITI bubble i.e. a cycle spent in WaitI power down mode.
|
||
|
Value = 417, select = 7, mask = 0001. Instruction TLB Accesses (per inst
|
||
|
ruction retiring).
|
||
|
ITLB Hit
|
||
|
Value = 0, select = 7, mask = 0002. Instruction TLB Accesses (per inst
|
||
|
ruction retiring).
|
||
|
Replay of instruction due to ITLB miss
|
||
|
Value = 0, select = 7, mask = 0004. Instruction TLB Accesses (per inst
|
||
|
ruction retiring).
|
||
|
HW-assisted TLB Refill completes
|
||
|
Value = 0, select = 7, mask = 0008. Instruction TLB Accesses (per inst
|
||
|
ruction retiring).
|
||
|
ITLB Miss Exception
|
||
|
Value = 0, select = 8, mask = 0001. Instruction Memory Accesses (per i
|
||
|
nstruction retiring).
|
||
|
Instruction Cache Hit
|
||
|
Value = 0, select = 8, mask = 0002. Instruction Memory Accesses (per i
|
||
|
nstruction retiring).
|
||
|
Instruction Cache Miss
|
||
|
Value = 420, select = 8, mask = 0004. Instruction Memory Accesses (per i
|
||
|
nstruction retiring).
|
||
|
All InstRAM or InstROM accesses
|
||
|
Value = 0, select = 8, mask = 0008. Instruction Memory Accesses (per i
|
||
|
nstruction retiring).
|
||
|
Bypass (i.e. uncached) fetch
|
||
|
Value = 3, select = 9, mask = 0001. Data TLB Accesses.
|
||
|
DTLB Hit
|
||
|
Value = 0, select = 9, mask = 0002. Data TLB Accesses.
|
||
|
Replay of load/store due to DTLB miss
|
||
|
Value = 0, select = 9, mask = 0004. Data TLB Accesses.
|
||
|
HW-assisted TLB Refill completes
|
||
|
Value = 0, select = 9, mask = 0008. Data TLB Accesses.
|
||
|
DTLB Miss Exception
|
||
|
Value = 0, select = 10, mask = 0001. Load Instruction (Data Memory).
|
||
|
Data Cache Hit(unused)
|
||
|
Value = 0, select = 10, mask = 0002. Load Instruction (Data Memory).
|
||
|
Data Cache Miss(unused)
|
||
|
Value = 3, select = 10, mask = 0004. Load Instruction (Data Memory).
|
||
|
Load from local memory i.e. DataRAM, DataROM, InstRAM, InstROM
|
||
|
|
||
|
Value = 0, select = 10, mask = 0008. Load Instruction (Data Memory).
|
||
|
Bypass (i.e. uncached) load
|
||
|
Value = 0, select = 13, mask = 0001. Load Instruction (Data Memory).
|
||
|
Data Cache Hit(unused)
|
||
|
Value = 0, select = 13, mask = 0002. Load Instruction (Data Memory).
|
||
|
Data Cache Miss(unused)
|
||
|
Value = 0, select = 13, mask = 0004. Load Instruction (Data Memory).
|
||
|
Load from local memory i.e. DataRAM, DataROM, InstRAM, InstROM
|
||
|
|
||
|
Value = 0, select = 13, mask = 0008. Load Instruction (Data Memory).
|
||
|
Bypass (i.e. uncached) load
|
||
|
Value = 0, select = 16, mask = 0001. Load Instruction (Data Memory).
|
||
|
Data Cache Hit (unused)
|
||
|
Value = 0, select = 16, mask = 0002. Load Instruction (Data Memory).
|
||
|
Data Cache Miss (unused)
|
||
|
Value = 0, select = 16, mask = 0004. Load Instruction (Data Memory).
|
||
|
Load from local memory i.e. DataRAM, DataROM, InstRAM, InstROM
|
||
|
|
||
|
Value = 0, select = 16, mask = 0008. Load Instruction (Data Memory).
|
||
|
Bypass (i.e. uncached) load
|
||
|
Value = 0, select = 11, mask = 0001. Store Instruction (Data Memory).
|
||
|
Data Cache Hit (unused)
|
||
|
Value = 0, select = 11, mask = 0002. Store Instruction (Data Memory).
|
||
|
Data Cache Miss (unused)
|
||
|
Value = 0, select = 11, mask = 0004. Store Instruction (Data Memory).
|
||
|
Store to local memory i.e. DataRAM, InstRAM
|
||
|
Value = 0, select = 11, mask = 0008. Store Instruction (Data Memory).
|
||
|
PIF Store
|
||
|
Value = 0, select = 14, mask = 0001. Store Instruction (Data Memory).
|
||
|
Data Cache Hit(unused)
|
||
|
Value = 0, select = 14, mask = 0002. Store Instruction (Data Memory).
|
||
|
Data Cache Miss(unused)
|
||
|
Value = 0, select = 14, mask = 0004. Store Instruction (Data Memory).
|
||
|
Store to local memory i.e. DataRAM, InstRAM
|
||
|
Value = 0, select = 14, mask = 0008. Store Instruction (Data Memory).
|
||
|
PIF Store
|
||
|
Value = 0, select = 17, mask = 0001. Store Instruction (Data Memory).
|
||
|
Data Cache Hit (unused)
|
||
|
Value = 0, select = 17, mask = 0002. Store Instruction (Data Memory).
|
||
|
Data Cache Miss (unused)
|
||
|
Value = 0, select = 17, mask = 0004. Store Instruction (Data Memory).
|
||
|
Store to local memory i.e. DataRAM, InstRAM
|
||
|
Value = 0, select = 17, mask = 0008. Store Instruction (Data Memory).
|
||
|
PIF Store
|
||
|
Value = 0, select = 12, mask = 0001. Accesses to Data Memory (Load, Sto
|
||
|
re, S32C1I, ...).
|
||
|
Cache Miss
|
||
|
Value = 0, select = 15, mask = 0001. Accesses to Data Memory (Load, Sto
|
||
|
re, S32C1I, ...).
|
||
|
Cache Miss
|
||
|
Value = 0, select = 18, mask = 0001. Accesses to Data Memory (Load, Sto
|
||
|
re, S32C1I, ...).
|
||
|
Cache Miss
|
||
|
Value = 415, select = 22, mask = 0001. Multiple Load/Store.
|
||
|
0 stores and 0 loads
|
||
|
Value = 3, select = 22, mask = 0002. Multiple Load/Store.
|
||
|
0 stores and 1 loads
|
||
|
Value = 0, select = 22, mask = 0004. Multiple Load/Store.
|
||
|
1 stores and 0 loads
|
||
|
Value = 0, select = 22, mask = 0008. Multiple Load/Store.
|
||
|
1 stores and 1 loads
|
||
|
Value = 0, select = 22, mask = 0010. Multiple Load/Store.
|
||
|
0 stores and 2 loads
|
||
|
Value = 0, select = 22, mask = 0020. Multiple Load/Store.
|
||
|
2 stores and 0 loads
|
||
|
Value = 0, select = 23, mask = 0001. Outbound PIF.
|
||
|
Castout
|
||
|
Value = 0, select = 23, mask = 0002. Outbound PIF.
|
||
|
Prefetch
|
||
|
Value = 0, select = 24, mask = 0001. Inbound PIF.
|
||
|
Data DMA
|
||
|
Value = 0, select = 24, mask = 0002. Inbound PIF.
|
||
|
Instruction DMA
|
||
|
Value = 0, select = 26, mask = 0001. Prefetch.
|
||
|
I prefetch-buffer-lookup hit
|
||
|
Value = 0, select = 26, mask = 0002. Prefetch.
|
||
|
D prefetch-buffer-lookup hit
|
||
|
Value = 0, select = 26, mask = 0004. Prefetch.
|
||
|
I prefetch-buffer-lookup miss
|
||
|
Value = 0, select = 26, mask = 0008. Prefetch.
|
||
|
D prefetch-buffer-lookup miss
|
||
|
Value = 0, select = 26, mask = 0020. Prefetch.
|
||
|
Direct fill to (L1) Data Cache (unused)
|
||
|
Value = 0, select = 27, mask = 0001. iDMA.
|
||
|
active cycles
|
||
|
Value = 0, select = 28, mask = 0001. Length of Instructions.
|
||
|
16-bit
|
||
|
Value = 0, select = 28, mask = 0002. Length of Instructions.
|
||
|
24-bit
|
||
|
Value = 0, select = 28, mask = 0004. Length of Instructions.
|
||
|
32-bit
|
||
|
Value = 0, select = 28, mask = 0008. Length of Instructions.
|
||
|
40-bit
|
||
|
Value = 0, select = 28, mask = 0010. Length of Instructions.
|
||
|
48-bit
|
||
|
Value = 0, select = 28, mask = 0020. Length of Instructions.
|
||
|
56-bit
|
||
|
Value = 0, select = 28, mask = 0040. Length of Instructions.
|
||
|
64-bit
|
||
|
Value = 0, select = 28, mask = 0080. Length of Instructions.
|
||
|
72-bit
|
||
|
Value = 0, select = 28, mask = 0100. Length of Instructions.
|
||
|
80-bit
|
||
|
Value = 0, select = 28, mask = 0200. Length of Instructions.
|
||
|
88-bit
|
||
|
Value = 0, select = 28, mask = 0400. Length of Instructions.
|
||
|
96-bit
|
||
|
Value = 0, select = 28, mask = 0800. Length of Instructions.
|
||
|
104-bit
|
||
|
Value = 0, select = 28, mask = 1000. Length of Instructions.
|
||
|
112-bit
|
||
|
Value = 0, select = 28, mask = 2000. Length of Instructions.
|
||
|
120-bit
|
||
|
Value = 0, select = 28, mask = 4000. Length of Instructions.
|
||
|
128-bit
|
||
|
```
|
||
|
|
||
|
2. Example calls second test.
|
||
|
|
||
|
```
|
||
|
I (1588) example: Start test with user defined statistic
|
||
|
Value = 743, select = 0, mask = 0001. Counts cycles.
|
||
|
Amount of cycles
|
||
|
Value = 417, select = 2, mask = 8dff. Successfully Retired Instructions.
|
||
|
|
||
|
JX instructions
|
||
|
CALLXn instructions
|
||
|
return instructions (RET, RETW, ...)
|
||
|
supervisor return instructions (RFDE, RFE, RFI, RFWO, RFWU)
|
||
|
Conditional branch instructions where execution
|
||
|
transfers to the target (aka. taken branch),
|
||
|
or loopgtz/loopnez instr where execution skips
|
||
|
the loop (aka. not-taken loop)
|
||
|
J instr
|
||
|
CALLn instr
|
||
|
Conditional branch instr where execution
|
||
|
falls through (aka. not-taken branch)
|
||
|
Loop instr where execution falls into loop (aka. taken loop)
|
||
|
Last inst of loop and execution transfers
|
||
|
to LBEG (aka. loopback taken)
|
||
|
Last inst of loop and execution falls
|
||
|
through to LEND (aka. loopback fallthrough)
|
||
|
Non-branch instr (aka. non-CTI)
|
||
|
Value = 3, select = 10, mask = 0004. Load Instruction (Data Memory).
|
||
|
Load from local memory i.e. DataRAM, DataROM, InstRAM, InstROM
|
||
|
|
||
|
Value = 0, select = 11, mask = 0004. Store Instruction (Data Memory).
|
||
|
Store to local memory i.e. DataRAM, InstRAM
|
||
|
Value = 321, select = 6, mask = 01ed. Hold and Other Bubble cycles.
|
||
|
Processor domain PSO bubble
|
||
|
R hold caused by Data Cache miss(unused)
|
||
|
R hold caused by Store release
|
||
|
R hold caused by MEMW, EXTW or EXCW
|
||
|
R hold caused by Halt instruction (TX only)
|
||
|
CTI bubble (e.g. branch delay slot)
|
||
|
WAITI bubble i.e. a cycle spent in WaitI power down mode.
|
||
|
Value = 0, select = 6, mask = 0010. Hold and Other Bubble cycles.
|
||
|
R hold caused by register dependency
|
||
|
Value = 0, select = 1, mask = 0001. Overflow of counter.
|
||
|
Overflow counter
|
||
|
I (1788) example: The End
|
||
|
|
||
|
```
|