[GIT PULL] perf tools changes for v6.19
Perf event/metric description
-----------------------------
Unify all event and metric descriptions in JSON format.
Now event parsing and handling is greatly simplified by that.
From users point of view, perf list will provide richer
information about hardware events like the following.
$ perf list hw
List of pre-defined events (to be used in -e or -M):
legacy hardware:
branch-instructions
[Retired branch instructions [This event is an alias of branches]. Unit: cpu]
branch-misses
[Mispredicted branch instructions. Unit: cpu]
branches
[Retired branch instructions [This event is an alias of branch-instructions]. Unit: cpu]
bus-cycles
[Bus cycles,which can be different from total cycles. Unit: cpu]
cache-misses
[Cache misses. Usually this indicates Last Level Cache misses; this is intended to be used in conjunction with the
PERF_COUNT_HW_CACHE_REFERENCES event to calculate cache miss rates. Unit: cpu]
cache-references
[Cache accesses. Usually this indicates Last Level Cache accesses but this may vary depending on your CPU. This may include
prefetches and coherency messages; again this depends on the design of your CPU. Unit: cpu]
cpu-cycles
[Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cycles]. Unit: cpu]
cycles
[Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cpu-cycles]. Unit: cpu]
instructions
[Retired instructions. Be careful,these can be affected by various issues,most notably hardware interrupt counts. Unit: cpu]
ref-cycles
[Total cycles; not affected by CPU frequency scaling. Unit: cpu]
But most notable changes would be in the perf stat. On the right side,
the default metrics are better named and aligned. :)
$ perf stat -- perf test -w noploop
Performance counter stats for 'perf test -w noploop':
11 context-switches # 10.8 cs/sec cs_per_second
0 cpu-migrations # 0.0 migrations/sec migrations_per_second
3,612 page-faults # 3532.5 faults/sec page_faults_per_second
1,022.51 msec task-clock # 1.0 CPUs CPUs_utilized
110,466 branch-misses # 0.0 % branch_miss_rate (88.66%)
6,934,452,104 branches # 6781.8 M/sec branch_frequency (88.66%)
4,657,032,590 cpu-cycles # 4.6 GHz cycles_frequency (88.65%)
27,755,874,218 instructions # 6.0 instructions insn_per_cycle (89.03%)
TopdownL1 # 0.3 % tma_backend_bound
# 9.3 % tma_bad_speculation (89.05%)
# 9.7 % tma_frontend_bound (77.86%)
# 80.7 % tma_retiring (88.81%)
1.025318171 seconds time elapsed
1.013248000 seconds user
0.012014000 seconds sys
Deferred unwinding support
--------------------------
With the kernel support [1], perf can use deferred callchains for
userspace stack trace with frame pointers like below:
$ perf record --call-graph fp,defer ...
This will be transparent to users when it comes to other commands like
perf report and perf script. They will merge the deferred callchains to
the previous samples as if they were collected together.
[1] https://git.kernel.org/torvalds/c/c69993ecdd4dfde2b7da08b022052a33b203da07
ARM SPE updates
---------------
* Extensive enhancements to support various kinds of memory operations
including GCS, MTE allocation tags, memcpy/memset, register access,
and SIMD operations.
* Add inverted data source filter (inv_data_src_filter) support to
exclude certain data sources.
* Improve documentation.
Vendor event updates
--------------------
* Intel: Updated event files for Sierra Forest, Panther Lake, Meteor Lake,
Lunar Lake, Granite Rapids, and others.
* Arm64: Added metrics for i.MX94 DDR PMU and Cortex-A720AE definitions.
* RISC-V: Added JSON support for T-HEAD C920V2.
Misc
----
* Improve pointer tracking in data type profiling. It'd give better
output when the variable is using container_of() to convert type.
* Annotation support for perf c2c report in TUI. Press 'a' key to
enter annotation view from cacheline browser window. This will show
which instruction is causing the cacheline contention.
* Lots of fixes and test coverage improvements!
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCaTUiWgAKCRCMstVUGiXM
gzO3AQCaPM1/xAOtZ3Z21QEBrP+A0yFhmWMkI54IqZLsFl6qzQD/fvuorMblR+9W
Nlr0Yyyo3zWnl2CD6s6AraIcLR5gVQs=
=mjYC
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-for-v6.19-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Namhyung Kim:
"Perf event/metric description:
Unify all event and metric descriptions in JSON format. Now event
parsing and handling is greatly simplified by that.
From users point of view, perf list will provide richer information
about hardware events like the following.
$ perf list hw
List of pre-defined events (to be used in -e or -M):
legacy hardware:
branch-instructions
[Retired branch instructions [This event is an alias of branches]. Unit: cpu]
branch-misses
[Mispredicted branch instructions. Unit: cpu]
branches
[Retired branch instructions [This event is an alias of branch-instructions]. Unit: cpu]
bus-cycles
[Bus cycles,which can be different from total cycles. Unit: cpu]
cache-misses
[Cache misses. Usually this indicates Last Level Cache misses; this is intended to be used in conjunction with the
PERF_COUNT_HW_CACHE_REFERENCES event to calculate cache miss rates. Unit: cpu]
cache-references
[Cache accesses. Usually this indicates Last Level Cache accesses but this may vary depending on your CPU. This may include
prefetches and coherency messages; again this depends on the design of your CPU. Unit: cpu]
cpu-cycles
[Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cycles]. Unit: cpu]
cycles
[Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cpu-cycles]. Unit: cpu]
instructions
[Retired instructions. Be careful,these can be affected by various issues,most notably hardware interrupt counts. Unit: cpu]
ref-cycles
[Total cycles; not affected by CPU frequency scaling. Unit: cpu]
But most notable changes would be in the perf stat. On the right side,
the default metrics are better named and aligned. :)
$ perf stat -- perf test -w noploop
Performance counter stats for 'perf test -w noploop':
11 context-switches # 10.8 cs/sec cs_per_second
0 cpu-migrations # 0.0 migrations/sec migrations_per_second
3,612 page-faults # 3532.5 faults/sec page_faults_per_second
1,022.51 msec task-clock # 1.0 CPUs CPUs_utilized
110,466 branch-misses # 0.0 % branch_miss_rate (88.66%)
6,934,452,104 branches # 6781.8 M/sec branch_frequency (88.66%)
4,657,032,590 cpu-cycles # 4.6 GHz cycles_frequency (88.65%)
27,755,874,218 instructions # 6.0 instructions insn_per_cycle (89.03%)
TopdownL1 # 0.3 % tma_backend_bound
# 9.3 % tma_bad_speculation (89.05%)
# 9.7 % tma_frontend_bound (77.86%)
# 80.7 % tma_retiring (88.81%)
1.025318171 seconds time elapsed
1.013248000 seconds user
0.012014000 seconds sys
Deferred unwinding support:
With the kernel support (commit c69993ecdd4d: "perf: Support deferred
user unwind"), perf can use deferred callchains for userspace stack
trace with frame pointers like below:
$ perf record --call-graph fp,defer ...
This will be transparent to users when it comes to other commands like
perf report and perf script. They will merge the deferred callchains
to the previous samples as if they were collected together.
ARM SPE updates
- Extensive enhancements to support various kinds of memory
operations including GCS, MTE allocation tags, memcpy/memset,
register access, and SIMD operations.
- Add inverted data source filter (inv_data_src_filter) support to
exclude certain data sources.
- Improve documentation.
Vendor event updates:
- Intel: Updated event files for Sierra Forest, Panther Lake, Meteor
Lake, Lunar Lake, Granite Rapids, and others.
- Arm64: Added metrics for i.MX94 DDR PMU and Cortex-A720AE
definitions.
- RISC-V: Added JSON support for T-HEAD C920V2.
Misc:
- Improve pointer tracking in data type profiling. It'd give better
output when the variable is using container_of() to convert type.
- Annotation support for perf c2c report in TUI. Press 'a' key to
enter annotation view from cacheline browser window. This will show
which instruction is causing the cacheline contention.
- Lots of fixes and test coverage improvements!"
* tag 'perf-tools-for-v6.19-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (214 commits)
libperf: Use 'extern' in LIBPERF_API visibility macro
perf stat: Improve handling of termination by signal
perf tests stat: Add test for error for an offline CPU
perf stat: When no events, don't report an error if there is none
perf tests stat: Add "--null" coverage
perf cpumap: Add "any" CPU handling to cpu_map__snprint_mask
libperf cpumap: Fix perf_cpu_map__max for an empty/NULL map
perf stat: Allow no events to open if this is a "--null" run
perf test kvm: Add some basic perf kvm test coverage
perf tests evlist: Add basic evlist test
perf tests script dlfilter: Add a dlfilter test
perf tests kallsyms: Add basic kallsyms test
perf tests timechart: Add a perf timechart test
perf tests top: Add basic perf top coverage test
perf tests buildid: Add purge and remove testing
perf tests c2c: Add a basic c2c
perf c2c: Clean up some defensive gets and make asan clean
perf jitdump: Fix missed dso__put
perf mem-events: Don't leak online CPU map
perf hist: In init, ensure mem_info is put on error paths
...
master
commit
9e906a9dea
|
|
@ -96,6 +96,7 @@
|
|||
#define ARM_CPU_PART_NEOVERSE_V3 0xD84
|
||||
#define ARM_CPU_PART_CORTEX_X925 0xD85
|
||||
#define ARM_CPU_PART_CORTEX_A725 0xD87
|
||||
#define ARM_CPU_PART_CORTEX_A720AE 0xD89
|
||||
#define ARM_CPU_PART_NEOVERSE_N3 0xD8E
|
||||
|
||||
#define APM_CPU_PART_XGENE 0x000
|
||||
|
|
@ -185,6 +186,7 @@
|
|||
#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
|
||||
#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
|
||||
#define MIDR_CORTEX_A725 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A725)
|
||||
#define MIDR_CORTEX_A720AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A720AE)
|
||||
#define MIDR_NEOVERSE_N3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N3)
|
||||
#define MIDR_THUNDERX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX)
|
||||
#define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX)
|
||||
|
|
|
|||
|
|
@ -90,7 +90,6 @@ FEATURE_TESTS_BASIC := \
|
|||
timerfd \
|
||||
zlib \
|
||||
lzma \
|
||||
get_cpuid \
|
||||
bpf \
|
||||
scandirat \
|
||||
sched_getcpu \
|
||||
|
|
@ -146,7 +145,6 @@ FEATURE_DISPLAY ?= \
|
|||
llvm-perf \
|
||||
zlib \
|
||||
lzma \
|
||||
get_cpuid \
|
||||
bpf \
|
||||
libaio \
|
||||
libzstd
|
||||
|
|
|
|||
|
|
@ -56,7 +56,6 @@ FILES= \
|
|||
test-lzma.bin \
|
||||
test-bpf.bin \
|
||||
test-libbpf.bin \
|
||||
test-get_cpuid.bin \
|
||||
test-sdt.bin \
|
||||
test-cxx.bin \
|
||||
test-gettid.bin \
|
||||
|
|
@ -318,9 +317,6 @@ $(OUTPUT)test-zlib.bin:
|
|||
$(OUTPUT)test-lzma.bin:
|
||||
$(BUILD) -llzma
|
||||
|
||||
$(OUTPUT)test-get_cpuid.bin:
|
||||
$(BUILD)
|
||||
|
||||
$(OUTPUT)test-bpf.bin:
|
||||
$(BUILD)
|
||||
|
||||
|
|
|
|||
|
|
@ -114,10 +114,6 @@
|
|||
# include "test-lzma.c"
|
||||
#undef main
|
||||
|
||||
#define main main_test_get_cpuid
|
||||
# include "test-get_cpuid.c"
|
||||
#undef main
|
||||
|
||||
#define main main_test_bpf
|
||||
# include "test-bpf.c"
|
||||
#undef main
|
||||
|
|
@ -168,7 +164,6 @@ int main(int argc, char *argv[])
|
|||
main_test_pthread_attr_setaffinity_np();
|
||||
main_test_pthread_barrier();
|
||||
main_test_lzma();
|
||||
main_test_get_cpuid();
|
||||
main_test_bpf();
|
||||
main_test_scandirat();
|
||||
main_test_sched_getcpu();
|
||||
|
|
|
|||
|
|
@ -1,8 +0,0 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <cpuid.h>
|
||||
|
||||
int main(void)
|
||||
{
|
||||
unsigned int eax = 0, ebx = 0, ecx = 0, edx = 0;
|
||||
return __get_cpuid(0x15, &eax, &ebx, &ecx, &edx);
|
||||
}
|
||||
|
|
@ -382,6 +382,7 @@ enum perf_event_read_format {
|
|||
#define PERF_ATTR_SIZE_VER6 120 /* Add: aux_sample_size */
|
||||
#define PERF_ATTR_SIZE_VER7 128 /* Add: sig_data */
|
||||
#define PERF_ATTR_SIZE_VER8 136 /* Add: config3 */
|
||||
#define PERF_ATTR_SIZE_VER9 144 /* add: config4 */
|
||||
|
||||
/*
|
||||
* 'struct perf_event_attr' contains various attributes that define
|
||||
|
|
@ -545,6 +546,7 @@ struct perf_event_attr {
|
|||
__u64 sig_data;
|
||||
|
||||
__u64 config3; /* extension of config2 */
|
||||
__u64 config4; /* extension of config3 */
|
||||
};
|
||||
|
||||
/*
|
||||
|
|
|
|||
|
|
@ -368,10 +368,12 @@ struct perf_cpu perf_cpu_map__max(const struct perf_cpu_map *map)
|
|||
.cpu = -1
|
||||
};
|
||||
|
||||
// cpu_map__trim_new() qsort()s it, cpu_map__default_new() sorts it as well.
|
||||
return __perf_cpu_map__nr(map) > 0
|
||||
? __perf_cpu_map__cpu(map, __perf_cpu_map__nr(map) - 1)
|
||||
: result;
|
||||
if (!map)
|
||||
return result;
|
||||
|
||||
// The CPUs are always sorted and nr is always > 0 as 0 length map is
|
||||
// encoded as NULL.
|
||||
return __perf_cpu_map__cpu(map, __perf_cpu_map__nr(map) - 1);
|
||||
}
|
||||
|
||||
/** Is 'b' a subset of 'a'. */
|
||||
|
|
@ -453,21 +455,33 @@ int perf_cpu_map__merge(struct perf_cpu_map **orig, struct perf_cpu_map *other)
|
|||
struct perf_cpu_map *perf_cpu_map__intersect(struct perf_cpu_map *orig,
|
||||
struct perf_cpu_map *other)
|
||||
{
|
||||
struct perf_cpu *tmp_cpus;
|
||||
int tmp_len;
|
||||
int i, j, k;
|
||||
struct perf_cpu_map *merged = NULL;
|
||||
struct perf_cpu_map *merged;
|
||||
|
||||
if (perf_cpu_map__is_subset(other, orig))
|
||||
return perf_cpu_map__get(orig);
|
||||
if (perf_cpu_map__is_subset(orig, other))
|
||||
return perf_cpu_map__get(other);
|
||||
|
||||
tmp_len = max(__perf_cpu_map__nr(orig), __perf_cpu_map__nr(other));
|
||||
tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
|
||||
if (!tmp_cpus)
|
||||
i = j = k = 0;
|
||||
while (i < __perf_cpu_map__nr(orig) && j < __perf_cpu_map__nr(other)) {
|
||||
if (__perf_cpu_map__cpu(orig, i).cpu < __perf_cpu_map__cpu(other, j).cpu)
|
||||
i++;
|
||||
else if (__perf_cpu_map__cpu(orig, i).cpu > __perf_cpu_map__cpu(other, j).cpu)
|
||||
j++;
|
||||
else { /* CPUs match. */
|
||||
i++;
|
||||
j++;
|
||||
k++;
|
||||
}
|
||||
}
|
||||
if (k == 0) /* Maps are completely disjoint. */
|
||||
return NULL;
|
||||
|
||||
merged = perf_cpu_map__alloc(k);
|
||||
if (!merged)
|
||||
return NULL;
|
||||
/* Entries are added to merged in sorted order, so no need to sort again. */
|
||||
i = j = k = 0;
|
||||
while (i < __perf_cpu_map__nr(orig) && j < __perf_cpu_map__nr(other)) {
|
||||
if (__perf_cpu_map__cpu(orig, i).cpu < __perf_cpu_map__cpu(other, j).cpu)
|
||||
|
|
@ -476,11 +490,8 @@ struct perf_cpu_map *perf_cpu_map__intersect(struct perf_cpu_map *orig,
|
|||
j++;
|
||||
else {
|
||||
j++;
|
||||
tmp_cpus[k++] = __perf_cpu_map__cpu(orig, i++);
|
||||
RC_CHK_ACCESS(merged)->map[k++] = __perf_cpu_map__cpu(orig, i++);
|
||||
}
|
||||
}
|
||||
if (k)
|
||||
merged = cpu_map__trim_new(k, tmp_cpus);
|
||||
free(tmp_cpus);
|
||||
return merged;
|
||||
}
|
||||
|
|
|
|||
|
|
@ -5,7 +5,7 @@
|
|||
#include <stdarg.h>
|
||||
|
||||
#ifndef LIBPERF_API
|
||||
#define LIBPERF_API __attribute__((visibility("default")))
|
||||
#define LIBPERF_API extern __attribute__((visibility("default")))
|
||||
#endif
|
||||
|
||||
enum libperf_print_level {
|
||||
|
|
|
|||
|
|
@ -151,6 +151,18 @@ struct perf_record_switch {
|
|||
__u32 next_prev_tid;
|
||||
};
|
||||
|
||||
struct perf_record_callchain_deferred {
|
||||
struct perf_event_header header;
|
||||
/*
|
||||
* This is to match kernel and (deferred) user stacks together.
|
||||
* The kernel part will be in the sample callchain array after
|
||||
* the PERF_CONTEXT_USER_DEFERRED entry.
|
||||
*/
|
||||
__u64 cookie;
|
||||
__u64 nr;
|
||||
__u64 ips[];
|
||||
};
|
||||
|
||||
struct perf_record_header_attr {
|
||||
struct perf_event_header header;
|
||||
struct perf_event_attr attr;
|
||||
|
|
@ -523,6 +535,7 @@ union perf_event {
|
|||
struct perf_record_read read;
|
||||
struct perf_record_throttle throttle;
|
||||
struct perf_record_sample sample;
|
||||
struct perf_record_callchain_deferred callchain_deferred;
|
||||
struct perf_record_bpf_event bpf;
|
||||
struct perf_record_ksymbol ksymbol;
|
||||
struct perf_record_text_poke_event text_poke;
|
||||
|
|
|
|||
|
|
@ -141,27 +141,65 @@ Config parameters
|
|||
These are placed between the // in the event and comma separated. For example '-e
|
||||
arm_spe/load_filter=1,min_latency=10/'
|
||||
|
||||
branch_filter=1 - collect branches only (PMSFCR.B)
|
||||
event_filter=<mask> - filter on specific events (PMSEVFR) - see bitfield description below
|
||||
event_filter=<mask> - logical AND filter on specific events (PMSEVFR) - see bitfield description below
|
||||
inv_event_filter=<mask> - logical OR to filter out specific events (PMSNEVFR, FEAT_SPEv1p2) - see bitfield description below
|
||||
jitter=1 - use jitter to avoid resonance when sampling (PMSIRR.RND)
|
||||
load_filter=1 - collect loads only (PMSFCR.LD)
|
||||
min_latency=<n> - collect only samples with this latency or higher* (PMSLATFR)
|
||||
pa_enable=1 - collect physical address (as well as VA) of loads/stores (PMSCR.PA) - requires privilege
|
||||
pct_enable=1 - collect physical timestamp instead of virtual timestamp (PMSCR.PCT) - requires privilege
|
||||
store_filter=1 - collect stores only (PMSFCR.ST)
|
||||
ts_enable=1 - enable timestamping with value of generic timer (PMSCR.TS)
|
||||
discard=1 - enable SPE PMU events but don't collect sample data - see 'Discard mode' (PMBLIMITR.FM = DISCARD)
|
||||
inv_data_src_filter=<mask> - mask to filter from 0-63 possible data sources (PMSDSFR, FEAT_SPE_FDS) - See 'Data source filtering'
|
||||
|
||||
+++*+++ Latency is the total latency from the point at which sampling started on that instruction, rather
|
||||
than only the execution latency.
|
||||
|
||||
Only some events can be filtered on; these include:
|
||||
Only some events can be filtered on using 'event_filter' bits. The overall
|
||||
filter is the logical AND of these bits, for example if bits 3 and 5 are set
|
||||
only samples that have both 'L1D cache refill' AND 'TLB walk' are recorded. When
|
||||
FEAT_SPEv1p2 is implemented 'inv_event_filter' can also be used to exclude
|
||||
events that have any (OR) of the filter's bits set. For example setting bits 3
|
||||
and 5 in 'inv_event_filter' will exclude any events that are either L1D cache
|
||||
refill OR TLB walk. If the same bit is set in both filters it's UNPREDICTABLE
|
||||
whether the sample is included or excluded. Filter bits for both event_filter
|
||||
and inv_event_filter are:
|
||||
|
||||
bit 1 - instruction retired (i.e. omit speculative instructions)
|
||||
bit 1 - Instruction retired (i.e. omit speculative instructions)
|
||||
bit 2 - L1D access (FEAT_SPEv1p4)
|
||||
bit 3 - L1D refill
|
||||
bit 4 - TLB access (FEAT_SPEv1p4)
|
||||
bit 5 - TLB refill
|
||||
bit 7 - mispredict
|
||||
bit 11 - misaligned access
|
||||
bit 6 - Not taken event (FEAT_SPEv1p2)
|
||||
bit 7 - Mispredict
|
||||
bit 8 - Last level cache access (FEAT_SPEv1p4)
|
||||
bit 9 - Last level cache miss (FEAT_SPEv1p4)
|
||||
bit 10 - Remote access (FEAT_SPEv1p4)
|
||||
bit 11 - Misaligned access (FEAT_SPEv1p1)
|
||||
bit 12-15 - IMPLEMENTATION DEFINED events (when implemented)
|
||||
bit 16 - Transaction (FEAT_TME)
|
||||
bit 17 - Partial or empty SME or SVE predicate (FEAT_SPEv1p1)
|
||||
bit 18 - Empty SME or SVE predicate (FEAT_SPEv1p1)
|
||||
bit 19 - L2D access (FEAT_SPEv1p4)
|
||||
bit 20 - L2D miss (FEAT_SPEv1p4)
|
||||
bit 21 - Cache data modified (FEAT_SPEv1p4)
|
||||
bit 22 - Recently fetched (FEAT_SPEv1p4)
|
||||
bit 23 - Data snooped (FEAT_SPEv1p4)
|
||||
bit 24 - Streaming SVE mode event (when FEAT_SPE_SME is implemented), or
|
||||
IMPLEMENTATION DEFINED event 24 (when implemented, only versions
|
||||
less than FEAT_SPEv1p4)
|
||||
bit 25 - SMCU or external coprocessor operation event when FEAT_SPE_SME is
|
||||
implemented, or IMPLEMENTATION DEFINED event 25 (when implemented,
|
||||
only versions less than FEAT_SPEv1p4)
|
||||
bit 26-31 - IMPLEMENTATION DEFINED events (only versions less than FEAT_SPEv1p4)
|
||||
bit 48-63 - IMPLEMENTATION DEFINED events (when implemented)
|
||||
|
||||
For IMPLEMENTATION DEFINED bits, refer to the CPU TRM if these bits are
|
||||
implemented.
|
||||
|
||||
The driver will reject events if requested filter bits require unimplemented SPE
|
||||
versions, but will not reject filter bits for unimplemented IMPDEF bits or when
|
||||
their related feature is not present (e.g. SME). For example, if FEAT_SPEv1p2 is
|
||||
not implemented, filtering on "Not taken event" (bit 6) will be rejected.
|
||||
|
||||
So to sample just retired instructions:
|
||||
|
||||
|
|
@ -171,6 +209,31 @@ or just mispredicted branches:
|
|||
|
||||
perf record -e arm_spe/event_filter=0x80/ -- ./mybench
|
||||
|
||||
When set, the following filters can be used to select samples that match any of
|
||||
the operation types (OR filtering). If only one is set then only samples of that
|
||||
type are collected:
|
||||
|
||||
branch_filter=1 - Collect branches (PMSFCR.B)
|
||||
load_filter=1 - Collect loads (PMSFCR.LD)
|
||||
store_filter=1 - Collect stores (PMSFCR.ST)
|
||||
|
||||
When extended filtering is supported (FEAT_SPE_EFT), SIMD and float
|
||||
pointer operations can also be selected:
|
||||
|
||||
simd_filter=1 - Collect SIMD loads, stores and operations (PMSFCR.SIMD)
|
||||
float_filter=1 - Collect floating point loads, stores and operations (PMSFCR.FP)
|
||||
|
||||
When extended filtering is supported (FEAT_SPE_EFT), operation type filters can
|
||||
be changed to AND using _mask fields. For example samples could be selected if
|
||||
they are store AND SIMD by setting 'store_filter=1,simd_filter=1,
|
||||
store_filter_mask=1,simd_filter_mask=1'. The new masks are as follows:
|
||||
|
||||
branch_filter_mask=1 - Change branch filter behavior from OR to AND (PMSFCR.Bm)
|
||||
load_filter_mask=1 - Change load filter behavior from OR to AND (PMSFCR.LDm)
|
||||
store_filter_mask=1 - Change store filter behavior from OR to AND (PMSFCR.STm)
|
||||
simd_filter_mask=1 - Change SIMD filter behavior from OR to AND (PMSFCR.SIMDm)
|
||||
float_filter_mask=1 - Change floating point filter behavior from OR to AND (PMSFCR.FPm)
|
||||
|
||||
Viewing the data
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
|
|
@ -210,6 +273,10 @@ Memory access details are also stored on the samples and this can be viewed with
|
|||
|
||||
perf report --mem-mode
|
||||
|
||||
The latency value from the SPE sample is stored in the 'weight' field of the
|
||||
Perf samples and can be displayed in Perf script and report outputs by enabling
|
||||
its display from the command line.
|
||||
|
||||
Common errors
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
|
|
@ -253,6 +320,25 @@ to minimize output. Then run perf stat:
|
|||
perf record -e arm_spe/discard/ -a -N -B --no-bpf-event -o - > /dev/null &
|
||||
perf stat -e SAMPLE_FEED_LD
|
||||
|
||||
Data source filtering
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
When FEAT_SPE_FDS is present, 'inv_data_src_filter' can be used as a mask to
|
||||
filter on a subset (0 - 63) of possible data source IDs. The full range of data
|
||||
sources is 0 - 65535 although these are unlikely to be used in practice. Data
|
||||
sources are IMPDEF so refer to the TRM for the mappings. Each bit N of the
|
||||
filter maps to data source N. The filter is an OR of all the bits, and the value
|
||||
provided inv_data_src_filter is inverted before writing to PMSDSFR_EL1 so that
|
||||
set bits exclude that data source and cleared bits include that data source.
|
||||
Therefore the default value of 0 is equivalent to no filtering (all data sources
|
||||
included).
|
||||
|
||||
For example, to include only data sources 0 and 3, clear bits 0 and 3
|
||||
(0xFFFFFFFFFFFFFFF6)
|
||||
|
||||
When 'inv_data_src_filter' is set to 0xFFFFFFFFFFFFFFFF, any samples with any
|
||||
data source set are excluded.
|
||||
|
||||
SEE ALSO
|
||||
--------
|
||||
|
||||
|
|
|
|||
|
|
@ -143,6 +143,13 @@ REPORT OPTIONS
|
|||
feature, which causes cacheline sharing to behave like the cacheline
|
||||
size is doubled.
|
||||
|
||||
-M::
|
||||
--disassembler-style=::
|
||||
Set disassembler style for objdump.
|
||||
|
||||
--objdump=<path>::
|
||||
Path to objdump binary.
|
||||
|
||||
C2C RECORD
|
||||
----------
|
||||
The perf c2c record command setup options related to HITM cacheline analysis
|
||||
|
|
|
|||
|
|
@ -50,7 +50,6 @@ feature::
|
|||
dwarf / HAVE_LIBDW_SUPPORT
|
||||
dwarf_getlocations / HAVE_LIBDW_SUPPORT
|
||||
dwarf-unwind / HAVE_DWARF_UNWIND_SUPPORT
|
||||
auxtrace / HAVE_AUXTRACE_SUPPORT
|
||||
libbfd / HAVE_LIBBFD_SUPPORT
|
||||
libbpf-strings / HAVE_LIBBPF_STRINGS_SUPPORT
|
||||
libcapstone / HAVE_LIBCAPSTONE_SUPPORT
|
||||
|
|
|
|||
|
|
@ -452,6 +452,9 @@ call-graph.*::
|
|||
kernel space is controlled not by this option but by the
|
||||
kernel config (CONFIG_UNWINDER_*).
|
||||
|
||||
The 'defer' mode can be used with 'fp' mode to enable deferred
|
||||
user callchains (like 'fp,defer').
|
||||
|
||||
call-graph.dump-size::
|
||||
The size of stack to dump in order to do post-unwinding. Default is 8192 (byte).
|
||||
When using dwarf into record-mode, the default size will be used if omitted.
|
||||
|
|
|
|||
|
|
@ -325,6 +325,10 @@ OPTIONS
|
|||
by default. User can change the number by passing it after comma
|
||||
like "--call-graph fp,32".
|
||||
|
||||
Also "defer" can be used with "fp" (like "--call-graph fp,defer") to
|
||||
enable deferred user callchain which will collect user-space callchains
|
||||
when the thread returns to the user space.
|
||||
|
||||
-q::
|
||||
--quiet::
|
||||
Don't print any warnings or messages, useful for scripting.
|
||||
|
|
|
|||
|
|
@ -527,6 +527,11 @@ include::itrace.txt[]
|
|||
The known limitations include exception handing such as
|
||||
setjmp/longjmp will have calls/returns not match.
|
||||
|
||||
--merge-callchains::
|
||||
Enable merging deferred user callchains if available. This is the
|
||||
default behavior. If you want to see separate CALLCHAIN_DEFERRED
|
||||
records for some reason, use --no-merge-callchains explicitly.
|
||||
|
||||
:GMEXAMPLECMD: script
|
||||
:GMEXAMPLESUBCMD:
|
||||
include::guest-files.txt[]
|
||||
|
|
|
|||
|
|
@ -94,6 +94,9 @@ RECORD OPTIONS
|
|||
-g::
|
||||
--callchain::
|
||||
Do call-graph (stack chain/backtrace) recording
|
||||
-o::
|
||||
--output=::
|
||||
Select the output file (default: perf.data)
|
||||
|
||||
EXAMPLES
|
||||
--------
|
||||
|
|
|
|||
|
|
@ -1112,19 +1112,6 @@ ifndef NO_CAPSTONE
|
|||
endif
|
||||
endif
|
||||
|
||||
ifndef NO_AUXTRACE
|
||||
ifeq ($(SRCARCH),x86)
|
||||
ifeq ($(feature-get_cpuid), 0)
|
||||
$(warning Your gcc lacks the __get_cpuid() builtin, disables support for auxtrace/Intel PT, please install a newer gcc)
|
||||
NO_AUXTRACE := 1
|
||||
endif
|
||||
endif
|
||||
ifndef NO_AUXTRACE
|
||||
$(call detected,CONFIG_AUXTRACE)
|
||||
CFLAGS += -DHAVE_AUXTRACE_SUPPORT
|
||||
endif
|
||||
endif
|
||||
|
||||
ifdef EXTRA_TESTS
|
||||
$(call detected,CONFIG_EXTRA_TESTS)
|
||||
CFLAGS += -DHAVE_EXTRA_TESTS
|
||||
|
|
|
|||
|
|
@ -84,8 +84,6 @@ include ../scripts/utilities.mak
|
|||
#
|
||||
# Define NO_LZMA if you do not want to support compressed (xz) kernel modules
|
||||
#
|
||||
# Define NO_AUXTRACE if you do not want AUX area tracing support
|
||||
#
|
||||
# Define NO_LIBBPF if you do not want BPF support
|
||||
#
|
||||
# Define NO_LIBCAP if you do not want process capabilities considered by perf
|
||||
|
|
@ -1272,9 +1270,24 @@ endif # CONFIG_PERF_BPF_SKEL
|
|||
bpf-skel-clean:
|
||||
$(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS) $(SKEL_OUT)/vmlinux.h
|
||||
|
||||
pmu-events-clean:
|
||||
ifeq ($(OUTPUT),)
|
||||
$(call QUIET_CLEAN, pmu-events) $(RM) \
|
||||
pmu-events/pmu-events.c \
|
||||
pmu-events/metric_test.log \
|
||||
pmu-events/test-empty-pmu-events.c \
|
||||
pmu-events/empty-pmu-events.log
|
||||
else # When an OUTPUT directory is present, clean up the copied pmu-events/arch directory.
|
||||
$(call QUIET_CLEAN, pmu-events) $(RM) -r $(OUTPUT)pmu-events/arch \
|
||||
$(OUTPUT)pmu-events/pmu-events.c \
|
||||
$(OUTPUT)pmu-events/metric_test.log \
|
||||
$(OUTPUT)pmu-events/test-empty-pmu-events.c \
|
||||
$(OUTPUT)pmu-events/empty-pmu-events.log
|
||||
endif
|
||||
|
||||
clean:: $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBSYMBOL)-clean $(LIBPERF)-clean \
|
||||
arm64-sysreg-defs-clean fixdep-clean python-clean bpf-skel-clean \
|
||||
tests-coresight-targets-clean
|
||||
tests-coresight-targets-clean pmu-events-clean
|
||||
$(call QUIET_CLEAN, core-objs) $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive \
|
||||
$(OUTPUT)perf-iostat $(LANG_BINDINGS)
|
||||
$(Q)find $(or $(OUTPUT),.) -name '*.o' -delete -o -name '*.a' -delete -o \
|
||||
|
|
@ -1287,10 +1300,6 @@ clean:: $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBSYMBOL)-clean $(
|
|||
$(OUTPUT)FEATURE-DUMP $(OUTPUT)util/*-bison* $(OUTPUT)util/*-flex* \
|
||||
$(OUTPUT)util/intel-pt-decoder/inat-tables.c \
|
||||
$(OUTPUT)tests/llvm-src-{base,kbuild,prologue,relocation}.c \
|
||||
$(OUTPUT)pmu-events/pmu-events.c \
|
||||
$(OUTPUT)pmu-events/test-empty-pmu-events.c \
|
||||
$(OUTPUT)pmu-events/empty-pmu-events.log \
|
||||
$(OUTPUT)pmu-events/metric_test.log \
|
||||
$(OUTPUT)$(fadvise_advice_array) \
|
||||
$(OUTPUT)$(fsconfig_arrays) \
|
||||
$(OUTPUT)$(fsmount_arrays) \
|
||||
|
|
|
|||
|
|
@ -1,6 +1,7 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <linux/compiler.h>
|
||||
#include <linux/zalloc.h>
|
||||
#include <errno.h>
|
||||
#include <sys/types.h>
|
||||
#include <regex.h>
|
||||
#include <stdlib.h>
|
||||
|
|
|
|||
|
|
@ -3,4 +3,4 @@ perf-util-y += perf_regs.o
|
|||
perf-util-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
|
||||
perf-util-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
|
||||
|
||||
perf-util-$(CONFIG_AUXTRACE) += pmu.o auxtrace.o cs-etm.o
|
||||
perf-util-y += pmu.o auxtrace.o cs-etm.o
|
||||
|
|
|
|||
|
|
@ -5,6 +5,7 @@
|
|||
*/
|
||||
|
||||
#include <dirent.h>
|
||||
#include <errno.h>
|
||||
#include <stdbool.h>
|
||||
#include <linux/coresight-pmu.h>
|
||||
#include <linux/zalloc.h>
|
||||
|
|
|
|||
|
|
@ -20,7 +20,6 @@ void perf_pmu__arch_init(struct perf_pmu *pmu)
|
|||
{
|
||||
struct perf_cpu_map *intersect, *online = cpu_map__online();
|
||||
|
||||
#ifdef HAVE_AUXTRACE_SUPPORT
|
||||
if (!strcmp(pmu->name, CORESIGHT_ETM_PMU_NAME)) {
|
||||
/* add ETM default config here */
|
||||
pmu->auxtrace = true;
|
||||
|
|
@ -39,7 +38,6 @@ void perf_pmu__arch_init(struct perf_pmu *pmu)
|
|||
pmu->selectable = true;
|
||||
#endif
|
||||
}
|
||||
#endif
|
||||
/* Workaround some ARM PMU's failing to correctly set CPU maps for online processors. */
|
||||
intersect = perf_cpu_map__intersect(online, pmu->cpus);
|
||||
perf_cpu_map__put(online);
|
||||
|
|
|
|||
|
|
@ -1,5 +1,6 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <linux/compiler.h>
|
||||
#include <errno.h>
|
||||
#include <sys/types.h>
|
||||
#include <regex.h>
|
||||
#include <stdlib.h>
|
||||
|
|
|
|||
|
|
@ -1,13 +1,14 @@
|
|||
perf-util-y += header.o
|
||||
perf-util-y += machine.o
|
||||
perf-util-y += perf_regs.o
|
||||
perf-util-y += tsc.o
|
||||
perf-util-y += pmu.o
|
||||
perf-util-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
|
||||
perf-util-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o
|
||||
perf-util-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
|
||||
perf-util-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
|
||||
|
||||
perf-util-$(CONFIG_AUXTRACE) += ../../arm/util/pmu.o \
|
||||
../../arm/util/auxtrace.o \
|
||||
../../arm/util/cs-etm.o \
|
||||
arm-spe.o mem-events.o hisi-ptt.o
|
||||
perf-util-y += ../../arm/util/auxtrace.o
|
||||
perf-util-y += ../../arm/util/cs-etm.o
|
||||
perf-util-y += ../../arm/util/pmu.o
|
||||
perf-util-y += arm-spe.o
|
||||
perf-util-y += header.o
|
||||
perf-util-y += hisi-ptt.o
|
||||
perf-util-y += machine.o
|
||||
perf-util-y += mem-events.o
|
||||
perf-util-y += perf_regs.o
|
||||
perf-util-y += pmu.o
|
||||
perf-util-y += tsc.o
|
||||
|
|
|
|||
|
|
@ -10,6 +10,7 @@
|
|||
#include <linux/log2.h>
|
||||
#include <linux/string.h>
|
||||
#include <linux/zalloc.h>
|
||||
#include <errno.h>
|
||||
#include <time.h>
|
||||
|
||||
#include "../../../util/cpumap.h"
|
||||
|
|
|
|||
|
|
@ -9,6 +9,7 @@
|
|||
#include <linux/bitops.h>
|
||||
#include <linux/log2.h>
|
||||
#include <linux/zalloc.h>
|
||||
#include <errno.h>
|
||||
#include <time.h>
|
||||
|
||||
#include <internal/lib.h> // page_size
|
||||
|
|
|
|||
|
|
@ -10,4 +10,4 @@ perf-util-$(CONFIG_LIBDW) += skip-callchain-idx.o
|
|||
|
||||
perf-util-$(CONFIG_LIBUNWIND) += unwind-libunwind.o
|
||||
perf-util-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
|
||||
perf-util-$(CONFIG_AUXTRACE) += auxtrace.o
|
||||
perf-util-y += auxtrace.o
|
||||
|
|
|
|||
|
|
@ -2,7 +2,7 @@
|
|||
/*
|
||||
* VPA support
|
||||
*/
|
||||
|
||||
#include <errno.h>
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/types.h>
|
||||
#include <linux/string.h>
|
||||
|
|
|
|||
|
|
@ -7,4 +7,4 @@ perf-util-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
|
|||
perf-util-y += machine.o
|
||||
perf-util-y += pmu.o
|
||||
|
||||
perf-util-$(CONFIG_AUXTRACE) += auxtrace.o
|
||||
perf-util-y += auxtrace.o
|
||||
|
|
|
|||
|
|
@ -1,3 +1,4 @@
|
|||
#include <errno.h>
|
||||
#include <stdbool.h>
|
||||
#include <stdlib.h>
|
||||
#include <linux/kernel.h>
|
||||
|
|
|
|||
|
|
@ -248,6 +248,7 @@ static void update_insn_state_x86(struct type_state *state,
|
|||
tsr = &state->regs[state->ret_reg];
|
||||
tsr->type = type_die;
|
||||
tsr->kind = TSR_KIND_TYPE;
|
||||
tsr->offset = 0;
|
||||
tsr->ok = true;
|
||||
|
||||
pr_debug_dtp("call [%x] return -> reg%d",
|
||||
|
|
@ -284,6 +285,7 @@ static void update_insn_state_x86(struct type_state *state,
|
|||
!strcmp(var_name, "this_cpu_off") &&
|
||||
tsr->kind == TSR_KIND_CONST) {
|
||||
tsr->kind = TSR_KIND_PERCPU_BASE;
|
||||
tsr->offset = 0;
|
||||
tsr->ok = true;
|
||||
imm_value = tsr->imm_value;
|
||||
}
|
||||
|
|
@ -291,6 +293,19 @@ static void update_insn_state_x86(struct type_state *state,
|
|||
else
|
||||
return;
|
||||
|
||||
/* Ignore add to non-pointer or non-const types */
|
||||
if (tsr->kind == TSR_KIND_POINTER ||
|
||||
(dwarf_tag(&tsr->type) == DW_TAG_pointer_type &&
|
||||
src->reg1 != DWARF_REG_PC && tsr->kind == TSR_KIND_TYPE && !dst->mem_ref)) {
|
||||
tsr->offset += imm_value;
|
||||
pr_debug_dtp("add [%x] offset %#"PRIx64" to reg%d",
|
||||
insn_offset, imm_value, dst->reg1);
|
||||
pr_debug_type_name(&tsr->type, tsr->kind);
|
||||
}
|
||||
|
||||
if (tsr->kind == TSR_KIND_CONST)
|
||||
tsr->imm_value += imm_value;
|
||||
|
||||
if (tsr->kind != TSR_KIND_PERCPU_BASE)
|
||||
return;
|
||||
|
||||
|
|
@ -302,6 +317,7 @@ static void update_insn_state_x86(struct type_state *state,
|
|||
*/
|
||||
tsr->type = type_die;
|
||||
tsr->kind = TSR_KIND_PERCPU_POINTER;
|
||||
tsr->offset = 0;
|
||||
tsr->ok = true;
|
||||
|
||||
pr_debug_dtp("add [%x] percpu %#"PRIx64" -> reg%d",
|
||||
|
|
@ -311,6 +327,135 @@ static void update_insn_state_x86(struct type_state *state,
|
|||
return;
|
||||
}
|
||||
|
||||
if (!strncmp(dl->ins.name, "sub", 3)) {
|
||||
u64 imm_value = -1ULL;
|
||||
|
||||
if (!has_reg_type(state, dst->reg1))
|
||||
return;
|
||||
|
||||
tsr = &state->regs[dst->reg1];
|
||||
tsr->copied_from = -1;
|
||||
|
||||
if (src->imm)
|
||||
imm_value = src->offset;
|
||||
else if (has_reg_type(state, src->reg1) &&
|
||||
state->regs[src->reg1].kind == TSR_KIND_CONST)
|
||||
imm_value = state->regs[src->reg1].imm_value;
|
||||
|
||||
if (tsr->kind == TSR_KIND_POINTER ||
|
||||
(dwarf_tag(&tsr->type) == DW_TAG_pointer_type &&
|
||||
src->reg1 != DWARF_REG_PC && tsr->kind == TSR_KIND_TYPE && !dst->mem_ref)) {
|
||||
tsr->offset -= imm_value;
|
||||
pr_debug_dtp("sub [%x] offset %#"PRIx64" to reg%d",
|
||||
insn_offset, imm_value, dst->reg1);
|
||||
pr_debug_type_name(&tsr->type, tsr->kind);
|
||||
}
|
||||
|
||||
if (tsr->kind == TSR_KIND_CONST)
|
||||
tsr->imm_value -= imm_value;
|
||||
|
||||
return;
|
||||
}
|
||||
|
||||
if (!strncmp(dl->ins.name, "lea", 3)) {
|
||||
int sreg = src->reg1;
|
||||
struct type_state_reg src_tsr;
|
||||
|
||||
if (!has_reg_type(state, sreg) ||
|
||||
!has_reg_type(state, dst->reg1) ||
|
||||
!src->mem_ref)
|
||||
return;
|
||||
|
||||
src_tsr = state->regs[sreg];
|
||||
tsr = &state->regs[dst->reg1];
|
||||
|
||||
tsr->copied_from = -1;
|
||||
tsr->ok = false;
|
||||
|
||||
/* Case 1: Based on stack pointer or frame pointer */
|
||||
if (sreg == fbreg || sreg == state->stack_reg) {
|
||||
struct type_state_stack *stack;
|
||||
int offset = src->offset - fboff;
|
||||
|
||||
stack = find_stack_state(state, offset);
|
||||
if (!stack)
|
||||
return;
|
||||
|
||||
tsr->type = stack->type;
|
||||
tsr->kind = TSR_KIND_POINTER;
|
||||
tsr->offset = offset - stack->offset;
|
||||
tsr->ok = true;
|
||||
|
||||
if (sreg == fbreg) {
|
||||
pr_debug_dtp("lea [%x] address of -%#x(stack) -> reg%d",
|
||||
insn_offset, -src->offset, dst->reg1);
|
||||
} else {
|
||||
pr_debug_dtp("lea [%x] address of %#x(reg%d) -> reg%d",
|
||||
insn_offset, src->offset, sreg, dst->reg1);
|
||||
}
|
||||
|
||||
pr_debug_type_name(&tsr->type, tsr->kind);
|
||||
}
|
||||
/* Case 2: Based on a register holding a typed pointer */
|
||||
else if (src_tsr.ok && (src_tsr.kind == TSR_KIND_POINTER ||
|
||||
(dwarf_tag(&src_tsr.type) == DW_TAG_pointer_type &&
|
||||
src_tsr.kind == TSR_KIND_TYPE))) {
|
||||
|
||||
if (src_tsr.kind == TSR_KIND_TYPE &&
|
||||
__die_get_real_type(&state->regs[sreg].type, &type_die) == NULL)
|
||||
return;
|
||||
|
||||
if (src_tsr.kind == TSR_KIND_POINTER)
|
||||
type_die = state->regs[sreg].type;
|
||||
|
||||
/* Check if the target type has a member at the new offset */
|
||||
if (die_get_member_type(&type_die,
|
||||
src->offset + src_tsr.offset, &type_die) == NULL)
|
||||
return;
|
||||
|
||||
tsr->type = src_tsr.type;
|
||||
tsr->kind = src_tsr.kind;
|
||||
tsr->offset = src->offset + src_tsr.offset;
|
||||
tsr->ok = true;
|
||||
|
||||
pr_debug_dtp("lea [%x] address of %s%#x(reg%d) -> reg%d",
|
||||
insn_offset, src->offset < 0 ? "-" : "",
|
||||
abs(src->offset), sreg, dst->reg1);
|
||||
|
||||
pr_debug_type_name(&tsr->type, tsr->kind);
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
/* Invalidate register states for other ops which may change pointers */
|
||||
if (has_reg_type(state, dst->reg1) && !dst->mem_ref &&
|
||||
dwarf_tag(&state->regs[dst->reg1].type) == DW_TAG_pointer_type) {
|
||||
if (!strncmp(dl->ins.name, "imul", 4) || !strncmp(dl->ins.name, "mul", 3) ||
|
||||
!strncmp(dl->ins.name, "idiv", 4) || !strncmp(dl->ins.name, "div", 3) ||
|
||||
!strncmp(dl->ins.name, "shl", 3) || !strncmp(dl->ins.name, "shr", 3) ||
|
||||
!strncmp(dl->ins.name, "sar", 3) || !strncmp(dl->ins.name, "and", 3) ||
|
||||
!strncmp(dl->ins.name, "or", 2) || !strncmp(dl->ins.name, "neg", 3) ||
|
||||
!strncmp(dl->ins.name, "inc", 3) || !strncmp(dl->ins.name, "dec", 3)) {
|
||||
pr_debug_dtp("%s [%x] invalidate reg%d\n",
|
||||
dl->ins.name, insn_offset, dst->reg1);
|
||||
state->regs[dst->reg1].ok = false;
|
||||
state->regs[dst->reg1].copied_from = -1;
|
||||
return;
|
||||
}
|
||||
|
||||
if (!strncmp(dl->ins.name, "xor", 3) && dst->reg1 == src->reg1) {
|
||||
/* xor reg, reg clears the register */
|
||||
pr_debug_dtp("xor [%x] clear reg%d\n",
|
||||
insn_offset, dst->reg1);
|
||||
|
||||
state->regs[dst->reg1].kind = TSR_KIND_CONST;
|
||||
state->regs[dst->reg1].imm_value = 0;
|
||||
state->regs[dst->reg1].ok = true;
|
||||
state->regs[dst->reg1].copied_from = -1;
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
if (strncmp(dl->ins.name, "mov", 3))
|
||||
return;
|
||||
|
||||
|
|
@ -345,6 +490,7 @@ static void update_insn_state_x86(struct type_state *state,
|
|||
|
||||
if (var_addr == 40) {
|
||||
tsr->kind = TSR_KIND_CANARY;
|
||||
tsr->offset = 0;
|
||||
tsr->ok = true;
|
||||
|
||||
pr_debug_dtp("mov [%x] stack canary -> reg%d\n",
|
||||
|
|
@ -361,6 +507,7 @@ static void update_insn_state_x86(struct type_state *state,
|
|||
|
||||
tsr->type = type_die;
|
||||
tsr->kind = TSR_KIND_TYPE;
|
||||
tsr->offset = 0;
|
||||
tsr->ok = true;
|
||||
|
||||
pr_debug_dtp("mov [%x] this-cpu addr=%#"PRIx64" -> reg%d",
|
||||
|
|
@ -372,6 +519,7 @@ static void update_insn_state_x86(struct type_state *state,
|
|||
if (src->imm) {
|
||||
tsr->kind = TSR_KIND_CONST;
|
||||
tsr->imm_value = src->offset;
|
||||
tsr->offset = 0;
|
||||
tsr->ok = true;
|
||||
|
||||
pr_debug_dtp("mov [%x] imm=%#x -> reg%d\n",
|
||||
|
|
@ -388,10 +536,11 @@ static void update_insn_state_x86(struct type_state *state,
|
|||
tsr->type = state->regs[src->reg1].type;
|
||||
tsr->kind = state->regs[src->reg1].kind;
|
||||
tsr->imm_value = state->regs[src->reg1].imm_value;
|
||||
tsr->offset = state->regs[src->reg1].offset;
|
||||
tsr->ok = true;
|
||||
|
||||
/* To copy back the variable type later (hopefully) */
|
||||
if (tsr->kind == TSR_KIND_TYPE)
|
||||
if (tsr->kind == TSR_KIND_TYPE || tsr->kind == TSR_KIND_POINTER)
|
||||
tsr->copied_from = src->reg1;
|
||||
|
||||
pr_debug_dtp("mov [%x] reg%d -> reg%d",
|
||||
|
|
@ -421,12 +570,14 @@ retry:
|
|||
} else if (!stack->compound) {
|
||||
tsr->type = stack->type;
|
||||
tsr->kind = stack->kind;
|
||||
tsr->offset = stack->ptr_offset;
|
||||
tsr->ok = true;
|
||||
} else if (die_get_member_type(&stack->type,
|
||||
offset - stack->offset,
|
||||
&type_die)) {
|
||||
tsr->type = type_die;
|
||||
tsr->kind = TSR_KIND_TYPE;
|
||||
tsr->offset = 0;
|
||||
tsr->ok = true;
|
||||
} else {
|
||||
tsr->ok = false;
|
||||
|
|
@ -446,15 +597,30 @@ retry:
|
|||
else if (has_reg_type(state, sreg) && state->regs[sreg].ok &&
|
||||
state->regs[sreg].kind == TSR_KIND_TYPE &&
|
||||
die_deref_ptr_type(&state->regs[sreg].type,
|
||||
src->offset, &type_die)) {
|
||||
src->offset + state->regs[sreg].offset, &type_die)) {
|
||||
tsr->type = type_die;
|
||||
tsr->kind = TSR_KIND_TYPE;
|
||||
tsr->offset = 0;
|
||||
tsr->ok = true;
|
||||
|
||||
pr_debug_dtp("mov [%x] %#x(reg%d) -> reg%d",
|
||||
insn_offset, src->offset, sreg, dst->reg1);
|
||||
pr_debug_type_name(&tsr->type, tsr->kind);
|
||||
}
|
||||
/* Handle dereference of TSR_KIND_POINTER registers */
|
||||
else if (has_reg_type(state, sreg) && state->regs[sreg].ok &&
|
||||
state->regs[sreg].kind == TSR_KIND_POINTER &&
|
||||
die_get_member_type(&state->regs[sreg].type,
|
||||
src->offset + state->regs[sreg].offset, &type_die)) {
|
||||
tsr->type = state->regs[sreg].type;
|
||||
tsr->kind = TSR_KIND_TYPE;
|
||||
tsr->offset = src->offset + state->regs[sreg].offset;
|
||||
tsr->ok = true;
|
||||
|
||||
pr_debug_dtp("mov [%x] addr %#x(reg%d) -> reg%d",
|
||||
insn_offset, src->offset, sreg, dst->reg1);
|
||||
pr_debug_type_name(&tsr->type, tsr->kind);
|
||||
}
|
||||
/* Or check if it's a global variable */
|
||||
else if (sreg == DWARF_REG_PC) {
|
||||
struct map_symbol *ms = dloc->ms;
|
||||
|
|
@ -473,6 +639,7 @@ retry:
|
|||
|
||||
tsr->type = type_die;
|
||||
tsr->kind = TSR_KIND_TYPE;
|
||||
tsr->offset = 0;
|
||||
tsr->ok = true;
|
||||
|
||||
pr_debug_dtp("mov [%x] global addr=%"PRIx64" -> reg%d",
|
||||
|
|
@ -504,6 +671,7 @@ retry:
|
|||
die_get_member_type(&type_die, offset, &type_die)) {
|
||||
tsr->type = type_die;
|
||||
tsr->kind = TSR_KIND_TYPE;
|
||||
tsr->offset = 0;
|
||||
tsr->ok = true;
|
||||
|
||||
if (src->multi_regs) {
|
||||
|
|
@ -526,6 +694,7 @@ retry:
|
|||
src->offset, &type_die)) {
|
||||
tsr->type = type_die;
|
||||
tsr->kind = TSR_KIND_TYPE;
|
||||
tsr->offset = 0;
|
||||
tsr->ok = true;
|
||||
|
||||
pr_debug_dtp("mov [%x] pointer %#x(reg%d) -> reg%d",
|
||||
|
|
@ -548,6 +717,7 @@ retry:
|
|||
&var_name, &offset) &&
|
||||
!strcmp(var_name, "__per_cpu_offset")) {
|
||||
tsr->kind = TSR_KIND_PERCPU_BASE;
|
||||
tsr->offset = 0;
|
||||
tsr->ok = true;
|
||||
|
||||
pr_debug_dtp("mov [%x] percpu base reg%d\n",
|
||||
|
|
@ -583,10 +753,10 @@ retry:
|
|||
*/
|
||||
if (!stack->compound)
|
||||
set_stack_state(stack, offset, tsr->kind,
|
||||
&tsr->type);
|
||||
&tsr->type, tsr->offset);
|
||||
} else {
|
||||
findnew_stack_state(state, offset, tsr->kind,
|
||||
&tsr->type);
|
||||
&tsr->type, tsr->offset);
|
||||
}
|
||||
|
||||
if (dst->reg1 == fbreg) {
|
||||
|
|
@ -596,6 +766,11 @@ retry:
|
|||
pr_debug_dtp("mov [%x] reg%d -> %#x(reg%d)",
|
||||
insn_offset, src->reg1, offset, dst->reg1);
|
||||
}
|
||||
if (tsr->offset != 0) {
|
||||
pr_debug_dtp(" reg%d offset %#x ->",
|
||||
src->reg1, tsr->offset);
|
||||
}
|
||||
|
||||
pr_debug_type_name(&tsr->type, tsr->kind);
|
||||
}
|
||||
/*
|
||||
|
|
|
|||
|
|
@ -3,9 +3,9 @@ perf-test-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
|
|||
|
||||
perf-test-y += arch-tests.o
|
||||
perf-test-y += hybrid.o
|
||||
perf-test-$(CONFIG_AUXTRACE) += intel-pt-test.o
|
||||
perf-test-y += intel-pt-test.o
|
||||
ifeq ($(CONFIG_EXTRA_TESTS),y)
|
||||
perf-test-$(CONFIG_AUXTRACE) += insn-x86.o
|
||||
perf-test-y += insn-x86.o
|
||||
endif
|
||||
perf-test-$(CONFIG_X86_64) += bp-modify.o
|
||||
perf-test-y += amd-ibs-via-core-pmu.o
|
||||
|
|
|
|||
|
|
@ -3,7 +3,6 @@
|
|||
#include "tests/tests.h"
|
||||
#include "arch-tests.h"
|
||||
|
||||
#ifdef HAVE_AUXTRACE_SUPPORT
|
||||
#ifdef HAVE_EXTRA_TESTS
|
||||
DEFINE_SUITE("x86 instruction decoder - new instructions", insn_x86);
|
||||
#endif
|
||||
|
|
@ -19,7 +18,6 @@ struct test_suite suite__intel_pt = {
|
|||
.test_cases = intel_pt_tests,
|
||||
};
|
||||
|
||||
#endif
|
||||
#if defined(__x86_64__)
|
||||
DEFINE_SUITE("x86 bp modify", bp_modify);
|
||||
#endif
|
||||
|
|
@ -39,12 +37,10 @@ struct test_suite *arch_tests[] = {
|
|||
#ifdef HAVE_DWARF_UNWIND_SUPPORT
|
||||
&suite__dwarf_unwind,
|
||||
#endif
|
||||
#ifdef HAVE_AUXTRACE_SUPPORT
|
||||
#ifdef HAVE_EXTRA_TESTS
|
||||
&suite__insn_x86,
|
||||
#endif
|
||||
&suite__intel_pt,
|
||||
#endif
|
||||
#if defined(__x86_64__)
|
||||
&suite__bp_modify,
|
||||
#endif
|
||||
|
|
|
|||
|
|
@ -3,7 +3,6 @@
|
|||
#include <linux/compiler.h>
|
||||
#include <linux/bits.h>
|
||||
#include <string.h>
|
||||
#include <cpuid.h>
|
||||
#include <sched.h>
|
||||
|
||||
#include "intel-pt-decoder/intel-pt-pkt-decoder.h"
|
||||
|
|
@ -11,6 +10,7 @@
|
|||
#include "debug.h"
|
||||
#include "tests/tests.h"
|
||||
#include "arch-tests.h"
|
||||
#include "../util/cpuid.h"
|
||||
#include "cpumap.h"
|
||||
|
||||
/**
|
||||
|
|
@ -363,7 +363,7 @@ static int get_pt_caps(int cpu, struct pt_caps *caps)
|
|||
memset(caps, 0, sizeof(*caps));
|
||||
|
||||
for (i = 0; i < INTEL_PT_SUBLEAF_CNT; i++) {
|
||||
__get_cpuid_count(20, i, &r.eax, &r.ebx, &r.ecx, &r.edx);
|
||||
cpuid(20, i, &r.eax, &r.ebx, &r.ecx, &r.edx);
|
||||
pr_debug("CPU %d CPUID leaf 20 subleaf %d\n", cpu, i);
|
||||
pr_debug("eax = 0x%08x\n", r.eax);
|
||||
pr_debug("ebx = 0x%08x\n", r.ebx);
|
||||
|
|
@ -380,7 +380,7 @@ static bool is_hybrid(void)
|
|||
unsigned int eax, ebx, ecx, edx = 0;
|
||||
bool result;
|
||||
|
||||
__get_cpuid_count(7, 0, &eax, &ebx, &ecx, &edx);
|
||||
cpuid(7, 0, &eax, &ebx, &ecx, &edx);
|
||||
result = edx & BIT(15);
|
||||
pr_debug("Is %shybrid : CPUID leaf 7 subleaf 0 edx %#x (bit-15 indicates hybrid)\n",
|
||||
result ? "" : "not ", edx);
|
||||
|
|
|
|||
|
|
@ -1,4 +1,5 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <errno.h>
|
||||
#include "arch-tests.h"
|
||||
#include "../util/topdown.h"
|
||||
#include "debug.h"
|
||||
|
|
|
|||
|
|
@ -14,7 +14,7 @@ perf-util-y += iostat.o
|
|||
perf-util-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
|
||||
perf-util-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
|
||||
|
||||
perf-util-$(CONFIG_AUXTRACE) += auxtrace.o
|
||||
perf-util-y += auxtrace.o
|
||||
perf-util-y += archinsn.o
|
||||
perf-util-$(CONFIG_AUXTRACE) += intel-pt.o
|
||||
perf-util-$(CONFIG_AUXTRACE) += intel-bts.o
|
||||
perf-util-y += intel-pt.o
|
||||
perf-util-y += intel-bts.o
|
||||
|
|
|
|||
|
|
@ -12,7 +12,6 @@
|
|||
#include <linux/log2.h>
|
||||
#include <linux/zalloc.h>
|
||||
#include <linux/err.h>
|
||||
#include <cpuid.h>
|
||||
|
||||
#include "../../../util/session.h"
|
||||
#include "../../../util/event.h"
|
||||
|
|
@ -34,6 +33,7 @@
|
|||
#include <internal/lib.h> // page_size
|
||||
#include "../../../util/intel-pt.h"
|
||||
#include <api/fs/fs.h>
|
||||
#include "cpuid.h"
|
||||
|
||||
#define KiB(x) ((x) * 1024)
|
||||
#define MiB(x) ((x) * 1024 * 1024)
|
||||
|
|
@ -72,7 +72,7 @@ static int intel_pt_parse_terms_with_default(const struct perf_pmu *pmu,
|
|||
int err;
|
||||
|
||||
parse_events_terms__init(&terms);
|
||||
err = parse_events_terms(&terms, str, /*input=*/ NULL);
|
||||
err = parse_events_terms(&terms, str);
|
||||
if (err)
|
||||
goto out_free;
|
||||
|
||||
|
|
@ -311,7 +311,7 @@ static void intel_pt_tsc_ctc_ratio(u32 *n, u32 *d)
|
|||
{
|
||||
unsigned int eax = 0, ebx = 0, ecx = 0, edx = 0;
|
||||
|
||||
__get_cpuid(0x15, &eax, &ebx, &ecx, &edx);
|
||||
cpuid(0x15, 0, &eax, &ebx, &ecx, &edx);
|
||||
*n = ebx;
|
||||
*d = eax;
|
||||
}
|
||||
|
|
|
|||
|
|
@ -273,7 +273,6 @@ void perf_pmu__arch_init(struct perf_pmu *pmu)
|
|||
{
|
||||
struct perf_pmu_caps *ldlat_cap;
|
||||
|
||||
#ifdef HAVE_AUXTRACE_SUPPORT
|
||||
if (!strcmp(pmu->name, INTEL_PT_PMU_NAME)) {
|
||||
pmu->auxtrace = true;
|
||||
pmu->selectable = true;
|
||||
|
|
@ -283,7 +282,6 @@ void perf_pmu__arch_init(struct perf_pmu *pmu)
|
|||
pmu->auxtrace = true;
|
||||
pmu->selectable = true;
|
||||
}
|
||||
#endif
|
||||
|
||||
if (x86__is_amd_cpu()) {
|
||||
if (strcmp(pmu->name, "ibs_op"))
|
||||
|
|
|
|||
|
|
@ -1,4 +1,5 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <errno.h>
|
||||
#include "util/evlist.h"
|
||||
#include "util/pmu.h"
|
||||
#include "util/pmus.h"
|
||||
|
|
|
|||
|
|
@ -1,4 +1,5 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <errno.h>
|
||||
#include <inttypes.h>
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
|
|
|
|||
|
|
@ -1,5 +1,6 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <err.h>
|
||||
#include <errno.h>
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <sys/prctl.h>
|
||||
|
|
|
|||
|
|
@ -85,7 +85,7 @@ static int add_dso(const char *fpath, const struct stat *sb __maybe_unused,
|
|||
if (typeflag == FTW_D || typeflag == FTW_SL)
|
||||
return 0;
|
||||
|
||||
if (filename__read_build_id(fpath, &bid, /*block=*/true) < 0)
|
||||
if (filename__read_build_id(fpath, &bid) < 0)
|
||||
return 0;
|
||||
|
||||
dso->name = realpath(fpath, NULL);
|
||||
|
|
|
|||
|
|
@ -4,6 +4,7 @@
|
|||
*
|
||||
* Copyright 2023 Google LLC.
|
||||
*/
|
||||
#include <errno.h>
|
||||
#include <stdio.h>
|
||||
#include "bench.h"
|
||||
#include "util/debug.h"
|
||||
|
|
|
|||
|
|
@ -6,6 +6,7 @@
|
|||
*
|
||||
* Copyright 2019 Google LLC.
|
||||
*/
|
||||
#include <errno.h>
|
||||
#include <stdio.h>
|
||||
#include "bench.h"
|
||||
#include "../util/debug.h"
|
||||
|
|
|
|||
|
|
@ -313,7 +313,8 @@ out_put:
|
|||
return ret;
|
||||
}
|
||||
|
||||
static int process_feature_event(struct perf_session *session,
|
||||
static int process_feature_event(const struct perf_tool *tool __maybe_unused,
|
||||
struct perf_session *session,
|
||||
union perf_event *event)
|
||||
{
|
||||
if (event->feat.feat_id < HEADER_LAST_FEATURE)
|
||||
|
|
@ -519,7 +520,7 @@ find_next:
|
|||
/* skip missing symbols */
|
||||
nd = rb_next(nd);
|
||||
} else if (use_browser == 1) {
|
||||
key = hist_entry__tui_annotate(he, evsel, NULL);
|
||||
key = hist_entry__tui_annotate(he, evsel, NULL, NO_ADDR);
|
||||
|
||||
switch (key) {
|
||||
case -1:
|
||||
|
|
|
|||
|
|
@ -180,7 +180,7 @@ static int build_id_cache__add_file(const char *filename, struct nsinfo *nsi)
|
|||
struct nscookie nsc;
|
||||
|
||||
nsinfo__mountns_enter(nsi, &nsc);
|
||||
err = filename__read_build_id(filename, &bid, /*block=*/true);
|
||||
err = filename__read_build_id(filename, &bid);
|
||||
nsinfo__mountns_exit(&nsc);
|
||||
if (err < 0) {
|
||||
pr_debug("Couldn't read a build-id in %s\n", filename);
|
||||
|
|
@ -204,7 +204,7 @@ static int build_id_cache__remove_file(const char *filename, struct nsinfo *nsi)
|
|||
int err;
|
||||
|
||||
nsinfo__mountns_enter(nsi, &nsc);
|
||||
err = filename__read_build_id(filename, &bid, /*block=*/true);
|
||||
err = filename__read_build_id(filename, &bid);
|
||||
nsinfo__mountns_exit(&nsc);
|
||||
if (err < 0) {
|
||||
pr_debug("Couldn't read a build-id in %s\n", filename);
|
||||
|
|
@ -280,7 +280,7 @@ static bool dso__missing_buildid_cache(struct dso *dso, int parm __maybe_unused)
|
|||
if (!dso__build_id_filename(dso, filename, sizeof(filename), false))
|
||||
return true;
|
||||
|
||||
if (filename__read_build_id(filename, &bid, /*block=*/true) == -1) {
|
||||
if (filename__read_build_id(filename, &bid) == -1) {
|
||||
if (errno == ENOENT)
|
||||
return false;
|
||||
|
||||
|
|
@ -309,7 +309,7 @@ static int build_id_cache__update_file(const char *filename, struct nsinfo *nsi)
|
|||
int err;
|
||||
|
||||
nsinfo__mountns_enter(nsi, &nsc);
|
||||
err = filename__read_build_id(filename, &bid, /*block=*/true);
|
||||
err = filename__read_build_id(filename, &bid);
|
||||
nsinfo__mountns_exit(&nsc);
|
||||
if (err < 0) {
|
||||
pr_debug("Couldn't read a build-id in %s\n", filename);
|
||||
|
|
|
|||
|
|
@ -45,6 +45,8 @@
|
|||
#include "pmus.h"
|
||||
#include "string2.h"
|
||||
#include "util/util.h"
|
||||
#include "util/symbol.h"
|
||||
#include "util/annotate.h"
|
||||
|
||||
struct c2c_hists {
|
||||
struct hists hists;
|
||||
|
|
@ -62,6 +64,7 @@ struct compute_stats {
|
|||
|
||||
struct c2c_hist_entry {
|
||||
struct c2c_hists *hists;
|
||||
struct evsel *evsel;
|
||||
struct c2c_stats stats;
|
||||
unsigned long *cpuset;
|
||||
unsigned long *nodeset;
|
||||
|
|
@ -225,6 +228,12 @@ he__get_c2c_hists(struct hist_entry *he,
|
|||
return hists;
|
||||
}
|
||||
|
||||
static void c2c_he__set_evsel(struct c2c_hist_entry *c2c_he,
|
||||
struct evsel *evsel)
|
||||
{
|
||||
c2c_he->evsel = evsel;
|
||||
}
|
||||
|
||||
static void c2c_he__set_cpu(struct c2c_hist_entry *c2c_he,
|
||||
struct perf_sample *sample)
|
||||
{
|
||||
|
|
@ -275,6 +284,33 @@ static void compute_stats(struct c2c_hist_entry *c2c_he,
|
|||
update_stats(&cstats->load, weight);
|
||||
}
|
||||
|
||||
/*
|
||||
* Return true if annotation is possible. When list is NULL,
|
||||
* it means that we are called at the c2c_browser level,
|
||||
* in that case we allow annotation to be initialized. When list
|
||||
* is non-NULL, it means that we are called at the cacheline_browser
|
||||
* level, in that case we allow annotation only if use_browser
|
||||
* is set and symbol information is available.
|
||||
*/
|
||||
static bool perf_c2c__has_annotation(struct perf_hpp_list *list)
|
||||
{
|
||||
if (use_browser != 1)
|
||||
return false;
|
||||
return !list || list->sym;
|
||||
}
|
||||
|
||||
static void perf_c2c__evsel_hists_inc_stats(struct evsel *evsel,
|
||||
struct hist_entry *he,
|
||||
struct perf_sample *sample)
|
||||
{
|
||||
struct hists *evsel_hists = evsel__hists(evsel);
|
||||
|
||||
hists__inc_nr_samples(evsel_hists, he->filtered);
|
||||
evsel_hists->stats.total_period += sample->period;
|
||||
if (!he->filtered)
|
||||
evsel_hists->stats.total_non_filtered_period += sample->period;
|
||||
}
|
||||
|
||||
static int process_sample_event(const struct perf_tool *tool __maybe_unused,
|
||||
union perf_event *event,
|
||||
struct perf_sample *sample,
|
||||
|
|
@ -286,7 +322,7 @@ static int process_sample_event(const struct perf_tool *tool __maybe_unused,
|
|||
struct c2c_stats stats = { .nr_entries = 0, };
|
||||
struct hist_entry *he;
|
||||
struct addr_location al;
|
||||
struct mem_info *mi, *mi_dup;
|
||||
struct mem_info *mi = NULL;
|
||||
struct callchain_cursor *cursor;
|
||||
int ret;
|
||||
|
||||
|
|
@ -313,20 +349,15 @@ static int process_sample_event(const struct perf_tool *tool __maybe_unused,
|
|||
goto out;
|
||||
}
|
||||
|
||||
/*
|
||||
* The mi object is released in hists__add_entry_ops,
|
||||
* if it gets sorted out into existing data, so we need
|
||||
* to take the copy now.
|
||||
*/
|
||||
mi_dup = mem_info__get(mi);
|
||||
|
||||
c2c_decode_stats(&stats, mi);
|
||||
|
||||
he = hists__add_entry_ops(&c2c_hists->hists, &c2c_entry_ops,
|
||||
&al, NULL, NULL, mi, NULL,
|
||||
sample, true);
|
||||
if (he == NULL)
|
||||
goto free_mi;
|
||||
if (he == NULL) {
|
||||
ret = -ENOMEM;
|
||||
goto out;
|
||||
}
|
||||
|
||||
c2c_he = container_of(he, struct c2c_hist_entry, he);
|
||||
c2c_add_stats(&c2c_he->stats, &stats);
|
||||
|
|
@ -334,8 +365,15 @@ static int process_sample_event(const struct perf_tool *tool __maybe_unused,
|
|||
|
||||
c2c_he__set_cpu(c2c_he, sample);
|
||||
c2c_he__set_node(c2c_he, sample);
|
||||
c2c_he__set_evsel(c2c_he, evsel);
|
||||
|
||||
hists__inc_nr_samples(&c2c_hists->hists, he->filtered);
|
||||
|
||||
if (perf_c2c__has_annotation(NULL)) {
|
||||
perf_c2c__evsel_hists_inc_stats(evsel, he, sample);
|
||||
addr_map_symbol__inc_samples(mem_info__iaddr(mi), sample, evsel);
|
||||
}
|
||||
|
||||
ret = hist_entry__append_callchain(he, sample);
|
||||
|
||||
if (!ret) {
|
||||
|
|
@ -350,17 +388,19 @@ static int process_sample_event(const struct perf_tool *tool __maybe_unused,
|
|||
int cpu = sample->cpu == (unsigned int) -1 ? 0 : sample->cpu;
|
||||
int node = c2c.cpu2node[cpu];
|
||||
|
||||
mi = mi_dup;
|
||||
|
||||
c2c_hists = he__get_c2c_hists(he, c2c.cl_sort, 2, machine->env);
|
||||
if (!c2c_hists)
|
||||
goto free_mi;
|
||||
if (!c2c_hists) {
|
||||
ret = -ENOMEM;
|
||||
goto out;
|
||||
}
|
||||
|
||||
he = hists__add_entry_ops(&c2c_hists->hists, &c2c_entry_ops,
|
||||
&al, NULL, NULL, mi, NULL,
|
||||
sample, true);
|
||||
if (he == NULL)
|
||||
goto free_mi;
|
||||
if (he == NULL) {
|
||||
ret = -ENOMEM;
|
||||
goto out;
|
||||
}
|
||||
|
||||
c2c_he = container_of(he, struct c2c_hist_entry, he);
|
||||
c2c_add_stats(&c2c_he->stats, &stats);
|
||||
|
|
@ -371,20 +411,16 @@ static int process_sample_event(const struct perf_tool *tool __maybe_unused,
|
|||
|
||||
c2c_he__set_cpu(c2c_he, sample);
|
||||
c2c_he__set_node(c2c_he, sample);
|
||||
c2c_he__set_evsel(c2c_he, evsel);
|
||||
|
||||
hists__inc_nr_samples(&c2c_hists->hists, he->filtered);
|
||||
ret = hist_entry__append_callchain(he, sample);
|
||||
}
|
||||
|
||||
out:
|
||||
mem_info__put(mi);
|
||||
addr_location__exit(&al);
|
||||
return ret;
|
||||
|
||||
free_mi:
|
||||
mem_info__put(mi_dup);
|
||||
mem_info__put(mi);
|
||||
ret = -ENOMEM;
|
||||
goto out;
|
||||
}
|
||||
|
||||
static const char * const c2c_usage[] = {
|
||||
|
|
@ -1997,6 +2033,9 @@ static int c2c_hists__init_sort(struct perf_hpp_list *hpp_list, char *name, stru
|
|||
if (dim == &dim_dso)
|
||||
hpp_list->dso = 1;
|
||||
|
||||
if (dim == &dim_symbol || dim == &dim_iaddr)
|
||||
hpp_list->sym = 1;
|
||||
|
||||
perf_hpp_list__register_sort_field(hpp_list, &c2c_fmt->fmt);
|
||||
return 0;
|
||||
}
|
||||
|
|
@ -2550,6 +2589,44 @@ static void perf_c2c__hists_fprintf(FILE *out, struct perf_session *session)
|
|||
}
|
||||
|
||||
#ifdef HAVE_SLANG_SUPPORT
|
||||
|
||||
static int perf_c2c__toggle_annotation(struct hist_browser *browser)
|
||||
{
|
||||
struct hist_entry *he = browser->he_selection;
|
||||
struct symbol *sym = NULL;
|
||||
struct annotated_source *src = NULL;
|
||||
struct c2c_hist_entry *c2c_he = NULL;
|
||||
u64 al_addr = NO_ADDR;
|
||||
|
||||
if (!perf_c2c__has_annotation(he->hists->hpp_list)) {
|
||||
ui_browser__help_window(&browser->b, "No annotation support");
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (he == NULL) {
|
||||
ui_browser__help_window(&browser->b, "No entry selected for annotation");
|
||||
return 0;
|
||||
}
|
||||
|
||||
sym = he->ms.sym;
|
||||
if (sym == NULL) {
|
||||
ui_browser__help_window(&browser->b, "Can not annotate, no symbol found");
|
||||
return 0;
|
||||
}
|
||||
|
||||
src = symbol__hists(sym, 0);
|
||||
if (src == NULL) {
|
||||
ui_browser__help_window(&browser->b, "Failed to initialize annotation source");
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (he->mem_info)
|
||||
al_addr = mem_info__iaddr(he->mem_info)->al_addr;
|
||||
|
||||
c2c_he = container_of(he, struct c2c_hist_entry, he);
|
||||
return hist_entry__tui_annotate(he, c2c_he->evsel, NULL, al_addr);
|
||||
}
|
||||
|
||||
static void c2c_browser__update_nr_entries(struct hist_browser *hb)
|
||||
{
|
||||
u64 nr_entries = 0;
|
||||
|
|
@ -2617,6 +2694,7 @@ static int perf_c2c__browse_cacheline(struct hist_entry *he)
|
|||
" ENTER Toggle callchains (if present) \n"
|
||||
" n Toggle Node details info \n"
|
||||
" s Toggle full length of symbol and source line columns \n"
|
||||
" a Toggle annotation view \n"
|
||||
" q Return back to cacheline list \n";
|
||||
|
||||
if (!he)
|
||||
|
|
@ -2651,6 +2729,9 @@ static int perf_c2c__browse_cacheline(struct hist_entry *he)
|
|||
c2c.node_info = (c2c.node_info + 1) % 3;
|
||||
setup_nodes_header();
|
||||
break;
|
||||
case 'a':
|
||||
perf_c2c__toggle_annotation(browser);
|
||||
break;
|
||||
case 'q':
|
||||
goto out;
|
||||
case '?':
|
||||
|
|
@ -3006,6 +3087,7 @@ static int perf_c2c__report(int argc, const char **argv)
|
|||
const char *display = NULL;
|
||||
const char *coalesce = NULL;
|
||||
bool no_source = false;
|
||||
const char *disassembler_style = NULL, *objdump_path = NULL;
|
||||
const struct option options[] = {
|
||||
OPT_STRING('k', "vmlinux", &symbol_conf.vmlinux_name,
|
||||
"file", "vmlinux pathname"),
|
||||
|
|
@ -3033,6 +3115,10 @@ static int perf_c2c__report(int argc, const char **argv)
|
|||
OPT_BOOLEAN(0, "stitch-lbr", &c2c.stitch_lbr,
|
||||
"Enable LBR callgraph stitching approach"),
|
||||
OPT_BOOLEAN(0, "double-cl", &chk_double_cl, "Detect adjacent cacheline false sharing"),
|
||||
OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style",
|
||||
"Specify disassembler style (e.g. -M intel for intel syntax)"),
|
||||
OPT_STRING(0, "objdump", &objdump_path, "path",
|
||||
"objdump binary to use for disassembly and annotations"),
|
||||
OPT_PARENT(c2c_options),
|
||||
OPT_END()
|
||||
};
|
||||
|
|
@ -3040,6 +3126,12 @@ static int perf_c2c__report(int argc, const char **argv)
|
|||
const char *output_str, *sort_str = NULL;
|
||||
struct perf_env *env;
|
||||
|
||||
annotation_options__init();
|
||||
|
||||
err = hists__init();
|
||||
if (err < 0)
|
||||
goto out;
|
||||
|
||||
argc = parse_options(argc, argv, options, report_c2c_usage,
|
||||
PARSE_OPT_STOP_AT_NON_OPTION);
|
||||
if (argc)
|
||||
|
|
@ -3052,6 +3144,27 @@ static int perf_c2c__report(int argc, const char **argv)
|
|||
if (c2c.stats_only)
|
||||
c2c.use_stdio = true;
|
||||
|
||||
/**
|
||||
* Annotation related options disassembler_style, objdump_path are set
|
||||
* in the c2c_options, so we can use them here.
|
||||
*/
|
||||
if (disassembler_style) {
|
||||
annotate_opts.disassembler_style = strdup(disassembler_style);
|
||||
if (!annotate_opts.disassembler_style) {
|
||||
err = -ENOMEM;
|
||||
pr_err("Failed to allocate memory for annotation options\n");
|
||||
goto out;
|
||||
}
|
||||
}
|
||||
if (objdump_path) {
|
||||
annotate_opts.objdump_path = strdup(objdump_path);
|
||||
if (!annotate_opts.objdump_path) {
|
||||
err = -ENOMEM;
|
||||
pr_err("Failed to allocate memory for annotation options\n");
|
||||
goto out;
|
||||
}
|
||||
}
|
||||
|
||||
err = symbol__validate_sym_arguments();
|
||||
if (err)
|
||||
goto out;
|
||||
|
|
@ -3126,6 +3239,38 @@ static int perf_c2c__report(int argc, const char **argv)
|
|||
if (err)
|
||||
goto out_mem2node;
|
||||
|
||||
if (c2c.use_stdio)
|
||||
use_browser = 0;
|
||||
else
|
||||
use_browser = 1;
|
||||
|
||||
/*
|
||||
* Only in the TUI browser we are doing integrated annotation,
|
||||
* so don't allocate extra space that won't be used in the stdio
|
||||
* implementation.
|
||||
*/
|
||||
if (perf_c2c__has_annotation(NULL)) {
|
||||
int ret = symbol__annotation_init();
|
||||
|
||||
if (ret < 0)
|
||||
goto out_mem2node;
|
||||
/*
|
||||
* For searching by name on the "Browse map details".
|
||||
* providing it only in verbose mode not to bloat too
|
||||
* much struct symbol.
|
||||
*/
|
||||
if (verbose > 0) {
|
||||
/*
|
||||
* XXX: Need to provide a less kludgy way to ask for
|
||||
* more space per symbol, the u32 is for the index on
|
||||
* the ui browser.
|
||||
* See symbol__browser_index.
|
||||
*/
|
||||
symbol_conf.priv_size += sizeof(u32);
|
||||
}
|
||||
annotation_config__init();
|
||||
}
|
||||
|
||||
if (symbol__init(env) < 0)
|
||||
goto out_mem2node;
|
||||
|
||||
|
|
@ -3135,11 +3280,6 @@ static int perf_c2c__report(int argc, const char **argv)
|
|||
goto out_mem2node;
|
||||
}
|
||||
|
||||
if (c2c.use_stdio)
|
||||
use_browser = 0;
|
||||
else
|
||||
use_browser = 1;
|
||||
|
||||
setup_browser(false);
|
||||
|
||||
err = perf_session__process_events(session);
|
||||
|
|
@ -3210,6 +3350,7 @@ out_mem2node:
|
|||
out_session:
|
||||
perf_session__delete(session);
|
||||
out:
|
||||
annotation_options__exit();
|
||||
return err;
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -42,7 +42,6 @@ struct feature_status supported_features[] = {
|
|||
FEATURE_STATUS("dwarf", HAVE_LIBDW_SUPPORT),
|
||||
FEATURE_STATUS("dwarf_getlocations", HAVE_LIBDW_SUPPORT),
|
||||
FEATURE_STATUS("dwarf-unwind", HAVE_DWARF_UNWIND_SUPPORT),
|
||||
FEATURE_STATUS("auxtrace", HAVE_AUXTRACE_SUPPORT),
|
||||
FEATURE_STATUS_TIP("libbfd", HAVE_LIBBFD_SUPPORT, "Deprecated, license incompatibility, use BUILD_NONDISTRO=1 and install binutils-dev[el]"),
|
||||
FEATURE_STATUS("libbpf-strings", HAVE_LIBBPF_STRINGS_SUPPORT),
|
||||
FEATURE_STATUS("libcapstone", HAVE_LIBCAPSTONE_SUPPORT),
|
||||
|
|
|
|||
|
|
@ -19,7 +19,8 @@
|
|||
#include "util/tool.h"
|
||||
#include "util/util.h"
|
||||
|
||||
static int process_header_feature(struct perf_session *session __maybe_unused,
|
||||
static int process_header_feature(const struct perf_tool *tool __maybe_unused,
|
||||
struct perf_session *session __maybe_unused,
|
||||
union perf_event *event __maybe_unused)
|
||||
{
|
||||
session_done = 1;
|
||||
|
|
|
|||
|
|
@ -197,18 +197,20 @@ static int perf_event__drop_oe(const struct perf_tool *tool __maybe_unused,
|
|||
}
|
||||
#endif
|
||||
|
||||
static int perf_event__repipe_op2_synth(struct perf_session *session,
|
||||
static int perf_event__repipe_op2_synth(const struct perf_tool *tool,
|
||||
struct perf_session *session __maybe_unused,
|
||||
union perf_event *event)
|
||||
{
|
||||
return perf_event__repipe_synth(session->tool, event);
|
||||
return perf_event__repipe_synth(tool, event);
|
||||
}
|
||||
|
||||
static int perf_event__repipe_op4_synth(struct perf_session *session,
|
||||
static int perf_event__repipe_op4_synth(const struct perf_tool *tool,
|
||||
struct perf_session *session __maybe_unused,
|
||||
union perf_event *event,
|
||||
u64 data __maybe_unused,
|
||||
const char *str __maybe_unused)
|
||||
{
|
||||
return perf_event__repipe_synth(session->tool, event);
|
||||
return perf_event__repipe_synth(tool, event);
|
||||
}
|
||||
|
||||
static int perf_event__repipe_attr(const struct perf_tool *tool,
|
||||
|
|
@ -237,8 +239,6 @@ static int perf_event__repipe_event_update(const struct perf_tool *tool,
|
|||
return perf_event__repipe_synth(tool, event);
|
||||
}
|
||||
|
||||
#ifdef HAVE_AUXTRACE_SUPPORT
|
||||
|
||||
static int copy_bytes(struct perf_inject *inject, struct perf_data *data, off_t size)
|
||||
{
|
||||
char buf[4096];
|
||||
|
|
@ -258,12 +258,11 @@ static int copy_bytes(struct perf_inject *inject, struct perf_data *data, off_t
|
|||
return 0;
|
||||
}
|
||||
|
||||
static s64 perf_event__repipe_auxtrace(struct perf_session *session,
|
||||
static s64 perf_event__repipe_auxtrace(const struct perf_tool *tool,
|
||||
struct perf_session *session,
|
||||
union perf_event *event)
|
||||
{
|
||||
const struct perf_tool *tool = session->tool;
|
||||
struct perf_inject *inject = container_of(tool, struct perf_inject,
|
||||
tool);
|
||||
struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
|
||||
int ret;
|
||||
|
||||
inject->have_auxtrace = true;
|
||||
|
|
@ -296,18 +295,6 @@ static s64 perf_event__repipe_auxtrace(struct perf_session *session,
|
|||
return event->auxtrace.size;
|
||||
}
|
||||
|
||||
#else
|
||||
|
||||
static s64
|
||||
perf_event__repipe_auxtrace(struct perf_session *session __maybe_unused,
|
||||
union perf_event *event __maybe_unused)
|
||||
{
|
||||
pr_err("AUX area tracing not supported\n");
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
#endif
|
||||
|
||||
static int perf_event__repipe(const struct perf_tool *tool,
|
||||
union perf_event *event,
|
||||
struct perf_sample *sample __maybe_unused,
|
||||
|
|
@ -661,12 +648,13 @@ static int perf_event__repipe_exit(const struct perf_tool *tool,
|
|||
}
|
||||
|
||||
#ifdef HAVE_LIBTRACEEVENT
|
||||
static int perf_event__repipe_tracing_data(struct perf_session *session,
|
||||
static int perf_event__repipe_tracing_data(const struct perf_tool *tool,
|
||||
struct perf_session *session,
|
||||
union perf_event *event)
|
||||
{
|
||||
perf_event__repipe_synth(session->tool, event);
|
||||
perf_event__repipe_synth(tool, event);
|
||||
|
||||
return perf_event__process_tracing_data(session, event);
|
||||
return perf_event__process_tracing_data(tool, session, event);
|
||||
}
|
||||
#endif
|
||||
|
||||
|
|
@ -680,12 +668,12 @@ static int dso__read_build_id(struct dso *dso)
|
|||
|
||||
mutex_lock(dso__lock(dso));
|
||||
nsinfo__mountns_enter(dso__nsinfo(dso), &nsc);
|
||||
if (filename__read_build_id(dso__long_name(dso), &bid, /*block=*/true) > 0)
|
||||
if (filename__read_build_id(dso__long_name(dso), &bid) > 0)
|
||||
dso__set_build_id(dso, &bid);
|
||||
else if (dso__nsinfo(dso)) {
|
||||
char *new_name = dso__filename_with_chroot(dso, dso__long_name(dso));
|
||||
|
||||
if (new_name && filename__read_build_id(new_name, &bid, /*block=*/true) > 0)
|
||||
if (new_name && filename__read_build_id(new_name, &bid) > 0)
|
||||
dso__set_build_id(dso, &bid);
|
||||
free(new_name);
|
||||
}
|
||||
|
|
@ -1348,7 +1336,7 @@ static int process_build_id(const struct perf_tool *tool,
|
|||
{
|
||||
struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
|
||||
|
||||
return perf_event__process_build_id(inject->session, event);
|
||||
return perf_event__process_build_id(tool, inject->session, event);
|
||||
}
|
||||
|
||||
static int synthesize_build_id(struct perf_inject *inject, struct dso *dso, pid_t machine_pid)
|
||||
|
|
@ -1780,9 +1768,10 @@ static int host__repipe(const struct perf_tool *tool,
|
|||
return perf_event__repipe(tool, event, sample, machine);
|
||||
}
|
||||
|
||||
static int host__finished_init(struct perf_session *session, union perf_event *event)
|
||||
static int host__finished_init(const struct perf_tool *tool, struct perf_session *session,
|
||||
union perf_event *event)
|
||||
{
|
||||
struct perf_inject *inject = container_of(session->tool, struct perf_inject, tool);
|
||||
struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
|
||||
struct guest_session *gs = &inject->guest_session;
|
||||
int ret;
|
||||
|
||||
|
|
@ -1829,7 +1818,7 @@ static int host__finished_init(struct perf_session *session, union perf_event *e
|
|||
if (ret)
|
||||
return ret;
|
||||
|
||||
return perf_event__repipe_op2_synth(session, event);
|
||||
return perf_event__repipe_op2_synth(tool, session, event);
|
||||
}
|
||||
|
||||
/*
|
||||
|
|
@ -2538,6 +2527,7 @@ int cmd_inject(int argc, const char **argv)
|
|||
inject.tool.auxtrace = perf_event__repipe_auxtrace;
|
||||
inject.tool.bpf_metadata = perf_event__repipe_op2_synth;
|
||||
inject.tool.dont_split_sample_group = true;
|
||||
inject.tool.merge_deferred_callchains = false;
|
||||
inject.session = __perf_session__new(&data, &inject.tool,
|
||||
/*trace_event_repipe=*/inject.output.is_pipe,
|
||||
/*host_env=*/NULL);
|
||||
|
|
|
|||
|
|
@ -2014,7 +2014,7 @@ static int __cmd_record(const char *file_name, int argc, const char **argv)
|
|||
for (j = 1; j < argc; j++, i++)
|
||||
rec_argv[i] = STRDUP_FAIL_EXIT(argv[j]);
|
||||
|
||||
BUG_ON(i != rec_argc);
|
||||
BUG_ON(i + 2 != rec_argc);
|
||||
|
||||
ret = kvm_add_default_arch_event(&i, rec_argv);
|
||||
if (ret)
|
||||
|
|
|
|||
|
|
@ -130,7 +130,7 @@ static void default_print_event(void *ps, const char *topic,
|
|||
if (deprecated && !print_state->deprecated)
|
||||
return;
|
||||
|
||||
if (print_state->pmu_glob && pmu_name && !strglobmatch(pmu_name, print_state->pmu_glob))
|
||||
if (print_state->pmu_glob && (!pmu_name || !strglobmatch(pmu_name, print_state->pmu_glob)))
|
||||
return;
|
||||
|
||||
if (print_state->exclude_abi && pmu_type < PERF_TYPE_MAX && pmu_type != PERF_TYPE_RAW)
|
||||
|
|
@ -283,8 +283,8 @@ static void default_print_metric(void *ps,
|
|||
}
|
||||
|
||||
struct json_print_state {
|
||||
/** @fp: File to write output to. */
|
||||
FILE *fp;
|
||||
/** The shared print_state */
|
||||
struct print_state common;
|
||||
/** Should a separator be printed prior to the next item? */
|
||||
bool need_sep;
|
||||
};
|
||||
|
|
@ -292,7 +292,7 @@ struct json_print_state {
|
|||
static void json_print_start(void *ps)
|
||||
{
|
||||
struct json_print_state *print_state = ps;
|
||||
FILE *fp = print_state->fp;
|
||||
FILE *fp = print_state->common.fp;
|
||||
|
||||
fprintf(fp, "[\n");
|
||||
}
|
||||
|
|
@ -300,7 +300,7 @@ static void json_print_start(void *ps)
|
|||
static void json_print_end(void *ps)
|
||||
{
|
||||
struct json_print_state *print_state = ps;
|
||||
FILE *fp = print_state->fp;
|
||||
FILE *fp = print_state->common.fp;
|
||||
|
||||
fprintf(fp, "%s]\n", print_state->need_sep ? "\n" : "");
|
||||
}
|
||||
|
|
@ -370,9 +370,26 @@ static void json_print_event(void *ps, const char *topic,
|
|||
{
|
||||
struct json_print_state *print_state = ps;
|
||||
bool need_sep = false;
|
||||
FILE *fp = print_state->fp;
|
||||
FILE *fp = print_state->common.fp;
|
||||
struct strbuf buf;
|
||||
|
||||
if (deprecated && !print_state->common.deprecated)
|
||||
return;
|
||||
|
||||
if (print_state->common.pmu_glob &&
|
||||
(!pmu_name || !strglobmatch(pmu_name, print_state->common.pmu_glob)))
|
||||
return;
|
||||
|
||||
if (print_state->common.exclude_abi && pmu_type < PERF_TYPE_MAX &&
|
||||
pmu_type != PERF_TYPE_RAW)
|
||||
return;
|
||||
|
||||
if (print_state->common.event_glob &&
|
||||
(!event_name || !strglobmatch(event_name, print_state->common.event_glob)) &&
|
||||
(!event_alias || !strglobmatch(event_alias, print_state->common.event_glob)) &&
|
||||
(!topic || !strglobmatch_nocase(topic, print_state->common.event_glob)))
|
||||
return;
|
||||
|
||||
strbuf_init(&buf, 0);
|
||||
fprintf(fp, "%s{\n", print_state->need_sep ? ",\n" : "");
|
||||
print_state->need_sep = true;
|
||||
|
|
@ -446,9 +463,16 @@ static void json_print_metric(void *ps __maybe_unused, const char *group,
|
|||
{
|
||||
struct json_print_state *print_state = ps;
|
||||
bool need_sep = false;
|
||||
FILE *fp = print_state->fp;
|
||||
FILE *fp = print_state->common.fp;
|
||||
struct strbuf buf;
|
||||
|
||||
if (print_state->common.event_glob &&
|
||||
(!print_state->common.metrics || !name ||
|
||||
!strglobmatch(name, print_state->common.event_glob)) &&
|
||||
(!print_state->common.metricgroups || !group ||
|
||||
!strglobmatch(group, print_state->common.event_glob)))
|
||||
return;
|
||||
|
||||
strbuf_init(&buf, 0);
|
||||
fprintf(fp, "%s{\n", print_state->need_sep ? ",\n" : "");
|
||||
print_state->need_sep = true;
|
||||
|
|
@ -521,10 +545,12 @@ int cmd_list(int argc, const char **argv)
|
|||
.fp = stdout,
|
||||
.desc = true,
|
||||
};
|
||||
struct print_state json_ps = {
|
||||
.fp = stdout,
|
||||
struct json_print_state json_ps = {
|
||||
.common = {
|
||||
.fp = stdout,
|
||||
},
|
||||
};
|
||||
void *ps = &default_ps;
|
||||
struct print_state *ps = &default_ps;
|
||||
struct print_callbacks print_cb = {
|
||||
.print_start = default_print_start,
|
||||
.print_end = default_print_end,
|
||||
|
|
@ -572,9 +598,11 @@ int cmd_list(int argc, const char **argv)
|
|||
argc = parse_options(argc, argv, list_options, list_usage,
|
||||
PARSE_OPT_STOP_AT_NON_OPTION);
|
||||
|
||||
if (json)
|
||||
ps = &json_ps.common;
|
||||
|
||||
if (output_path) {
|
||||
default_ps.fp = fopen(output_path, "w");
|
||||
json_ps.fp = default_ps.fp;
|
||||
ps->fp = fopen(output_path, "w");
|
||||
}
|
||||
|
||||
setup_pager();
|
||||
|
|
@ -590,14 +618,13 @@ int cmd_list(int argc, const char **argv)
|
|||
.print_metric = json_print_metric,
|
||||
.skip_duplicate_pmus = json_skip_duplicate_pmus,
|
||||
};
|
||||
ps = &json_ps;
|
||||
} else {
|
||||
default_ps.last_topic = strdup("");
|
||||
assert(default_ps.last_topic);
|
||||
default_ps.visited_metrics = strlist__new(NULL, NULL);
|
||||
assert(default_ps.visited_metrics);
|
||||
ps->last_topic = strdup("");
|
||||
assert(ps->last_topic);
|
||||
ps->visited_metrics = strlist__new(NULL, NULL);
|
||||
assert(ps->visited_metrics);
|
||||
if (unit_name)
|
||||
default_ps.pmu_glob = strdup(unit_name);
|
||||
ps->pmu_glob = strdup(unit_name);
|
||||
else if (cputype) {
|
||||
const struct perf_pmu *pmu = perf_pmus__pmu_for_pmu_filter(cputype);
|
||||
|
||||
|
|
@ -606,14 +633,16 @@ int cmd_list(int argc, const char **argv)
|
|||
ret = -1;
|
||||
goto out;
|
||||
}
|
||||
default_ps.pmu_glob = strdup(pmu->name);
|
||||
ps->pmu_glob = strdup(pmu->name);
|
||||
}
|
||||
}
|
||||
print_cb.print_start(ps);
|
||||
|
||||
if (argc == 0) {
|
||||
default_ps.metrics = true;
|
||||
default_ps.metricgroups = true;
|
||||
if (!unit_name) {
|
||||
ps->metrics = true;
|
||||
ps->metricgroups = true;
|
||||
}
|
||||
print_events(&print_cb, ps);
|
||||
goto out;
|
||||
}
|
||||
|
|
@ -633,41 +662,58 @@ int cmd_list(int argc, const char **argv)
|
|||
zfree(&default_ps.pmu_glob);
|
||||
default_ps.pmu_glob = old_pmu_glob;
|
||||
} else if (strcmp(argv[i], "hw") == 0 ||
|
||||
strcmp(argv[i], "hardware") == 0)
|
||||
print_symbol_events(&print_cb, ps, PERF_TYPE_HARDWARE,
|
||||
event_symbols_hw, PERF_COUNT_HW_MAX);
|
||||
else if (strcmp(argv[i], "sw") == 0 ||
|
||||
strcmp(argv[i], "hardware") == 0) {
|
||||
char *old_event_glob = ps->event_glob;
|
||||
|
||||
ps->event_glob = strdup("legacy hardware");
|
||||
if (!ps->event_glob) {
|
||||
ret = -1;
|
||||
goto out;
|
||||
}
|
||||
perf_pmus__print_pmu_events(&print_cb, ps);
|
||||
zfree(&ps->event_glob);
|
||||
ps->event_glob = old_event_glob;
|
||||
} else if (strcmp(argv[i], "sw") == 0 ||
|
||||
strcmp(argv[i], "software") == 0) {
|
||||
char *old_pmu_glob = default_ps.pmu_glob;
|
||||
char *old_pmu_glob = ps->pmu_glob;
|
||||
static const char * const sw_globs[] = { "software", "tool" };
|
||||
|
||||
for (size_t j = 0; j < ARRAY_SIZE(sw_globs); j++) {
|
||||
default_ps.pmu_glob = strdup(sw_globs[j]);
|
||||
if (!default_ps.pmu_glob) {
|
||||
ps->pmu_glob = strdup(sw_globs[j]);
|
||||
if (!ps->pmu_glob) {
|
||||
ret = -1;
|
||||
goto out;
|
||||
}
|
||||
perf_pmus__print_pmu_events(&print_cb, ps);
|
||||
zfree(&default_ps.pmu_glob);
|
||||
zfree(&ps->pmu_glob);
|
||||
}
|
||||
default_ps.pmu_glob = old_pmu_glob;
|
||||
ps->pmu_glob = old_pmu_glob;
|
||||
} else if (strcmp(argv[i], "cache") == 0 ||
|
||||
strcmp(argv[i], "hwcache") == 0)
|
||||
print_hwcache_events(&print_cb, ps);
|
||||
else if (strcmp(argv[i], "pmu") == 0) {
|
||||
default_ps.exclude_abi = true;
|
||||
strcmp(argv[i], "hwcache") == 0) {
|
||||
char *old_event_glob = ps->event_glob;
|
||||
|
||||
ps->event_glob = strdup("legacy cache");
|
||||
if (!ps->event_glob) {
|
||||
ret = -1;
|
||||
goto out;
|
||||
}
|
||||
perf_pmus__print_pmu_events(&print_cb, ps);
|
||||
default_ps.exclude_abi = false;
|
||||
zfree(&ps->event_glob);
|
||||
ps->event_glob = old_event_glob;
|
||||
} else if (strcmp(argv[i], "pmu") == 0) {
|
||||
ps->exclude_abi = true;
|
||||
perf_pmus__print_pmu_events(&print_cb, ps);
|
||||
ps->exclude_abi = false;
|
||||
} else if (strcmp(argv[i], "sdt") == 0)
|
||||
print_sdt_events(&print_cb, ps);
|
||||
else if (strcmp(argv[i], "metric") == 0 || strcmp(argv[i], "metrics") == 0) {
|
||||
default_ps.metricgroups = false;
|
||||
default_ps.metrics = true;
|
||||
ps->metricgroups = false;
|
||||
ps->metrics = true;
|
||||
metricgroup__print(&print_cb, ps);
|
||||
} else if (strcmp(argv[i], "metricgroup") == 0 ||
|
||||
strcmp(argv[i], "metricgroups") == 0) {
|
||||
default_ps.metricgroups = true;
|
||||
default_ps.metrics = false;
|
||||
ps->metricgroups = true;
|
||||
ps->metrics = false;
|
||||
metricgroup__print(&print_cb, ps);
|
||||
}
|
||||
#ifdef HAVE_LIBPFM
|
||||
|
|
@ -675,43 +721,40 @@ int cmd_list(int argc, const char **argv)
|
|||
print_libpfm_events(&print_cb, ps);
|
||||
#endif
|
||||
else if ((sep = strchr(argv[i], ':')) != NULL) {
|
||||
char *old_pmu_glob = default_ps.pmu_glob;
|
||||
char *old_event_glob = default_ps.event_glob;
|
||||
char *old_pmu_glob = ps->pmu_glob;
|
||||
char *old_event_glob = ps->event_glob;
|
||||
|
||||
default_ps.event_glob = strdup(argv[i]);
|
||||
if (!default_ps.event_glob) {
|
||||
ps->event_glob = strdup(argv[i]);
|
||||
if (!ps->event_glob) {
|
||||
ret = -1;
|
||||
goto out;
|
||||
}
|
||||
|
||||
default_ps.pmu_glob = strdup("tracepoint");
|
||||
if (!default_ps.pmu_glob) {
|
||||
zfree(&default_ps.event_glob);
|
||||
ps->pmu_glob = strdup("tracepoint");
|
||||
if (!ps->pmu_glob) {
|
||||
zfree(&ps->event_glob);
|
||||
ret = -1;
|
||||
goto out;
|
||||
}
|
||||
perf_pmus__print_pmu_events(&print_cb, ps);
|
||||
zfree(&default_ps.pmu_glob);
|
||||
default_ps.pmu_glob = old_pmu_glob;
|
||||
zfree(&ps->pmu_glob);
|
||||
ps->pmu_glob = old_pmu_glob;
|
||||
print_sdt_events(&print_cb, ps);
|
||||
default_ps.metrics = true;
|
||||
default_ps.metricgroups = true;
|
||||
ps->metrics = true;
|
||||
ps->metricgroups = true;
|
||||
metricgroup__print(&print_cb, ps);
|
||||
zfree(&default_ps.event_glob);
|
||||
default_ps.event_glob = old_event_glob;
|
||||
zfree(&ps->event_glob);
|
||||
ps->event_glob = old_event_glob;
|
||||
} else {
|
||||
if (asprintf(&s, "*%s*", argv[i]) < 0) {
|
||||
printf("Critical: Not enough memory! Trying to continue...\n");
|
||||
continue;
|
||||
}
|
||||
default_ps.event_glob = s;
|
||||
print_symbol_events(&print_cb, ps, PERF_TYPE_HARDWARE,
|
||||
event_symbols_hw, PERF_COUNT_HW_MAX);
|
||||
print_hwcache_events(&print_cb, ps);
|
||||
ps->event_glob = s;
|
||||
perf_pmus__print_pmu_events(&print_cb, ps);
|
||||
print_sdt_events(&print_cb, ps);
|
||||
default_ps.metrics = true;
|
||||
default_ps.metricgroups = true;
|
||||
ps->metrics = true;
|
||||
ps->metricgroups = true;
|
||||
metricgroup__print(&print_cb, ps);
|
||||
free(s);
|
||||
}
|
||||
|
|
@ -719,12 +762,12 @@ int cmd_list(int argc, const char **argv)
|
|||
|
||||
out:
|
||||
print_cb.print_end(ps);
|
||||
free(default_ps.pmu_glob);
|
||||
free(default_ps.last_topic);
|
||||
free(default_ps.last_metricgroups);
|
||||
strlist__delete(default_ps.visited_metrics);
|
||||
free(ps->pmu_glob);
|
||||
free(ps->last_topic);
|
||||
free(ps->last_metricgroups);
|
||||
strlist__delete(ps->visited_metrics);
|
||||
if (output_path)
|
||||
fclose(default_ps.fp);
|
||||
fclose(ps->fp);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
|
|
|||
|
|
@ -1,4 +1,5 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <errno.h>
|
||||
#include <inttypes.h>
|
||||
#include <sys/types.h>
|
||||
#include <sys/stat.h>
|
||||
|
|
|
|||
|
|
@ -730,8 +730,6 @@ static void record__sig_exit(void)
|
|||
raise(signr);
|
||||
}
|
||||
|
||||
#ifdef HAVE_AUXTRACE_SUPPORT
|
||||
|
||||
static int record__process_auxtrace(const struct perf_tool *tool,
|
||||
struct mmap *map,
|
||||
union perf_event *event, void *data1,
|
||||
|
|
@ -889,40 +887,6 @@ static int record__auxtrace_init(struct record *rec)
|
|||
return auxtrace_parse_filters(rec->evlist);
|
||||
}
|
||||
|
||||
#else
|
||||
|
||||
static inline
|
||||
int record__auxtrace_mmap_read(struct record *rec __maybe_unused,
|
||||
struct mmap *map __maybe_unused)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline
|
||||
void record__read_auxtrace_snapshot(struct record *rec __maybe_unused,
|
||||
bool on_exit __maybe_unused)
|
||||
{
|
||||
}
|
||||
|
||||
static inline
|
||||
int auxtrace_record__snapshot_start(struct auxtrace_record *itr __maybe_unused)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline
|
||||
int record__auxtrace_snapshot_exit(struct record *rec __maybe_unused)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int record__auxtrace_init(struct record *rec __maybe_unused)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
#endif
|
||||
|
||||
static int record__config_text_poke(struct evlist *evlist)
|
||||
{
|
||||
struct evsel *evsel;
|
||||
|
|
@ -983,7 +947,6 @@ static int record__config_tracking_events(struct record *rec)
|
|||
*/
|
||||
if (opts->target.initial_delay || target__has_cpu(&opts->target) ||
|
||||
perf_pmus__num_core_pmus() > 1) {
|
||||
|
||||
/*
|
||||
* User space tasks can migrate between CPUs, so when tracing
|
||||
* selected CPUs, sideband for all CPUs is still needed.
|
||||
|
|
@ -1388,10 +1351,27 @@ static int record__open(struct record *rec)
|
|||
struct perf_session *session = rec->session;
|
||||
struct record_opts *opts = &rec->opts;
|
||||
int rc = 0;
|
||||
bool skipped = false;
|
||||
bool removed_tracking = false;
|
||||
|
||||
evlist__for_each_entry(evlist, pos) {
|
||||
if (removed_tracking) {
|
||||
/*
|
||||
* Normally the head of the list has tracking enabled
|
||||
* for sideband data like mmaps. If this event is
|
||||
* removed, make sure to add tracking to the next
|
||||
* processed event.
|
||||
*/
|
||||
if (!pos->tracking) {
|
||||
pos->tracking = true;
|
||||
evsel__config(pos, opts, &callchain_param);
|
||||
}
|
||||
removed_tracking = false;
|
||||
}
|
||||
try_again:
|
||||
if (evsel__open(pos, pos->core.cpus, pos->core.threads) < 0) {
|
||||
bool report_error = true;
|
||||
|
||||
if (evsel__fallback(pos, &opts->target, errno, msg, sizeof(msg))) {
|
||||
if (verbose > 0)
|
||||
ui__warning("%s\n", msg);
|
||||
|
|
@ -1403,13 +1383,72 @@ try_again:
|
|||
pos = evlist__reset_weak_group(evlist, pos, true);
|
||||
goto try_again;
|
||||
}
|
||||
rc = -errno;
|
||||
evsel__open_strerror(pos, &opts->target, errno, msg, sizeof(msg));
|
||||
ui__error("%s\n", msg);
|
||||
goto out;
|
||||
#if defined(__aarch64__) || defined(__arm__)
|
||||
if (strstr(evsel__name(pos), "cycles")) {
|
||||
struct evsel *pos2;
|
||||
/*
|
||||
* Unfortunately ARM has many events named
|
||||
* "cycles" on PMUs like the system-level (L3)
|
||||
* cache which don't support sampling. Only
|
||||
* display such failures to open when there is
|
||||
* only 1 cycles event or verbose is enabled.
|
||||
*/
|
||||
evlist__for_each_entry(evlist, pos2) {
|
||||
if (pos2 == pos)
|
||||
continue;
|
||||
if (strstr(evsel__name(pos2), "cycles")) {
|
||||
report_error = false;
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
#endif
|
||||
if (report_error || verbose > 0) {
|
||||
ui__error("Failure to open event '%s' on PMU '%s' which will be "
|
||||
"removed.\n%s\n",
|
||||
evsel__name(pos), evsel__pmu_name(pos), msg);
|
||||
}
|
||||
if (pos->tracking)
|
||||
removed_tracking = true;
|
||||
pos->skippable = true;
|
||||
skipped = true;
|
||||
}
|
||||
}
|
||||
|
||||
if (skipped) {
|
||||
struct evsel *tmp;
|
||||
int idx = 0;
|
||||
bool evlist_empty = true;
|
||||
|
||||
/* Remove evsels that failed to open and update indices. */
|
||||
evlist__for_each_entry_safe(evlist, tmp, pos) {
|
||||
if (pos->skippable) {
|
||||
evlist__remove(evlist, pos);
|
||||
continue;
|
||||
}
|
||||
|
||||
/*
|
||||
* Note, dummy events may be command line parsed or
|
||||
* added by the tool. We care about supporting `perf
|
||||
* record -e dummy` which may be used as a permission
|
||||
* check. Dummy events that are added to the command
|
||||
* line and opened along with other events that fail,
|
||||
* will still fail as if the dummy events were tool
|
||||
* added events for the sake of code simplicity.
|
||||
*/
|
||||
if (!evsel__is_dummy_event(pos))
|
||||
evlist_empty = false;
|
||||
}
|
||||
evlist__for_each_entry(evlist, pos) {
|
||||
pos->core.idx = idx++;
|
||||
}
|
||||
/* If list is empty then fail. */
|
||||
if (evlist_empty) {
|
||||
ui__error("Failure to open any events for recording.\n");
|
||||
rc = -1;
|
||||
goto out;
|
||||
}
|
||||
}
|
||||
if (symbol_conf.kptr_restrict && !evlist__exclude_kernel(evlist)) {
|
||||
pr_warning(
|
||||
"WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,\n"
|
||||
|
|
@ -1815,15 +1854,14 @@ record__finish_output(struct record *rec)
|
|||
}
|
||||
|
||||
/* Buildid scanning disabled or build ID in kernel and synthesized map events. */
|
||||
if (!rec->no_buildid) {
|
||||
if (!rec->no_buildid || !rec->no_buildid_cache) {
|
||||
process_buildids(rec);
|
||||
|
||||
if (rec->buildid_all)
|
||||
perf_session__dsos_hit_all(rec->session);
|
||||
}
|
||||
perf_session__write_header(rec->session, rec->evlist, fd, true);
|
||||
|
||||
return;
|
||||
perf_session__cache_build_ids(rec->session);
|
||||
}
|
||||
|
||||
static int record__synthesize_workload(struct record *rec, bool tail)
|
||||
|
|
@ -2883,11 +2921,11 @@ out_free_threads:
|
|||
rec->bytes_written += off_cpu_write(rec->session);
|
||||
|
||||
record__read_lost_samples(rec);
|
||||
record__synthesize(rec, true);
|
||||
/* this will be recalculated during process_buildids() */
|
||||
rec->samples = 0;
|
||||
|
||||
if (!err) {
|
||||
record__synthesize(rec, true);
|
||||
if (!rec->timestamp_filename) {
|
||||
record__finish_output(rec);
|
||||
} else {
|
||||
|
|
@ -3008,7 +3046,7 @@ static int perf_record_config(const char *var, const char *value, void *cb)
|
|||
else if (!strcmp(value, "no-cache"))
|
||||
rec->no_buildid_cache = true;
|
||||
else if (!strcmp(value, "skip"))
|
||||
rec->no_buildid = true;
|
||||
rec->no_buildid = rec->no_buildid_cache = true;
|
||||
else if (!strcmp(value, "mmap"))
|
||||
rec->buildid_mmap = true;
|
||||
else if (!strcmp(value, "no-mmap"))
|
||||
|
|
@ -4117,24 +4155,25 @@ int cmd_record(int argc, const char **argv)
|
|||
record.opts.record_switch_events = true;
|
||||
}
|
||||
|
||||
if (!rec->buildid_mmap) {
|
||||
pr_debug("Disabling build id in synthesized mmap2 events.\n");
|
||||
symbol_conf.no_buildid_mmap2 = true;
|
||||
} else if (rec->buildid_mmap_set) {
|
||||
/*
|
||||
* Explicitly passing --buildid-mmap disables buildid processing
|
||||
* and cache generation.
|
||||
*/
|
||||
rec->no_buildid = true;
|
||||
}
|
||||
if (rec->buildid_mmap && !perf_can_record_build_id()) {
|
||||
pr_warning("Missing support for build id in kernel mmap events.\n"
|
||||
"Disable this warning with --no-buildid-mmap\n");
|
||||
rec->buildid_mmap = false;
|
||||
}
|
||||
|
||||
if (rec->buildid_mmap) {
|
||||
/* Enable perf_event_attr::build_id bit. */
|
||||
rec->opts.build_id = true;
|
||||
/* Disable build-ID table in the header. */
|
||||
rec->no_buildid = true;
|
||||
} else {
|
||||
pr_debug("Disabling build id in synthesized mmap2 events.\n");
|
||||
symbol_conf.no_buildid_mmap2 = true;
|
||||
}
|
||||
|
||||
if (rec->no_buildid_set && rec->no_buildid) {
|
||||
/* -B implies -N for historic reasons. */
|
||||
rec->no_buildid_cache = true;
|
||||
}
|
||||
|
||||
if (rec->opts.record_cgroup && !perf_can_record_cgroup()) {
|
||||
|
|
@ -4231,7 +4270,7 @@ int cmd_record(int argc, const char **argv)
|
|||
|
||||
err = -ENOMEM;
|
||||
|
||||
if (rec->no_buildid_cache || rec->no_buildid) {
|
||||
if (rec->no_buildid_cache) {
|
||||
disable_buildid_cache();
|
||||
} else if (rec->switch_output.enabled) {
|
||||
/*
|
||||
|
|
@ -4266,9 +4305,13 @@ int cmd_record(int argc, const char **argv)
|
|||
record.opts.tail_synthesize = true;
|
||||
|
||||
if (rec->evlist->core.nr_entries == 0) {
|
||||
err = parse_event(rec->evlist, "cycles:P");
|
||||
if (err)
|
||||
struct evlist *def_evlist = evlist__new_default();
|
||||
|
||||
if (!def_evlist)
|
||||
goto out;
|
||||
|
||||
evlist__splice_list_tail(rec->evlist, &def_evlist->core.entries);
|
||||
evlist__delete(def_evlist);
|
||||
}
|
||||
|
||||
if (rec->opts.target.tid && !rec->opts.no_inherit_set)
|
||||
|
|
|
|||
|
|
@ -240,10 +240,11 @@ static void setup_forced_leader(struct report *report,
|
|||
evlist__force_leader(evlist);
|
||||
}
|
||||
|
||||
static int process_feature_event(struct perf_session *session,
|
||||
static int process_feature_event(const struct perf_tool *tool,
|
||||
struct perf_session *session,
|
||||
union perf_event *event)
|
||||
{
|
||||
struct report *rep = container_of(session->tool, struct report, tool);
|
||||
struct report *rep = container_of(tool, struct report, tool);
|
||||
|
||||
if (event->feat.feat_id < HEADER_LAST_FEATURE)
|
||||
return perf_event__process_feature(session, event);
|
||||
|
|
@ -1613,6 +1614,7 @@ repeat:
|
|||
report.tool.event_update = perf_event__process_event_update;
|
||||
report.tool.feature = process_feature_event;
|
||||
report.tool.ordering_requires_timestamps = true;
|
||||
report.tool.merge_deferred_callchains = !dump_trace;
|
||||
|
||||
session = perf_session__new(&data, &report.tool);
|
||||
if (IS_ERR(session)) {
|
||||
|
|
|
|||
|
|
@ -33,6 +33,7 @@
|
|||
#include "util/path.h"
|
||||
#include "util/event.h"
|
||||
#include "util/mem-info.h"
|
||||
#include "util/metricgroup.h"
|
||||
#include "ui/ui.h"
|
||||
#include "print_binary.h"
|
||||
#include "print_insn.h"
|
||||
|
|
@ -341,16 +342,8 @@ struct evsel_script {
|
|||
char *filename;
|
||||
FILE *fp;
|
||||
u64 samples;
|
||||
/* For metric output */
|
||||
u64 val;
|
||||
int gnum;
|
||||
};
|
||||
|
||||
static inline struct evsel_script *evsel_script(struct evsel *evsel)
|
||||
{
|
||||
return (struct evsel_script *)evsel->priv;
|
||||
}
|
||||
|
||||
static struct evsel_script *evsel_script__new(struct evsel *evsel, struct perf_data *data)
|
||||
{
|
||||
struct evsel_script *es = zalloc(sizeof(*es));
|
||||
|
|
@ -2002,7 +1995,6 @@ static int perf_sample__fprintf_synth_iflag_chg(struct perf_sample *sample, FILE
|
|||
return len + perf_sample__fprintf_pt_spacing(len, fp);
|
||||
}
|
||||
|
||||
#ifdef HAVE_AUXTRACE_SUPPORT
|
||||
static int perf_sample__fprintf_synth_vpadtl(struct perf_sample *data, FILE *fp)
|
||||
{
|
||||
struct powerpc_vpadtl_entry *dtl = (struct powerpc_vpadtl_entry *)data->raw_data;
|
||||
|
|
@ -2021,13 +2013,6 @@ static int perf_sample__fprintf_synth_vpadtl(struct perf_sample *data, FILE *fp)
|
|||
|
||||
return len;
|
||||
}
|
||||
#else
|
||||
static int perf_sample__fprintf_synth_vpadtl(struct perf_sample *data __maybe_unused,
|
||||
FILE *fp __maybe_unused)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
#endif
|
||||
|
||||
static int perf_sample__fprintf_synth(struct perf_sample *sample,
|
||||
struct evsel *evsel, FILE *fp)
|
||||
|
|
@ -2132,13 +2117,161 @@ static void script_new_line(struct perf_stat_config *config __maybe_unused,
|
|||
fputs("\tmetric: ", mctx->fp);
|
||||
}
|
||||
|
||||
static void perf_sample__fprint_metric(struct perf_script *script,
|
||||
struct thread *thread,
|
||||
struct script_find_metrics_args {
|
||||
struct evlist *evlist;
|
||||
bool system_wide;
|
||||
};
|
||||
|
||||
static struct evsel *map_metric_evsel_to_script_evsel(struct evlist *script_evlist,
|
||||
struct evsel *metric_evsel)
|
||||
{
|
||||
struct evsel *script_evsel;
|
||||
|
||||
evlist__for_each_entry(script_evlist, script_evsel) {
|
||||
/* Skip if perf_event_attr differ. */
|
||||
if (metric_evsel->core.attr.type != script_evsel->core.attr.type)
|
||||
continue;
|
||||
if (metric_evsel->core.attr.config != script_evsel->core.attr.config)
|
||||
continue;
|
||||
/* Skip if the script event has a metric_id that doesn't match. */
|
||||
if (script_evsel->metric_id &&
|
||||
strcmp(evsel__metric_id(metric_evsel), evsel__metric_id(script_evsel))) {
|
||||
pr_debug("Skipping matching evsel due to differing metric ids '%s' vs '%s'\n",
|
||||
evsel__metric_id(metric_evsel), evsel__metric_id(script_evsel));
|
||||
continue;
|
||||
}
|
||||
return script_evsel;
|
||||
}
|
||||
return NULL;
|
||||
}
|
||||
|
||||
static int script_find_metrics(const struct pmu_metric *pm,
|
||||
const struct pmu_metrics_table *table __maybe_unused,
|
||||
void *data)
|
||||
{
|
||||
struct script_find_metrics_args *args = data;
|
||||
struct evlist *script_evlist = args->evlist;
|
||||
struct evlist *metric_evlist = evlist__new();
|
||||
struct evsel *metric_evsel;
|
||||
int ret = metricgroup__parse_groups(metric_evlist,
|
||||
/*pmu=*/"all",
|
||||
pm->metric_name,
|
||||
/*metric_no_group=*/false,
|
||||
/*metric_no_merge=*/false,
|
||||
/*metric_no_threshold=*/true,
|
||||
/*user_requested_cpu_list=*/NULL,
|
||||
args->system_wide,
|
||||
/*hardware_aware_grouping=*/false);
|
||||
|
||||
if (ret) {
|
||||
/* Metric parsing failed but continue the search. */
|
||||
goto out;
|
||||
}
|
||||
|
||||
/*
|
||||
* Check the script_evlist has an entry for each metric_evlist entry. If
|
||||
* the script evsel was already set up avoid changing data that may
|
||||
* break it.
|
||||
*/
|
||||
evlist__for_each_entry(metric_evlist, metric_evsel) {
|
||||
struct evsel *script_evsel =
|
||||
map_metric_evsel_to_script_evsel(script_evlist, metric_evsel);
|
||||
struct evsel *new_metric_leader;
|
||||
|
||||
if (!script_evsel) {
|
||||
pr_debug("Skipping metric '%s' as evsel '%s' / '%s' is missing\n",
|
||||
pm->metric_name, evsel__name(metric_evsel),
|
||||
evsel__metric_id(metric_evsel));
|
||||
goto out;
|
||||
}
|
||||
|
||||
if (script_evsel->metric_leader == NULL)
|
||||
continue;
|
||||
|
||||
if (metric_evsel->metric_leader == metric_evsel) {
|
||||
new_metric_leader = script_evsel;
|
||||
} else {
|
||||
new_metric_leader =
|
||||
map_metric_evsel_to_script_evsel(script_evlist,
|
||||
metric_evsel->metric_leader);
|
||||
}
|
||||
/* Mismatching evsel leaders. */
|
||||
if (script_evsel->metric_leader != new_metric_leader) {
|
||||
pr_debug("Skipping metric '%s' due to mismatching evsel metric leaders '%s' vs '%s'\n",
|
||||
pm->metric_name, evsel__metric_id(metric_evsel),
|
||||
evsel__metric_id(script_evsel));
|
||||
goto out;
|
||||
}
|
||||
}
|
||||
/*
|
||||
* Metric events match those in the script evlist, copy metric evsel
|
||||
* data into the script evlist.
|
||||
*/
|
||||
evlist__for_each_entry(metric_evlist, metric_evsel) {
|
||||
struct evsel *script_evsel =
|
||||
map_metric_evsel_to_script_evsel(script_evlist, metric_evsel);
|
||||
struct metric_event *metric_me = metricgroup__lookup(&metric_evlist->metric_events,
|
||||
metric_evsel,
|
||||
/*create=*/false);
|
||||
|
||||
if (script_evsel->metric_id == NULL) {
|
||||
script_evsel->metric_id = metric_evsel->metric_id;
|
||||
metric_evsel->metric_id = NULL;
|
||||
}
|
||||
|
||||
if (script_evsel->metric_leader == NULL) {
|
||||
if (metric_evsel->metric_leader == metric_evsel) {
|
||||
script_evsel->metric_leader = script_evsel;
|
||||
} else {
|
||||
script_evsel->metric_leader =
|
||||
map_metric_evsel_to_script_evsel(script_evlist,
|
||||
metric_evsel->metric_leader);
|
||||
}
|
||||
}
|
||||
|
||||
if (metric_me) {
|
||||
struct metric_expr *expr;
|
||||
struct metric_event *script_me =
|
||||
metricgroup__lookup(&script_evlist->metric_events,
|
||||
script_evsel,
|
||||
/*create=*/true);
|
||||
|
||||
if (!script_me) {
|
||||
/*
|
||||
* As the metric_expr is created, the only
|
||||
* failure is a lack of memory.
|
||||
*/
|
||||
goto out;
|
||||
}
|
||||
list_splice_init(&metric_me->head, &script_me->head);
|
||||
list_for_each_entry(expr, &script_me->head, nd) {
|
||||
for (int i = 0; expr->metric_events[i]; i++) {
|
||||
expr->metric_events[i] =
|
||||
map_metric_evsel_to_script_evsel(script_evlist,
|
||||
expr->metric_events[i]);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
pr_debug("Found metric '%s' whose evsels match those of in the perf data\n",
|
||||
pm->metric_name);
|
||||
evlist__delete(metric_evlist);
|
||||
out:
|
||||
return 0;
|
||||
}
|
||||
|
||||
static struct aggr_cpu_id script_aggr_cpu_id_get(struct perf_stat_config *config __maybe_unused,
|
||||
struct perf_cpu cpu)
|
||||
{
|
||||
return aggr_cpu_id__global(cpu, /*data=*/NULL);
|
||||
}
|
||||
|
||||
static void perf_sample__fprint_metric(struct thread *thread,
|
||||
struct evsel *evsel,
|
||||
struct perf_sample *sample,
|
||||
FILE *fp)
|
||||
{
|
||||
struct evsel *leader = evsel__leader(evsel);
|
||||
static bool init_metrics;
|
||||
struct perf_stat_output_ctx ctx = {
|
||||
.print_metric = script_print_metric,
|
||||
.new_line = script_new_line,
|
||||
|
|
@ -2150,23 +2283,84 @@ static void perf_sample__fprint_metric(struct perf_script *script,
|
|||
},
|
||||
.force_header = false,
|
||||
};
|
||||
struct evsel *ev2;
|
||||
u64 val;
|
||||
struct perf_counts_values *count, *old_count;
|
||||
int cpu_map_idx, thread_map_idx, aggr_idx;
|
||||
struct evsel *pos;
|
||||
|
||||
if (!evsel->stats)
|
||||
evlist__alloc_stats(&stat_config, script->session->evlist, /*alloc_raw=*/false);
|
||||
if (evsel_script(leader)->gnum++ == 0)
|
||||
perf_stat__reset_shadow_stats();
|
||||
val = sample->period * evsel->scale;
|
||||
evsel_script(evsel)->val = val;
|
||||
if (evsel_script(leader)->gnum == leader->core.nr_members) {
|
||||
for_each_group_member (ev2, leader) {
|
||||
perf_stat__print_shadow_stats(&stat_config, ev2,
|
||||
evsel_script(ev2)->val,
|
||||
sample->cpu,
|
||||
&ctx);
|
||||
if (!init_metrics) {
|
||||
/* One time initialization of stat_config and metric data. */
|
||||
struct script_find_metrics_args args = {
|
||||
.evlist = evsel->evlist,
|
||||
.system_wide = perf_thread_map__pid(evsel->core.threads, /*idx=*/0) == -1,
|
||||
|
||||
};
|
||||
if (!stat_config.output)
|
||||
stat_config.output = stdout;
|
||||
|
||||
if (!stat_config.aggr_map) {
|
||||
/* TODO: currently only global aggregation is supported. */
|
||||
assert(stat_config.aggr_mode == AGGR_GLOBAL);
|
||||
stat_config.aggr_get_id = script_aggr_cpu_id_get;
|
||||
stat_config.aggr_map =
|
||||
cpu_aggr_map__new(evsel->evlist->core.user_requested_cpus,
|
||||
aggr_cpu_id__global, /*data=*/NULL,
|
||||
/*needs_sort=*/false);
|
||||
}
|
||||
|
||||
metricgroup__for_each_metric(pmu_metrics_table__find(), script_find_metrics, &args);
|
||||
init_metrics = true;
|
||||
}
|
||||
|
||||
if (!evsel->stats) {
|
||||
if (evlist__alloc_stats(&stat_config, evsel->evlist, /*alloc_raw=*/true) < 0)
|
||||
return;
|
||||
}
|
||||
if (!evsel->stats->aggr) {
|
||||
if (evlist__alloc_aggr_stats(evsel->evlist, stat_config.aggr_map->nr) < 0)
|
||||
return;
|
||||
}
|
||||
|
||||
/* Update the evsel's count using the sample's data. */
|
||||
cpu_map_idx = perf_cpu_map__idx(evsel->core.cpus, (struct perf_cpu){sample->cpu});
|
||||
if (cpu_map_idx < 0) {
|
||||
/* Missing CPU, check for any CPU. */
|
||||
if (perf_cpu_map__cpu(evsel->core.cpus, /*idx=*/0).cpu == -1 ||
|
||||
sample->cpu == (u32)-1) {
|
||||
/* Place the counts in the which ever CPU is first in the map. */
|
||||
cpu_map_idx = 0;
|
||||
} else {
|
||||
pr_info("Missing CPU map entry for CPU %d\n", sample->cpu);
|
||||
return;
|
||||
}
|
||||
}
|
||||
thread_map_idx = perf_thread_map__idx(evsel->core.threads, sample->tid);
|
||||
if (thread_map_idx < 0) {
|
||||
/* Missing thread, check for any thread. */
|
||||
if (perf_thread_map__pid(evsel->core.threads, /*idx=*/0) == -1 ||
|
||||
sample->tid == (u32)-1) {
|
||||
/* Place the counts in the which ever thread is first in the map. */
|
||||
thread_map_idx = 0;
|
||||
} else {
|
||||
pr_info("Missing thread map entry for thread %d\n", sample->tid);
|
||||
return;
|
||||
}
|
||||
}
|
||||
count = perf_counts(evsel->counts, cpu_map_idx, thread_map_idx);
|
||||
old_count = perf_counts(evsel->prev_raw_counts, cpu_map_idx, thread_map_idx);
|
||||
count->val = old_count->val + sample->period;
|
||||
count->run = old_count->run + 1;
|
||||
count->ena = old_count->ena + 1;
|
||||
|
||||
/* Update the aggregated stats. */
|
||||
perf_stat_process_counter(&stat_config, evsel);
|
||||
|
||||
/* Display all metrics. */
|
||||
evlist__for_each_entry(evsel->evlist, pos) {
|
||||
cpu_aggr_map__for_each_idx(aggr_idx, stat_config.aggr_map) {
|
||||
perf_stat__print_shadow_stats(&stat_config, pos,
|
||||
aggr_idx,
|
||||
&ctx);
|
||||
}
|
||||
evsel_script(leader)->gnum = 0;
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -2348,7 +2542,7 @@ static void process_event(struct perf_script *script,
|
|||
}
|
||||
|
||||
if (PRINT_FIELD(METRIC))
|
||||
perf_sample__fprint_metric(script, thread, evsel, sample, fp);
|
||||
perf_sample__fprint_metric(thread, evsel, sample, fp);
|
||||
|
||||
if (verbose > 0)
|
||||
fflush(fp);
|
||||
|
|
@ -2512,6 +2706,94 @@ out_put:
|
|||
return ret;
|
||||
}
|
||||
|
||||
static int process_deferred_sample_event(const struct perf_tool *tool,
|
||||
union perf_event *event,
|
||||
struct perf_sample *sample,
|
||||
struct evsel *evsel,
|
||||
struct machine *machine)
|
||||
{
|
||||
struct perf_script *scr = container_of(tool, struct perf_script, tool);
|
||||
struct perf_event_attr *attr = &evsel->core.attr;
|
||||
struct evsel_script *es = evsel->priv;
|
||||
unsigned int type = output_type(attr->type);
|
||||
struct addr_location al;
|
||||
FILE *fp = es->fp;
|
||||
int ret = 0;
|
||||
|
||||
if (output[type].fields == 0)
|
||||
return 0;
|
||||
|
||||
/* Set thread to NULL to indicate addr_al and al are not initialized */
|
||||
addr_location__init(&al);
|
||||
|
||||
if (perf_time__ranges_skip_sample(scr->ptime_range, scr->range_num,
|
||||
sample->time)) {
|
||||
goto out_put;
|
||||
}
|
||||
|
||||
if (debug_mode) {
|
||||
if (sample->time < last_timestamp) {
|
||||
pr_err("Samples misordered, previous: %" PRIu64
|
||||
" this: %" PRIu64 "\n", last_timestamp,
|
||||
sample->time);
|
||||
nr_unordered++;
|
||||
}
|
||||
last_timestamp = sample->time;
|
||||
goto out_put;
|
||||
}
|
||||
|
||||
if (filter_cpu(sample))
|
||||
goto out_put;
|
||||
|
||||
if (machine__resolve(machine, &al, sample) < 0) {
|
||||
pr_err("problem processing %d event, skipping it.\n",
|
||||
event->header.type);
|
||||
ret = -1;
|
||||
goto out_put;
|
||||
}
|
||||
|
||||
if (al.filtered)
|
||||
goto out_put;
|
||||
|
||||
if (!show_event(sample, evsel, al.thread, &al, NULL))
|
||||
goto out_put;
|
||||
|
||||
if (evswitch__discard(&scr->evswitch, evsel))
|
||||
goto out_put;
|
||||
|
||||
perf_sample__fprintf_start(scr, sample, al.thread, evsel,
|
||||
PERF_RECORD_CALLCHAIN_DEFERRED, fp);
|
||||
fprintf(fp, "DEFERRED CALLCHAIN [cookie: %llx]",
|
||||
(unsigned long long)event->callchain_deferred.cookie);
|
||||
|
||||
if (PRINT_FIELD(IP)) {
|
||||
struct callchain_cursor *cursor = NULL;
|
||||
|
||||
if (symbol_conf.use_callchain && sample->callchain) {
|
||||
cursor = get_tls_callchain_cursor();
|
||||
if (thread__resolve_callchain(al.thread, cursor, evsel,
|
||||
sample, NULL, NULL,
|
||||
scripting_max_stack)) {
|
||||
pr_info("cannot resolve deferred callchains\n");
|
||||
cursor = NULL;
|
||||
}
|
||||
}
|
||||
|
||||
fputc(cursor ? '\n' : ' ', fp);
|
||||
sample__fprintf_sym(sample, &al, 0, output[type].print_ip_opts,
|
||||
cursor, symbol_conf.bt_stop_list, fp);
|
||||
}
|
||||
|
||||
fprintf(fp, "\n");
|
||||
|
||||
if (verbose > 0)
|
||||
fflush(fp);
|
||||
|
||||
out_put:
|
||||
addr_location__exit(&al);
|
||||
return ret;
|
||||
}
|
||||
|
||||
// Used when scr->per_event_dump is not set
|
||||
static struct evsel_script es_stdout;
|
||||
|
||||
|
|
@ -2729,7 +3011,8 @@ static int process_switch_event(const struct perf_tool *tool,
|
|||
sample->tid);
|
||||
}
|
||||
|
||||
static int process_auxtrace_error(struct perf_session *session,
|
||||
static int process_auxtrace_error(const struct perf_tool *tool,
|
||||
struct perf_session *session,
|
||||
union perf_event *event)
|
||||
{
|
||||
if (scripting_ops && scripting_ops->process_auxtrace_error) {
|
||||
|
|
@ -2737,7 +3020,7 @@ static int process_auxtrace_error(struct perf_session *session,
|
|||
return 0;
|
||||
}
|
||||
|
||||
return perf_event__process_auxtrace_error(session, event);
|
||||
return perf_event__process_auxtrace_error(tool, session, event);
|
||||
}
|
||||
|
||||
static int
|
||||
|
|
@ -2785,7 +3068,8 @@ process_bpf_events(const struct perf_tool *tool __maybe_unused,
|
|||
}
|
||||
|
||||
static int
|
||||
process_bpf_metadata_event(struct perf_session *session __maybe_unused,
|
||||
process_bpf_metadata_event(const struct perf_tool *tool __maybe_unused,
|
||||
struct perf_session *session __maybe_unused,
|
||||
union perf_event *event)
|
||||
{
|
||||
perf_event__fprintf(event, NULL, stdout);
|
||||
|
|
@ -3544,7 +3828,8 @@ static void script__setup_sample_type(struct perf_script *script)
|
|||
}
|
||||
}
|
||||
|
||||
static int process_stat_round_event(struct perf_session *session,
|
||||
static int process_stat_round_event(const struct perf_tool *tool __maybe_unused,
|
||||
struct perf_session *session,
|
||||
union perf_event *event)
|
||||
{
|
||||
struct perf_record_stat_round *round = &event->stat_round;
|
||||
|
|
@ -3559,7 +3844,8 @@ static int process_stat_round_event(struct perf_session *session,
|
|||
return 0;
|
||||
}
|
||||
|
||||
static int process_stat_config_event(struct perf_session *session __maybe_unused,
|
||||
static int process_stat_config_event(const struct perf_tool *tool __maybe_unused,
|
||||
struct perf_session *session __maybe_unused,
|
||||
union perf_event *event)
|
||||
{
|
||||
perf_event__read_stat_config(&stat_config, &event->stat_config);
|
||||
|
|
@ -3593,10 +3879,10 @@ static int set_maps(struct perf_script *script)
|
|||
}
|
||||
|
||||
static
|
||||
int process_thread_map_event(struct perf_session *session,
|
||||
int process_thread_map_event(const struct perf_tool *tool,
|
||||
struct perf_session *session __maybe_unused,
|
||||
union perf_event *event)
|
||||
{
|
||||
const struct perf_tool *tool = session->tool;
|
||||
struct perf_script *script = container_of(tool, struct perf_script, tool);
|
||||
|
||||
if (dump_trace)
|
||||
|
|
@ -3615,10 +3901,10 @@ int process_thread_map_event(struct perf_session *session,
|
|||
}
|
||||
|
||||
static
|
||||
int process_cpu_map_event(struct perf_session *session,
|
||||
int process_cpu_map_event(const struct perf_tool *tool,
|
||||
struct perf_session *session __maybe_unused,
|
||||
union perf_event *event)
|
||||
{
|
||||
const struct perf_tool *tool = session->tool;
|
||||
struct perf_script *script = container_of(tool, struct perf_script, tool);
|
||||
|
||||
if (dump_trace)
|
||||
|
|
@ -3636,7 +3922,8 @@ int process_cpu_map_event(struct perf_session *session,
|
|||
return set_maps(script);
|
||||
}
|
||||
|
||||
static int process_feature_event(struct perf_session *session,
|
||||
static int process_feature_event(const struct perf_tool *tool __maybe_unused,
|
||||
struct perf_session *session,
|
||||
union perf_event *event)
|
||||
{
|
||||
if (event->feat.feat_id < HEADER_LAST_FEATURE)
|
||||
|
|
@ -3644,14 +3931,13 @@ static int process_feature_event(struct perf_session *session,
|
|||
return 0;
|
||||
}
|
||||
|
||||
#ifdef HAVE_AUXTRACE_SUPPORT
|
||||
static int perf_script__process_auxtrace_info(struct perf_session *session,
|
||||
static int perf_script__process_auxtrace_info(const struct perf_tool *tool,
|
||||
struct perf_session *session,
|
||||
union perf_event *event)
|
||||
{
|
||||
int ret = perf_event__process_auxtrace_info(session, event);
|
||||
int ret = perf_event__process_auxtrace_info(tool, session, event);
|
||||
|
||||
if (ret == 0) {
|
||||
const struct perf_tool *tool = session->tool;
|
||||
struct perf_script *script = container_of(tool, struct perf_script, tool);
|
||||
|
||||
ret = perf_script__setup_per_event_dump(script);
|
||||
|
|
@ -3659,9 +3945,6 @@ static int perf_script__process_auxtrace_info(struct perf_session *session,
|
|||
|
||||
return ret;
|
||||
}
|
||||
#else
|
||||
#define perf_script__process_auxtrace_info 0
|
||||
#endif
|
||||
|
||||
static int parse_insn_trace(const struct option *opt __maybe_unused,
|
||||
const char *str, int unset __maybe_unused)
|
||||
|
|
@ -3726,6 +4009,7 @@ int cmd_script(int argc, const char **argv)
|
|||
bool header_only = false;
|
||||
bool script_started = false;
|
||||
bool unsorted_dump = false;
|
||||
bool merge_deferred_callchains = true;
|
||||
char *rec_script_path = NULL;
|
||||
char *rep_script_path = NULL;
|
||||
struct perf_session *session;
|
||||
|
|
@ -3879,6 +4163,8 @@ int cmd_script(int argc, const char **argv)
|
|||
"Guest code can be found in hypervisor process"),
|
||||
OPT_BOOLEAN('\0', "stitch-lbr", &script.stitch_lbr,
|
||||
"Enable LBR callgraph stitching approach"),
|
||||
OPT_BOOLEAN('\0', "merge-callchains", &merge_deferred_callchains,
|
||||
"Enable merge deferred user callchains"),
|
||||
OPTS_EVSWITCH(&script.evswitch),
|
||||
OPT_END()
|
||||
};
|
||||
|
|
@ -4108,6 +4394,7 @@ script_found:
|
|||
|
||||
perf_tool__init(&script.tool, !unsorted_dump);
|
||||
script.tool.sample = process_sample_event;
|
||||
script.tool.callchain_deferred = process_deferred_sample_event;
|
||||
script.tool.mmap = perf_event__process_mmap;
|
||||
script.tool.mmap2 = perf_event__process_mmap2;
|
||||
script.tool.comm = perf_event__process_comm;
|
||||
|
|
@ -4134,6 +4421,7 @@ script_found:
|
|||
script.tool.throttle = process_throttle_event;
|
||||
script.tool.unthrottle = process_throttle_event;
|
||||
script.tool.ordering_requires_timestamps = true;
|
||||
script.tool.merge_deferred_callchains = merge_deferred_callchains;
|
||||
session = perf_session__new(&data, &script.tool);
|
||||
if (IS_ERR(session))
|
||||
return PTR_ERR(session);
|
||||
|
|
|
|||
|
|
@ -74,6 +74,7 @@
|
|||
#include "util/intel-tpebs.h"
|
||||
#include "asm/bug.h"
|
||||
|
||||
#include <linux/list_sort.h>
|
||||
#include <linux/time64.h>
|
||||
#include <linux/zalloc.h>
|
||||
#include <api/fs/fs.h>
|
||||
|
|
@ -96,9 +97,18 @@
|
|||
#include <perf/evlist.h>
|
||||
#include <internal/threadmap.h>
|
||||
|
||||
#ifdef HAVE_BPF_SKEL
|
||||
#include "util/bpf_skel/bperf_cgroup.h"
|
||||
#endif
|
||||
|
||||
#define DEFAULT_SEPARATOR " "
|
||||
#define FREEZE_ON_SMI_PATH "bus/event_source/devices/cpu/freeze_on_smi"
|
||||
|
||||
struct rusage_stats {
|
||||
struct stats ru_utime_usec_stat;
|
||||
struct stats ru_stime_usec_stat;
|
||||
};
|
||||
|
||||
static void print_counters(struct timespec *ts, int argc, const char **argv);
|
||||
|
||||
static struct evlist *evsel_list;
|
||||
|
|
@ -128,6 +138,7 @@ static bool interval_count;
|
|||
static const char *output_name;
|
||||
static int output_fd;
|
||||
static char *metrics;
|
||||
static struct rusage_stats ru_stats;
|
||||
|
||||
struct perf_stat {
|
||||
bool record;
|
||||
|
|
@ -228,7 +239,7 @@ static inline void diff_timespec(struct timespec *r, struct timespec *a,
|
|||
static void perf_stat__reset_stats(void)
|
||||
{
|
||||
evlist__reset_stats(evsel_list);
|
||||
perf_stat__reset_shadow_stats();
|
||||
memset(stat_config.walltime_nsecs_stats, 0, sizeof(*stat_config.walltime_nsecs_stats));
|
||||
}
|
||||
|
||||
static int process_synthesized_event(const struct perf_tool *tool __maybe_unused,
|
||||
|
|
@ -278,17 +289,27 @@ static int read_single_counter(struct evsel *counter, int cpu_map_idx, int threa
|
|||
if (err && cpu_map_idx == 0 &&
|
||||
(evsel__tool_event(counter) == TOOL_PMU__EVENT_USER_TIME ||
|
||||
evsel__tool_event(counter) == TOOL_PMU__EVENT_SYSTEM_TIME)) {
|
||||
u64 val, *start_time;
|
||||
struct perf_counts_values *count =
|
||||
perf_counts(counter->counts, cpu_map_idx, thread);
|
||||
struct perf_counts_values *old_count = NULL;
|
||||
u64 val;
|
||||
|
||||
if (counter->prev_raw_counts)
|
||||
old_count = perf_counts(counter->prev_raw_counts, cpu_map_idx, thread);
|
||||
|
||||
start_time = xyarray__entry(counter->start_times, cpu_map_idx, thread);
|
||||
if (evsel__tool_event(counter) == TOOL_PMU__EVENT_USER_TIME)
|
||||
val = ru_stats.ru_utime_usec_stat.mean;
|
||||
else
|
||||
val = ru_stats.ru_stime_usec_stat.mean;
|
||||
count->ena = count->run = *start_time + val;
|
||||
|
||||
count->val = val;
|
||||
if (old_count) {
|
||||
count->run = old_count->run + 1;
|
||||
count->ena = old_count->ena + 1;
|
||||
} else {
|
||||
count->run++;
|
||||
count->ena++;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
return err;
|
||||
|
|
@ -345,7 +366,7 @@ static int read_counter_cpu(struct evsel *counter, int cpu_map_idx)
|
|||
return 0;
|
||||
}
|
||||
|
||||
static int read_affinity_counters(void)
|
||||
static int read_counters_with_affinity(void)
|
||||
{
|
||||
struct evlist_cpu_iterator evlist_cpu_itr;
|
||||
struct affinity saved_affinity, *affinity;
|
||||
|
|
@ -366,6 +387,9 @@ static int read_affinity_counters(void)
|
|||
if (evsel__is_bpf(counter))
|
||||
continue;
|
||||
|
||||
if (evsel__is_tool(counter))
|
||||
continue;
|
||||
|
||||
if (!counter->err)
|
||||
counter->err = read_counter_cpu(counter, evlist_cpu_itr.cpu_map_idx);
|
||||
}
|
||||
|
|
@ -391,16 +415,46 @@ static int read_bpf_map_counters(void)
|
|||
return 0;
|
||||
}
|
||||
|
||||
static int read_counters(void)
|
||||
static int read_tool_counters(void)
|
||||
{
|
||||
if (!stat_config.stop_read_counter) {
|
||||
if (read_bpf_map_counters() ||
|
||||
read_affinity_counters())
|
||||
return -1;
|
||||
struct evsel *counter;
|
||||
|
||||
evlist__for_each_entry(evsel_list, counter) {
|
||||
int idx;
|
||||
|
||||
if (!evsel__is_tool(counter))
|
||||
continue;
|
||||
|
||||
perf_cpu_map__for_each_idx(idx, counter->core.cpus) {
|
||||
if (!counter->err)
|
||||
counter->err = read_counter_cpu(counter, idx);
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int read_counters(void)
|
||||
{
|
||||
int ret;
|
||||
|
||||
if (stat_config.stop_read_counter)
|
||||
return 0;
|
||||
|
||||
// Read all BPF counters first.
|
||||
ret = read_bpf_map_counters();
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
// Read non-BPF and non-tool counters next.
|
||||
ret = read_counters_with_affinity();
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
// Read the tool counters last. This way the duration_time counter
|
||||
// should always be greater than any other counter's enabled time.
|
||||
return read_tool_counters();
|
||||
}
|
||||
|
||||
static void process_counters(void)
|
||||
{
|
||||
struct evsel *counter;
|
||||
|
|
@ -434,8 +488,8 @@ static void process_interval(void)
|
|||
pr_err("failed to write stat round event\n");
|
||||
}
|
||||
|
||||
init_stats(&walltime_nsecs_stats);
|
||||
update_stats(&walltime_nsecs_stats, stat_config.interval * 1000000ULL);
|
||||
init_stats(stat_config.walltime_nsecs_stats);
|
||||
update_stats(stat_config.walltime_nsecs_stats, stat_config.interval * 1000000ULL);
|
||||
print_counters(&rs, 0, NULL);
|
||||
}
|
||||
|
||||
|
|
@ -624,8 +678,9 @@ static enum counter_recovery stat_handle_error(struct evsel *counter, int err)
|
|||
*/
|
||||
if (err == EINVAL || err == ENOSYS || err == ENOENT || err == ENXIO) {
|
||||
if (verbose > 0) {
|
||||
ui__warning("%s event is not supported by the kernel.\n",
|
||||
evsel__name(counter));
|
||||
evsel__open_strerror(counter, &target, err, msg, sizeof(msg));
|
||||
ui__warning("%s event is not supported by the kernel.\n%s\n",
|
||||
evsel__name(counter), msg);
|
||||
}
|
||||
return COUNTER_SKIP;
|
||||
}
|
||||
|
|
@ -649,10 +704,11 @@ static enum counter_recovery stat_handle_error(struct evsel *counter, int err)
|
|||
}
|
||||
}
|
||||
if (verbose > 0) {
|
||||
evsel__open_strerror(counter, &target, err, msg, sizeof(msg));
|
||||
ui__warning(err == EOPNOTSUPP
|
||||
? "%s event is not supported by the kernel.\n"
|
||||
: "skipping event %s that kernel failed to open.\n",
|
||||
evsel__name(counter));
|
||||
? "%s event is not supported by the kernel.\n%s\n"
|
||||
: "skipping event %s that kernel failed to open.\n%s\n",
|
||||
evsel__name(counter), msg);
|
||||
}
|
||||
return COUNTER_SKIP;
|
||||
}
|
||||
|
|
@ -713,6 +769,17 @@ static int create_perf_stat_counter(struct evsel *evsel,
|
|||
evsel->core.threads);
|
||||
}
|
||||
|
||||
static void update_rusage_stats(const struct rusage *rusage)
|
||||
{
|
||||
const u64 us_to_ns = 1000;
|
||||
const u64 s_to_ns = 1000000000;
|
||||
|
||||
update_stats(&ru_stats.ru_utime_usec_stat,
|
||||
(rusage->ru_utime.tv_usec * us_to_ns + rusage->ru_utime.tv_sec * s_to_ns));
|
||||
update_stats(&ru_stats.ru_stime_usec_stat,
|
||||
(rusage->ru_stime.tv_usec * us_to_ns + rusage->ru_stime.tv_sec * s_to_ns));
|
||||
}
|
||||
|
||||
static int __run_perf_stat(int argc, const char **argv, int run_idx)
|
||||
{
|
||||
int interval = stat_config.interval;
|
||||
|
|
@ -856,9 +923,11 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
|
|||
goto err_out;
|
||||
}
|
||||
}
|
||||
if (!has_supported_counters) {
|
||||
evsel__open_strerror(evlist__first(evsel_list), &target, open_err,
|
||||
msg, sizeof(msg));
|
||||
if (!has_supported_counters && !stat_config.null_run) {
|
||||
if (open_err) {
|
||||
evsel__open_strerror(evlist__first(evsel_list), &target, open_err,
|
||||
msg, sizeof(msg));
|
||||
}
|
||||
ui__error("No supported events found.\n%s\n", msg);
|
||||
|
||||
if (child_pid != -1)
|
||||
|
|
@ -938,10 +1007,20 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
|
|||
goto err_out;
|
||||
}
|
||||
|
||||
if (WIFSIGNALED(status))
|
||||
if (WIFSIGNALED(status)) {
|
||||
/*
|
||||
* We want to indicate failure to stop a repeat run,
|
||||
* hence negative. We want the value to be the exit code
|
||||
* of perf, which for termination by a signal is 128
|
||||
* plus the signal number.
|
||||
*/
|
||||
err = 0 - (128 + WTERMSIG(status));
|
||||
psignal(WTERMSIG(status), argv[0]);
|
||||
} else {
|
||||
err = WEXITSTATUS(status);
|
||||
}
|
||||
} else {
|
||||
status = dispatch_events(forks, timeout, interval, ×);
|
||||
err = dispatch_events(forks, timeout, interval, ×);
|
||||
}
|
||||
|
||||
disable_counters();
|
||||
|
|
@ -954,15 +1033,15 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
|
|||
if (interval && stat_config.summary) {
|
||||
stat_config.interval = 0;
|
||||
stat_config.stop_read_counter = true;
|
||||
init_stats(&walltime_nsecs_stats);
|
||||
update_stats(&walltime_nsecs_stats, t1 - t0);
|
||||
init_stats(stat_config.walltime_nsecs_stats);
|
||||
update_stats(stat_config.walltime_nsecs_stats, t1 - t0);
|
||||
|
||||
evlist__copy_prev_raw_counts(evsel_list);
|
||||
evlist__reset_prev_raw_counts(evsel_list);
|
||||
evlist__reset_aggr_stats(evsel_list);
|
||||
} else {
|
||||
update_stats(&walltime_nsecs_stats, t1 - t0);
|
||||
update_rusage_stats(&ru_stats, &stat_config.ru_data);
|
||||
update_stats(stat_config.walltime_nsecs_stats, t1 - t0);
|
||||
update_rusage_stats(&stat_config.ru_data);
|
||||
}
|
||||
|
||||
/*
|
||||
|
|
@ -981,7 +1060,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
|
|||
if (!STAT_RECORD)
|
||||
evlist__close(evsel_list);
|
||||
|
||||
return WEXITSTATUS(status);
|
||||
return err;
|
||||
|
||||
err_out:
|
||||
if (forks)
|
||||
|
|
@ -1851,6 +1930,35 @@ static int perf_stat_init_aggr_mode_file(struct perf_stat *st)
|
|||
return 0;
|
||||
}
|
||||
|
||||
static int default_evlist_evsel_cmp(void *priv __maybe_unused,
|
||||
const struct list_head *l,
|
||||
const struct list_head *r)
|
||||
{
|
||||
const struct perf_evsel *lhs_core = container_of(l, struct perf_evsel, node);
|
||||
const struct evsel *lhs = container_of(lhs_core, struct evsel, core);
|
||||
const struct perf_evsel *rhs_core = container_of(r, struct perf_evsel, node);
|
||||
const struct evsel *rhs = container_of(rhs_core, struct evsel, core);
|
||||
|
||||
if (evsel__leader(lhs) == evsel__leader(rhs)) {
|
||||
/* Within the same group, respect the original order. */
|
||||
return lhs_core->idx - rhs_core->idx;
|
||||
}
|
||||
|
||||
/* Sort default metrics evsels first, and default show events before those. */
|
||||
if (lhs->default_metricgroup != rhs->default_metricgroup)
|
||||
return lhs->default_metricgroup ? -1 : 1;
|
||||
|
||||
if (lhs->default_show_events != rhs->default_show_events)
|
||||
return lhs->default_show_events ? -1 : 1;
|
||||
|
||||
/* Sort by PMU type (prefers legacy types first). */
|
||||
if (lhs->pmu != rhs->pmu)
|
||||
return lhs->pmu->type - rhs->pmu->type;
|
||||
|
||||
/* Sort by name. */
|
||||
return strcmp(evsel__name((struct evsel *)lhs), evsel__name((struct evsel *)rhs));
|
||||
}
|
||||
|
||||
/*
|
||||
* Add default events, if there were no attributes specified or
|
||||
* if -d/--detailed, -d -d or -d -d -d is used:
|
||||
|
|
@ -1973,48 +2081,39 @@ static int add_default_events(void)
|
|||
stat_config.topdown_level = 1;
|
||||
|
||||
if (!evlist->core.nr_entries && !evsel_list->core.nr_entries) {
|
||||
/* No events so add defaults. */
|
||||
if (target__has_cpu(&target))
|
||||
ret = parse_events(evlist, "cpu-clock", &err);
|
||||
else
|
||||
ret = parse_events(evlist, "task-clock", &err);
|
||||
if (ret)
|
||||
goto out;
|
||||
|
||||
ret = parse_events(evlist,
|
||||
"context-switches,"
|
||||
"cpu-migrations,"
|
||||
"page-faults,"
|
||||
"instructions,"
|
||||
"cycles,"
|
||||
"stalled-cycles-frontend,"
|
||||
"stalled-cycles-backend,"
|
||||
"branches,"
|
||||
"branch-misses",
|
||||
&err);
|
||||
if (ret)
|
||||
goto out;
|
||||
|
||||
/*
|
||||
* Add TopdownL1 metrics if they exist. To minimize
|
||||
* multiplexing, don't request threshold computation.
|
||||
* Add Default metrics. To minimize multiplexing, don't request
|
||||
* threshold computation, but it will be computed if the events
|
||||
* are present.
|
||||
*/
|
||||
if (metricgroup__has_metric_or_groups(pmu, "Default")) {
|
||||
struct evlist *metric_evlist = evlist__new();
|
||||
const char *default_metricgroup_names[] = {
|
||||
"Default", "Default2", "Default3", "Default4",
|
||||
};
|
||||
|
||||
for (size_t i = 0; i < ARRAY_SIZE(default_metricgroup_names); i++) {
|
||||
struct evlist *metric_evlist;
|
||||
|
||||
if (!metricgroup__has_metric_or_groups(pmu, default_metricgroup_names[i]))
|
||||
continue;
|
||||
|
||||
if ((int)i > detailed_run)
|
||||
break;
|
||||
|
||||
metric_evlist = evlist__new();
|
||||
if (!metric_evlist) {
|
||||
ret = -ENOMEM;
|
||||
goto out;
|
||||
break;
|
||||
}
|
||||
if (metricgroup__parse_groups(metric_evlist, pmu, "Default",
|
||||
if (metricgroup__parse_groups(metric_evlist, pmu, default_metricgroup_names[i],
|
||||
/*metric_no_group=*/false,
|
||||
/*metric_no_merge=*/false,
|
||||
/*metric_no_threshold=*/true,
|
||||
stat_config.user_requested_cpu_list,
|
||||
stat_config.system_wide,
|
||||
stat_config.hardware_aware_grouping) < 0) {
|
||||
evlist__delete(metric_evlist);
|
||||
ret = -1;
|
||||
goto out;
|
||||
break;
|
||||
}
|
||||
|
||||
evlist__for_each_entry(metric_evlist, evsel)
|
||||
|
|
@ -2026,44 +2125,8 @@ static int add_default_events(void)
|
|||
&metric_evlist->metric_events);
|
||||
evlist__delete(metric_evlist);
|
||||
}
|
||||
}
|
||||
list_sort(/*priv=*/NULL, &evlist->core.entries, default_evlist_evsel_cmp);
|
||||
|
||||
/* Detailed events get appended to the event list: */
|
||||
|
||||
if (!ret && detailed_run >= 1) {
|
||||
/*
|
||||
* Detailed stats (-d), covering the L1 and last level data
|
||||
* caches:
|
||||
*/
|
||||
ret = parse_events(evlist,
|
||||
"L1-dcache-loads,"
|
||||
"L1-dcache-load-misses,"
|
||||
"LLC-loads,"
|
||||
"LLC-load-misses",
|
||||
&err);
|
||||
}
|
||||
if (!ret && detailed_run >= 2) {
|
||||
/*
|
||||
* Very detailed stats (-d -d), covering the instruction cache
|
||||
* and the TLB caches:
|
||||
*/
|
||||
ret = parse_events(evlist,
|
||||
"L1-icache-loads,"
|
||||
"L1-icache-load-misses,"
|
||||
"dTLB-loads,"
|
||||
"dTLB-load-misses,"
|
||||
"iTLB-loads,"
|
||||
"iTLB-load-misses",
|
||||
&err);
|
||||
}
|
||||
if (!ret && detailed_run >= 3) {
|
||||
/*
|
||||
* Very, very detailed stats (-d -d -d), adding prefetch events:
|
||||
*/
|
||||
ret = parse_events(evlist,
|
||||
"L1-dcache-prefetches,"
|
||||
"L1-dcache-prefetch-misses",
|
||||
&err);
|
||||
}
|
||||
out:
|
||||
if (!ret) {
|
||||
|
|
@ -2072,7 +2135,7 @@ out:
|
|||
* Make at least one event non-skippable so fatal errors are visible.
|
||||
* 'cycles' always used to be default and non-skippable, so use that.
|
||||
*/
|
||||
if (strcmp("cycles", evsel__name(evsel)))
|
||||
if (!evsel__match(evsel, HARDWARE, HW_CPU_CYCLES))
|
||||
evsel->skippable = true;
|
||||
}
|
||||
}
|
||||
|
|
@ -2136,7 +2199,8 @@ static int __cmd_record(const struct option stat_options[], struct opt_aggr_mode
|
|||
return argc;
|
||||
}
|
||||
|
||||
static int process_stat_round_event(struct perf_session *session,
|
||||
static int process_stat_round_event(const struct perf_tool *tool __maybe_unused,
|
||||
struct perf_session *session,
|
||||
union perf_event *event)
|
||||
{
|
||||
struct perf_record_stat_round *stat_round = &event->stat_round;
|
||||
|
|
@ -2148,7 +2212,7 @@ static int process_stat_round_event(struct perf_session *session,
|
|||
process_counters();
|
||||
|
||||
if (stat_round->type == PERF_STAT_ROUND_TYPE__FINAL)
|
||||
update_stats(&walltime_nsecs_stats, stat_round->time);
|
||||
update_stats(stat_config.walltime_nsecs_stats, stat_round->time);
|
||||
|
||||
if (stat_config.interval && stat_round->time) {
|
||||
tsh.tv_sec = stat_round->time / NSEC_PER_SEC;
|
||||
|
|
@ -2161,10 +2225,10 @@ static int process_stat_round_event(struct perf_session *session,
|
|||
}
|
||||
|
||||
static
|
||||
int process_stat_config_event(struct perf_session *session,
|
||||
int process_stat_config_event(const struct perf_tool *tool,
|
||||
struct perf_session *session,
|
||||
union perf_event *event)
|
||||
{
|
||||
const struct perf_tool *tool = session->tool;
|
||||
struct perf_stat *st = container_of(tool, struct perf_stat, tool);
|
||||
|
||||
perf_event__read_stat_config(&stat_config, &event->stat_config);
|
||||
|
|
@ -2210,10 +2274,10 @@ static int set_maps(struct perf_stat *st)
|
|||
}
|
||||
|
||||
static
|
||||
int process_thread_map_event(struct perf_session *session,
|
||||
int process_thread_map_event(const struct perf_tool *tool,
|
||||
struct perf_session *session __maybe_unused,
|
||||
union perf_event *event)
|
||||
{
|
||||
const struct perf_tool *tool = session->tool;
|
||||
struct perf_stat *st = container_of(tool, struct perf_stat, tool);
|
||||
|
||||
if (st->threads) {
|
||||
|
|
@ -2229,10 +2293,10 @@ int process_thread_map_event(struct perf_session *session,
|
|||
}
|
||||
|
||||
static
|
||||
int process_cpu_map_event(struct perf_session *session,
|
||||
int process_cpu_map_event(const struct perf_tool *tool,
|
||||
struct perf_session *session __maybe_unused,
|
||||
union perf_event *event)
|
||||
{
|
||||
const struct perf_tool *tool = session->tool;
|
||||
struct perf_stat *st = container_of(tool, struct perf_stat, tool);
|
||||
struct perf_cpu_map *cpus;
|
||||
|
||||
|
|
@ -2540,6 +2604,7 @@ int cmd_stat(int argc, const char **argv)
|
|||
unsigned int interval, timeout;
|
||||
const char * const stat_subcommands[] = { "record", "report" };
|
||||
char errbuf[BUFSIZ];
|
||||
struct evsel *counter;
|
||||
|
||||
setlocale(LC_ALL, "");
|
||||
|
||||
|
|
@ -2794,9 +2859,28 @@ int cmd_stat(int argc, const char **argv)
|
|||
goto out;
|
||||
}
|
||||
}
|
||||
|
||||
#ifdef HAVE_BPF_SKEL
|
||||
if (target.use_bpf && nr_cgroups &&
|
||||
(evsel_list->core.nr_entries / nr_cgroups) > BPERF_CGROUP__MAX_EVENTS) {
|
||||
pr_warning("Disabling BPF counters due to more events (%d) than the max (%d)\n",
|
||||
evsel_list->core.nr_entries / nr_cgroups, BPERF_CGROUP__MAX_EVENTS);
|
||||
target.use_bpf = false;
|
||||
}
|
||||
#endif // HAVE_BPF_SKEL
|
||||
evlist__warn_user_requested_cpus(evsel_list, target.cpu_list);
|
||||
|
||||
evlist__for_each_entry(evsel_list, counter) {
|
||||
/*
|
||||
* Setup BPF counters to require CPUs as any(-1) isn't
|
||||
* supported. evlist__create_maps below will propagate this
|
||||
* information to the evsels. Note, evsel__is_bperf isn't yet
|
||||
* set up, and this change must happen early, so directly use
|
||||
* the bpf_counter variable and target information.
|
||||
*/
|
||||
if ((counter->bpf_counter || target.use_bpf) && !target__has_cpu(&target))
|
||||
counter->core.requires_cpu = true;
|
||||
}
|
||||
|
||||
if (evlist__create_maps(evsel_list, &target) < 0) {
|
||||
if (target__has_task(&target)) {
|
||||
pr_err("Problems finding threads of monitor\n");
|
||||
|
|
@ -2895,7 +2979,7 @@ int cmd_stat(int argc, const char **argv)
|
|||
evlist__reset_prev_raw_counts(evsel_list);
|
||||
|
||||
status = run_perf_stat(argc, argv, run_idx);
|
||||
if (status == -1)
|
||||
if (status < 0)
|
||||
break;
|
||||
|
||||
if (forever && !interval) {
|
||||
|
|
@ -2936,7 +3020,7 @@ int cmd_stat(int argc, const char **argv)
|
|||
}
|
||||
|
||||
if (!interval) {
|
||||
if (WRITE_STAT_ROUND_EVENT(walltime_nsecs_stats.max, FINAL))
|
||||
if (WRITE_STAT_ROUND_EVENT(stat_config.walltime_nsecs_stats->max, FINAL))
|
||||
pr_err("failed to write stat round event\n");
|
||||
}
|
||||
|
||||
|
|
@ -2965,5 +3049,6 @@ out:
|
|||
|
||||
evlist__close_control(stat_config.ctl_fd, stat_config.ctl_fd_ack, &stat_config.ctl_fd_close);
|
||||
|
||||
return status;
|
||||
/* Only the low byte of status becomes the exit code. */
|
||||
return abs(status);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -1651,7 +1651,7 @@ out_delete:
|
|||
return ret;
|
||||
}
|
||||
|
||||
static int timechart__io_record(int argc, const char **argv)
|
||||
static int timechart__io_record(int argc, const char **argv, const char *output_data)
|
||||
{
|
||||
unsigned int rec_argc, i;
|
||||
const char **rec_argv;
|
||||
|
|
@ -1659,7 +1659,7 @@ static int timechart__io_record(int argc, const char **argv)
|
|||
char *filter = NULL;
|
||||
|
||||
const char * const common_args[] = {
|
||||
"record", "-a", "-R", "-c", "1",
|
||||
"record", "-a", "-R", "-c", "1", "-o", output_data,
|
||||
};
|
||||
unsigned int common_args_nr = ARRAY_SIZE(common_args);
|
||||
|
||||
|
|
@ -1786,7 +1786,8 @@ static int timechart__io_record(int argc, const char **argv)
|
|||
}
|
||||
|
||||
|
||||
static int timechart__record(struct timechart *tchart, int argc, const char **argv)
|
||||
static int timechart__record(struct timechart *tchart, int argc, const char **argv,
|
||||
const char *output_data)
|
||||
{
|
||||
unsigned int rec_argc, i, j;
|
||||
const char **rec_argv;
|
||||
|
|
@ -1794,7 +1795,7 @@ static int timechart__record(struct timechart *tchart, int argc, const char **ar
|
|||
unsigned int record_elems;
|
||||
|
||||
const char * const common_args[] = {
|
||||
"record", "-a", "-R", "-c", "1",
|
||||
"record", "-a", "-R", "-c", "1", "-o", output_data,
|
||||
};
|
||||
unsigned int common_args_nr = ARRAY_SIZE(common_args);
|
||||
|
||||
|
|
@ -1934,6 +1935,7 @@ int cmd_timechart(int argc, const char **argv)
|
|||
.merge_dist = 1000,
|
||||
};
|
||||
const char *output_name = "output.svg";
|
||||
const char *output_record_data = "perf.data";
|
||||
const struct option timechart_common_options[] = {
|
||||
OPT_BOOLEAN('P', "power-only", &tchart.power_only, "output power data only"),
|
||||
OPT_BOOLEAN('T', "tasks-only", &tchart.tasks_only, "output processes data only"),
|
||||
|
|
@ -1976,6 +1978,7 @@ int cmd_timechart(int argc, const char **argv)
|
|||
OPT_BOOLEAN('I', "io-only", &tchart.io_only,
|
||||
"record only IO data"),
|
||||
OPT_BOOLEAN('g', "callchain", &tchart.with_backtrace, "record callchain"),
|
||||
OPT_STRING('o', "output", &output_record_data, "file", "output data file name"),
|
||||
OPT_PARENT(timechart_common_options),
|
||||
};
|
||||
const char * const timechart_record_usage[] = {
|
||||
|
|
@ -2024,9 +2027,9 @@ int cmd_timechart(int argc, const char **argv)
|
|||
}
|
||||
|
||||
if (tchart.io_only)
|
||||
ret = timechart__io_record(argc, argv);
|
||||
ret = timechart__io_record(argc, argv, output_record_data);
|
||||
else
|
||||
ret = timechart__record(&tchart, argc, argv);
|
||||
ret = timechart__record(&tchart, argc, argv, output_record_data);
|
||||
goto out;
|
||||
} else if (argc)
|
||||
usage_with_options(timechart_usage, timechart_options);
|
||||
|
|
|
|||
|
|
@ -1695,11 +1695,13 @@ int cmd_top(int argc, const char **argv)
|
|||
goto out_delete_evlist;
|
||||
|
||||
if (!top.evlist->core.nr_entries) {
|
||||
bool can_profile_kernel = perf_event_paranoid_check(1);
|
||||
int err = parse_event(top.evlist, can_profile_kernel ? "cycles:P" : "cycles:Pu");
|
||||
struct evlist *def_evlist = evlist__new_default();
|
||||
|
||||
if (err)
|
||||
if (!def_evlist)
|
||||
goto out_delete_evlist;
|
||||
|
||||
evlist__splice_list_tail(top.evlist, &def_evlist->core.entries);
|
||||
evlist__delete(def_evlist);
|
||||
}
|
||||
|
||||
status = evswitch__init(&top.evswitch, top.evlist, stderr);
|
||||
|
|
|
|||
|
|
@ -2005,7 +2005,9 @@ static int trace__symbols_init(struct trace *trace, int argc, const char **argv,
|
|||
|
||||
err = __machine__synthesize_threads(trace->host, &trace->tool, &trace->opts.target,
|
||||
evlist->core.threads, trace__tool_process,
|
||||
true, false, 1);
|
||||
/*needs_mmap=*/callchain_param.enabled,
|
||||
/*mmap_data=*/false,
|
||||
/*nr_threads_synthesize=*/1);
|
||||
out:
|
||||
if (err) {
|
||||
perf_env__exit(&trace->host_env);
|
||||
|
|
@ -2067,6 +2069,15 @@ static const struct syscall_arg_fmt *syscall_arg_fmt__find_by_name(const char *n
|
|||
return __syscall_arg_fmt__find_by_name(syscall_arg_fmts__by_name, nmemb, name);
|
||||
}
|
||||
|
||||
/*
|
||||
* v6.19 kernel added new fields to read userspace memory for event tracing.
|
||||
* But it's not used by perf and confuses the syscall parameters.
|
||||
*/
|
||||
static bool is_internal_field(struct tep_format_field *field)
|
||||
{
|
||||
return !strcmp(field->type, "__data_loc char[]");
|
||||
}
|
||||
|
||||
static struct tep_format_field *
|
||||
syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field *field,
|
||||
bool *use_btf)
|
||||
|
|
@ -2075,6 +2086,10 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
|
|||
int len;
|
||||
|
||||
for (; field; field = field->next, ++arg) {
|
||||
/* assume it's the last argument */
|
||||
if (is_internal_field(field))
|
||||
continue;
|
||||
|
||||
last_field = field;
|
||||
|
||||
if (arg->scnprintf)
|
||||
|
|
@ -2143,6 +2158,7 @@ static int syscall__read_info(struct syscall *sc, struct trace *trace)
|
|||
{
|
||||
char tp_name[128];
|
||||
const char *name;
|
||||
struct tep_format_field *field;
|
||||
int err;
|
||||
|
||||
if (sc->nonexistent)
|
||||
|
|
@ -2199,6 +2215,13 @@ static int syscall__read_info(struct syscall *sc, struct trace *trace)
|
|||
--sc->nr_args;
|
||||
}
|
||||
|
||||
field = sc->args;
|
||||
while (field) {
|
||||
if (is_internal_field(field))
|
||||
--sc->nr_args;
|
||||
field = field->next;
|
||||
}
|
||||
|
||||
sc->is_exit = !strcmp(name, "exit_group") || !strcmp(name, "exit");
|
||||
sc->is_open = !strcmp(name, "open") || !strcmp(name, "openat");
|
||||
|
||||
|
|
|
|||
|
|
@ -1,7 +1,5 @@
|
|||
pmu-events-y += pmu-events.o
|
||||
JDIR = pmu-events/arch/$(SRCARCH)
|
||||
JSON = $(shell [ -d $(JDIR) ] && \
|
||||
find $(JDIR) -name '*.json' -o -name 'mapfile.csv')
|
||||
JSON = $(shell find pmu-events/arch -name '*.json' -o -name '*.csv')
|
||||
JDIR_TEST = pmu-events/arch/test
|
||||
JSON_TEST = $(shell [ -d $(JDIR_TEST) ] && \
|
||||
find $(JDIR_TEST) -name '*.json')
|
||||
|
|
@ -13,6 +11,8 @@ PMU_EVENTS_C = $(OUTPUT)pmu-events/pmu-events.c
|
|||
METRIC_TEST_LOG = $(OUTPUT)pmu-events/metric_test.log
|
||||
TEST_EMPTY_PMU_EVENTS_C = $(OUTPUT)pmu-events/test-empty-pmu-events.c
|
||||
EMPTY_PMU_EVENTS_TEST_LOG = $(OUTPUT)pmu-events/empty-pmu-events.log
|
||||
LEGACY_CACHE_PY = pmu-events/make_legacy_cache.py
|
||||
LEGACY_CACHE_JSON = $(OUTPUT)pmu-events/arch/common/common/legacy-cache.json
|
||||
|
||||
ifeq ($(JEVENTS_ARCH),)
|
||||
JEVENTS_ARCH=$(SRCARCH)
|
||||
|
|
@ -29,13 +29,26 @@ $(PMU_EVENTS_C): $(EMPTY_PMU_EVENTS_C)
|
|||
$(call rule_mkdir)
|
||||
$(Q)$(call echo-cmd,gen)cp $< $@
|
||||
else
|
||||
# Copy checked-in json to OUTPUT for generation if it's an out of source build
|
||||
ifneq ($(OUTPUT),)
|
||||
$(OUTPUT)pmu-events/arch/%: pmu-events/arch/%
|
||||
$(call rule_mkdir)
|
||||
$(Q)$(call echo-cmd,gen)cp $< $@
|
||||
endif
|
||||
|
||||
$(LEGACY_CACHE_JSON): $(LEGACY_CACHE_PY)
|
||||
$(call rule_mkdir)
|
||||
$(Q)$(call echo-cmd,gen)$(PYTHON) $(LEGACY_CACHE_PY) > $@
|
||||
|
||||
GEN_JSON = $(patsubst %,$(OUTPUT)%,$(JSON)) $(LEGACY_CACHE_JSON)
|
||||
|
||||
$(METRIC_TEST_LOG): $(METRIC_TEST_PY) $(METRIC_PY)
|
||||
$(call rule_mkdir)
|
||||
$(Q)$(call echo-cmd,test)$(PYTHON) $< 2> $@ || (cat $@ && false)
|
||||
|
||||
$(TEST_EMPTY_PMU_EVENTS_C): $(JSON) $(JSON_TEST) $(JEVENTS_PY) $(METRIC_PY) $(METRIC_TEST_LOG)
|
||||
$(TEST_EMPTY_PMU_EVENTS_C): $(GEN_JSON) $(JSON_TEST) $(JEVENTS_PY) $(METRIC_PY) $(METRIC_TEST_LOG)
|
||||
$(call rule_mkdir)
|
||||
$(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) none none pmu-events/arch $@
|
||||
$(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) none none $(OUTPUT)pmu-events/arch $@
|
||||
|
||||
$(EMPTY_PMU_EVENTS_TEST_LOG): $(EMPTY_PMU_EVENTS_C) $(TEST_EMPTY_PMU_EVENTS_C)
|
||||
$(call rule_mkdir)
|
||||
|
|
@ -63,10 +76,10 @@ $(OUTPUT)%.pylint_log: %
|
|||
$(call rule_mkdir)
|
||||
$(Q)$(call echo-cmd,test)pylint "$<" > $@ || (cat $@ && rm $@ && false)
|
||||
|
||||
$(PMU_EVENTS_C): $(JSON) $(JSON_TEST) $(JEVENTS_PY) $(METRIC_PY) $(METRIC_TEST_LOG) \
|
||||
$(PMU_EVENTS_C): $(GEN_JSON) $(JSON_TEST) $(JEVENTS_PY) $(METRIC_PY) $(METRIC_TEST_LOG) \
|
||||
$(EMPTY_PMU_EVENTS_TEST_LOG) $(PMU_EVENTS_MYPY_TEST_LOGS) $(PMU_EVENTS_PYLINT_TEST_LOGS)
|
||||
$(call rule_mkdir)
|
||||
$(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) $(JEVENTS_ARCH) $(JEVENTS_MODEL) pmu-events/arch $@
|
||||
$(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) $(JEVENTS_ARCH) $(JEVENTS_MODEL) $(OUTPUT)pmu-events/arch $@
|
||||
endif
|
||||
|
||||
# pmu-events.c file is generated in the OUTPUT directory so it needs a
|
||||
|
|
|
|||
|
|
@ -388,55 +388,55 @@
|
|||
"MetricExpr": "L1D_CACHE_RW / L1D_CACHE",
|
||||
"BriefDescription": "L1D cache access - demand",
|
||||
"MetricGroup": "Cache",
|
||||
"ScaleUnit": "100percent of cache acceses"
|
||||
"ScaleUnit": "100percent of cache accesses"
|
||||
},
|
||||
{
|
||||
"MetricName": "l1d_cache_access_prefetches",
|
||||
"MetricExpr": "L1D_CACHE_PRFM / L1D_CACHE",
|
||||
"BriefDescription": "L1D cache access - prefetch",
|
||||
"MetricGroup": "Cache",
|
||||
"ScaleUnit": "100percent of cache acceses"
|
||||
"ScaleUnit": "100percent of cache accesses"
|
||||
},
|
||||
{
|
||||
"MetricName": "l1d_cache_demand_misses",
|
||||
"MetricExpr": "L1D_CACHE_REFILL_RW / L1D_CACHE",
|
||||
"BriefDescription": "L1D cache demand misses",
|
||||
"MetricGroup": "Cache",
|
||||
"ScaleUnit": "100percent of cache acceses"
|
||||
"ScaleUnit": "100percent of cache accesses"
|
||||
},
|
||||
{
|
||||
"MetricName": "l1d_cache_demand_misses_read",
|
||||
"MetricExpr": "L1D_CACHE_REFILL_RD / L1D_CACHE",
|
||||
"BriefDescription": "L1D cache demand misses - read",
|
||||
"MetricGroup": "Cache",
|
||||
"ScaleUnit": "100percent of cache acceses"
|
||||
"ScaleUnit": "100percent of cache accesses"
|
||||
},
|
||||
{
|
||||
"MetricName": "l1d_cache_demand_misses_write",
|
||||
"MetricExpr": "L1D_CACHE_REFILL_WR / L1D_CACHE",
|
||||
"BriefDescription": "L1D cache demand misses - write",
|
||||
"MetricGroup": "Cache",
|
||||
"ScaleUnit": "100percent of cache acceses"
|
||||
"ScaleUnit": "100percent of cache accesses"
|
||||
},
|
||||
{
|
||||
"MetricName": "l1d_cache_prefetch_misses",
|
||||
"MetricExpr": "L1D_CACHE_REFILL_PRFM / L1D_CACHE",
|
||||
"BriefDescription": "L1D cache prefetch misses",
|
||||
"MetricGroup": "Cache",
|
||||
"ScaleUnit": "100percent of cache acceses"
|
||||
"ScaleUnit": "100percent of cache accesses"
|
||||
},
|
||||
{
|
||||
"MetricName": "ase_scalar_mix",
|
||||
"MetricExpr": "ASE_SCALAR_SPEC / OP_SPEC",
|
||||
"BriefDescription": "Proportion of advanced SIMD data processing operations (excluding DP_SPEC/LD_SPEC) scalar operations",
|
||||
"MetricGroup": "Instructions",
|
||||
"ScaleUnit": "100percent of cache acceses"
|
||||
"ScaleUnit": "100percent of cache accesses"
|
||||
},
|
||||
{
|
||||
"MetricName": "ase_vector_mix",
|
||||
"MetricExpr": "ASE_VECTOR_SPEC / OP_SPEC",
|
||||
"BriefDescription": "Proportion of advanced SIMD data processing operations (excluding DP_SPEC/LD_SPEC) vector operations",
|
||||
"MetricGroup": "Instructions",
|
||||
"ScaleUnit": "100percent of cache acceses"
|
||||
"ScaleUnit": "100percent of cache accesses"
|
||||
}
|
||||
]
|
||||
|
|
|
|||
|
|
@ -81,7 +81,7 @@
|
|||
"BriefDescription": "L2D TLB access"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Level 2 access to instruciton TLB that caused a page table walk. This event counts on any instruciton access which causes L2I_TLB_REFILL to count",
|
||||
"PublicDescription": "Level 2 access to instruction TLB that caused a page table walk. This event counts on any instruction access which causes L2I_TLB_REFILL to count",
|
||||
"EventCode": "0x35",
|
||||
"EventName": "L2I_TLB_ACCESS",
|
||||
"BriefDescription": "L2I TLB access"
|
||||
|
|
|
|||
|
|
@ -0,0 +1,9 @@
|
|||
[
|
||||
{
|
||||
"BriefDescription": "ddr cycles event",
|
||||
"EventCode": "0x00",
|
||||
"EventName": "imx94_ddr.cycles",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
}
|
||||
]
|
||||
|
|
@ -0,0 +1,450 @@
|
|||
[
|
||||
{
|
||||
"BriefDescription": "bandwidth usage for lpddr5 evk board",
|
||||
"MetricName": "imx94_bandwidth_usage.lpddr5",
|
||||
"MetricExpr": "(( imx9_ddr0@eddrtq_pm_rd_beat_filt0\\,axi_mask\\=0x000\\,axi_id\\=0x000@ + imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x000\\,axi_id\\=0x000@ ) * 32 / duration_time) / (4266 * 1000000 * 4)",
|
||||
"ScaleUnit": "1e2%",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bandwidth usage for lpddr4 evk board",
|
||||
"MetricName": "imx94_bandwidth_usage.lpddr4",
|
||||
"MetricExpr": "(( imx9_ddr0@eddrtq_pm_rd_beat_filt0\\,axi_mask\\=0x000\\,axi_id\\=0x000@ + imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x000\\,axi_id\\=0x000@ ) * 32 / duration_time) / (4266 * 1000000 * 4)",
|
||||
"ScaleUnit": "1e2%",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of all masters read from ddr",
|
||||
"MetricName": "imx94_ddr_read.all",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt0\\,axi_mask\\=0x000\\,axi_id\\=0x000@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of all masters write to ddr",
|
||||
"MetricName": "imx94_ddr_write.all",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x000\\,axi_id\\=0x000@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of all a55 read from ddr",
|
||||
"MetricName": "imx94_ddr_read.a55_all",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt1\\,axi_mask\\=0x3fc\\,axi_id\\=0x000@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of all a55 write from ddr",
|
||||
"MetricName": "imx94_ddr_write.a55_all",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x3fc\\,axi_id\\=0x000@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of a55 core 0 read from ddr",
|
||||
"MetricName": "imx94_ddr_read.a55_0",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt2\\,axi_mask\\=0x3ff\\,axi_id\\=0x000@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of a55 core 0 write to ddr",
|
||||
"MetricName": "imx94_ddr_write.a55_0",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x3ff\\,axi_id\\=0x000@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of a55 core 1 read from ddr",
|
||||
"MetricName": "imx94_ddr_read.a55_1",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt0\\,axi_mask\\=0x00f\\,axi_id\\=0x001@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of a55 core 1 write to ddr",
|
||||
"MetricName": "imx94_ddr_write.a55_1",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x00f\\,axi_id\\=0x001@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of a55 core 2 read from ddr",
|
||||
"MetricName": "imx94_ddr_read.a55_2",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt1\\,axi_mask\\=0x00f\\,axi_id\\=0x002@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of a55 core 2 write to ddr",
|
||||
"MetricName": "imx94_ddr_write.a55_2",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x00f\\,axi_id\\=0x002@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of a55 core 3 read from ddr",
|
||||
"MetricName": "imx94_ddr_read.a55_3",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt2\\,axi_mask\\=0x00f\\,axi_id\\=0x003@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of a55 core 3 write to ddr",
|
||||
"MetricName": "imx94_ddr_write.a55_3",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x00f\\,axi_id\\=0x003@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of m7 core1 read from ddr",
|
||||
"MetricName": "imx94_ddr_read.m7_1",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt0\\,axi_mask\\=0x00f\\,axi_id\\=0x004@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of m7 core1 write to ddr",
|
||||
"MetricName": "imx94_ddr_write.m7_1",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x00f\\,axi_id\\=0x004@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of m33 core1 (in netc) read from ddr",
|
||||
"MetricName": "imx94_ddr_read.m33_1",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt1\\,axi_mask\\=0x00f\\,axi_id\\=0x005@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of m33 core1 (in netc) write to ddr",
|
||||
"MetricName": "imx94_ddr_write.m33_1",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x00f\\,axi_id\\=0x005@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of pcie2 read from ddr",
|
||||
"MetricName": "imx94_ddr_read.pcie2",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt2\\,axi_mask\\=0x00f\\,axi_id\\=0x006@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of pcie2 write to ddr",
|
||||
"MetricName": "imx94_ddr_write.pcie2",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x00f\\,axi_id\\=0x006@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of Cortex-A DSU L3 evicted/ACP transactions read from ddr",
|
||||
"MetricName": "imx94_ddr_read.cortex_a_dsu",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt0\\,axi_mask\\=0x00f\\,axi_id\\=0x007@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of Cortex-A DSU L3 evicted/ACP transactions write to ddr",
|
||||
"MetricName": "imx94_ddr_write.cortex_a_dsu",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x00f\\,axi_id\\=0x007@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of m33 core0 read from ddr",
|
||||
"MetricName": "imx94_ddr_read.m33_0",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt1\\,axi_mask\\=0x00f\\,axi_id\\=0x008@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of m33 core0 write to ddr",
|
||||
"MetricName": "imx94_ddr_write.m33_0",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x00f\\,axi_id\\=0x008@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of m7 core0 read from ddr",
|
||||
"MetricName": "imx94_ddr_read.m7_0",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt2\\,axi_mask\\=0x00f\\,axi_id\\=0x009@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of m7 core0 write to ddr",
|
||||
"MetricName": "imx94_ddr_write.m7_0",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x00f\\,axi_id\\=0x009@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of sentinel read from ddr",
|
||||
"MetricName": "imx94_ddr_read.sentinel",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt0\\,axi_mask\\=0x00f\\,axi_id\\=0x00a@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of sentinel write to ddr",
|
||||
"MetricName": "imx94_ddr_write.sentinel",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x00f\\,axi_id\\=0x00a@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of edma1 read from ddr",
|
||||
"MetricName": "imx94_ddr_read.edma1",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt1\\,axi_mask\\=0x00f\\,axi_id\\=0x00b@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of edma1 write to ddr",
|
||||
"MetricName": "imx94_ddr_write.edma1",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x00f\\,axi_id\\=0x00b@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of edma2 read from ddr",
|
||||
"MetricName": "imx94_ddr_read.edma2",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt2\\,axi_mask\\=0x00f\\,axi_id\\=0x00c@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of edma2 write to ddr",
|
||||
"MetricName": "imx94_ddr_write.edma2",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x00f\\,axi_id\\=0x00c@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of netc read from ddr",
|
||||
"MetricName": "imx94_ddr_read.netc",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt0\\,axi_mask\\=0x00f\\,axi_id\\=0x00d@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of netc write to ddr",
|
||||
"MetricName": "imx94_ddr_write.netc",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x00f\\,axi_id\\=0x00d@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of aonmix read from ddr",
|
||||
"MetricName": "imx94_ddr_read.aonmix",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt2\\,axi_mask\\=0x00f\\,axi_id\\=0x00f@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of aonmix write to ddr",
|
||||
"MetricName": "imx94_ddr_write.aonmix",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x00f\\,axi_id\\=0x00f@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of npumix read from ddr",
|
||||
"MetricName": "imx94_ddr_read.npumix",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt0\\,axi_mask\\=0x3f0\\,axi_id\\=0x010@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of npumix write to ddr",
|
||||
"MetricName": "imx94_ddr_write.npumix",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x3f0\\,axi_id\\=0x010@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of usdhc1 read from ddr",
|
||||
"MetricName": "imx94_ddr_read.usdhc1",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt1\\,axi_mask\\=0x3f0\\,axi_id\\=0x0b0@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of usdhc1 write to ddr",
|
||||
"MetricName": "imx94_ddr_write.usdhc1",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x3f0\\,axi_id\\=0x0b0@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of usdhc2 read from ddr",
|
||||
"MetricName": "imx94_ddr_read.usdhc2",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt2\\,axi_mask\\=0x3f0\\,axi_id\\=0x0c0@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of usdhc2 write to ddr",
|
||||
"MetricName": "imx94_ddr_write.usdhc2",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x3f0\\,axi_id\\=0x0c0@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of usdhc3 read from ddr",
|
||||
"MetricName": "imx94_ddr_read.usdhc3",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt0\\,axi_mask\\=0x3f0\\,axi_id\\=0x0d0@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of usdhc3 write to ddr",
|
||||
"MetricName": "imx94_ddr_write.usdhc3",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x3f0\\,axi_id\\=0x0d0@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of xspi read from ddr",
|
||||
"MetricName": "imx94_ddr_read.xspi",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt2\\,axi_mask\\=0x3f0\\,axi_id\\=0x0f0@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of xspi write to ddr",
|
||||
"MetricName": "imx94_ddr_write.xspi",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x3f0\\,axi_id\\=0x0f0@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of pcie1 read from ddr",
|
||||
"MetricName": "imx94_ddr_read.pcie1",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt0\\,axi_mask\\=0x3f0\\,axi_id\\=0x100@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of pcie1 write to ddr",
|
||||
"MetricName": "imx94_ddr_write.pcie1",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x3f0\\,axi_id\\=0x100@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of usb1 read from ddr",
|
||||
"MetricName": "imx94_ddr_read.usb1",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt1\\,axi_mask\\=0x3f0\\,axi_id\\=0x140@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of usb1 write to ddr",
|
||||
"MetricName": "imx94_ddr_write.usb1",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x3f0\\,axi_id\\=0x140@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of usb2 read from ddr",
|
||||
"MetricName": "imx94_ddr_read.usb2",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt2\\,axi_mask\\=0x3f0\\,axi_id\\=0x150@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of usb2 write to ddr",
|
||||
"MetricName": "imx94_ddr_write.usb2",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x3f0\\,axi_id\\=0x150@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of pxp read from ddr",
|
||||
"MetricName": "imx94_ddr_read.pxp",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt0\\,axi_mask\\=0x3f0\\,axi_id\\=0x2a0@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of pxp write to ddr",
|
||||
"MetricName": "imx94_ddr_write.pxp",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x3f0\\,axi_id\\=0x2a0@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of dcif read from ddr",
|
||||
"MetricName": "imx94_ddr_read.dcif",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_rd_beat_filt1\\,axi_mask\\=0x3f0\\,axi_id\\=0x2b0@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "bytes of dcif write to ddr",
|
||||
"MetricName": "imx94_ddr_write.dcif",
|
||||
"MetricExpr": "( imx9_ddr0@eddrtq_pm_wr_beat_filt\\,axi_mask\\=0x3f0\\,axi_id\\=0x2b0@ ) * 32",
|
||||
"ScaleUnit": "9.765625e-4KB",
|
||||
"Unit": "imx9_ddr",
|
||||
"Compat": "imx94"
|
||||
}
|
||||
]
|
||||
|
|
@ -0,0 +1,72 @@
|
|||
[
|
||||
{
|
||||
"EventName": "cpu-cycles",
|
||||
"BriefDescription": "Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cycles].",
|
||||
"LegacyConfigCode": "0"
|
||||
},
|
||||
{
|
||||
"EventName": "cycles",
|
||||
"BriefDescription": "Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cpu-cycles].",
|
||||
"LegacyConfigCode": "0"
|
||||
},
|
||||
{
|
||||
"EventName": "instructions",
|
||||
"BriefDescription": "Retired instructions. Be careful, these can be affected by various issues, most notably hardware interrupt counts.",
|
||||
"LegacyConfigCode": "1"
|
||||
},
|
||||
{
|
||||
"EventName": "cache-references",
|
||||
"BriefDescription": "Cache accesses. Usually this indicates Last Level Cache accesses but this may vary depending on your CPU. This may include prefetches and coherency messages; again this depends on the design of your CPU.",
|
||||
"LegacyConfigCode": "2"
|
||||
},
|
||||
{
|
||||
"EventName": "cache-misses",
|
||||
"BriefDescription": "Cache misses. Usually this indicates Last Level Cache misses; this is intended to be used in conjunction with the PERF_COUNT_HW_CACHE_REFERENCES event to calculate cache miss rates.",
|
||||
"LegacyConfigCode": "3"
|
||||
},
|
||||
{
|
||||
"EventName": "branches",
|
||||
"BriefDescription": "Retired branch instructions [This event is an alias of branch-instructions].",
|
||||
"LegacyConfigCode": "4"
|
||||
},
|
||||
{
|
||||
"EventName": "branch-instructions",
|
||||
"BriefDescription": "Retired branch instructions [This event is an alias of branches].",
|
||||
"LegacyConfigCode": "4"
|
||||
},
|
||||
{
|
||||
"EventName": "branch-misses",
|
||||
"BriefDescription": "Mispredicted branch instructions.",
|
||||
"LegacyConfigCode": "5"
|
||||
},
|
||||
{
|
||||
"EventName": "bus-cycles",
|
||||
"BriefDescription": "Bus cycles, which can be different from total cycles.",
|
||||
"LegacyConfigCode": "6"
|
||||
},
|
||||
{
|
||||
"EventName": "stalled-cycles-frontend",
|
||||
"BriefDescription": "Stalled cycles during issue [This event is an alias of idle-cycles-frontend].",
|
||||
"LegacyConfigCode": "7"
|
||||
},
|
||||
{
|
||||
"EventName": "idle-cycles-frontend",
|
||||
"BriefDescription": "Stalled cycles during issue [This event is an alias of stalled-cycles-fronted].",
|
||||
"LegacyConfigCode": "7"
|
||||
},
|
||||
{
|
||||
"EventName": "stalled-cycles-backend",
|
||||
"BriefDescription": "Stalled cycles during retirement [This event is an alias of idle-cycles-backend].",
|
||||
"LegacyConfigCode": "8"
|
||||
},
|
||||
{
|
||||
"EventName": "idle-cycles-backend",
|
||||
"BriefDescription": "Stalled cycles during retirement [This event is an alias of stalled-cycles-backend].",
|
||||
"LegacyConfigCode": "8"
|
||||
},
|
||||
{
|
||||
"EventName": "ref-cycles",
|
||||
"BriefDescription": "Total cycles; not affected by CPU frequency scaling.",
|
||||
"LegacyConfigCode": "9"
|
||||
}
|
||||
]
|
||||
|
|
@ -0,0 +1,151 @@
|
|||
[
|
||||
{
|
||||
"BriefDescription": "Average CPU utilization",
|
||||
"MetricExpr": "(software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@) / (duration_time * 1e9)",
|
||||
"MetricGroup": "Default",
|
||||
"MetricName": "CPUs_utilized",
|
||||
"ScaleUnit": "1CPUs",
|
||||
"MetricConstraint": "NO_GROUP_EVENTS",
|
||||
"DefaultShowEvents": "1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Context switches per CPU second",
|
||||
"MetricExpr": "(software@context\\-switches\\,name\\=context\\-switches@ * 1e9) / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)",
|
||||
"MetricGroup": "Default",
|
||||
"MetricName": "cs_per_second",
|
||||
"ScaleUnit": "1cs/sec",
|
||||
"MetricConstraint": "NO_GROUP_EVENTS",
|
||||
"DefaultShowEvents": "1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Process migrations to a new CPU per CPU second",
|
||||
"MetricExpr": "(software@cpu\\-migrations\\,name\\=cpu\\-migrations@ * 1e9) / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)",
|
||||
"MetricGroup": "Default",
|
||||
"MetricName": "migrations_per_second",
|
||||
"ScaleUnit": "1migrations/sec",
|
||||
"MetricConstraint": "NO_GROUP_EVENTS",
|
||||
"DefaultShowEvents": "1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Page faults per CPU second",
|
||||
"MetricExpr": "(software@page\\-faults\\,name\\=page\\-faults@ * 1e9) / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)",
|
||||
"MetricGroup": "Default",
|
||||
"MetricName": "page_faults_per_second",
|
||||
"ScaleUnit": "1faults/sec",
|
||||
"MetricConstraint": "NO_GROUP_EVENTS",
|
||||
"DefaultShowEvents": "1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Instructions Per Cycle",
|
||||
"MetricExpr": "instructions / cpu\\-cycles",
|
||||
"MetricGroup": "Default",
|
||||
"MetricName": "insn_per_cycle",
|
||||
"MetricThreshold": "insn_per_cycle < 1",
|
||||
"ScaleUnit": "1instructions",
|
||||
"DefaultShowEvents": "1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Max front or backend stalls per instruction",
|
||||
"MetricExpr": "max(stalled\\-cycles\\-frontend, stalled\\-cycles\\-backend) / instructions",
|
||||
"MetricGroup": "Default",
|
||||
"MetricName": "stalled_cycles_per_instruction",
|
||||
"DefaultShowEvents": "1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Frontend stalls per cycle",
|
||||
"MetricExpr": "stalled\\-cycles\\-frontend / cpu\\-cycles",
|
||||
"MetricGroup": "Default",
|
||||
"MetricName": "frontend_cycles_idle",
|
||||
"MetricThreshold": "frontend_cycles_idle > 0.1",
|
||||
"DefaultShowEvents": "1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Backend stalls per cycle",
|
||||
"MetricExpr": "stalled\\-cycles\\-backend / cpu\\-cycles",
|
||||
"MetricGroup": "Default",
|
||||
"MetricName": "backend_cycles_idle",
|
||||
"MetricThreshold": "backend_cycles_idle > 0.2",
|
||||
"DefaultShowEvents": "1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Cycles per CPU second",
|
||||
"MetricExpr": "cpu\\-cycles / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)",
|
||||
"MetricGroup": "Default",
|
||||
"MetricName": "cycles_frequency",
|
||||
"ScaleUnit": "1GHz",
|
||||
"MetricConstraint": "NO_GROUP_EVENTS",
|
||||
"DefaultShowEvents": "1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Branches per CPU second",
|
||||
"MetricExpr": "branches / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)",
|
||||
"MetricGroup": "Default",
|
||||
"MetricName": "branch_frequency",
|
||||
"ScaleUnit": "1000M/sec",
|
||||
"MetricConstraint": "NO_GROUP_EVENTS",
|
||||
"DefaultShowEvents": "1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Branch miss rate",
|
||||
"MetricExpr": "branch\\-misses / branches",
|
||||
"MetricGroup": "Default",
|
||||
"MetricName": "branch_miss_rate",
|
||||
"MetricThreshold": "branch_miss_rate > 0.05",
|
||||
"ScaleUnit": "100%",
|
||||
"DefaultShowEvents": "1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "L1D miss rate",
|
||||
"MetricExpr": "L1\\-dcache\\-load\\-misses / L1\\-dcache\\-loads",
|
||||
"MetricGroup": "Default2",
|
||||
"MetricName": "l1d_miss_rate",
|
||||
"MetricThreshold": "l1d_miss_rate > 0.05",
|
||||
"ScaleUnit": "100%",
|
||||
"DefaultShowEvents": "1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "LLC miss rate",
|
||||
"MetricExpr": "LLC\\-load\\-misses / LLC\\-loads",
|
||||
"MetricGroup": "Default2",
|
||||
"MetricName": "llc_miss_rate",
|
||||
"MetricThreshold": "llc_miss_rate > 0.05",
|
||||
"ScaleUnit": "100%",
|
||||
"DefaultShowEvents": "1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "L1I miss rate",
|
||||
"MetricExpr": "L1\\-icache\\-load\\-misses / L1\\-icache\\-loads",
|
||||
"MetricGroup": "Default3",
|
||||
"MetricName": "l1i_miss_rate",
|
||||
"MetricThreshold": "l1i_miss_rate > 0.05",
|
||||
"ScaleUnit": "100%",
|
||||
"DefaultShowEvents": "1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "dTLB miss rate",
|
||||
"MetricExpr": "dTLB\\-load\\-misses / dTLB\\-loads",
|
||||
"MetricGroup": "Default3",
|
||||
"MetricName": "dtlb_miss_rate",
|
||||
"MetricThreshold": "dtlb_miss_rate > 0.05",
|
||||
"ScaleUnit": "100%",
|
||||
"DefaultShowEvents": "1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "iTLB miss rate",
|
||||
"MetricExpr": "iTLB\\-load\\-misses / iTLB\\-loads",
|
||||
"MetricGroup": "Default3",
|
||||
"MetricName": "itlb_miss_rate",
|
||||
"MetricThreshold": "itlb_miss_rate > 0.05",
|
||||
"ScaleUnit": "100%",
|
||||
"DefaultShowEvents": "1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "L1 prefetch miss rate",
|
||||
"MetricExpr": "L1\\-dcache\\-prefetch\\-misses / L1\\-dcache\\-prefetches",
|
||||
"MetricGroup": "Default4",
|
||||
"MetricName": "l1_prefetch_miss_rate",
|
||||
"MetricThreshold": "l1_prefetch_miss_rate > 0.05",
|
||||
"ScaleUnit": "100%",
|
||||
"DefaultShowEvents": "1"
|
||||
}
|
||||
]
|
||||
|
|
@ -3,13 +3,15 @@
|
|||
"Unit": "software",
|
||||
"EventName": "cpu-clock",
|
||||
"BriefDescription": "Per-CPU high-resolution timer based event",
|
||||
"ConfigCode": "0"
|
||||
"ConfigCode": "0",
|
||||
"ScaleUnit": "1e-6msec"
|
||||
},
|
||||
{
|
||||
"Unit": "software",
|
||||
"EventName": "task-clock",
|
||||
"BriefDescription": "Per-task high-resolution timer based event",
|
||||
"ConfigCode": "1"
|
||||
"ConfigCode": "1",
|
||||
"ScaleUnit": "1e-6msec"
|
||||
},
|
||||
{
|
||||
"Unit": "software",
|
||||
|
|
|
|||
|
|
@ -70,5 +70,17 @@
|
|||
"EventName": "system_tsc_freq",
|
||||
"BriefDescription": "The amount a Time Stamp Counter (TSC) increases per second",
|
||||
"ConfigCode": "12"
|
||||
},
|
||||
{
|
||||
"Unit": "tool",
|
||||
"EventName": "core_wide",
|
||||
"BriefDescription": "1 if not SMT, if SMT are events being gathered on all SMT threads 1 otherwise 0",
|
||||
"ConfigCode": "13"
|
||||
},
|
||||
{
|
||||
"Unit": "tool",
|
||||
"EventName": "target_cpu",
|
||||
"BriefDescription": "1 if CPUs being analyzed, 0 if threads/processes",
|
||||
"ConfigCode": "14"
|
||||
}
|
||||
]
|
||||
|
|
|
|||
|
|
@ -20,5 +20,6 @@
|
|||
0x489-0x8000000000000008-0x[[:xdigit:]]+,v1,sifive/p550,core
|
||||
0x489-0x8000000000000[1-6]08-0x[9b][[:xdigit:]]+,v1,sifive/p650,core
|
||||
0x5b7-0x0-0x0,v1,thead/c900-legacy,core
|
||||
0x5b7-0x80000000090c0d00-0x2047000,v1,thead/c900-legacy,core
|
||||
0x67e-0x80000000db0000[89]0-0x[[:xdigit:]]+,v1,starfive/dubhe-80,core
|
||||
0x31e-0x8000000000008a45-0x[[:xdigit:]]+,v1,andes/ax45,core
|
||||
|
|
|
|||
|
|
|
@ -7,17 +7,17 @@
|
|||
{
|
||||
"BriefDescription": "Cycles per Instruction",
|
||||
"MetricName": "cpi",
|
||||
"MetricExpr": "CPU_CYCLES / INSTRUCTIONS if has_event(INSTRUCTIONS) else 0"
|
||||
"MetricExpr": "CPU_CYCLES / INSTRUCTIONS if has_event(CPU_CYCLES) else 0"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Problem State Instruction Ratio",
|
||||
"MetricName": "prbstate",
|
||||
"MetricExpr": "(PROBLEM_STATE_INSTRUCTIONS / INSTRUCTIONS) * 100 if has_event(INSTRUCTIONS) else 0"
|
||||
"MetricExpr": "(PROBLEM_STATE_INSTRUCTIONS / INSTRUCTIONS) * 100 if has_event(PROBLEM_STATE_INSTRUCTIONS) else 0"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Level One Miss per 100 Instructions",
|
||||
"MetricName": "l1mp",
|
||||
"MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100 if has_event(INSTRUCTIONS) else 0"
|
||||
"MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100 if has_event(L1I_DIR_WRITES) else 0"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Percentage sourced from Level 2 cache",
|
||||
|
|
@ -52,7 +52,7 @@
|
|||
{
|
||||
"BriefDescription": "Estimated Instruction Complexity CPI infinite Level 1",
|
||||
"MetricName": "est_cpi",
|
||||
"MetricExpr": "(CPU_CYCLES / INSTRUCTIONS) - (L1C_TLB2_MISSES / INSTRUCTIONS) if has_event(INSTRUCTIONS) else 0"
|
||||
"MetricExpr": "(CPU_CYCLES / INSTRUCTIONS) - (L1C_TLB2_MISSES / INSTRUCTIONS) if has_event(CPU_CYCLES) else 0"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Estimated Sourcing Cycles per Level 1 Miss",
|
||||
|
|
|
|||
|
|
@ -7,17 +7,17 @@
|
|||
{
|
||||
"BriefDescription": "Cycles per Instruction",
|
||||
"MetricName": "cpi",
|
||||
"MetricExpr": "CPU_CYCLES / INSTRUCTIONS if has_event(INSTRUCTIONS) else 0"
|
||||
"MetricExpr": "CPU_CYCLES / INSTRUCTIONS if has_event(CPU_CYCLES) else 0"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Problem State Instruction Ratio",
|
||||
"MetricName": "prbstate",
|
||||
"MetricExpr": "(PROBLEM_STATE_INSTRUCTIONS / INSTRUCTIONS) * 100 if has_event(INSTRUCTIONS) else 0"
|
||||
"MetricExpr": "(PROBLEM_STATE_INSTRUCTIONS / INSTRUCTIONS) * 100 if has_event(PROBLEM_STATE_INSTRUCTIONS) else 0"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Level One Miss per 100 Instructions",
|
||||
"MetricName": "l1mp",
|
||||
"MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100 if has_event(INSTRUCTIONS) else 0"
|
||||
"MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100 if has_event(L1I_DIR_WRITES) else 0"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Percentage sourced from Level 2 cache",
|
||||
|
|
@ -52,7 +52,7 @@
|
|||
{
|
||||
"BriefDescription": "Estimated Instruction Complexity CPI infinite Level 1",
|
||||
"MetricName": "est_cpi",
|
||||
"MetricExpr": "(CPU_CYCLES / INSTRUCTIONS) - (L1C_TLB2_MISSES / INSTRUCTIONS) if has_event(INSTRUCTIONS) else 0"
|
||||
"MetricExpr": "(CPU_CYCLES / INSTRUCTIONS) - (L1C_TLB2_MISSES / INSTRUCTIONS) if has_event(L1C_TLB2_MISSES) else 0"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Estimated Sourcing Cycles per Level 1 Miss",
|
||||
|
|
|
|||
|
|
@ -877,7 +877,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 128 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_128",
|
||||
|
|
@ -890,7 +890,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 16 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_16",
|
||||
|
|
@ -903,7 +903,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 256 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_256",
|
||||
|
|
@ -916,7 +916,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 32 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_32",
|
||||
|
|
@ -929,7 +929,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 4 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_4",
|
||||
|
|
@ -942,7 +942,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 512 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_512",
|
||||
|
|
@ -955,7 +955,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 64 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_64",
|
||||
|
|
@ -968,7 +968,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 8 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_8",
|
||||
|
|
|
|||
|
|
@ -32,8 +32,9 @@
|
|||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of active floating point and integer dividers per cycle.",
|
||||
"BriefDescription": "This event is deprecated.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Deprecated": "1",
|
||||
"EventCode": "0xcd",
|
||||
"EventName": "ARITH.DIV_OCCUPANCY",
|
||||
"SampleAfterValue": "1000003",
|
||||
|
|
@ -41,8 +42,9 @@
|
|||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of floating point and integer divider uops executed per cycle.",
|
||||
"BriefDescription": "This event is deprecated.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Deprecated": "1",
|
||||
"EventCode": "0xcd",
|
||||
"EventName": "ARITH.DIV_UOPS",
|
||||
"SampleAfterValue": "1000003",
|
||||
|
|
|
|||
|
|
@ -247,7 +247,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 128 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_128",
|
||||
|
|
@ -259,7 +259,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 16 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_16",
|
||||
|
|
@ -271,7 +271,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 256 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_256",
|
||||
|
|
@ -283,7 +283,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 32 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_32",
|
||||
|
|
@ -295,7 +295,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 4 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_4",
|
||||
|
|
@ -307,7 +307,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 512 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_512",
|
||||
|
|
@ -319,7 +319,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 64 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_64",
|
||||
|
|
@ -331,7 +331,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 8 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_8",
|
||||
|
|
|
|||
|
|
@ -9,16 +9,18 @@
|
|||
"UMask": "0x3"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of active floating point and integer dividers per cycle.",
|
||||
"BriefDescription": "This event is deprecated.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Deprecated": "1",
|
||||
"EventCode": "0xcd",
|
||||
"EventName": "ARITH.DIV_OCCUPANCY",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x3"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of floating point and integer divider uops executed per cycle.",
|
||||
"BriefDescription": "This event is deprecated.",
|
||||
"Counter": "0,1,2,3,4,5",
|
||||
"Deprecated": "1",
|
||||
"EventCode": "0xcd",
|
||||
"EventName": "ARITH.DIV_UOPS",
|
||||
"SampleAfterValue": "1000003",
|
||||
|
|
|
|||
|
|
@ -8,6 +8,16 @@
|
|||
"SampleAfterValue": "1000003",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of L1D cacheline (dirty) evictions caused by load misses, stores, and prefetches.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x51",
|
||||
"EventName": "DL1.DIRTY_EVICTION",
|
||||
"PublicDescription": "Counts the number of L1D cacheline (dirty) evictions caused by load misses, stores, and prefetches. Does not count evictions or dirty writebacks caused by snoops. Does not count a replacement unless a (dirty) line was written back.",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of L1D cacheline (dirty) evictions caused by load misses, stores, and prefetches.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -109,6 +119,15 @@
|
|||
"UMask": "0x1f",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cache lines filled into the L2 cache that are in Exclusive state",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x25",
|
||||
"EventName": "L2_LINES_IN.E",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cache lines filled into the L2 cache that are in Exclusive state",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -119,6 +138,15 @@
|
|||
"UMask": "0x4",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cache lines filled into the L2 cache that are in Forward state",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x25",
|
||||
"EventName": "L2_LINES_IN.F",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x10",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cache lines filled into the L2 cache that are in Forward state",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -129,6 +157,25 @@
|
|||
"UMask": "0x10",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cache lines filled into the L2 cache that are in Invalid state",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x25",
|
||||
"EventName": "L2_LINES_IN.I",
|
||||
"PublicDescription": "Counts the number of cache lines filled into the L2 cache that are in Invalid state, does not count lines that go Invalid due to an eviction",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cache lines filled into the L2 cache that are in Modified state",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x25",
|
||||
"EventName": "L2_LINES_IN.M",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cache lines filled into the L2 cache that are in Modified state",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -139,6 +186,15 @@
|
|||
"UMask": "0x8",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cache lines filled into the L2 cache that are in Shared state",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x25",
|
||||
"EventName": "L2_LINES_IN.S",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cache lines filled into the L2 cache that are in Shared state",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -189,6 +245,16 @@
|
|||
"UMask": "0x1",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of L2 cache lines that have been L2 hardware prefetched but not used by demand accesses",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x26",
|
||||
"EventName": "L2_LINES_OUT.USELESS_HWPF",
|
||||
"PublicDescription": "Counts the number of L2 cache lines that have been L2 hardware prefetched but not used by demand accesses. Increments on the core that brought the line in originally.",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Cache lines that have been L2 hardware prefetched but not used by demand accesses",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -199,6 +265,42 @@
|
|||
"UMask": "0x4",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of L2 prefetches initiated by either the L2 Stream or AMP that were throttled due to Dynamic Prefetch Throttling. The throttle requestor/source could be from the uncore/SOC or the Dead Block Predictor. Counts on a per core basis.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x28",
|
||||
"EventName": "L2_PREFETCHES_THROTTLED.DPT",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of L2 prefetches initiated by the L2 Stream that were throttled due to Demand Throttle Prefetcher. DTP Global Triggered with no Local Override. Counts on a per core basis.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x28",
|
||||
"EventName": "L2_PREFETCHES_THROTTLED.DTP",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of L2 prefetches initiated by the L2 Stream and not throttled by DTP due to local override. These prefetches may still be throttled due to another throttler mechanism besides DTP. Counts on a per core basis.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x28",
|
||||
"EventName": "L2_PREFETCHES_THROTTLED.DTP_OVERRIDE",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of L2 prefetches initiated by either the L2 Stream or AMP that were throttled due to exceeding the XQ threshold set by either XQ_THRESHOLD_DTP or XQ_THRESHOLD. Counts on a per core basis.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x28",
|
||||
"EventName": "L2_PREFETCHES_THROTTLED.XQ_THRESH",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of demand and prefetch transactions that the External Queue (XQ) rejects due to a full or near full condition.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -208,6 +310,16 @@
|
|||
"SampleAfterValue": "1000003",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of L2 Cache Accesses Counts the total number of L2 Cache Accesses - sum of hits, misses, rejects front door requests for CRd/DRd/RFO/ItoM/L2 Prefetches only, per core event",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x24",
|
||||
"EventName": "L2_REQUEST.ALL",
|
||||
"PublicDescription": "Counts the number of L2 Cache Accesses Counts the total number of L2 Cache Accesses - sum of hits, misses, rejects front door requests for CRd/DRd/RFO/ItoM/L2 Prefetches only.",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x7",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "All accesses to L2 cache [This event is alias to L2_RQSTS.REFERENCES, L2_RQSTS.ANY]",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -218,6 +330,15 @@
|
|||
"UMask": "0xff",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of L2 Cache Accesses that resulted in a Hit from a front door request only (does not include rejects or recycles), per core event",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x24",
|
||||
"EventName": "L2_REQUEST.HIT",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of L2 Cache Accesses that resulted in a Hit from a front door request only (does not include rejects or recycles), per core event",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -227,6 +348,15 @@
|
|||
"UMask": "0x2",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of total L2 Cache Accesses that resulted in a Miss from a front door request only (does not include rejects or recycles), per core event",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x24",
|
||||
"EventName": "L2_REQUEST.MISS",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Read requests with true-miss in L2 cache [This event is alias to L2_RQSTS.MISS]",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -246,6 +376,15 @@
|
|||
"UMask": "0x1",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of L2 Cache Accesses that miss the L2 and get BBL reject short and long rejects (includes those counted in L2_reject_XQ.any), per core event",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x24",
|
||||
"EventName": "L2_REQUEST.REJECTS",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of L2 Cache Accesses that miss the L2 and get BBL reject short and long rejects, per core event",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -365,6 +504,51 @@
|
|||
"UMask": "0x22",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of LLC prefetches that were throttled due to Dynamic Prefetch Throttling. The throttle requestor/source could be from the uncore/SOC or the Dead Block Predictor. Counts on a per core basis.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x29",
|
||||
"EventName": "LLC_PREFETCHES_THROTTLED.DPT",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of LLC prefetches throttled due to Demand Throttle Prefetcher. DTP Global Triggered with no Local Override. Counts on a per core basis.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x29",
|
||||
"EventName": "LLC_PREFETCHES_THROTTLED.DTP",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of LLC prefetches not throttled by DTP due to local override. These prefetches may still be throttled due to another throttler mechanism. Counts on a per core basis.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x29",
|
||||
"EventName": "LLC_PREFETCHES_THROTTLED.DTP_OVERRIDE",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of LLC prefetches throttled due to LLC hit rate in <insert knob name here>. Counts on a per core basis.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x29",
|
||||
"EventName": "LLC_PREFETCHES_THROTTLED.HIT_RATE",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x10",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of LLC prefetches throttled due to exceeding the XQ threshold set by either XQ_THRESHOLD_DTP or LLC_XQ_THRESHOLD. Counts on a per core basis.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x29",
|
||||
"EventName": "LLC_PREFETCHES_THROTTLED.XQ_THRESH",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Cycles when L1D is locked",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -375,6 +559,16 @@
|
|||
"UMask": "0x2",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cacheable memory requests that miss in the LLC. Counts on a per core basis.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x2e",
|
||||
"EventName": "LONGEST_LAT_CACHE.MISS",
|
||||
"PublicDescription": "Counts the number of cacheable memory requests that miss in the Last Level Cache (LLC). Requests include demand loads, reads for ownership (RFO), instruction fetches and L1 HW prefetches. If the core has access to an L3 cache, the LLC is the L3 cache, otherwise it is the L2 cache. Counts on a per core basis.",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x41",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Core-originated cacheable requests that missed L3 (Except hardware prefetches to the L3)",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -385,6 +579,26 @@
|
|||
"UMask": "0x41",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cacheable memory requests that miss in the LLC. Counts on a per core basis.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x2e",
|
||||
"EventName": "LONGEST_LAT_CACHE.MISS",
|
||||
"PublicDescription": "Counts the number of cacheable memory requests that miss in the Last Level Cache (LLC). Requests include demand loads, reads for ownership (RFO), instruction fetches and L1 HW prefetches. If the core has access to an L3 cache, the LLC is the L3 cache, otherwise it is the L2 cache. Counts on a per core basis.",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x41",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cacheable memory requests that access the LLC. Counts on a per core basis.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x2e",
|
||||
"EventName": "LONGEST_LAT_CACHE.REFERENCE",
|
||||
"PublicDescription": "Counts the number of cacheable memory requests that access the Last Level Cache (LLC). Requests include demand loads, reads for ownership (RFO), instruction fetches and L1 HW prefetches. If the core has access to an L3 cache, the LLC is the L3 cache, otherwise it is the L2 cache. Counts on a per core basis.",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x4f",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Core-originated cacheable requests that refer to L3 (Except hardware prefetches to the L3)",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -535,6 +749,15 @@
|
|||
"UMask": "0x78",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of unhalted cycles when the core is stalled to a store buffer full condition",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x34",
|
||||
"EventName": "MEM_BOUND_STALLS_LOAD.SBFULL",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x80",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of unhalted cycles when the core is stalled to a store buffer full condition",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -858,6 +1081,15 @@
|
|||
"UMask": "0x20",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of load ops retired that miss the L3 cache and hit in DRAM",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xd4",
|
||||
"EventName": "MEM_LOAD_UOPS_MISC_RETIRED.LOCAL_DRAM",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of load ops retired that hit the L1 data cache",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -940,6 +1172,15 @@
|
|||
"UMask": "0x1c",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of load ops retired that hit in the L3 cache.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xd1",
|
||||
"EventName": "MEM_LOAD_UOPS_RETIRED.L3_HIT",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x1c",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of loads that hit in a write combining buffer (WCB), excluding the first load that caused the WCB to allocate.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -1039,6 +1280,16 @@
|
|||
"UMask": "0x1",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of memory uops retired. A single uop that performs both a load AND a store will be counted as 1, not 2 (e.g. ADD [mem], CONST)",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.ALL",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x83",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of load uops retired.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -1081,7 +1332,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_1024",
|
||||
|
|
@ -1093,7 +1344,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_128",
|
||||
|
|
@ -1105,7 +1356,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_128",
|
||||
|
|
@ -1117,7 +1368,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_16",
|
||||
|
|
@ -1129,7 +1380,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_16",
|
||||
|
|
@ -1141,7 +1392,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_2048",
|
||||
|
|
@ -1153,7 +1404,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_256",
|
||||
|
|
@ -1165,7 +1416,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_256",
|
||||
|
|
@ -1177,7 +1428,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_32",
|
||||
|
|
@ -1189,7 +1440,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_32",
|
||||
|
|
@ -1201,7 +1452,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_4",
|
||||
|
|
@ -1213,7 +1464,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_4",
|
||||
|
|
@ -1225,7 +1476,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_512",
|
||||
|
|
@ -1237,7 +1488,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_512",
|
||||
|
|
@ -1249,7 +1500,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_64",
|
||||
|
|
@ -1261,7 +1512,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_64",
|
||||
|
|
@ -1273,7 +1524,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_8",
|
||||
|
|
@ -1285,7 +1536,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_8",
|
||||
|
|
@ -1315,6 +1566,26 @@
|
|||
"UMask": "0x21",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of memory renamed load uops retired.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.MRN_LOADS",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x9",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of memory renamed store uops retired.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.MRN_STORES",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0xa",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of memory uops retired that were splits.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -1375,6 +1646,16 @@
|
|||
"UMask": "0x42",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of memory uops retired that missed in the second level TLB.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.STLB_MISS",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x13",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of memory uops retired that missed in the second level TLB.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -1385,6 +1666,16 @@
|
|||
"UMask": "0x13",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of load ops retired that filled the STLB - includes those in DTLB_LOAD_MISSES submasks",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.STLB_MISS_LOADS",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x11",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of load uops retired that miss in the second Level TLB.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -1395,6 +1686,16 @@
|
|||
"UMask": "0x11",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of store ops retired (store STLB miss)",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.STLB_MISS_STORES",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x12",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of store uops retired that miss in the second level TLB.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
|
|||
|
|
@ -1,4 +1,14 @@
|
|||
[
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles when any of the floating point dividers are active.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"CounterMask": "1",
|
||||
"EventCode": "0xcd",
|
||||
"EventName": "ARITH.FPDIV_ACTIVE",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Cycles when floating-point divide unit is busy executing divide or square root operations.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -20,6 +30,24 @@
|
|||
"UMask": "0x2",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of active floating point dividers per cycle in the loop stage.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xcd",
|
||||
"EventName": "ARITH.FPDIV_OCCUPANCY",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of floating point divider uops executed per cycle.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xcd",
|
||||
"EventName": "ARITH.FPDIV_UOPS",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts all microcode FP assists.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -473,6 +501,51 @@
|
|||
"UMask": "0x3f",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on all floating point ports.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb2",
|
||||
"EventName": "FP_VINT_UOPS_EXECUTED.ALL",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1f",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on floating point and vector integer port 0.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb2",
|
||||
"EventName": "FP_VINT_UOPS_EXECUTED.P0",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on floating point and vector integer port 1.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb2",
|
||||
"EventName": "FP_VINT_UOPS_EXECUTED.P1",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on floating point and vector integer port 2.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb2",
|
||||
"EventName": "FP_VINT_UOPS_EXECUTED.P2",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on floating point and vector integer port 3.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb2",
|
||||
"EventName": "FP_VINT_UOPS_EXECUTED.P3",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x10",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on floating point and vector integer port 0, 1, 2, 3.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
|
|||
|
|
@ -29,6 +29,42 @@
|
|||
"UMask": "0x1",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of BACLEARS due to a conditional jump.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe6",
|
||||
"EventName": "BACLEARS.COND",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x10",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of BACLEARS due to an indirect branch.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe6",
|
||||
"EventName": "BACLEARS.INDIRECT",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of BACLEARS due to a return branch.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe6",
|
||||
"EventName": "BACLEARS.RETURN",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of BACLEARS due to a direct, unconditional jump.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe6",
|
||||
"EventName": "BACLEARS.UNCOND",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Stalls caused by changing prefix length of the instruction.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -48,6 +84,15 @@
|
|||
"UMask": "0x2",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of times a decode restriction reduces the decode throughput due to wrong instruction length prediction.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe9",
|
||||
"EventName": "DECODE_RESTRICTION.PREDECODE_WRONG",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "DSB-to-MITE switch true penalty cycles.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -733,6 +778,15 @@
|
|||
"UMask": "0x1",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles that the micro-sequencer is busy.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe7",
|
||||
"EventName": "MS_DECODED.MS_BUSY",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles that the micro-sequencer is busy.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -741,5 +795,23 @@
|
|||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of times entered into a ucode flow in the FEC. Includes inserted flows due to front-end detected faults or assists.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe7",
|
||||
"EventName": "MS_DECODED.MS_ENTRY",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of times nanocode flow is executed.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe7",
|
||||
"EventName": "MS_DECODED.NANO_CODE",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
}
|
||||
]
|
||||
|
|
|
|||
|
|
@ -1,4 +1,13 @@
|
|||
[
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles that the head (oldest load) of the load buffer is stalled due to any number of reasons, including an L1 miss, WCB full, pagewalk, store address block or store data block.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x05",
|
||||
"EventName": "LD_HEAD.ANY",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x7f",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles that the head (oldest load) of the load buffer is stalled due to any number of reasons, including an L1 miss, WCB full, pagewalk, store address block or store data block, on a load that retires.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -62,6 +71,16 @@
|
|||
"UMask": "0x81",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles that the head (oldest load) of the load buffer is stalled due to other block cases.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x05",
|
||||
"EventName": "LD_HEAD.OTHER",
|
||||
"PublicDescription": "Counts the number of cycles that the head (oldest load) of the load buffer is stalled due to other block cases such as pipeline conflicts, fences, etc.",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x40",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles that the head (oldest load) of the load buffer and retirement are both stalled due to other block cases.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -82,6 +101,15 @@
|
|||
"UMask": "0xc0",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles that the head (oldest load) of the load buffer is stalled due to a pagewalk.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x05",
|
||||
"EventName": "LD_HEAD.PGWALK",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x20",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles that the head (oldest load) of the load buffer and retirement are both stalled due to a pagewalk.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -100,6 +128,15 @@
|
|||
"UMask": "0xa0",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles that the head (oldest load) of the load buffer is stalled due to a store address match.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x05",
|
||||
"EventName": "LD_HEAD.ST_ADDR",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles that the head (oldest load) of the load buffer and retirement are both stalled due to a store address match.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -118,6 +155,24 @@
|
|||
"UMask": "0x84",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles that the head (oldest load) of the load buffer is stalled due to store data forward block.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x05",
|
||||
"EventName": "LD_HEAD.ST_DATA",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles that the head (oldest load) of the load buffer is stalled due to request buffers full or lock in progress.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x05",
|
||||
"EventName": "LD_HEAD.WCB_FULL",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles that the head (oldest load) of the load buffer and retirement are both stalled due to request buffers full or lock in progress.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -155,6 +210,15 @@
|
|||
"UMask": "0x2",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of machine clears that flush the pipeline and restart the machine without the use of microcode.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xc3",
|
||||
"EventName": "MACHINE_CLEARS.MEMORY_ORDERING_FAST",
|
||||
"SampleAfterValue": "20003",
|
||||
"UMask": "0x82",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts randomly selected loads when the latency from first dispatch to completion is greater than 1024 cycles.",
|
||||
"Counter": "2,3,4,5,6,7,8,9",
|
||||
|
|
|
|||
|
|
@ -18,6 +18,89 @@
|
|||
"UMask": "0x8",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of unhalted cycles a Core is blocked due to a lock In Progress issued by another core",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x63",
|
||||
"EventName": "BUS_LOCK.BLOCKED_CYCLES",
|
||||
"PublicDescription": "Counts the number of unhalted cycles a Core is blocked due to a lock In Progress issued by another core. Counts on a per core basis.",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of unhalted cycles a Core is blocked due to an Accepted lock it issued, includes both split and non-split lock cycles.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x63",
|
||||
"EventName": "BUS_LOCK.LOCK_CYCLES",
|
||||
"PublicDescription": "Counts the number of unhalted cycles a Core is blocked due to an Accepted lock it issued, includes both split and non-split lock cycles. Counts on a per core basis.",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of non-split locks such as UC locks issued by a Core (does not include cache locks)",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x63",
|
||||
"EventName": "BUS_LOCK.NON_SPLIT_LOCKS",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of split locks issued by a Core",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x63",
|
||||
"EventName": "BUS_LOCK.SPLIT_LOCKS",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles the L2 Prefetchers are at throttle level 0",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x32",
|
||||
"EventName": "DYNAMIC_PREFETCH_THROTTLER.LEVEL0_SOC",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles the L2 Prefetcher throttle level is at 1",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x32",
|
||||
"EventName": "DYNAMIC_PREFETCH_THROTTLER.LEVEL1_SOC",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles the L2 Prefetcher throttle level is at 2",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x32",
|
||||
"EventName": "DYNAMIC_PREFETCH_THROTTLER.LEVEL2_SOC",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles the L2 Prefetcher throttle level is at 3",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x32",
|
||||
"EventName": "DYNAMIC_PREFETCH_THROTTLER.LEVEL3_SOC",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles the L2 Prefetcher throttle level is at 4",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x32",
|
||||
"EventName": "DYNAMIC_PREFETCH_THROTTLER.LEVEL4_SOC",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x10",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "This event is deprecated. [This event is alias to MISC_RETIRED.LBR_INSERTS]",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -86,5 +169,41 @@
|
|||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of prefetch requests that were promoted in the XQ to a demand request.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xf4",
|
||||
"EventName": "XQ_PROMOTION.ALL",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x7",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of prefetch requests that were promoted in the XQ to a demand code read.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xf4",
|
||||
"EventName": "XQ_PROMOTION.CRDS",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of prefetch requests that were promoted in the XQ to a demand read.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xf4",
|
||||
"EventName": "XQ_PROMOTION.DRDS",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of prefetch requests that were promoted in the XQ to a demand RFO.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xf4",
|
||||
"EventName": "XQ_PROMOTION.RFOS",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
}
|
||||
]
|
||||
|
|
|
|||
|
|
@ -30,6 +30,16 @@
|
|||
"UMask": "0x3",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles when any of the integer dividers are active.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"CounterMask": "1",
|
||||
"EventCode": "0xcd",
|
||||
"EventName": "ARITH.IDIV_ACTIVE",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Cycles when integer divide unit is busy executing divide or square root operations.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -41,6 +51,24 @@
|
|||
"UMask": "0x8",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of active integer dividers per cycle.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xcd",
|
||||
"EventName": "ARITH.IDIV_OCCUPANCY",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of integer divider uops executed per cycle.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xcd",
|
||||
"EventName": "ARITH.IDIV_UOPS",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Number of occurrences where a microcode assist is invoked by hardware.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -117,6 +145,15 @@
|
|||
"UMask": "0x7e",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of not taken JCC branch instructions retired",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xc4",
|
||||
"EventName": "BR_INST_RETIRED.COND_NTAKEN",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x7f",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Not taken branch instructions retired.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -252,6 +289,15 @@
|
|||
"UMask": "0xfb",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of near indirect JMP branch instructions retired",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xc4",
|
||||
"EventName": "BR_INST_RETIRED.INDIRECT_JMP",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0xef",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of near indirect JMP branch instructions retired.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -261,6 +307,17 @@
|
|||
"UMask": "0xef",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "This event is deprecated. Refer to new event BR_INST_RETIRED.INDIRECT_CALL",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Deprecated": "1",
|
||||
"Errata": "ARL011",
|
||||
"EventCode": "0xc4",
|
||||
"EventName": "BR_INST_RETIRED.IND_CALL",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0xfb",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of near CALL branch instructions retired",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -318,6 +375,15 @@
|
|||
"UMask": "0xf7",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of taken branch instructions retired",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xc4",
|
||||
"EventName": "BR_INST_RETIRED.NEAR_TAKEN",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0xc0",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Taken branch instructions retired.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -440,6 +506,15 @@
|
|||
"UMask": "0x151",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of mispredicted not taken JCC branch instructions retired",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xc5",
|
||||
"EventName": "BR_MISP_RETIRED.COND_NTAKEN",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x7f",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Mispredicted non-taken conditional branch instructions retired.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -613,6 +688,15 @@
|
|||
"UMask": "0xc0",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of mispredicted near indirect JMP branch instructions retired",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xc5",
|
||||
"EventName": "BR_MISP_RETIRED.INDIRECT_JMP",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0xef",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of mispredicted near indirect JMP branch instructions retired.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -622,6 +706,15 @@
|
|||
"UMask": "0xef",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of mispredicted near taken branch instructions retired",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xc5",
|
||||
"EventName": "BR_MISP_RETIRED.NEAR_TAKEN",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x80",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Number of near branch instructions retired that were mispredicted and taken.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -689,6 +782,15 @@
|
|||
"UMask": "0x48",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the total number of BTCLEARS.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe8",
|
||||
"EventName": "BTCLEAR.ANY",
|
||||
"PublicDescription": "Counts the total number of BTCLEARS which occurs when the Branch Target Buffer (BTB) predicts a taken branch.",
|
||||
"SampleAfterValue": "1000003",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Core clocks when the thread is in the C0.1 light-weight slower wakeup time but more power saving optimized state.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -1187,6 +1289,15 @@
|
|||
"UMask": "0x80",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on all Integer ports.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "INT_UOPS_EXECUTED.ALL",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0xff",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on a load port.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -1197,6 +1308,42 @@
|
|||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on integer port 0.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "INT_UOPS_EXECUTED.P0",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on integer port 1.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "INT_UOPS_EXECUTED.P1",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x10",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on integer port 2.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "INT_UOPS_EXECUTED.P2",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x20",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on integer port 3.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "INT_UOPS_EXECUTED.P3",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x40",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on integer port 0,1, 2, 3.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -1327,6 +1474,15 @@
|
|||
"UMask": "0x4",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of occurrences a retired load was blocked for any of the following reasons: utlb_miss, 4k_alias, unknown_sta/bad_fwd, unready_fwd (includes md blocks and esp consuming load blocks)",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x03",
|
||||
"EventName": "LD_BLOCKS.ALL",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1f",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of occurrences a retired load gets blocked because its address exactly matches an older store whose data is not ready (a.k.a. unknown). unready_fwd",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -1392,6 +1548,25 @@
|
|||
"UMask": "0x2",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of demand loads that match on a wcb (request buffer) allocated by an L1 hardware prefetch [This event is alias to LOAD_HIT_PREFETCH.HW_PF]",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x4c",
|
||||
"EventName": "LOAD_HIT_PREFETCH.HWPF",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "This event is deprecated. [This event is alias to LOAD_HIT_PREFETCH.HWPF]",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Deprecated": "1",
|
||||
"EventCode": "0x4c",
|
||||
"EventName": "LOAD_HIT_PREFETCH.HW_PF",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Cycles Uops delivered by the LSD, but didn't come from the decoder.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -1432,6 +1607,15 @@
|
|||
"SampleAfterValue": "20003",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of machine clears that flush the pipeline and restart the machine without the use of microcode.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xc3",
|
||||
"EventName": "MACHINE_CLEARS.ANY_FAST",
|
||||
"SampleAfterValue": "20003",
|
||||
"UMask": "0xff",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Number of machine clears (nukes) of any type.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -1462,6 +1646,15 @@
|
|||
"UMask": "0x8",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of machine clears that flush the pipeline and restart the machine without the use of microcode.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xc3",
|
||||
"EventName": "MACHINE_CLEARS.DISAMBIGUATION_FAST",
|
||||
"SampleAfterValue": "20003",
|
||||
"UMask": "0x88",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of nukes due to memory renaming",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -1471,6 +1664,15 @@
|
|||
"UMask": "0x10",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of machine clears that flush the pipeline and restart the machine without the use of microcode.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xc3",
|
||||
"EventName": "MACHINE_CLEARS.MRN_NUKE_FAST",
|
||||
"SampleAfterValue": "20003",
|
||||
"UMask": "0x90",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of times that the machine clears due to a page fault. Covers both I-Side and D-Side (Loads/Stores) page faults. A page fault occurs when either the page is not present, or an access violation.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -1574,6 +1776,15 @@
|
|||
"UMask": "0x20",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of LBR entries recorded. Requires LBRs to be enabled in IA32_LBR_CTL.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe4",
|
||||
"EventName": "MISC_RETIRED.LBR_INSERTS",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "LBR record is inserted",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -1593,6 +1804,86 @@
|
|||
"UMask": "0x1",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of CLFLUSH, CLWB, and CLDEMOTE instructions retired.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe0",
|
||||
"EventName": "MISC_RETIRED1.CL_INST",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0xff",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of LFENCE instructions retired.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe0",
|
||||
"EventName": "MISC_RETIRED1.LFENCE",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of RDPMC, RDTSC, and RDTSCP instructions retired.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe0",
|
||||
"EventName": "MISC_RETIRED1.RDPMC_RDTSC_P",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Count the number of WRMSR instructions retired.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe0",
|
||||
"EventName": "MISC_RETIRED1.WRMSR",
|
||||
"SampleAfterValue": "1000003",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of faults and software interrupts with vector < 32.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe1",
|
||||
"EventName": "MISC_RETIRED2.FAULT_ALL",
|
||||
"PublicDescription": "Counts the number of faults and software interrupts with vector < 32, including VOE cases.",
|
||||
"SampleAfterValue": "1000003",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of PSB+ nuke events and ToPA trap events.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe1",
|
||||
"EventName": "MISC_RETIRED2.INTEL_PT_CLEARS",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of user interrupts delivered.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe1",
|
||||
"EventName": "MISC_RETIRED2.ULI_DELIVERY",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of SENDUIPI instructions retired.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe1",
|
||||
"EventName": "MISC_RETIRED2.ULI_SENDUIPI",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x9",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of VM exits.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xe1",
|
||||
"EventName": "MISC_RETIRED2.VM_EXIT",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Cycles when Reservation Station (RS) is empty for the thread.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -1643,6 +1934,15 @@
|
|||
"UMask": "0x4",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number issue slots not consumed due to a color request for an FCW or MXCSR control register when all 4 colors (copies) are already in use",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x75",
|
||||
"EventName": "SERIALIZATION.COLOR_STALLS",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots where no uop could issue due to an IQ scoreboard that stalls allocation until a specified older uop retires or (in the case of jump scoreboard) executes. Commonly executed instructions with IQ scoreboards include LFENCE and MFENCE.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -1720,6 +2020,15 @@
|
|||
"UMask": "0x1",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Fixed Counter: Counts the number of issue slots not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear.",
|
||||
"Counter": "Fixed counter 4",
|
||||
"EventName": "TOPDOWN_BAD_SPECULATION.ALL",
|
||||
"PublicDescription": "Fixed Counter: Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. Counts all issue slots blocked during this recovery window including relevant microcode flows and while uops are not yet available in the IQ. Also, includes the issue slots that were consumed by the backend but were thrown away because they were younger than the mispredict or machine clear.",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x5",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -1836,6 +2145,14 @@
|
|||
"UMask": "0x1",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x74",
|
||||
"EventName": "TOPDOWN_BE_BOUND.ALL_NON_ARCH",
|
||||
"SampleAfterValue": "1000003",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls [This event is alias to TOPDOWN_BE_BOUND.ALL]",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -1951,6 +2268,14 @@
|
|||
"UMask": "0x6",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of retirement slots not consumed due to front end stalls",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x71",
|
||||
"EventName": "TOPDOWN_FE_BOUND.ALL_NON_ARCH",
|
||||
"SampleAfterValue": "1000003",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of retirement slots not consumed due to front end stalls",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -2148,6 +2473,14 @@
|
|||
"UMask": "0x7",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of consumed retirement slots.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x72",
|
||||
"EventName": "TOPDOWN_RETIRING.ALL_NON_ARCH",
|
||||
"SampleAfterValue": "1000003",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of consumed retirement slots.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -2367,6 +2700,14 @@
|
|||
"UMask": "0x1",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops retired",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xc2",
|
||||
"EventName": "UOPS_RETIRED.ALL",
|
||||
"SampleAfterValue": "2000003",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the total number of uops retired.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -2414,6 +2755,15 @@
|
|||
"UMask": "0x10",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops retired that were delivered by the loop stream detector (LSD).",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xc2",
|
||||
"EventName": "UOPS_RETIRED.LSD",
|
||||
"SampleAfterValue": "2000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops that are from the complex flows issued by the micro-sequencer (MS). This includes uops from flows due to complex instructions, faults, assists, and inserted flows.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
|
|||
|
|
@ -8,6 +8,15 @@
|
|||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts walks that miss the PDE_CACHE",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x08",
|
||||
"EventName": "DTLB_LOAD_MISSES.PDE_CACHE_MISS",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x80",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of first level TLB misses but second level hits due to a demand load that did not start a page walk. Accounts for all page sizes. Will result in a DTLB write from STLB.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -47,6 +56,16 @@
|
|||
"UMask": "0x10",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of page walks completed due to load DTLB misses to any page size.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x08",
|
||||
"EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED",
|
||||
"PublicDescription": "Counts the number of page walks completed due to loads (including SW prefetches) whose address translations missed in all Translation Lookaside Buffer (TLB) levels and were mapped to any page size. Includes page walks that page fault.",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0xe",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Load miss in all TLB levels causes a page walk that completes. (All page sizes)",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -175,6 +194,15 @@
|
|||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts walks that miss the PDE_CACHE",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x49",
|
||||
"EventName": "DTLB_STORE_MISSES.PDE_CACHE_MISS",
|
||||
"SampleAfterValue": "2000003",
|
||||
"UMask": "0x80",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of first level TLB misses but second level hits due to stores that did not start a page walk. Accounts for all page sizes. Will result in a DTLB write from STLB.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -215,6 +243,16 @@
|
|||
"UMask": "0x10",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of page walks completed due to store DTLB misses to any page size.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x49",
|
||||
"EventName": "DTLB_STORE_MISSES.WALK_COMPLETED",
|
||||
"PublicDescription": "Counts the number of page walks completed due to stores whose address translations missed in all Translation Lookaside Buffer (TLB) levels and were mapped to any page size. Includes page walks that page fault.",
|
||||
"SampleAfterValue": "2000003",
|
||||
"UMask": "0xe",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Store misses in all TLB levels causes a page walk that completes. (All page sizes)",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -244,6 +282,16 @@
|
|||
"UMask": "0x8",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of page walks completed due to store DTLB misses to a 2M or 4M page.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x49",
|
||||
"EventName": "DTLB_STORE_MISSES.WALK_COMPLETED_2M_4M",
|
||||
"PublicDescription": "Counts the number of page walks completed due to stores whose address translations missed in all Translation Lookaside Buffer (TLB) levels and were mapped to 2M or 4M pages. Includes page walks that page fault.",
|
||||
"SampleAfterValue": "2000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Page walks completed due to a demand data store to a 2M/4M page.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -324,6 +372,16 @@
|
|||
"UMask": "0x10",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of times there was an ITLB miss and a new translation was filled into the ITLB.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x81",
|
||||
"EventName": "ITLB.FILLS",
|
||||
"PublicDescription": "Counts the number of times the machine was unable to find a translation in the Instruction Translation Lookaside Buffer (ITLB) and a new translation was filled into the ITLB. The event is speculative in nature, but will not count translations (page walks) that are begun and not finished, or translations that are finished but not filled into the ITLB.",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of page walks initiated by a instruction fetch that missed the first and second level TLBs.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -342,6 +400,15 @@
|
|||
"UMask": "0x1",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts walks that miss the PDE_CACHE",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x85",
|
||||
"EventName": "ITLB_MISSES.PDE_CACHE_MISS",
|
||||
"SampleAfterValue": "2000003",
|
||||
"UMask": "0x80",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of first level TLB misses but second level hits due to an instruction fetch that did not start a page walk. Account for all pages sizes. Will result in an ITLB write from STLB.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -501,6 +568,24 @@
|
|||
"UMask": "0x10",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of occurrences a load gets blocked because of a micro TLB miss",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x03",
|
||||
"EventName": "LD_BLOCKS.DTLB_MISS",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles that the head (oldest load) of the load buffer is stalled due to a DTLB miss",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x05",
|
||||
"EventName": "LD_HEAD.DTLB_MISS",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x10",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles that the head (oldest load) of the load buffer and retirement are both stalled due to a DTLB miss.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -518,5 +603,33 @@
|
|||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x90",
|
||||
"Unit": "cpu_lowpower"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of PMH walks that hit in the L1 or WCBs",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xbc",
|
||||
"EventName": "PAGE_WALKER_LOADS.DTLB_L1_HIT",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of PMH walks that hit in the L2",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xbc",
|
||||
"EventName": "PAGE_WALKER_LOADS.DTLB_L2_HIT",
|
||||
"PublicDescription": "Counts the number of PMH walks that hit in the L2. Includes L2 Hit resulting from and L1D eviction of another core in the same module which is longer latency than a typical L2 hit.",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Count number of any STLB flush attempts (Entire, PCID, InvPage, CR3 write, etc)",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xbd",
|
||||
"EventName": "TLB_FLUSHES.STLB_ANY",
|
||||
"SampleAfterValue": "20003",
|
||||
"UMask": "0x20",
|
||||
"Unit": "cpu_atom"
|
||||
}
|
||||
]
|
||||
|
|
|
|||
|
|
@ -22,7 +22,7 @@
|
|||
"Unit": "CHA"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "LLC misses - Uncacheable reads (from cpu) . Derived from unc_cha_tor_inserts.ia_miss",
|
||||
"BriefDescription": "LLC misses - Uncacheable reads (from cpu). Derived from unc_cha_tor_inserts.ia_miss",
|
||||
"Counter": "0,1,2,3",
|
||||
"EventCode": "0x35",
|
||||
"EventName": "LLC_MISSES.UNCACHEABLE",
|
||||
|
|
|
|||
|
|
@ -316,32 +316,32 @@
|
|||
"Unit": "iMC"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Intel Optane DC persistent memory bandwidth read (MB/sec). Derived from unc_m_pmm_rpq_inserts",
|
||||
"BriefDescription": "Intel Optane DC persistent memory bandwidth read (MiB/sec). Derived from unc_m_pmm_rpq_inserts",
|
||||
"Counter": "0,1,2,3",
|
||||
"EventCode": "0xE3",
|
||||
"EventName": "UNC_M_PMM_BANDWIDTH.READ",
|
||||
"PerPkg": "1",
|
||||
"ScaleUnit": "6.103515625E-5MB/sec",
|
||||
"ScaleUnit": "6.103515625E-5MiB/sec",
|
||||
"Unit": "iMC"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Intel Optane DC persistent memory bandwidth total (MB/sec). Derived from unc_m_pmm_rpq_inserts",
|
||||
"BriefDescription": "Intel Optane DC persistent memory bandwidth total (MiB/sec). Derived from unc_m_pmm_rpq_inserts",
|
||||
"Counter": "0,1,2,3",
|
||||
"EventCode": "0xE3",
|
||||
"EventName": "UNC_M_PMM_BANDWIDTH.TOTAL",
|
||||
"MetricExpr": "UNC_M_PMM_RPQ_INSERTS + UNC_M_PMM_WPQ_INSERTS",
|
||||
"MetricName": "UNC_M_PMM_BANDWIDTH.TOTAL",
|
||||
"PerPkg": "1",
|
||||
"ScaleUnit": "6.103515625E-5MB/sec",
|
||||
"ScaleUnit": "6.103515625E-5MiB/sec",
|
||||
"Unit": "iMC"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Intel Optane DC persistent memory bandwidth write (MB/sec). Derived from unc_m_pmm_wpq_inserts",
|
||||
"BriefDescription": "Intel Optane DC persistent memory bandwidth write (MiB/sec). Derived from unc_m_pmm_wpq_inserts",
|
||||
"Counter": "0,1,2,3",
|
||||
"EventCode": "0xE7",
|
||||
"EventName": "UNC_M_PMM_BANDWIDTH.WRITE",
|
||||
"PerPkg": "1",
|
||||
"ScaleUnit": "6.103515625E-5MB/sec",
|
||||
"ScaleUnit": "6.103515625E-5MiB/sec",
|
||||
"Unit": "iMC"
|
||||
},
|
||||
{
|
||||
|
|
|
|||
|
|
@ -488,12 +488,12 @@
|
|||
"UMask": "0x2"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Retired load instructions which data sources missed L3 but serviced from local dram",
|
||||
"BriefDescription": "Retired load instructions which data sources missed L3 but serviced from dram homed in the local socket",
|
||||
"Counter": "0,1,2,3",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd3",
|
||||
"EventName": "MEM_LOAD_L3_MISS_RETIRED.LOCAL_DRAM",
|
||||
"PublicDescription": "Retired load instructions which data sources missed L3 but serviced from local DRAM. Available PDIST counters: 0",
|
||||
"PublicDescription": "Retired load instructions which data sources missed L3 but serviced from DRAM homed in the local socket. Available PDIST counters: 0",
|
||||
"RetirementLatencyMax": 4146,
|
||||
"RetirementLatencyMean": 115.83,
|
||||
"RetirementLatencyMin": 0,
|
||||
|
|
|
|||
|
|
@ -9,6 +9,15 @@
|
|||
"PublicDescription": "UNC_CHACMS_CLOCKTICKS",
|
||||
"Unit": "CHACMS"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "UNC_CHACMS_DISTRESS_ASSERTED",
|
||||
"Counter": "0,1,2,3",
|
||||
"EventCode": "0x35",
|
||||
"EventName": "UNC_CHACMS_DISTRESS_ASSERTED",
|
||||
"PerPkg": "1",
|
||||
"PortMask": "0x000",
|
||||
"Unit": "CHACMS"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles FAST trigger is received from the global FAST distress wire.",
|
||||
"Counter": "0,1,2,3",
|
||||
|
|
|
|||
|
|
@ -6050,7 +6050,7 @@
|
|||
"EventName": "UNC_CHA_SNOOP_RESP.RSPIFWD",
|
||||
"Experimental": "1",
|
||||
"PerPkg": "1",
|
||||
"PublicDescription": "Counts when a a transaction with the opcode type RspIFwd Snoop Response was received which indicates a remote caching agent forwarded the data and the requesting agent is able to acquire the data in E (Exclusive) or M (modified) states. This is commonly returned with RFO (the Read for Ownership issued before a write) transactions. The snoop could have either been to a cacheline in the M,E,F (Modified, Exclusive or Forward) states.",
|
||||
"PublicDescription": "Counts when a transaction with the opcode type RspIFwd Snoop Response was received which indicates a remote caching agent forwarded the data and the requesting agent is able to acquire the data in E (Exclusive) or M (modified) states. This is commonly returned with RFO (the Read for Ownership issued before a write) transactions. The snoop could have either been to a cacheline in the M,E,F (Modified, Exclusive or Forward) states.",
|
||||
"UMask": "0x4",
|
||||
"Unit": "CHA"
|
||||
},
|
||||
|
|
@ -6072,7 +6072,7 @@
|
|||
"EventName": "UNC_CHA_SNOOP_RESP.RSPSFWD",
|
||||
"Experimental": "1",
|
||||
"PerPkg": "1",
|
||||
"PublicDescription": "Counts when a a transaction with the opcode type RspSFwd Snoop Response was received which indicates a remote caching agent forwarded the data but held on to its current copy. This is common for data and code reads that hit in a remote socket in E (Exclusive) or F (Forward) state.",
|
||||
"PublicDescription": "Counts when a transaction with the opcode type RspSFwd Snoop Response was received which indicates a remote caching agent forwarded the data but held on to its current copy. This is common for data and code reads that hit in a remote socket in E (Exclusive) or F (Forward) state.",
|
||||
"UMask": "0x8",
|
||||
"Unit": "CHA"
|
||||
},
|
||||
|
|
|
|||
|
|
@ -243,7 +243,7 @@
|
|||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of L2 prefetches initiated by either the L2 Stream or AMP that were throttled due to exceeding the XQ threshold set by either XQ_THRESOLD_DTP or XQ_THRESHOLD. Counts on a per core basis.",
|
||||
"BriefDescription": "Counts the number of L2 prefetches initiated by either the L2 Stream or AMP that were throttled due to exceeding the XQ threshold set by either XQ_THRESHOLD_DTP or XQ_THRESHOLD. Counts on a per core basis.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x28",
|
||||
"EventName": "L2_PREFETCHES_THROTTLED.XQ_THRESH",
|
||||
|
|
@ -464,7 +464,7 @@
|
|||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of LLC prefetches throttled due to exceeding the XQ threshold set by either XQ_THRESOLD_DTP or LLC_XQ_THRESHOLD. Counts on a per core basis.",
|
||||
"BriefDescription": "Counts the number of LLC prefetches throttled due to exceeding the XQ threshold set by either XQ_THRESHOLD_DTP or LLC_XQ_THRESHOLD. Counts on a per core basis.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x29",
|
||||
"EventName": "LLC_PREFETCHES_THROTTLED.XQ_THRESH",
|
||||
|
|
@ -1089,7 +1089,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_128",
|
||||
|
|
@ -1101,7 +1101,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_16",
|
||||
|
|
@ -1113,7 +1113,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_256",
|
||||
|
|
@ -1125,7 +1125,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_32",
|
||||
|
|
@ -1137,7 +1137,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_4",
|
||||
|
|
@ -1149,7 +1149,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_512",
|
||||
|
|
@ -1161,7 +1161,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_64",
|
||||
|
|
@ -1173,7 +1173,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_8",
|
||||
|
|
|
|||
|
|
@ -178,6 +178,7 @@
|
|||
"EventCode": "0xf4",
|
||||
"EventName": "XQ_PROMOTION.ALL",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x7",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
|
|
|
|||
|
|
@ -21,8 +21,9 @@
|
|||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of active floating point and integer dividers per cycle.",
|
||||
"BriefDescription": "This event is deprecated.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Deprecated": "1",
|
||||
"EventCode": "0xcd",
|
||||
"EventName": "ARITH.DIV_OCCUPANCY",
|
||||
"SampleAfterValue": "1000003",
|
||||
|
|
@ -30,8 +31,9 @@
|
|||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of floating point and integer divider uops executed per cycle.",
|
||||
"BriefDescription": "This event is deprecated.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Deprecated": "1",
|
||||
"EventCode": "0xcd",
|
||||
"EventName": "ARITH.DIV_UOPS",
|
||||
"SampleAfterValue": "1000003",
|
||||
|
|
@ -1023,6 +1025,15 @@
|
|||
"UMask": "0x10",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on secondary integer ports 0,1,2,3.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "INT_UOPS_EXECUTED.2ND",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x80",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on all Integer ports.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -1205,7 +1216,7 @@
|
|||
"EventCode": "0x03",
|
||||
"EventName": "LD_BLOCKS.ALL",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x10",
|
||||
"UMask": "0x1f",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
|
|
@ -1613,6 +1624,15 @@
|
|||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots where no uop could issue due to an IQ scoreboard that stalls allocation until a specified older uop retires or (in the case of jump scoreboard) executes. Commonly executed instructions with IQ scoreboards include LFENCE and MFENCE.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x75",
|
||||
"EventName": "SERIALIZATION.IQ_JEU_SCB",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots not consumed by the backend due to a micro-sequencer (MS) scoreboard, which stalls the front-end from issuing from the UROM until a specified older uop retires.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
|
|||
|
|
@ -1,7 +1,7 @@
|
|||
Family-model,Version,Filename,EventType
|
||||
GenuineIntel-6-(97|9A|B7|BA|BF),v1.34,alderlake,core
|
||||
GenuineIntel-6-BE,v1.34,alderlaken,core
|
||||
GenuineIntel-6-C[56],v1.13,arrowlake,core
|
||||
GenuineIntel-6-(97|9A|B7|BA|BF),v1.35,alderlake,core
|
||||
GenuineIntel-6-BE,v1.35,alderlaken,core
|
||||
GenuineIntel-6-C[56],v1.14,arrowlake,core
|
||||
GenuineIntel-6-(1C|26|27|35|36),v5,bonnell,core
|
||||
GenuineIntel-6-(3D|47),v30,broadwell,core
|
||||
GenuineIntel-6-56,v12,broadwellde,core
|
||||
|
|
@ -13,24 +13,24 @@ GenuineIntel-6-CF,v1.20,emeraldrapids,core
|
|||
GenuineIntel-6-5[CF],v13,goldmont,core
|
||||
GenuineIntel-6-7A,v1.01,goldmontplus,core
|
||||
GenuineIntel-6-B6,v1.10,grandridge,core
|
||||
GenuineIntel-6-A[DE],v1.15,graniterapids,core
|
||||
GenuineIntel-6-A[DE],v1.16,graniterapids,core
|
||||
GenuineIntel-6-(3C|45|46),v36,haswell,core
|
||||
GenuineIntel-6-3F,v29,haswellx,core
|
||||
GenuineIntel-6-7[DE],v1.24,icelake,core
|
||||
GenuineIntel-6-6[AC],v1.28,icelakex,core
|
||||
GenuineIntel-6-6[AC],v1.30,icelakex,core
|
||||
GenuineIntel-6-3A,v24,ivybridge,core
|
||||
GenuineIntel-6-3E,v24,ivytown,core
|
||||
GenuineIntel-6-2D,v24,jaketown,core
|
||||
GenuineIntel-6-(57|85),v16,knightslanding,core
|
||||
GenuineIntel-6-BD,v1.18,lunarlake,core
|
||||
GenuineIntel-6-(AA|AC|B5),v1.17,meteorlake,core
|
||||
GenuineIntel-6-BD,v1.19,lunarlake,core
|
||||
GenuineIntel-6-(AA|AC|B5),v1.18,meteorlake,core
|
||||
GenuineIntel-6-1[AEF],v4,nehalemep,core
|
||||
GenuineIntel-6-2E,v4,nehalemex,core
|
||||
GenuineIntel-6-CC,v1.00,pantherlake,core
|
||||
GenuineIntel-6-CC,v1.02,pantherlake,core
|
||||
GenuineIntel-6-A7,v1.04,rocketlake,core
|
||||
GenuineIntel-6-2A,v19,sandybridge,core
|
||||
GenuineIntel-6-8F,v1.35,sapphirerapids,core
|
||||
GenuineIntel-6-AF,v1.12,sierraforest,core
|
||||
GenuineIntel-6-AF,v1.13,sierraforest,core
|
||||
GenuineIntel-6-(37|4A|4C|4D|5A),v15,silvermont,core
|
||||
GenuineIntel-6-(4E|5E|8E|9E|A5|A6),v59,skylake,core
|
||||
GenuineIntel-6-55-[01234],v1.37,skylakex,core
|
||||
|
|
|
|||
|
|
|
@ -970,7 +970,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_1024",
|
||||
|
|
@ -982,7 +982,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_128",
|
||||
|
|
@ -994,7 +994,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_16",
|
||||
|
|
@ -1006,7 +1006,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_2048",
|
||||
|
|
@ -1018,7 +1018,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_256",
|
||||
|
|
@ -1030,7 +1030,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_32",
|
||||
|
|
@ -1042,7 +1042,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_4",
|
||||
|
|
@ -1054,7 +1054,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_512",
|
||||
|
|
@ -1066,7 +1066,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_64",
|
||||
|
|
@ -1078,7 +1078,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_8",
|
||||
|
|
|
|||
|
|
@ -383,6 +383,15 @@
|
|||
"UMask": "0x10",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of unhalted cycles when the core is stalled due to a demand load miss which missed all the caches, a snoop was required, and hits in other core or module on same die. Another core provides the data with a fwd, no fwd, or hitM.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x34",
|
||||
"EventName": "MEM_BOUND_STALLS_LOAD.LLC_MISS_OTHERMOD",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts all retired load instructions.",
|
||||
"Counter": "0,1,2,3",
|
||||
|
|
@ -727,6 +736,16 @@
|
|||
"UMask": "0x40",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of load ops retired that hit in the L3 cache in which a snoop was required and modified data was forwarded.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xd4",
|
||||
"EventName": "MEM_LOAD_UOPS_MISC_RETIRED.L3_HIT_SNOOP_HITM",
|
||||
"PublicDescription": "Counts the number of load ops retired that hit in the L3 cache in which a snoop was required and modified data was forwarded. Available PDIST counters: 0,1",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of load ops retired that hit the L1 data cache.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -830,6 +849,16 @@
|
|||
"SampleAfterValue": "100021",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of memory uops retired. A single uop that performs both a load AND a store will be counted as 1, not 2 (e.g. ADD [mem], CONST).",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.ALL",
|
||||
"PublicDescription": "Counts the number of memory uops retired. A single uop that performs both a load AND a store will be counted as 1, not 2 (e.g. ADD [mem], CONST). Available PDIST counters: 0,1",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x83",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of load ops retired.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -1371,5 +1400,14 @@
|
|||
"SampleAfterValue": "100003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots every cycle that were not delivered by the frontend due to an icache miss",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x71",
|
||||
"EventName": "TOPDOWN_FE_BOUND.ICACHE",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x20",
|
||||
"Unit": "cpu_atom"
|
||||
}
|
||||
]
|
||||
|
|
|
|||
|
|
@ -273,6 +273,69 @@
|
|||
"UMask": "0x3f",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on all floating point ports.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb2",
|
||||
"EventName": "FP_VINT_UOPS_EXECUTED.ALL",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1f",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on floating point and vector integer port 0.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb2",
|
||||
"EventName": "FP_VINT_UOPS_EXECUTED.P0",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on floating point and vector integer port 1.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb2",
|
||||
"EventName": "FP_VINT_UOPS_EXECUTED.P1",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on floating point and vector integer port 2.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb2",
|
||||
"EventName": "FP_VINT_UOPS_EXECUTED.P2",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on floating point and vector integer port 3.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb2",
|
||||
"EventName": "FP_VINT_UOPS_EXECUTED.P3",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x10",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on floating point and vector integer port 0, 1, 2, 3.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb2",
|
||||
"EventName": "FP_VINT_UOPS_EXECUTED.PRIMARY",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1e",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on floating point and vector integer store data port.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb2",
|
||||
"EventName": "FP_VINT_UOPS_EXECUTED.STD",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of floating point operations retired that required microcode assist.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -282,5 +345,15 @@
|
|||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of floating point divide uops retired (x87 and sse, including x87 sqrt).",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xc2",
|
||||
"EventName": "UOPS_RETIRED.FPDIV",
|
||||
"PublicDescription": "Counts the number of floating point divide uops retired (x87 and sse, including x87 sqrt). Available PDIST counters: 0,1",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x40",
|
||||
"Unit": "cpu_atom"
|
||||
}
|
||||
]
|
||||
|
|
|
|||
|
|
@ -8,6 +8,15 @@
|
|||
"UMask": "0xf4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles that the head (oldest load) of the load buffer is stalled due to request buffers full or lock in progress.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x05",
|
||||
"EventName": "LD_HEAD.WCB_FULL",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of machine clears due to memory ordering caused by a snoop from an external agent. Does not count internally generated machine clears such as those due to memory disambiguation.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
|
|||
|
|
@ -329,6 +329,17 @@
|
|||
"SampleAfterValue": "400009",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "This event is deprecated. [This event is alias to BR_MISP_RETIRED.NEAR_INDIRECT]",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Deprecated": "1",
|
||||
"EventCode": "0xc5",
|
||||
"EventName": "BR_MISP_RETIRED.ALL_NEAR_IND",
|
||||
"PublicDescription": "This event is deprecated. [This event is alias to BR_MISP_RETIRED.NEAR_INDIRECT] Available PDIST counters: 0,1",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x50",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Mispredicted conditional branch instructions retired.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -570,6 +581,16 @@
|
|||
"UMask": "0x8040",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of mispredicted near indirect JMP and near indirect CALL branch instructions retired. [This event is alias to BR_MISP_RETIRED.ALL_NEAR_IND]",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xc5",
|
||||
"EventName": "BR_MISP_RETIRED.NEAR_INDIRECT",
|
||||
"PublicDescription": "Counts the number of mispredicted near indirect JMP and near indirect CALL branch instructions retired. [This event is alias to BR_MISP_RETIRED.ALL_NEAR_IND] Available PDIST counters: 0,1",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x50",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Miss-predicted near indirect branch instructions retired (excluding returns) [This event is alias to BR_MISP_RETIRED.INDIRECT]",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -1126,6 +1147,70 @@
|
|||
"UMask": "0x10",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on secondary integer ports 0,1,2,3.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "INT_UOPS_EXECUTED.2ND",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x80",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on all Integer ports.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "INT_UOPS_EXECUTED.ALL",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0xff",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on a load port.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "INT_UOPS_EXECUTED.LD",
|
||||
"PublicDescription": "Counts the number of uops executed on a load port. This event counts for integer uops even if the destination is FP/vector",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on integer port 0.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "INT_UOPS_EXECUTED.P0",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on integer port 1.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "INT_UOPS_EXECUTED.P1",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x10",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on integer port 2.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "INT_UOPS_EXECUTED.P2",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x20",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on integer port 3.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "INT_UOPS_EXECUTED.P3",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x40",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on integer port 0,1, 2, 3.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -1135,6 +1220,25 @@
|
|||
"UMask": "0x78",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on a Store address port.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "INT_UOPS_EXECUTED.STA",
|
||||
"PublicDescription": "Counts the number of uops executed on a Store address port. This event counts integer uops even if the data source is FP/vector",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops executed on an integer store data and jump port.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "INT_UOPS_EXECUTED.STD_JMP",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Number of vector integer instructions retired of 128-bit vector-width.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -1236,7 +1340,7 @@
|
|||
"EventName": "LD_BLOCKS.ALL",
|
||||
"PublicDescription": "Counts the number of retired loads that are blocked for any of the following reasons: DTLB miss, address alias, store forward or data unknown (includes memory disambiguation blocks and ESP consuming load blocks). Available PDIST counters: 0,1",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x10",
|
||||
"UMask": "0x1f",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
|
|
@ -1360,6 +1464,15 @@
|
|||
"UMask": "0x20",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of machine clears due to program modifying data (self modifying code) within 1K of a recently fetched code page.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xc3",
|
||||
"EventName": "MACHINE_CLEARS.SMC",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Self-modifying code (SMC) detected.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -1507,6 +1620,25 @@
|
|||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots where no uop could issue due to an IQ scoreboard that stalls allocation until a specified older uop retires or (in the case of jump scoreboard) executes. Commonly executed instructions with IQ scoreboards include LFENCE and MFENCE.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x75",
|
||||
"EventName": "SERIALIZATION.IQ_JEU_SCB",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots not consumed by the backend due to a micro-sequencer (MS) scoreboard, which stalls the front-end from issuing from the UROM until a specified older uop retires.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x75",
|
||||
"EventName": "SERIALIZATION.NON_C01_MS_SCB",
|
||||
"PublicDescription": "Counts the number of issue slots not consumed by the backend due to a micro-sequencer (MS) scoreboard, which stalls the front-end from issuing from the UROM until a specified older uop retires. The most commonly executed instruction with an MS scoreboard is PAUSE.",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "This event counts a subset of the Topdown Slots event that were not consumed by the back-end pipeline due to lack of back-end resources, as a result of memory subsystem delays, execution units limitations, or other conditions.",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -1582,6 +1714,42 @@
|
|||
"SampleAfterValue": "1000003",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots every cycle that were not consumed by the backend due to Fast Nukes such as Memory Ordering Machine clears and MRN nukes",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x73",
|
||||
"EventName": "TOPDOWN_BAD_SPECULATION.FASTNUKE",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots every cycle that were not consumed by the backend due to a branch mispredict that resulted in LSD exit.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x73",
|
||||
"EventName": "TOPDOWN_BAD_SPECULATION.LSD_MISPREDICT",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots every cycle that were not consumed by the backend due to Branch Mispredict",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x73",
|
||||
"EventName": "TOPDOWN_BAD_SPECULATION.MISPREDICT",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots every cycle that were not consumed by the backend due to a machine clear (nuke).",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x73",
|
||||
"EventName": "TOPDOWN_BAD_SPECULATION.NUKE",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls. [This event is alias to TOPDOWN_BE_BOUND.ALL_P]",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -1591,6 +1759,15 @@
|
|||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots every cycle that were not consumed by the backend due to due to certain allocation restrictions.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x74",
|
||||
"EventName": "TOPDOWN_BE_BOUND.ALLOC_RESTRICTIONS",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls. [This event is alias to TOPDOWN_BE_BOUND.ALL]",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
|
|
@ -1600,6 +1777,33 @@
|
|||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots every cycle that were not consumed by the backend due to LSD entry.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x74",
|
||||
"EventName": "TOPDOWN_BE_BOUND.LSD",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x80",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots every cycle that were not consumed by the backend due to memory reservation stall (scheduler not being able to accept another uop). This could be caused by RSV full or load/store buffer block.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x74",
|
||||
"EventName": "TOPDOWN_BE_BOUND.MEM_SCHEDULER",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots every cycle that were not consumed by the backend due to iq/jeu scoreboards or ms scb",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x74",
|
||||
"EventName": "TOPDOWN_BE_BOUND.SERIALIZATION",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x10",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Fixed Counter: Counts the number of retirement slots not consumed due to front end stalls.",
|
||||
"Counter": "Fixed counter 5",
|
||||
|
|
@ -1617,6 +1821,78 @@
|
|||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots every cycle that were not delivered by the frontend due to BAClear",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x71",
|
||||
"EventName": "TOPDOWN_FE_BOUND.BRANCH_DETECT",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots every cycle that were not delivered by the frontend due to BTClear",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x71",
|
||||
"EventName": "TOPDOWN_FE_BOUND.BRANCH_RESTEER",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x40",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots every cycle that were not delivered by the frontend due to ms",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x71",
|
||||
"EventName": "TOPDOWN_FE_BOUND.CISC",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots every cycle that were not delivered by the frontend due to decode stall",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x71",
|
||||
"EventName": "TOPDOWN_FE_BOUND.DECODE",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x8",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots every cycle that were not delivered by the frontend due to latency related stalls including BACLEARs, BTCLEARs, ITLB misses, and ICache misses.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x71",
|
||||
"EventName": "TOPDOWN_FE_BOUND.FRONTEND_LATENCY",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x72",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots every cycle that were not delivered by the frontend due to itlb miss",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x71",
|
||||
"EventName": "TOPDOWN_FE_BOUND.ITLB_MISS",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x10",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots every cycle that were not delivered by the frontend that do not categorize into any other common frontend stall",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x71",
|
||||
"EventName": "TOPDOWN_FE_BOUND.OTHER",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x80",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of issue slots every cycle that were not delivered by the frontend due to predecode wrong",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0x71",
|
||||
"EventName": "TOPDOWN_FE_BOUND.PREDECODE",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Fixed Counter: Counts the number of consumed retirement slots.",
|
||||
"Counter": "Fixed counter 6",
|
||||
|
|
@ -1841,6 +2117,25 @@
|
|||
"UMask": "0x1",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of integer divide uops retired.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xc2",
|
||||
"EventName": "UOPS_RETIRED.IDIV",
|
||||
"PublicDescription": "Counts the number of integer divide uops retired. Available PDIST counters: 0,1",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x80",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of uops that are from the complex flows issued by the micro-sequencer (MS). This includes uops from flows due to complex instructions, faults, assists, and inserted flows.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xc2",
|
||||
"EventName": "UOPS_RETIRED.MS",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x4",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "UOPS_RETIRED.MS",
|
||||
"Counter": "0,1,2,3,4,5,6,7,8,9",
|
||||
|
|
@ -1887,5 +2182,13 @@
|
|||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of x87 uops retired, includes those in ms flows.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"EventCode": "0xc2",
|
||||
"EventName": "UOPS_RETIRED.X87",
|
||||
"SampleAfterValue": "1000003",
|
||||
"Unit": "cpu_atom"
|
||||
}
|
||||
]
|
||||
|
|
|
|||
|
|
@ -327,7 +327,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_1024",
|
||||
|
|
@ -338,7 +338,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_128",
|
||||
|
|
@ -349,7 +349,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_16",
|
||||
|
|
@ -360,7 +360,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_2048",
|
||||
|
|
@ -371,7 +371,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_256",
|
||||
|
|
@ -382,7 +382,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_32",
|
||||
|
|
@ -393,7 +393,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_4",
|
||||
|
|
@ -404,7 +404,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_512",
|
||||
|
|
@ -415,7 +415,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_64",
|
||||
|
|
@ -426,7 +426,7 @@
|
|||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of tagged load uops retired that exceed the latency threshold defined in MEC_CR_PEBS_LD_LAT_THRESHOLD - Only counts with PEBS enabled.",
|
||||
"Counter": "0,1,2,3,4,5,6,7",
|
||||
"Counter": "0,1",
|
||||
"Data_LA": "1",
|
||||
"EventCode": "0xd0",
|
||||
"EventName": "MEM_UOPS_RETIRED.LOAD_LATENCY_GT_8",
|
||||
|
|
|
|||
|
|
@ -9,6 +9,15 @@
|
|||
"PublicDescription": "UNC_CHACMS_CLOCKTICKS",
|
||||
"Unit": "CHACMS"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "UNC_CHACMS_DISTRESS_ASSERTED",
|
||||
"Counter": "0,1,2,3",
|
||||
"EventCode": "0x35",
|
||||
"EventName": "UNC_CHACMS_DISTRESS_ASSERTED",
|
||||
"PerPkg": "1",
|
||||
"PortMask": "0x000",
|
||||
"Unit": "CHACMS"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts the number of cycles FAST trigger is received from the global FAST distress wire.",
|
||||
"Counter": "0,1,2,3",
|
||||
|
|
|
|||
File diff suppressed because it is too large
Load Diff
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue