Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

L3CAT not giving the expected results on the caterpillar benchmark #289

Open
apinas opened this issue Feb 14, 2025 · 2 comments
Open

L3CAT not giving the expected results on the caterpillar benchmark #289

apinas opened this issue Feb 14, 2025 · 2 comments

Comments

@apinas
Copy link

apinas commented Feb 14, 2025

Hi!
I was trying to reproduce the Intel ECI caterpillar results by running the benchmark with and without L3 Caché allocation on a i7-13700E machine.
I am using kernel 6.12.11 with PREEMPT_RT.

Basically, I have some isolated cores in which I run the application, while a non-isolated core creates some "noise" by using memcpy:

[apinas@zodd ~]$ cat /sys/devices/system/cpu/isolated
4-7

Using the pqos tool I partition the L3 cache so the isolated cores get half of the cache, and the non-isolated cores get the remaining. To do this, I run:

export RDT_PROBE_MSR=1
pqos -R
pqos -e "llc:0=0x03f;llc:3=0xfc0"
pqos -a "llc:0=0-3,8-15;llc:3=4-7"

The output is

sudo -E pqos -R
NOTE: Mixed use of MSR and kernel interfaces to manage
CAT or CMT & MBM may lead to unexpected behavior.
WARN: CPUID.0x7.0: Monitoring capability not supported!
WARN: Cache allocation not supported on model name '13th Gen Intel(R) Core(TM) i7-13700E'!
ERROR: RDMSR failed for reg[0xca0] on lcore 0
WARN: Cache allocation not supported on model name '13th Gen Intel(R) Core(TM) i7-13700E'!
ERROR: RDMSR failed for reg[0xca0] on lcore 0
Allocation reset successful

sudo -E pqos -e "llc:0=0x03f;llc:3=0xfc0"
NOTE: Mixed use of MSR and kernel interfaces to manage
CAT or CMT & MBM may lead to unexpected behavior.
WARN: CPUID.0x7.0: Monitoring capability not supported!
WARN: Cache allocation not supported on model name '13th Gen Intel(R) Core(TM) i7-13700E'!
ERROR: RDMSR failed for reg[0xca0] on lcore 0
SOCKET 0 L3CA COS0 => MASK 0x3f
SOCKET 0 L3CA COS3 => MASK 0xfc0
Allocation configuration altered.

[apinas@zodd intel]$ sudo -E pqos -a "llc:0=0-3,8-15;llc:3=4-7"
NOTE: Mixed use of MSR and kernel interfaces to manage
CAT or CMT & MBM may lead to unexpected behavior.
WARN: CPUID.0x7.0: Monitoring capability not supported!
WARN: Cache allocation not supported on model name '13th Gen Intel(R) Core(TM) i7-13700E'!
ERROR: RDMSR failed for reg[0xca0] on lcore 0
Allocation configuration altered.

I can tell that the configuration has been applied:

sudo -E pqos -V -s
NOTE: Mixed use of MSR and kernel interfaces to manage
CAT or CMT & MBM may lead to unexpected behavior.
INFO: Requested interface: AUTO
INFO: resctrl not detected. Kernel version 4.10 or higher required
INFO: Selected interface: MSR
INFO: CACHE: type 1, level 1, max id sharing this cache 2 (1 bits)
DEBUG: CACHE: not inclusive, direct mapped, 8 way(s), 64 set(s), line size 64, 1 partition(s)
INFO: CACHE: type 2, level 1, max id sharing this cache 2 (1 bits)
DEBUG: CACHE: not inclusive, direct mapped, 8 way(s), 128 set(s), line size 64, 1 partition(s)
INFO: CACHE: type 3, level 2, max id sharing this cache 8 (3 bits)
DEBUG: CACHE: not inclusive, direct mapped, 16 way(s), 4096 set(s), line size 64, 1 partition(s)
INFO: CACHE: type 3, level 3, max id sharing this cache 128 (7 bits)
DEBUG: CACHE: not inclusive, complex cache indexing, 12 way(s), 40960 set(s), line size 64, 1 partition(s)
DEBUG: Detected core 0, socket 0, NUMAnode 0, L2 ID 0, L3 ID 0, APICID 0
DEBUG: Detected core 1, socket 0, NUMAnode 0, L2 ID 1, L3 ID 0, APICID 8
DEBUG: Detected core 2, socket 0, NUMAnode 0, L2 ID 2, L3 ID 0, APICID 16
DEBUG: Detected core 3, socket 0, NUMAnode 0, L2 ID 3, L3 ID 0, APICID 24
DEBUG: Detected core 4, socket 0, NUMAnode 0, L2 ID 4, L3 ID 0, APICID 32
DEBUG: Detected core 5, socket 0, NUMAnode 0, L2 ID 5, L3 ID 0, APICID 40
DEBUG: Detected core 6, socket 0, NUMAnode 0, L2 ID 6, L3 ID 0, APICID 48
DEBUG: Detected core 7, socket 0, NUMAnode 0, L2 ID 7, L3 ID 0, APICID 56
DEBUG: Detected core 8, socket 0, NUMAnode 0, L2 ID 8, L3 ID 0, APICID 64
DEBUG: Detected core 9, socket 0, NUMAnode 0, L2 ID 8, L3 ID 0, APICID 66
DEBUG: Detected core 10, socket 0, NUMAnode 0, L2 ID 8, L3 ID 0, APICID 68
DEBUG: Detected core 11, socket 0, NUMAnode 0, L2 ID 8, L3 ID 0, APICID 70
DEBUG: Detected core 12, socket 0, NUMAnode 0, L2 ID 9, L3 ID 0, APICID 72
DEBUG: Detected core 13, socket 0, NUMAnode 0, L2 ID 9, L3 ID 0, APICID 74
DEBUG: Detected core 14, socket 0, NUMAnode 0, L2 ID 9, L3 ID 0, APICID 76
DEBUG: Detected core 15, socket 0, NUMAnode 0, L2 ID 9, L3 ID 0, APICID 78
WARN: CPUID.0x7.0: Monitoring capability not supported!
INFO: Monitoring capability not detected
INFO: CPUID.0x7.0: L3 CAT not detected. Checking brand string...
DEBUG: CPU brand string '13th Gen Intel(R) Core(TM) i7-13700E'
WARN: Cache allocation not supported on model name '13th Gen Intel(R) Core(TM) i7-13700E'!
INFO: Checking model and family ID...
INFO: Probing msr....
ERROR: RDMSR failed for reg[0xca0] on lcore 0
INFO: L3CA capability detected
INFO: L3 CAT details: CDP support=0, CDP on=0, #COS=16, #ways=12, ways contention bit-mask 0x0
INFO: L3 CAT details: cache size 31457280 bytes, way size 2621440 bytes
INFO: L3 CAT details: I/O RDT support=0, I/O RDT on=0
INFO: CPUID.0x7.0: L2 CAT not supported
INFO: L2CA capability not detected
INFO: CPUID.0x7.0: MBA not supported
INFO: MBA capability not detected
DEBUG: allocation init OK
DEBUG: monitoring init aborted: feature not present
DEBUG: I/O RDT init aborted: feature not present
L3CA COS definitions for Socket 0:
L3CA COS0 => MASK 0x3f
L3CA COS1 => MASK 0xfff
L3CA COS2 => MASK 0xfff
L3CA COS3 => MASK 0xfc0
L3CA COS4 => MASK 0xfff
L3CA COS5 => MASK 0xfff
L3CA COS6 => MASK 0xfff
L3CA COS7 => MASK 0xfff
L3CA COS8 => MASK 0xfff
L3CA COS9 => MASK 0xfff
L3CA COS10 => MASK 0xfff
L3CA COS11 => MASK 0xfff
L3CA COS12 => MASK 0xfff
L3CA COS13 => MASK 0xfff
L3CA COS14 => MASK 0xfff
L3CA COS15 => MASK 0xfff
Core information for socket 0:
Core 0, L2ID 0, L3ID 0 => COS0
Core 1, L2ID 1, L3ID 0 => COS0
Core 2, L2ID 2, L3ID 0 => COS0
Core 3, L2ID 3, L3ID 0 => COS0
Core 4, L2ID 4, L3ID 0 => COS3
Core 5, L2ID 5, L3ID 0 => COS3
Core 6, L2ID 6, L3ID 0 => COS3
Core 7, L2ID 7, L3ID 0 => COS3
Core 8, L2ID 8, L3ID 0 => COS0
Core 9, L2ID 8, L3ID 0 => COS0
Core 10, L2ID 8, L3ID 0 => COS0
Core 11, L2ID 8, L3ID 0 => COS0
Core 12, L2ID 9, L3ID 0 => COS0
Core 13, L2ID 9, L3ID 0 => COS0
Core 14, L2ID 9, L3ID 0 => COS0
Core 15, L2ID 9, L3ID 0 => COS0

Then I run the caterpillar test as:

/opt/benchmarking/caterpillar/start-benchmark.py --irq_core 0 --nn_core 3 --caterpillar_args '-c 6 -l results.txt -s 5000'

After running for some hours, with and without CAT enabled I got the following results:

Image

There does not seem to be much of a difference between the CAT-enabled and the non-CAT-enabled runs (in fact, the CAT version reports more jitter), contrary to what happens in the results shown on the Intel ECI site.

I hope you could help me understand what is happening, and how I could get better results!

As a side note, I assume the pqos utility is using the MSR interface, because despite having enabled the CONFIG_X86_CPU_RESCTRL=y in the kernel, and set the following kernel argument rdt=l3cat, I am unable to mount the resctrl filesystem:

mount -t resctrl resctrl /sys/fs/resctrl
mount: /sys/fs/resctrl: mount point does not exist.

@rkanagar
Copy link
Contributor

Hi @apinas

Refer resctrl kernel documentation.
https://docs.kernel.org/arch/x86/resctrl.html

mount -t resctrl resctrl [-o cdp[,cdpl2][,mba_MBps][,debug]] /sys/fs/resctrl

I can see allocation & monitoring features are not supported in your machine.

Please check them using below commands:
pqos --iface=msr -d
pqos --iface=msr -D

cpuid -1 -l 0x10
cpuid -1 -l 0x10 -s 2
cpuid -1 -l 0x10 -s 1

Thanks,
Raghavan K.

@apinas
Copy link
Author

apinas commented Feb 14, 2025

Hi @rkanagar,
by running the commands you provided I can see that the cache allocation is not supported because it does not support RDT. However, as reported by the Intel® Resource Director Technology (Intel® RDT) Architecture Specification, Table 7-9, the Intel Core i7-13700E Processor supports L3 cache allocation.
In the previously mentioned page Intel ECI caterpillar results, the testbench is ran for an Intel Core i7-1185GRE Processor. This processor does not support RDT, but as well as the i7-13700E it implements a "non-architectural version" of L3 cache allocation. I think I should be able to obtain the same results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants