|
Borislav Petkov |
b231f2 |
From: Vikas Shivappa <vikas.shivappa@linux.intel.com>
|
|
Borislav Petkov |
b231f2 |
Date: Tue, 25 Jul 2017 14:14:21 -0700
|
|
Borislav Petkov |
b231f2 |
Subject: x86/intel_rdt/cqm: Documentation for resctrl based RDT Monitoring
|
|
Borislav Petkov |
b231f2 |
Git-commit: 1640ae9471ae41eb18d2b214f1f40af3c4ed3828
|
|
Borislav Petkov |
b231f2 |
Patch-mainline: v4.14-rc1
|
|
Borislav Petkov |
b231f2 |
References: fate#323965
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
Add a description of resctrl based RDT(resource director technology)
|
|
Borislav Petkov |
b231f2 |
monitoring extension and its usage.
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
[Tony: Added descriptions for how monitoring and allocation are measured
|
|
Borislav Petkov |
b231f2 |
and some cleanups]
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
|
|
Borislav Petkov |
b231f2 |
Signed-off-by: Tony Luck <tony.luck@intel.com>
|
|
Borislav Petkov |
b231f2 |
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Borislav Petkov |
b231f2 |
Cc: ravi.v.shankar@intel.com
|
|
Borislav Petkov |
b231f2 |
Cc: fenghua.yu@intel.com
|
|
Borislav Petkov |
b231f2 |
Cc: peterz@infradead.org
|
|
Borislav Petkov |
b231f2 |
Cc: eranian@google.com
|
|
Borislav Petkov |
b231f2 |
Cc: vikas.shivappa@intel.com
|
|
Borislav Petkov |
b231f2 |
Cc: ak@linux.intel.com
|
|
Borislav Petkov |
b231f2 |
Cc: davidcc@google.com
|
|
Borislav Petkov |
b231f2 |
Cc: reinette.chatre@intel.com
|
|
Borislav Petkov |
b231f2 |
Link: http://lkml.kernel.org/r/1501017287-28083-3-git-send-email-vikas.shivappa@linux.intel.com
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
Acked-by: Borislav Petkov <bp@suse.de>
|
|
Borislav Petkov |
b231f2 |
---
|
|
Borislav Petkov |
b231f2 |
Documentation/x86/intel_rdt_ui.txt | 316 ++++++++++++++++++++++++++++++++-----
|
|
Borislav Petkov |
b231f2 |
1 file changed, 278 insertions(+), 38 deletions(-)
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
diff --git a/Documentation/x86/intel_rdt_ui.txt b/Documentation/x86/intel_rdt_ui.txt
|
|
Borislav Petkov |
b231f2 |
index c491a1b82de2..76f21e2ac176 100644
|
|
Borislav Petkov |
b231f2 |
--- a/Documentation/x86/intel_rdt_ui.txt
|
|
Borislav Petkov |
b231f2 |
+++ b/Documentation/x86/intel_rdt_ui.txt
|
|
Borislav Petkov |
b231f2 |
@@ -6,8 +6,8 @@ Fenghua Yu <fenghua.yu@intel.com>
|
|
Borislav Petkov |
b231f2 |
Tony Luck <tony.luck@intel.com>
|
|
Borislav Petkov |
b231f2 |
Vikas Shivappa <vikas.shivappa@intel.com>
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
-This feature is enabled by the CONFIG_INTEL_RDT_A Kconfig and the
|
|
Borislav Petkov |
b231f2 |
-X86 /proc/cpuinfo flag bits "rdt", "cat_l3" and "cdp_l3".
|
|
Borislav Petkov |
b231f2 |
+This feature is enabled by the CONFIG_INTEL_RDT Kconfig and the
|
|
Borislav Petkov |
b231f2 |
+X86 /proc/cpuinfo flag bits "rdt", "cqm", "cat_l3" and "cdp_l3".
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
To use the feature mount the file system:
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
@@ -17,6 +17,13 @@ mount options are:
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
"cdp": Enable code/data prioritization in L3 cache allocations.
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
+RDT features are orthogonal. A particular system may support only
|
|
Borislav Petkov |
b231f2 |
+monitoring, only control, or both monitoring and control.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+The mount succeeds if either of allocation or monitoring is present, but
|
|
Borislav Petkov |
b231f2 |
+only those files and directories supported by the system will be created.
|
|
Borislav Petkov |
b231f2 |
+For more details on the behavior of the interface during monitoring
|
|
Borislav Petkov |
b231f2 |
+and allocation, see the "Resource alloc and monitor groups" section.
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
Info directory
|
|
Borislav Petkov |
b231f2 |
--------------
|
|
Borislav Petkov |
b231f2 |
@@ -24,7 +31,12 @@ Info directory
|
|
Borislav Petkov |
b231f2 |
The 'info' directory contains information about the enabled
|
|
Borislav Petkov |
b231f2 |
resources. Each resource has its own subdirectory. The subdirectory
|
|
Borislav Petkov |
b231f2 |
names reflect the resource names.
|
|
Borislav Petkov |
b231f2 |
-Cache resource(L3/L2) subdirectory contains the following files:
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Each subdirectory contains the following files with respect to
|
|
Borislav Petkov |
b231f2 |
+allocation:
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Cache resource(L3/L2) subdirectory contains the following files
|
|
Borislav Petkov |
b231f2 |
+related to allocation:
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
"num_closids": The number of CLOSIDs which are valid for this
|
|
Borislav Petkov |
b231f2 |
resource. The kernel uses the smallest number of
|
|
Borislav Petkov |
b231f2 |
@@ -36,7 +48,8 @@ Cache resource(L3/L2) subdirectory contains the following files:
|
|
Borislav Petkov |
b231f2 |
"min_cbm_bits": The minimum number of consecutive bits which
|
|
Borislav Petkov |
b231f2 |
must be set when writing a mask.
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
-Memory bandwitdh(MB) subdirectory contains the following files:
|
|
Borislav Petkov |
b231f2 |
+Memory bandwitdh(MB) subdirectory contains the following files
|
|
Borislav Petkov |
b231f2 |
+with respect to allocation:
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
"min_bandwidth": The minimum memory bandwidth percentage which
|
|
Borislav Petkov |
b231f2 |
user can request.
|
|
Borislav Petkov |
b231f2 |
@@ -52,48 +65,152 @@ Memory bandwitdh(MB) subdirectory contains the following files:
|
|
Borislav Petkov |
b231f2 |
non-linear. This field is purely informational
|
|
Borislav Petkov |
b231f2 |
only.
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
-Resource groups
|
|
Borislav Petkov |
b231f2 |
----------------
|
|
Borislav Petkov |
b231f2 |
+If RDT monitoring is available there will be an "L3_MON" directory
|
|
Borislav Petkov |
b231f2 |
+with the following files:
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+"num_rmids": The number of RMIDs available. This is the
|
|
Borislav Petkov |
b231f2 |
+ upper bound for how many "CTRL_MON" + "MON"
|
|
Borislav Petkov |
b231f2 |
+ groups can be created.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+"mon_features": Lists the monitoring events if
|
|
Borislav Petkov |
b231f2 |
+ monitoring is enabled for the resource.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+"max_threshold_occupancy":
|
|
Borislav Petkov |
b231f2 |
+ Read/write file provides the largest value (in
|
|
Borislav Petkov |
b231f2 |
+ bytes) at which a previously used LLC_occupancy
|
|
Borislav Petkov |
b231f2 |
+ counter can be considered for re-use.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Resource alloc and monitor groups
|
|
Borislav Petkov |
b231f2 |
+---------------------------------
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
Resource groups are represented as directories in the resctrl file
|
|
Borislav Petkov |
b231f2 |
-system. The default group is the root directory. Other groups may be
|
|
Borislav Petkov |
b231f2 |
-created as desired by the system administrator using the "mkdir(1)"
|
|
Borislav Petkov |
b231f2 |
-command, and removed using "rmdir(1)".
|
|
Borislav Petkov |
b231f2 |
+system. The default group is the root directory which, immediately
|
|
Borislav Petkov |
b231f2 |
+after mounting, owns all the tasks and cpus in the system and can make
|
|
Borislav Petkov |
b231f2 |
+full use of all resources.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+On a system with RDT control features additional directories can be
|
|
Borislav Petkov |
b231f2 |
+created in the root directory that specify different amounts of each
|
|
Borislav Petkov |
b231f2 |
+resource (see "schemata" below). The root and these additional top level
|
|
Borislav Petkov |
b231f2 |
+directories are referred to as "CTRL_MON" groups below.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+On a system with RDT monitoring the root directory and other top level
|
|
Borislav Petkov |
b231f2 |
+directories contain a directory named "mon_groups" in which additional
|
|
Borislav Petkov |
b231f2 |
+directories can be created to monitor subsets of tasks in the CTRL_MON
|
|
Borislav Petkov |
b231f2 |
+group that is their ancestor. These are called "MON" groups in the rest
|
|
Borislav Petkov |
b231f2 |
+of this document.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Removing a directory will move all tasks and cpus owned by the group it
|
|
Borislav Petkov |
b231f2 |
+represents to the parent. Removing one of the created CTRL_MON groups
|
|
Borislav Petkov |
b231f2 |
+will automatically remove all MON groups below it.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+All groups contain the following files:
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+"tasks":
|
|
Borislav Petkov |
b231f2 |
+ Reading this file shows the list of all tasks that belong to
|
|
Borislav Petkov |
b231f2 |
+ this group. Writing a task id to the file will add a task to the
|
|
Borislav Petkov |
b231f2 |
+ group. If the group is a CTRL_MON group the task is removed from
|
|
Borislav Petkov |
b231f2 |
+ whichever previous CTRL_MON group owned the task and also from
|
|
Borislav Petkov |
b231f2 |
+ any MON group that owned the task. If the group is a MON group,
|
|
Borislav Petkov |
b231f2 |
+ then the task must already belong to the CTRL_MON parent of this
|
|
Borislav Petkov |
b231f2 |
+ group. The task is removed from any previous MON group.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+"cpus":
|
|
Borislav Petkov |
b231f2 |
+ Reading this file shows a bitmask of the logical CPUs owned by
|
|
Borislav Petkov |
b231f2 |
+ this group. Writing a mask to this file will add and remove
|
|
Borislav Petkov |
b231f2 |
+ CPUs to/from this group. As with the tasks file a hierarchy is
|
|
Borislav Petkov |
b231f2 |
+ maintained where MON groups may only include CPUs owned by the
|
|
Borislav Petkov |
b231f2 |
+ parent CTRL_MON group.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+"cpus_list":
|
|
Borislav Petkov |
b231f2 |
+ Just like "cpus", only using ranges of CPUs instead of bitmasks.
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
-There are three files associated with each group:
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
-"tasks": A list of tasks that belongs to this group. Tasks can be
|
|
Borislav Petkov |
b231f2 |
- added to a group by writing the task ID to the "tasks" file
|
|
Borislav Petkov |
b231f2 |
- (which will automatically remove them from the previous
|
|
Borislav Petkov |
b231f2 |
- group to which they belonged). New tasks created by fork(2)
|
|
Borislav Petkov |
b231f2 |
- and clone(2) are added to the same group as their parent.
|
|
Borislav Petkov |
b231f2 |
- If a pid is not in any sub partition, it is in root partition
|
|
Borislav Petkov |
b231f2 |
- (i.e. default partition).
|
|
Borislav Petkov |
b231f2 |
+When control is enabled all CTRL_MON groups will also contain:
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
-"cpus": A bitmask of logical CPUs assigned to this group. Writing
|
|
Borislav Petkov |
b231f2 |
- a new mask can add/remove CPUs from this group. Added CPUs
|
|
Borislav Petkov |
b231f2 |
- are removed from their previous group. Removed ones are
|
|
Borislav Petkov |
b231f2 |
- given to the default (root) group. You cannot remove CPUs
|
|
Borislav Petkov |
b231f2 |
- from the default group.
|
|
Borislav Petkov |
b231f2 |
+"schemata":
|
|
Borislav Petkov |
b231f2 |
+ A list of all the resources available to this group.
|
|
Borislav Petkov |
b231f2 |
+ Each resource has its own line and format - see below for details.
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
-"cpus_list": One or more CPU ranges of logical CPUs assigned to this
|
|
Borislav Petkov |
b231f2 |
- group. Same rules apply like for the "cpus" file.
|
|
Borislav Petkov |
b231f2 |
+When monitoring is enabled all MON groups will also contain:
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
-"schemata": A list of all the resources available to this group.
|
|
Borislav Petkov |
b231f2 |
- Each resource has its own line and format - see below for
|
|
Borislav Petkov |
b231f2 |
- details.
|
|
Borislav Petkov |
b231f2 |
+"mon_data":
|
|
Borislav Petkov |
b231f2 |
+ This contains a set of files organized by L3 domain and by
|
|
Borislav Petkov |
b231f2 |
+ RDT event. E.g. on a system with two L3 domains there will
|
|
Borislav Petkov |
b231f2 |
+ be subdirectories "mon_L3_00" and "mon_L3_01". Each of these
|
|
Borislav Petkov |
b231f2 |
+ directories have one file per event (e.g. "llc_occupancy",
|
|
Borislav Petkov |
b231f2 |
+ "mbm_total_bytes", and "mbm_local_bytes"). In a MON group these
|
|
Borislav Petkov |
b231f2 |
+ files provide a read out of the current value of the event for
|
|
Borislav Petkov |
b231f2 |
+ all tasks in the group. In CTRL_MON groups these files provide
|
|
Borislav Petkov |
b231f2 |
+ the sum for all tasks in the CTRL_MON group and all tasks in
|
|
Borislav Petkov |
b231f2 |
+ MON groups. Please see example section for more details on usage.
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
-When a task is running the following rules define which resources
|
|
Borislav Petkov |
b231f2 |
-are available to it:
|
|
Borislav Petkov |
b231f2 |
+Resource allocation rules
|
|
Borislav Petkov |
b231f2 |
+-------------------------
|
|
Borislav Petkov |
b231f2 |
+When a task is running the following rules define which resources are
|
|
Borislav Petkov |
b231f2 |
+available to it:
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
1) If the task is a member of a non-default group, then the schemata
|
|
Borislav Petkov |
b231f2 |
-for that group is used.
|
|
Borislav Petkov |
b231f2 |
+ for that group is used.
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
2) Else if the task belongs to the default group, but is running on a
|
|
Borislav Petkov |
b231f2 |
-CPU that is assigned to some specific group, then the schemata for
|
|
Borislav Petkov |
b231f2 |
-the CPU's group is used.
|
|
Borislav Petkov |
b231f2 |
+ CPU that is assigned to some specific group, then the schemata for the
|
|
Borislav Petkov |
b231f2 |
+ CPU's group is used.
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
3) Otherwise the schemata for the default group is used.
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
+Resource monitoring rules
|
|
Borislav Petkov |
b231f2 |
+-------------------------
|
|
Borislav Petkov |
b231f2 |
+1) If a task is a member of a MON group, or non-default CTRL_MON group
|
|
Borislav Petkov |
b231f2 |
+ then RDT events for the task will be reported in that group.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+2) If a task is a member of the default CTRL_MON group, but is running
|
|
Borislav Petkov |
b231f2 |
+ on a CPU that is assigned to some specific group, then the RDT events
|
|
Borislav Petkov |
b231f2 |
+ for the task will be reported in that group.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+3) Otherwise RDT events for the task will be reported in the root level
|
|
Borislav Petkov |
b231f2 |
+ "mon_data" group.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Notes on cache occupancy monitoring and control
|
|
Borislav Petkov |
b231f2 |
+-----------------------------------------------
|
|
Borislav Petkov |
b231f2 |
+When moving a task from one group to another you should remember that
|
|
Borislav Petkov |
b231f2 |
+this only affects *new* cache allocations by the task. E.g. you may have
|
|
Borislav Petkov |
b231f2 |
+a task in a monitor group showing 3 MB of cache occupancy. If you move
|
|
Borislav Petkov |
b231f2 |
+to a new group and immediately check the occupancy of the old and new
|
|
Borislav Petkov |
b231f2 |
+groups you will likely see that the old group is still showing 3 MB and
|
|
Borislav Petkov |
b231f2 |
+the new group zero. When the task accesses locations still in cache from
|
|
Borislav Petkov |
b231f2 |
+before the move, the h/w does not update any counters. On a busy system
|
|
Borislav Petkov |
b231f2 |
+you will likely see the occupancy in the old group go down as cache lines
|
|
Borislav Petkov |
b231f2 |
+are evicted and re-used while the occupancy in the new group rises as
|
|
Borislav Petkov |
b231f2 |
+the task accesses memory and loads into the cache are counted based on
|
|
Borislav Petkov |
b231f2 |
+membership in the new group.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+The same applies to cache allocation control. Moving a task to a group
|
|
Borislav Petkov |
b231f2 |
+with a smaller cache partition will not evict any cache lines. The
|
|
Borislav Petkov |
b231f2 |
+process may continue to use them from the old partition.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Hardware uses CLOSid(Class of service ID) and an RMID(Resource monitoring ID)
|
|
Borislav Petkov |
b231f2 |
+to identify a control group and a monitoring group respectively. Each of
|
|
Borislav Petkov |
b231f2 |
+the resource groups are mapped to these IDs based on the kind of group. The
|
|
Borislav Petkov |
b231f2 |
+number of CLOSid and RMID are limited by the hardware and hence the creation of
|
|
Borislav Petkov |
b231f2 |
+a "CTRL_MON" directory may fail if we run out of either CLOSID or RMID
|
|
Borislav Petkov |
b231f2 |
+and creation of "MON" group may fail if we run out of RMIDs.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+max_threshold_occupancy - generic concepts
|
|
Borislav Petkov |
b231f2 |
+------------------------------------------
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Note that an RMID once freed may not be immediately available for use as
|
|
Borislav Petkov |
b231f2 |
+the RMID is still tagged the cache lines of the previous user of RMID.
|
|
Borislav Petkov |
b231f2 |
+Hence such RMIDs are placed on limbo list and checked back if the cache
|
|
Borislav Petkov |
b231f2 |
+occupancy has gone down. If there is a time when system has a lot of
|
|
Borislav Petkov |
b231f2 |
+limbo RMIDs but which are not ready to be used, user may see an -EBUSY
|
|
Borislav Petkov |
b231f2 |
+during mkdir.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+max_threshold_occupancy is a user configurable value to determine the
|
|
Borislav Petkov |
b231f2 |
+occupancy at which an RMID can be freed.
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
Schemata files - general concepts
|
|
Borislav Petkov |
b231f2 |
---------------------------------
|
|
Borislav Petkov |
b231f2 |
@@ -143,22 +260,22 @@ SKUs. Using a high bandwidth and a low bandwidth setting on two threads
|
|
Borislav Petkov |
b231f2 |
sharing a core will result in both threads being throttled to use the
|
|
Borislav Petkov |
b231f2 |
low bandwidth.
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
-L3 details (code and data prioritization disabled)
|
|
Borislav Petkov |
b231f2 |
---------------------------------------------------
|
|
Borislav Petkov |
b231f2 |
+L3 schemata file details (code and data prioritization disabled)
|
|
Borislav Petkov |
b231f2 |
+----------------------------------------------------------------
|
|
Borislav Petkov |
b231f2 |
With CDP disabled the L3 schemata format is:
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
L3:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
-L3 details (CDP enabled via mount option to resctrl)
|
|
Borislav Petkov |
b231f2 |
-----------------------------------------------------
|
|
Borislav Petkov |
b231f2 |
+L3 schemata file details (CDP enabled via mount option to resctrl)
|
|
Borislav Petkov |
b231f2 |
+------------------------------------------------------------------
|
|
Borislav Petkov |
b231f2 |
When CDP is enabled L3 control is split into two separate resources
|
|
Borislav Petkov |
b231f2 |
so you can specify independent masks for code and data like this:
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
L3data:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...
|
|
Borislav Petkov |
b231f2 |
L3code:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
-L2 details
|
|
Borislav Petkov |
b231f2 |
-----------
|
|
Borislav Petkov |
b231f2 |
+L2 schemata file details
|
|
Borislav Petkov |
b231f2 |
+------------------------
|
|
Borislav Petkov |
b231f2 |
L2 cache does not support code and data prioritization, so the
|
|
Borislav Petkov |
b231f2 |
schemata format is always:
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
@@ -185,6 +302,8 @@ L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
|
|
Borislav Petkov |
b231f2 |
L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
|
|
Borislav Petkov |
b231f2 |
L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
|
|
Borislav Petkov |
b231f2 |
|
|
Borislav Petkov |
b231f2 |
+Examples for RDT allocation usage:
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
Example 1
|
|
Borislav Petkov |
b231f2 |
---------
|
|
Borislav Petkov |
b231f2 |
On a two socket machine (one L3 cache per socket) with just four bits
|
|
Borislav Petkov |
b231f2 |
@@ -410,3 +529,124 @@ void main(void)
|
|
Borislav Petkov |
b231f2 |
/* code to read and write directory contents */
|
|
Borislav Petkov |
b231f2 |
resctrl_release_lock(fd);
|
|
Borislav Petkov |
b231f2 |
}
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Examples for RDT Monitoring along with allocation usage:
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Reading monitored data
|
|
Borislav Petkov |
b231f2 |
+----------------------
|
|
Borislav Petkov |
b231f2 |
+Reading an event file (for ex: mon_data/mon_L3_00/llc_occupancy) would
|
|
Borislav Petkov |
b231f2 |
+show the current snapshot of LLC occupancy of the corresponding MON
|
|
Borislav Petkov |
b231f2 |
+group or CTRL_MON group.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Example 1 (Monitor CTRL_MON group and subset of tasks in CTRL_MON group)
|
|
Borislav Petkov |
b231f2 |
+---------
|
|
Borislav Petkov |
b231f2 |
+On a two socket machine (one L3 cache per socket) with just four bits
|
|
Borislav Petkov |
b231f2 |
+for cache bit masks
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+# mount -t resctrl resctrl /sys/fs/resctrl
|
|
Borislav Petkov |
b231f2 |
+# cd /sys/fs/resctrl
|
|
Borislav Petkov |
b231f2 |
+# mkdir p0 p1
|
|
Borislav Petkov |
b231f2 |
+# echo "L3:0=3;1=c" > /sys/fs/resctrl/p0/schemata
|
|
Borislav Petkov |
b231f2 |
+# echo "L3:0=3;1=3" > /sys/fs/resctrl/p1/schemata
|
|
Borislav Petkov |
b231f2 |
+# echo 5678 > p1/tasks
|
|
Borislav Petkov |
b231f2 |
+# echo 5679 > p1/tasks
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+The default resource group is unmodified, so we have access to all parts
|
|
Borislav Petkov |
b231f2 |
+of all caches (its schemata file reads "L3:0=f;1=f").
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Tasks that are under the control of group "p0" may only allocate from the
|
|
Borislav Petkov |
b231f2 |
+"lower" 50% on cache ID 0, and the "upper" 50% of cache ID 1.
|
|
Borislav Petkov |
b231f2 |
+Tasks in group "p1" use the "lower" 50% of cache on both sockets.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Create monitor groups and assign a subset of tasks to each monitor group.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+# cd /sys/fs/resctrl/p1/mon_groups
|
|
Borislav Petkov |
b231f2 |
+# mkdir m11 m12
|
|
Borislav Petkov |
b231f2 |
+# echo 5678 > m11/tasks
|
|
Borislav Petkov |
b231f2 |
+# echo 5679 > m12/tasks
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+fetch data (data shown in bytes)
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+# cat m11/mon_data/mon_L3_00/llc_occupancy
|
|
Borislav Petkov |
b231f2 |
+16234000
|
|
Borislav Petkov |
b231f2 |
+# cat m11/mon_data/mon_L3_01/llc_occupancy
|
|
Borislav Petkov |
b231f2 |
+14789000
|
|
Borislav Petkov |
b231f2 |
+# cat m12/mon_data/mon_L3_00/llc_occupancy
|
|
Borislav Petkov |
b231f2 |
+16789000
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+The parent ctrl_mon group shows the aggregated data.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+# cat /sys/fs/resctrl/p1/mon_data/mon_l3_00/llc_occupancy
|
|
Borislav Petkov |
b231f2 |
+31234000
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Example 2 (Monitor a task from its creation)
|
|
Borislav Petkov |
b231f2 |
+---------
|
|
Borislav Petkov |
b231f2 |
+On a two socket machine (one L3 cache per socket)
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+# mount -t resctrl resctrl /sys/fs/resctrl
|
|
Borislav Petkov |
b231f2 |
+# cd /sys/fs/resctrl
|
|
Borislav Petkov |
b231f2 |
+# mkdir p0 p1
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+An RMID is allocated to the group once its created and hence the <cmd>
|
|
Borislav Petkov |
b231f2 |
+below is monitored from its creation.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+# echo $$ > /sys/fs/resctrl/p1/tasks
|
|
Borislav Petkov |
b231f2 |
+# <cmd>
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Fetch the data
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+# cat /sys/fs/resctrl/p1/mon_data/mon_l3_00/llc_occupancy
|
|
Borislav Petkov |
b231f2 |
+31789000
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Example 3 (Monitor without CAT support or before creating CAT groups)
|
|
Borislav Petkov |
b231f2 |
+---------
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Assume a system like HSW has only CQM and no CAT support. In this case
|
|
Borislav Petkov |
b231f2 |
+the resctrl will still mount but cannot create CTRL_MON directories.
|
|
Borislav Petkov |
b231f2 |
+But user can create different MON groups within the root group thereby
|
|
Borislav Petkov |
b231f2 |
+able to monitor all tasks including kernel threads.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+This can also be used to profile jobs cache size footprint before being
|
|
Borislav Petkov |
b231f2 |
+able to allocate them to different allocation groups.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+# mount -t resctrl resctrl /sys/fs/resctrl
|
|
Borislav Petkov |
b231f2 |
+# cd /sys/fs/resctrl
|
|
Borislav Petkov |
b231f2 |
+# mkdir mon_groups/m01
|
|
Borislav Petkov |
b231f2 |
+# mkdir mon_groups/m02
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+# echo 3478 > /sys/fs/resctrl/mon_groups/m01/tasks
|
|
Borislav Petkov |
b231f2 |
+# echo 2467 > /sys/fs/resctrl/mon_groups/m02/tasks
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Monitor the groups separately and also get per domain data. From the
|
|
Borislav Petkov |
b231f2 |
+below its apparent that the tasks are mostly doing work on
|
|
Borislav Petkov |
b231f2 |
+domain(socket) 0.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+# cat /sys/fs/resctrl/mon_groups/m01/mon_L3_00/llc_occupancy
|
|
Borislav Petkov |
b231f2 |
+31234000
|
|
Borislav Petkov |
b231f2 |
+# cat /sys/fs/resctrl/mon_groups/m01/mon_L3_01/llc_occupancy
|
|
Borislav Petkov |
b231f2 |
+34555
|
|
Borislav Petkov |
b231f2 |
+# cat /sys/fs/resctrl/mon_groups/m02/mon_L3_00/llc_occupancy
|
|
Borislav Petkov |
b231f2 |
+31234000
|
|
Borislav Petkov |
b231f2 |
+# cat /sys/fs/resctrl/mon_groups/m02/mon_L3_01/llc_occupancy
|
|
Borislav Petkov |
b231f2 |
+32789
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Example 4 (Monitor real time tasks)
|
|
Borislav Petkov |
b231f2 |
+-----------------------------------
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+A single socket system which has real time tasks running on cores 4-7
|
|
Borislav Petkov |
b231f2 |
+and non real time tasks on other cpus. We want to monitor the cache
|
|
Borislav Petkov |
b231f2 |
+occupancy of the real time threads on these cores.
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+# mount -t resctrl resctrl /sys/fs/resctrl
|
|
Borislav Petkov |
b231f2 |
+# cd /sys/fs/resctrl
|
|
Borislav Petkov |
b231f2 |
+# mkdir p1
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+Move the cpus 4-7 over to p1
|
|
Borislav Petkov |
b231f2 |
+# echo f0 > p0/cpus
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+View the llc occupancy snapshot
|
|
Borislav Petkov |
b231f2 |
+
|
|
Borislav Petkov |
b231f2 |
+# cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/llc_occupancy
|
|
Borislav Petkov |
b231f2 |
+11234000
|
|
Borislav Petkov |
b231f2 |
|