|
Michal Suchanek |
ad43d2 |
From 67df77845c181166d4bc324cbb0382f7e81c7631 Mon Sep 17 00:00:00 2001
|
|
Michal Suchanek |
ad43d2 |
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
|
|
Michal Suchanek |
ad43d2 |
Date: Mon, 17 Aug 2020 11:22:57 +0530
|
|
Michal Suchanek |
ad43d2 |
Subject: [PATCH] powerpc/numa: Restrict possible nodes based on platform
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
References: bsc#1209999 ltc#202140 bsc#1142685 ltc#179509 FATE#327775 git-fixes
|
|
Michal Suchanek |
ad43d2 |
Patch-mainline: v5.10-rc1
|
|
Michal Suchanek |
ad43d2 |
Git-commit: 67df77845c181166d4bc324cbb0382f7e81c7631
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
As per draft LoPAPR (Revision 2.9_pre7), section B.5.3 "Run Time
|
|
Michal Suchanek |
ad43d2 |
Abstraction Services (RTAS) Node" available at:
|
|
Michal Suchanek |
ad43d2 |
https://openpowerfoundation.org/wp-content/uploads/2020/07/LoPAR-20200611.pdf
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
... there are 2 device tree properties:
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
"ibm,max-associativity-domains"
|
|
Michal Suchanek |
ad43d2 |
which defines the maximum number of domains that the firmware i.e
|
|
Michal Suchanek |
ad43d2 |
PowerVM can support.
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
and:
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
"ibm,current-associativity-domains"
|
|
Michal Suchanek |
ad43d2 |
which defines the maximum number of domains that the current
|
|
Michal Suchanek |
ad43d2 |
platform can support.
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
The value of "ibm,max-associativity-domains" is always greater than or
|
|
Michal Suchanek |
ad43d2 |
equal to "ibm,current-associativity-domains" property. If the latter
|
|
Michal Suchanek |
ad43d2 |
property is not available, use "ibm,max-associativity-domain" as a
|
|
Michal Suchanek |
ad43d2 |
fallback. In this yet to be released LoPAPR, "ibm,current-associativity-domains"
|
|
Michal Suchanek |
ad43d2 |
is mentioned in page 833 / B.5.3 which is covered under under
|
|
Michal Suchanek |
ad43d2 |
"Appendix B. System Binding" section
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
Currently powerpc uses the "ibm,max-associativity-domains" property
|
|
Michal Suchanek |
ad43d2 |
while setting the possible number of nodes. This is currently set at
|
|
Michal Suchanek |
ad43d2 |
32. However the possible number of nodes for a platform may be
|
|
Michal Suchanek |
ad43d2 |
significantly less. Hence set the possible number of nodes based on
|
|
Michal Suchanek |
ad43d2 |
"ibm,current-associativity-domains" property.
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
Nathan Lynch had raised a valid concern that post LPM (Live Partition
|
|
Michal Suchanek |
ad43d2 |
Migration), a user could DLPAR add processors and memory after LPM
|
|
Michal Suchanek |
ad43d2 |
with "new" associativity properties:
|
|
Michal Suchanek |
ad43d2 |
https://lore.kernel.org/linuxppc-dev/871rljfet9.fsf@linux.ibm.com/t/#u
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
He also pointed out that "ibm,max-associativity-domains" has the same
|
|
Michal Suchanek |
ad43d2 |
contents on all currently available PowerVM systems, unlike
|
|
Michal Suchanek |
ad43d2 |
"ibm,current-associativity-domains" and hence may be better able to
|
|
Michal Suchanek |
ad43d2 |
handle the new NUMA associativity properties.
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
However with the recent commit dbce45628085 ("powerpc/numa: Limit
|
|
Michal Suchanek |
ad43d2 |
possible nodes to within num_possible_nodes"), all new NUMA
|
|
Michal Suchanek |
ad43d2 |
associativity properties are capped to initially set nr_node_ids.
|
|
Michal Suchanek |
ad43d2 |
Hence this commit should be safe with any new DLPAR add post LPM.
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
$ lsprop /proc/device-tree/rtas/ibm,*associ*-domains
|
|
Michal Suchanek |
ad43d2 |
/proc/device-tree/rtas/ibm,current-associativity-domains
|
|
Michal Suchanek |
ad43d2 |
00000005 00000001 00000002 00000002 00000002 00000010
|
|
Michal Suchanek |
ad43d2 |
/proc/device-tree/rtas/ibm,max-associativity-domains
|
|
Michal Suchanek |
ad43d2 |
00000005 00000001 00000008 00000020 00000020 00000100
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
$ cat /sys/devices/system/node/possible ##Before patch
|
|
Michal Suchanek |
ad43d2 |
0-31
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
$ cat /sys/devices/system/node/possible ##After patch
|
|
Michal Suchanek |
ad43d2 |
0-1
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
Note the maximum nodes this platform can support is only 2 but the
|
|
Michal Suchanek |
ad43d2 |
possible nodes is set to 32.
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
This is important because lot of kernel and user space code allocate
|
|
Michal Suchanek |
ad43d2 |
structures for all possible nodes leading to a lot of memory that is
|
|
Michal Suchanek |
ad43d2 |
allocated but not used.
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
I ran a simple experiment to create and destroy 100 memory cgroups on
|
|
Michal Suchanek |
ad43d2 |
boot on a 8 node machine (Power8 Alpine).
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
Before patch:
|
|
Michal Suchanek |
ad43d2 |
free -k at boot
|
|
Michal Suchanek |
ad43d2 |
total used free shared buff/cache available
|
|
Michal Suchanek |
ad43d2 |
Mem: 523498176 4106816 518820608 22272 570752 516606720
|
|
Michal Suchanek |
ad43d2 |
Swap: 4194240 0 4194240
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
free -k after creating 100 memory cgroups
|
|
Michal Suchanek |
ad43d2 |
total used free shared buff/cache available
|
|
Michal Suchanek |
ad43d2 |
Mem: 523498176 4628416 518246464 22336 623296 516058688
|
|
Michal Suchanek |
ad43d2 |
Swap: 4194240 0 4194240
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
free -k after destroying 100 memory cgroups
|
|
Michal Suchanek |
ad43d2 |
total used free shared buff/cache available
|
|
Michal Suchanek |
ad43d2 |
Mem: 523498176 4697408 518173760 22400 627008 515987904
|
|
Michal Suchanek |
ad43d2 |
Swap: 4194240 0 4194240
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
After patch:
|
|
Michal Suchanek |
ad43d2 |
free -k at boot
|
|
Michal Suchanek |
ad43d2 |
total used free shared buff/cache available
|
|
Michal Suchanek |
ad43d2 |
Mem: 523498176 3969472 518933888 22272 594816 516731776
|
|
Michal Suchanek |
ad43d2 |
Swap: 4194240 0 4194240
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
free -k after creating 100 memory cgroups
|
|
Michal Suchanek |
ad43d2 |
total used free shared buff/cache available
|
|
Michal Suchanek |
ad43d2 |
Mem: 523498176 4181888 518676096 22208 640192 516496448
|
|
Michal Suchanek |
ad43d2 |
Swap: 4194240 0 4194240
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
free -k after destroying 100 memory cgroups
|
|
Michal Suchanek |
ad43d2 |
total used free shared buff/cache available
|
|
Michal Suchanek |
ad43d2 |
Mem: 523498176 4232320 518619904 22272 645952 516443264
|
|
Michal Suchanek |
ad43d2 |
Swap: 4194240 0 4194240
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
Observations:
|
|
Michal Suchanek |
ad43d2 |
Fixed kernel takes 137344 kb (4106816-3969472) less to boot.
|
|
Michal Suchanek |
ad43d2 |
Fixed kernel takes 309184 kb (4628416-4181888-137344) less to create 100 memcgs.
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
|
|
Michal Suchanek |
ad43d2 |
[mpe: Reformat change log a bit for readability]
|
|
Michal Suchanek |
ad43d2 |
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Michal Suchanek |
ad43d2 |
Link: https://lore.kernel.org/r/20200817055257.110873-1-srikar@linux.vnet.ibm.com
|
|
Michal Suchanek |
ad43d2 |
Acked-by: Michal Suchanek <msuchanek@suse.de>
|
|
Michal Suchanek |
ad43d2 |
---
|
|
Michal Suchanek |
ad43d2 |
arch/powerpc/mm/numa.c | 15 ++++++++++++---
|
|
Michal Suchanek |
ad43d2 |
1 file changed, 12 insertions(+), 3 deletions(-)
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
|
|
Michal Suchanek |
ad43d2 |
index 1f61fa2148b5..5ddc83ba20f4 100644
|
|
Michal Suchanek |
ad43d2 |
--- a/arch/powerpc/mm/numa.c
|
|
Michal Suchanek |
ad43d2 |
+++ b/arch/powerpc/mm/numa.c
|
|
Michal Suchanek |
ad43d2 |
@@ -900,10 +900,19 @@ static void __init find_possible_nodes(void)
|
|
Michal Suchanek |
ad43d2 |
if (!rtas)
|
|
Michal Suchanek |
ad43d2 |
return;
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
- if (of_property_read_u32_index(rtas,
|
|
Michal Suchanek |
ad43d2 |
- "ibm,max-associativity-domains",
|
|
Michal Suchanek |
ad43d2 |
+ if (of_property_read_u32_index(rtas, "ibm,current-associativity-domains",
|
|
Michal Suchanek |
ad43d2 |
+ min_common_depth, &numnodes)) {
|
|
Michal Suchanek |
ad43d2 |
+ /*
|
|
Michal Suchanek |
ad43d2 |
+ * ibm,current-associativity-domains is a fairly recent
|
|
Michal Suchanek |
ad43d2 |
+ * property. If it doesn't exist, then fallback on
|
|
Michal Suchanek |
ad43d2 |
+ * ibm,max-associativity-domains. Current denotes what the
|
|
Michal Suchanek |
ad43d2 |
+ * platform can support compared to max which denotes what the
|
|
Michal Suchanek |
ad43d2 |
+ * Hypervisor can support.
|
|
Michal Suchanek |
ad43d2 |
+ */
|
|
Michal Suchanek |
ad43d2 |
+ if (of_property_read_u32_index(rtas, "ibm,max-associativity-domains",
|
|
Michal Suchanek |
ad43d2 |
min_common_depth, &numnodes))
|
|
Michal Suchanek |
ad43d2 |
- goto out;
|
|
Michal Suchanek |
ad43d2 |
+ goto out;
|
|
Michal Suchanek |
ad43d2 |
+ }
|
|
Michal Suchanek |
ad43d2 |
|
|
Michal Suchanek |
ad43d2 |
for (i = 0; i < numnodes; i++) {
|
|
Michal Suchanek |
ad43d2 |
if (!node_possible(i))
|
|
Michal Suchanek |
ad43d2 |
--
|
|
Michal Suchanek |
ad43d2 |
2.40.0
|
|
Michal Suchanek |
ad43d2 |
|