Blob Blame History Raw
From: Kurt Garloff <garloff@suse.de>
Subject: Fix calculation of unmapped page cache size
References: FATE309111
Patch-mainline: Never, SUSE specific

Remarks from sle11sp3->sle12 porting by mhocko@suse.cz:
- nr_swap_pages is atomic now

Original changelog as per 11sp3:
--------------------------------
Unfortunately, the assumption that NR_FILE_PAGES - NR_FILE_MAPPED
is easily freeable was wrong -- this could lead to us repeatedly
calling shrink_page_cache() from add_to_page_cache() without
making much progress and thus slowing down the system needlessly
(bringing it down to crawl in the worst case).

Calculating the unmapped page cache pages is not obvious, unfortunately.
There's two upper limits:
* It can't be larger than the overall pagecache size minus the max
  from mapped and shmem pages. (Those two overlap, unfortunately,
  so we can't just subtract the sum of those two ...)
* It can't be larger than the inactive plus active FILE LRU lists.

So we take the smaller of those two and divide by two to approximate
the number we're looking for.

Signed-off-by: Kurt Garloff <garloff@suse.de>

---
 mm/page_alloc.c |   33 +++++++++++++++++++++++++++++++--
 1 file changed, 31 insertions(+), 2 deletions(-)

--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7796,14 +7796,43 @@ void zone_pcp_reset(struct zone *zone)
  */
 unsigned long pagecache_over_limit()
 {
-	/* We only want to limit unmapped page cache pages */
+	/* We only want to limit unmapped and non-shmem page cache pages;
+	 * normally all shmem pages are mapped as well, but that does
+	 * not seem to be guaranteed. (Maybe this was just an oprofile
+	 * bug?).
+	 * (FIXME: Do we need to subtract NR_FILE_DIRTY here as well?) */
 	unsigned long pgcache_pages = global_page_state(NR_FILE_PAGES)
-				    - global_page_state(NR_FILE_MAPPED);
+				    - max_t(unsigned long,
+					    global_page_state(NR_FILE_MAPPED),
+					    global_page_state(NR_SHMEM));
+	/* We certainly can't free more than what's on the LRU lists
+	 * minus the dirty ones. (FIXME: pages accounted for in NR_WRITEBACK
+	 * are not on the LRU lists  any more, right?) */
+	unsigned long pgcache_lru_pages = global_page_state(NR_ACTIVE_FILE)
+				        + global_page_state(NR_INACTIVE_FILE)
+					- global_page_state(NR_FILE_DIRTY);
 	unsigned long free_pages = global_page_state(NR_FREE_PAGES);
+	unsigned long swap_pages = total_swap_pages - atomic_long_read(&nr_swap_pages);
 	unsigned long limit;
 
+	/* Paranoia */
+	if (unlikely(pgcache_lru_pages > LONG_MAX))
+		return 0;
+	/* We give a bonus for free pages above 6% of total (minus half swap used) */
+	free_pages -= totalram_pages/16;
+	if (likely(swap_pages <= LONG_MAX))
+		free_pages -= swap_pages/2;
+	if (free_pages > LONG_MAX)
+		free_pages = 0;
+
+	/* Limit it to 94% of LRU (not all there might be unmapped) */
+	pgcache_lru_pages -= pgcache_lru_pages/16;
+	pgcache_pages = min_t(unsigned long, pgcache_pages, pgcache_lru_pages);
+
+	/* Effective limit is corrected by effective free pages */
 	limit = vm_pagecache_limit_mb * ((1024*1024UL)/PAGE_SIZE) +
 		FREE_TO_PAGECACHE_RATIO * free_pages;
+
 	if (pgcache_pages > limit)
 		return pgcache_pages - limit;
 	return 0;