From 8c8c383c04f6cbcda38e38b2430cb245da4d7e5a Mon Sep 17 00:00:00 2001 From: Johannes Weiner Date: Sat, 30 Nov 2019 17:50:09 -0800 Subject: mm: memcontrol: try harder to set a new memory.high Setting a memory.high limit below the usage makes almost no effort to shrink the cgroup to the new target size. While memory.high is a "soft" limit that isn't supposed to cause OOM situations, we should still try harder to meet a user request through persistent reclaim. For example, after setting a 10M memory.high on an 800M cgroup full of file cache, the usage shrinks to about 350M: + cat /cgroup/workingset/memory.current 841568256 + echo 10M + cat /cgroup/workingset/memory.current 355729408 This isn't exactly what the user would expect to happen. Setting the value a few more times eventually whittles the usage down to what we are asking for: + echo 10M + cat /cgroup/workingset/memory.current 104181760 + echo 10M + cat /cgroup/workingset/memory.current 31801344 + echo 10M + cat /cgroup/workingset/memory.current 10440704 To improve this, add reclaim retry loops to the memory.high write() callback, similar to what we do for memory.max, to make a reasonable effort that the usage meets the requested size after the call returns. Afterwards, a single write() to memory.high is enough in all but extreme cases: + cat /cgroup/workingset/memory.current 841609216 + echo 10M + cat /cgroup/workingset/memory.current 10182656 790M is not a reasonable reclaim target to ask of a single reclaim invocation. And it wouldn't be reasonable to optimize the reclaim code for it. So asking for the full size but retrying is not a bad choice here: we express our intent, and benefit if reclaim becomes better at handling larger requests, but we also acknowledge that some of the deltas we can encounter in memory_high_write() are just too ridiculously big for a single reclaim invocation to manage. Link: http://lkml.kernel.org/r/20191022201518.341216-2-hannes@cmpxchg.org Signed-off-by: Johannes Weiner Acked-by: Michal Hocko Cc: Vladimir Davydov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/memcontrol.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-) (limited to 'mm/memcontrol.c') diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2bd6d470c5f1..94a5b6d831f9 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6091,7 +6091,8 @@ static ssize_t memory_high_write(struct kernfs_open_file *of, char *buf, size_t nbytes, loff_t off) { struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of)); - unsigned long nr_pages; + unsigned int nr_retries = MEM_CGROUP_RECLAIM_RETRIES; + bool drained = false; unsigned long high; int err; @@ -6102,12 +6103,29 @@ static ssize_t memory_high_write(struct kernfs_open_file *of, memcg->high = high; - nr_pages = page_counter_read(&memcg->memory); - if (nr_pages > high) - try_to_free_mem_cgroup_pages(memcg, nr_pages - high, - GFP_KERNEL, true); + for (;;) { + unsigned long nr_pages = page_counter_read(&memcg->memory); + unsigned long reclaimed; + + if (nr_pages <= high) + break; + + if (signal_pending(current)) + break; + + if (!drained) { + drain_all_stock(memcg); + drained = true; + continue; + } + + reclaimed = try_to_free_mem_cgroup_pages(memcg, nr_pages - high, + GFP_KERNEL, true); + + if (!reclaimed && !nr_retries--) + break; + } - memcg_wb_domain_size_changed(memcg); return nbytes; } -- cgit v1.2.3