This repository has been archived on 2026-03-14. You can view files and clone it, but cannot push or open issues or pull requests.
Files
ubports_kernel_google_msm/include/linux
KAMEZAWA Hiroyuki 89c06bd52f memcg: use new logic for page stat accounting
Now, page-stat-per-memcg is recorded into per page_cgroup flag by
duplicating page's status into the flag.  The reason is that memcg has a
feature to move a page from a group to another group and we have race
between "move" and "page stat accounting",

Under current logic, assume CPU-A and CPU-B.  CPU-A does "move" and CPU-B
does "page stat accounting".

When CPU-A goes 1st,

            CPU-A                           CPU-B
                                    update "struct page" info.
    move_lock_mem_cgroup(memcg)
    see pc->flags
    copy page stat to new group
    overwrite pc->mem_cgroup.
    move_unlock_mem_cgroup(memcg)
                                    move_lock_mem_cgroup(mem)
                                    set pc->flags
                                    update page stat accounting
                                    move_unlock_mem_cgroup(mem)

stat accounting is guarded by move_lock_mem_cgroup() and "move" logic
(CPU-A) doesn't see changes in "struct page" information.

But it's costly to have the same information both in 'struct page' and
'struct page_cgroup'.  And, there is a potential problem.

For example, assume we have PG_dirty accounting in memcg.
PG_..is a flag for struct page.
PCG_ is a flag for struct page_cgroup.
(This is just an example. The same problem can be found in any
 kind of page stat accounting.)

	  CPU-A                               CPU-B
      TestSet PG_dirty
      (delay)                        TestClear PG_dirty
                                     if (TestClear(PCG_dirty))
                                          memcg->nr_dirty--
      if (TestSet(PCG_dirty))
          memcg->nr_dirty++

Here, memcg->nr_dirty = +1, this is wrong.  This race was reported by Greg
Thelen <gthelen@google.com>.  Now, only FILE_MAPPED is supported but
fortunately, it's serialized by page table lock and this is not real bug,
_now_,

If this potential problem is caused by having duplicated information in
struct page and struct page_cgroup, we may be able to fix this by using
original 'struct page' information.  But we'll have a problem in "move
account"

Assume we use only PG_dirty.

         CPU-A                   CPU-B
    TestSet PG_dirty
    (delay)                    move_lock_mem_cgroup()
                               if (PageDirty(page))
                                      new_memcg->nr_dirty++
                               pc->mem_cgroup = new_memcg;
                               move_unlock_mem_cgroup()
    move_lock_mem_cgroup()
    memcg = pc->mem_cgroup
    new_memcg->nr_dirty++

accounting information may be double-counted.  This was original reason to
have PCG_xxx flags but it seems PCG_xxx has another problem.

I think we need a bigger lock as

     move_lock_mem_cgroup(page)
     TestSetPageDirty(page)
     update page stats (without any checks)
     move_unlock_mem_cgroup(page)

This fixes both of problems and we don't have to duplicate page flag into
page_cgroup.  Please note: move_lock_mem_cgroup() is held only when there
are possibility of "account move" under the system.  So, in most path,
status update will go without atomic locks.

This patch introduces mem_cgroup_begin_update_page_stat() and
mem_cgroup_end_update_page_stat() both should be called at modifying
'struct page' information if memcg takes care of it.  as

     mem_cgroup_begin_update_page_stat()
     modify page information
     mem_cgroup_update_page_stat()
     => never check any 'struct page' info, just update counters.
     mem_cgroup_end_update_page_stat().

This patch is slow because we need to call begin_update_page_stat()/
end_update_page_stat() regardless of accounted will be changed or not.  A
following patch adds an easy optimization and reduces the cost.

[akpm@linux-foundation.org: s/lock/locked/]
[hughd@google.com: fix deadlock by avoiding stat lock when anon]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Greg Thelen <gthelen@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Ying Han <yinghan@google.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-21 17:55:01 -07:00
..
2011-03-31 11:26:23 -03:00
2008-02-06 10:41:02 -08:00
2011-07-22 08:25:37 -07:00
2011-11-02 16:07:03 -07:00
2011-10-26 15:43:25 -04:00
2011-07-26 16:49:47 -07:00
2009-04-01 08:59:23 -07:00
2011-03-22 17:43:59 -07:00
2008-10-20 08:52:42 -07:00
2011-07-26 16:49:47 -07:00
2011-08-03 11:30:42 -04:00
2011-07-31 22:05:09 +02:00
2011-07-26 16:49:47 -07:00
2012-03-21 17:55:01 -07:00
2010-10-25 08:02:40 -07:00
2012-02-28 16:02:54 +01:00
2005-04-16 15:20:36 -07:00
2012-01-03 22:54:57 -05:00
2011-10-29 21:20:22 +02:00
2011-07-25 20:57:16 -07:00
2011-12-13 09:26:45 +00:00
2011-07-26 16:49:47 -07:00
2010-08-04 11:00:45 +02:00
2012-02-20 19:46:36 +11:00
2012-01-04 08:56:31 -06:00
2011-10-31 20:19:04 +00:00
2007-02-09 17:39:36 -05:00
2009-11-04 09:50:58 -08:00
2011-11-26 14:59:39 -05:00
2011-12-09 17:35:51 -08:00
2008-01-28 23:21:18 +01:00
2011-12-11 18:25:16 -05:00
2012-01-12 20:13:04 -08:00
2011-09-14 15:24:51 -04:00
2011-03-31 11:26:23 -03:00
2012-01-03 22:54:58 -05:00
2011-01-12 20:16:43 -05:00
2005-04-16 15:20:36 -07:00
2011-07-26 16:49:47 -07:00
2011-07-26 16:49:47 -07:00
2005-04-16 15:20:36 -07:00
2011-01-07 17:50:27 +11:00
2011-12-13 11:58:49 +01:00
2012-03-20 21:48:30 +08:00
2012-01-03 22:54:56 -05:00
2011-07-21 13:47:54 -07:00
2005-04-16 15:20:36 -07:00
2012-01-03 22:55:17 -05:00
2006-10-04 00:31:09 -07:00
2012-01-09 13:52:09 +01:00
2012-03-08 10:50:35 -08:00
2012-03-21 17:54:57 -07:00
2012-03-08 10:50:35 -08:00
2011-07-26 16:49:47 -07:00
2012-01-17 15:40:51 -08:00
2011-12-27 11:26:41 +02:00
2011-09-16 19:20:20 -04:00
2007-07-17 10:23:03 -07:00
2012-03-15 21:41:34 +01:00
2012-03-03 15:04:45 -05:00
2011-07-26 16:49:47 -07:00
2005-04-16 15:20:36 -07:00
2006-11-30 04:40:22 +01:00
2008-06-06 11:29:12 -07:00
2011-07-26 16:49:47 -07:00
2011-11-14 00:47:54 -05:00
2011-07-31 12:18:16 -04:00
2012-03-21 17:54:58 -07:00
2005-04-16 15:20:36 -07:00
2011-01-16 13:47:07 -05:00
2012-01-06 12:10:26 -08:00
2012-03-09 08:26:05 +01:00
2012-01-12 15:23:04 -08:00
2011-05-26 17:12:37 -07:00
2012-03-16 21:49:24 +01:00
2011-12-13 09:26:45 +00:00
2006-10-03 23:01:26 +02:00
2011-11-02 16:07:02 -07:00
2011-01-13 08:03:21 -08:00
2011-03-31 11:26:23 -03:00
2012-01-03 22:55:07 -05:00
2010-02-10 17:47:17 -08:00
2012-01-03 22:54:56 -05:00
2010-11-15 13:24:06 -05:00
2011-07-26 14:50:01 -07:00
2012-01-03 22:52:40 -05:00
2008-02-07 08:42:34 -08:00
2012-03-08 11:38:50 -08:00
2012-01-09 09:33:57 +09:00
2011-07-30 08:44:19 -10:00
2012-03-19 16:53:08 -04:00
2011-07-26 16:49:47 -07:00
2011-12-13 09:26:45 +00:00
2011-07-26 16:49:47 -07:00
2011-07-25 20:57:11 -07:00
2007-05-08 11:15:18 -07:00
2011-10-31 17:30:47 -07:00
2011-08-16 00:16:49 -07:00
2011-08-03 14:25:22 -10:00
2005-04-16 15:20:36 -07:00
2012-01-03 22:54:56 -05:00
2010-11-29 08:55:25 +11:00
2011-06-27 20:30:08 +02:00
2012-02-02 14:55:45 -08:00
2005-04-16 15:20:36 -07:00
2011-11-02 16:07:02 -07:00
2011-07-26 16:49:47 -07:00
2012-03-08 10:50:35 -08:00
2011-09-14 15:24:51 -04:00