Balloon steals memory from virtual machines

This article is my experience with memory balloon on virtual machine.

I came across an ElasticSearch server (ESXi guest with 32GB physical memory) where the main process keeps dying of OOM. Even worse, after the OOM event, the free memory left is about 10G and Elastic Search cannot start because its JVM is set with -Xms16g in /etc/elasticsearch/jvm.options. So I need to address the OutOfMemory error.

To understand what triggered OOM, we can use dmesg or just check /var/log/message, where the memory snapshot by OOM killer is displayed during the kernel panic:

Apr 26 04:18:15 elastichost kernel: kworker/7:1 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
Apr 26 04:18:15 elastichost kernel: kworker/7:1 cpuset=/ mems_allowed=0
Apr 26 04:18:15 elastichost kernel: CPU: 7 PID: 13968 Comm: kworker/7:1 Kdump: loaded Not tainted 3.10.0-957.1.3.el7.x86_64 #1
Apr 26 04:18:15 elastichost kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
Apr 26 04:18:15 elastichost kernel: Workqueue: events_freezable vmballoon_work [vmw_balloon]
Apr 26 04:18:15 elastichost kernel: Call Trace:
Apr 26 04:18:15 elastichost kernel: [<ffffffffa2961e41>] dump_stack+0x19/0x1b
Apr 26 04:18:15 elastichost kernel: [<ffffffffa295c86a>] dump_header+0x90/0x229
Apr 26 04:18:15 elastichost kernel: [<ffffffffa2301052>] ? ktime_get_ts64+0x52/0xf0
Apr 26 04:18:15 elastichost kernel: [<ffffffffa23581ef>] ? delayacct_end+0x8f/0xb0
Apr 26 04:18:15 elastichost kernel: [<ffffffffa23ba4e4>] oom_kill_process+0x254/0x3d0
Apr 26 04:18:15 elastichost kernel: [<ffffffffa23b9f8d>] ? oom_unkillable_task+0xcd/0x120
Apr 26 04:18:15 elastichost kernel: [<ffffffffa23ba036>] ? find_lock_task_mm+0x56/0xc0
Apr 26 04:18:15 elastichost kernel: [<ffffffffa23bad26>] out_of_memory+0x4b6/0x4f0
Apr 26 04:18:15 elastichost kernel: [<ffffffffa295d36e>] __alloc_pages_slowpath+0x5d6/0x724
Apr 26 04:18:15 elastichost kernel: [<ffffffffa23c1105>] __alloc_pages_nodemask+0x405/0x420
Apr 26 04:18:15 elastichost kernel: [<ffffffffa240df68>] alloc_pages_current+0x98/0x110
Apr 26 04:18:15 elastichost kernel: [<ffffffffc05f1f84>] vmballoon_work+0x454/0x6ff [vmw_balloon]
Apr 26 04:18:15 elastichost kernel: [<ffffffffa22b9d4f>] process_one_work+0x17f/0x440
Apr 26 04:18:15 elastichost kernel: [<ffffffffa22bade6>] worker_thread+0x126/0x3c0
Apr 26 04:18:15 elastichost kernel: [<ffffffffa22bacc0>] ? manage_workers.isra.25+0x2a0/0x2a0
Apr 26 04:18:15 elastichost kernel: [<ffffffffa22c1c31>] kthread+0xd1/0xe0
Apr 26 04:18:15 elastichost kernel: [<ffffffffa22c1b60>] ? insert_kthread_work+0x40/0x40
Apr 26 04:18:15 elastichost kernel: [<ffffffffa2974c37>] ret_from_fork_nospec_begin+0x21/0x21
Apr 26 04:18:15 elastichost kernel: [<ffffffffa22c1b60>] ? insert_kthread_work+0x40/0x40
Apr 26 04:18:15 elastichost kernel: Mem-Info:
Apr 26 04:18:15 elastichost kernel: active_anon:3649243 inactive_anon:439380 isolated_anon:0#012 active_file:480 inactive_file:693 isolated_file:0#012 unevictable:0 dirty:2 writeback:0 unstable:0#012 slab_reclaimable:77845 slab_unreclaimable:10219#012 mapped:10086 shmem:9596 pagetables:16727 bounce:0#012 free:50116 free_pcp:238 free_cma:0
Apr 26 04:18:15 elastichost kernel: Node 0 DMA free:15892kB min:32kB low:40kB high:48kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:16kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Apr 26 04:18:15 elastichost kernel: lowmem_reserve[]: 0 2829 31993 31993
Apr 26 04:18:15 elastichost kernel: Node 0 DMA32 free:122768kB min:5972kB low:7464kB high:8956kB active_anon:471052kB inactive_anon:471040kB active_file:232kB inactive_file:460kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3129216kB managed:2897760kB mlocked:0kB dirty:0kB writeback:0kB mapped:584kB shmem:300kB slab_reclaimable:23908kB slab_unreclaimable:3872kB kernel_stack:656kB pagetables:6440kB unstable:0kB bounce:0kB free_pcp:288kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:214 all_unreclaimable? no
Apr 26 04:18:15 elastichost kernel: lowmem_reserve[]: 0 0 29163 29163
Apr 26 04:18:15 elastichost kernel: Node 0 Normal free:61804kB min:61576kB low:76968kB high:92364kB active_anon:14125920kB inactive_anon:1286480kB active_file:1688kB inactive_file:2312kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:30408704kB managed:29866500kB mlocked:0kB dirty:8kB writeback:0kB mapped:39760kB shmem:38084kB slab_reclaimable:287472kB slab_unreclaimable:36988kB kernel_stack:5680kB pagetables:60468kB unstable:0kB bounce:0kB free_pcp:804kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:698 all_unreclaimable? no
Apr 26 04:18:15 elastichost kernel: lowmem_reserve[]: 0 0 0 0
Apr 26 04:18:15 elastichost kernel: Node 0 DMA: 1*4kB (U) 0*8kB 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15892kB
Apr 26 04:18:15 elastichost kernel: Node 0 DMA32: 407*4kB (UEM) 506*8kB (UEM) 431*16kB (UM) 364*32kB (M) 272*64kB (UM) 179*128kB (UM) 99*256kB (UM) 35*512kB (M) 11*1024kB (M) 2*2048kB (M) 0*4096kB = 123164kB
Apr 26 04:18:15 elastichost kernel: Node 0 Normal: 15576*4kB (UM) 39*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 62616kB
Apr 26 04:18:15 elastichost kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Apr 26 04:18:15 elastichost kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Apr 26 04:18:15 elastichost kernel: 44652 total pagecache pages
Apr 26 04:18:15 elastichost kernel: 34162 pages in swap cache
Apr 26 04:18:15 elastichost kernel: Swap cache stats: add 10884045, delete 10845872, find 3294122/3678936
Apr 26 04:18:15 elastichost kernel: Free swap  = 0kB
Apr 26 04:18:15 elastichost kernel: Total swap = 2097148kB
Apr 26 04:18:15 elastichost kernel: 8388478 pages RAM
Apr 26 04:18:15 elastichost kernel: 0 pages HighMem/MovableOnly
Apr 26 04:18:15 elastichost kernel: 193436 pages reserved
Apr 26 04:18:15 elastichost kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
Apr 26 04:18:15 elastichost kernel: [ 3164]     0  3164    47420    12996     100     6314             0 systemd-journal
Apr 26 04:18:15 elastichost kernel: [ 3192]     0  3192    11158        2      24      174         -1000 systemd-udevd
Apr 26 04:18:15 elastichost kernel: [ 3198]     0  3198    66023        0      31      117             0 lvmetad
Apr 26 04:18:15 elastichost kernel: [ 6051]     0  6051    15511       20      29      138         -1000 auditd
Apr 26 04:18:15 elastichost kernel: [ 6073]    81  6073    14557       83      32       88          -900 dbus-daemon
Apr 26 04:18:15 elastichost kernel: [ 6075]    32  6075    18412       16      39      166             0 rpcbind
Apr 26 04:18:15 elastichost kernel: [ 6076]     0  6076    50404        0      37      171             0 gssproxy
Apr 26 04:18:15 elastichost kernel: [ 6077]     0  6077     5422       50      15       41             0 irqbalance
Apr 26 04:18:15 elastichost kernel: [ 6078]     0  6078   118943      155      85      362             0 NetworkManager
Apr 26 04:18:15 elastichost kernel: [ 6086]     0  6086     6594       47      18       41             0 systemd-logind
Apr 26 04:18:15 elastichost kernel: [ 6087]     0  6087    24892        0      42      402             0 VGAuthService
Apr 26 04:18:15 elastichost kernel: [ 6088]     0  6088    56746      113      58      246             0 vmtoolsd
Apr 26 04:18:15 elastichost kernel: [ 6089]   999  6089   153086      130      61     1787             0 polkitd
Apr 26 04:18:15 elastichost kernel: [ 6129]    38  6129    11817       39      27      140             0 ntpd
Apr 26 04:18:15 elastichost kernel: [ 6464]     0  6464    56962      132      62      970             0 snmpd
Apr 26 04:18:15 elastichost kernel: [ 6468]     0  6468    97733     4770     100      482             0 rsyslogd
Apr 26 04:18:15 elastichost kernel: [ 6470]     0  6470   470477     6332     100      781             0 sh-metricbeat
Apr 26 04:18:15 elastichost kernel: [ 6472]     0  6472   250084     2279      72     2367             0 sh-filebeat
Apr 26 04:18:15 elastichost kernel: [ 6483]     0  6483    28189       26      57      231         -1000 sshd
Apr 26 04:18:15 elastichost kernel: [ 6485]     0  6485   143455      107      97     2667             0 tuned
Apr 26 04:18:15 elastichost kernel: [ 6569]     0  6569    76290      625      41      188             0 avagent.bin
Apr 26 04:18:15 elastichost kernel: [ 6574]    29  6574    12239        1      27      254             0 rpc.statd
Apr 26 04:18:15 elastichost kernel: [ 6601]     0  6601    31572       30      18      129             0 crond
Apr 26 04:18:15 elastichost kernel: [ 6609]     0  6609    27523        1      10       32             0 agetty
Apr 26 04:18:15 elastichost kernel: [ 7451]   494  7451   315970    36412     364     8550             0 node
Apr 26 04:18:15 elastichost kernel: [11272]     0 11272    39154        0      80      336             0 sshd
Apr 26 04:18:15 elastichost kernel: [11277]     0 11277    28885        2      12      112             0 bash
Apr 26 04:18:15 elastichost kernel: [16070]   495 16070 47122631  3994129   15027   457539             0 java
Apr 26 04:18:15 elastichost kernel: [16264]   495 16264    18032        0      31      162             0 controller
Apr 26 04:18:15 elastichost kernel: Out of memory: Kill process 16070 (java) score 512 or sacrifice child
Apr 26 04:18:15 elastichost kernel: Killed process 16264 (controller) total-vm:72128kB, anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Apr 26 04:18:15 elastichost kernel: java invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Apr 26 04:18:15 elastichost kernel: java cpuset=/ mems_allowed=0
Apr 26 04:18:15 elastichost kernel: CPU: 2 PID: 16265 Comm: java Kdump: loaded Not tainted 3.10.0-957.1.3.el7.x86_64 #1
Apr 26 04:18:15 elastichost kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
Apr 26 04:18:15 elastichost kernel: Call Trace:
Apr 26 04:18:15 elastichost kernel: [<ffffffffa2961e41>] dump_stack+0x19/0x1b
Apr 26 04:18:15 elastichost kernel: [<ffffffffa295c86a>] dump_header+0x90/0x229
Apr 26 04:18:15 elastichost kernel: [<ffffffffa2301052>] ? ktime_get_ts64+0x52/0xf0
Apr 26 04:18:15 elastichost kernel: [<ffffffffa23581ef>] ? delayacct_end+0x8f/0xb0
Apr 26 04:18:15 elastichost kernel: [<ffffffffa23ba4e4>] oom_kill_process+0x254/0x3d0
Apr 26 04:18:15 elastichost kernel: [<ffffffffa23b9f8d>] ? oom_unkillable_task+0xcd/0x120
Apr 26 04:18:15 elastichost kernel: [<ffffffffa23ba036>] ? find_lock_task_mm+0x56/0xc0
Apr 26 04:18:15 elastichost kernel: [<ffffffffa23bad26>] out_of_memory+0x4b6/0x4f0
Apr 26 04:18:15 elastichost kernel: [<ffffffffa295d36e>] __alloc_pages_slowpath+0x5d6/0x724
Apr 26 04:18:15 elastichost kernel: [<ffffffffa23c1105>] __alloc_pages_nodemask+0x405/0x420
Apr 26 04:18:15 elastichost kernel: [<ffffffffa240df68>] alloc_pages_current+0x98/0x110
Apr 26 04:18:15 elastichost kernel: [<ffffffffa23b6347>] __page_cache_alloc+0x97/0xb0
Apr 26 04:18:15 elastichost kernel: [<ffffffffa23b8fa8>] filemap_fault+0x298/0x490
Apr 26 04:18:15 elastichost kernel: [<ffffffffc0484d0e>] __xfs_filemap_fault+0x7e/0x1d0 [xfs]
Apr 26 04:18:15 elastichost kernel: [<ffffffffa22c2dc0>] ? wake_bit_function+0x40/0x40
Apr 26 04:18:15 elastichost kernel: [<ffffffffc0484f0c>] xfs_filemap_fault+0x2c/0x30 [xfs]
Apr 26 04:18:15 elastichost kernel: [<ffffffffa23e444a>] __do_fault.isra.59+0x8a/0x100
Apr 26 04:18:15 elastichost kernel: [<ffffffffa23e49fc>] do_read_fault.isra.61+0x4c/0x1b0
Apr 26 04:18:15 elastichost kernel: [<ffffffffa23e93a4>] handle_pte_fault+0x2f4/0xd10
Apr 26 04:18:15 elastichost kernel: [<ffffffffa23ebedd>] handle_mm_fault+0x39d/0x9b0
Apr 26 04:18:15 elastichost kernel: [<ffffffffa296f5e3>] __do_page_fault+0x203/0x500
Apr 26 04:18:15 elastichost kernel: [<ffffffffa296f915>] do_page_fault+0x35/0x90
Apr 26 04:18:15 elastichost kernel: [<ffffffffa296b758>] page_fault+0x28/0x30
Apr 26 04:18:15 elastichost kernel: Mem-Info:
Apr 26 04:18:15 elastichost kernel: active_anon:3607073 inactive_anon:480522 isolated_anon:0#012 active_file:8 inactive_file:0 isolated_file:0#012 unevictable:0 dirty:0 writeback:1 unstable:0#012 slab_reclaimable:75170 slab_unreclaimable:10131#012 mapped:2070 shmem:9592 pagetables:16696 bounce:0#012 free:50006 free_pcp:72 free_cma:0
Apr 26 04:18:15 elastichost kernel: Node 0 DMA free:15892kB min:32kB low:40kB high:48kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:16kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Apr 26 04:18:15 elastichost kernel: lowmem_reserve[]: 0 2829 31993 31993
Apr 26 04:18:15 elastichost kernel: Node 0 DMA32 free:122560kB min:5972kB low:7464kB high:8956kB active_anon:471000kB inactive_anon:471076kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3129216kB managed:2897760kB mlocked:0kB dirty:0kB writeback:0kB mapped:292kB shmem:288kB slab_reclaimable:22832kB slab_unreclaimable:3752kB kernel_stack:672kB pagetables:6408kB unstable:0kB bounce:0kB free_pcp:272kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:215 all_unreclaimable? yes
Apr 26 04:18:15 elastichost kernel: lowmem_reserve[]: 0 0 29163 29163
Apr 26 04:18:15 elastichost kernel: Node 0 Normal free:61572kB min:61576kB low:76968kB high:92364kB active_anon:13957292kB inactive_anon:1451012kB active_file:32kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:30408704kB managed:29866500kB mlocked:0kB dirty:0kB writeback:4kB mapped:7988kB shmem:38080kB slab_reclaimable:277848kB slab_unreclaimable:36756kB kernel_stack:5664kB pagetables:60376kB unstable:0kB bounce:0kB free_pcp:16kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:500 all_unreclaimable? yes
Apr 26 04:18:15 elastichost kernel: lowmem_reserve[]: 0 0 0 0
Apr 26 04:18:15 elastichost kernel: Node 0 DMA: 1*4kB (U) 0*8kB 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15892kB
Apr 26 04:18:15 elastichost kernel: Node 0 DMA32: 393*4kB (UEM) 490*8kB (UEM) 453*16kB (EM) 369*32kB (UEM) 269*64kB (M) 180*128kB (M) 98*256kB (M) 35*512kB (M) 12*1024kB (UM) 1*2048kB (M) 0*4096kB = 122148kB
Apr 26 04:18:15 elastichost kernel: Node 0 Normal: 15440*4kB (UM) 1*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 61768kB
Apr 26 04:18:15 elastichost kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Apr 26 04:18:15 elastichost kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Apr 26 04:18:15 elastichost kernel: 42975 total pagecache pages
Apr 26 04:18:15 elastichost kernel: 33308 pages in swap cache
Apr 26 04:18:15 elastichost kernel: Swap cache stats: add 10884233, delete 10846914, find 3294127/3678944
Apr 26 04:18:15 elastichost kernel: Free swap  = 0kB
Apr 26 04:18:15 elastichost kernel: Total swap = 2097148kB
Apr 26 04:18:15 elastichost kernel: 8388478 pages RAM
Apr 26 04:18:15 elastichost kernel: 0 pages HighMem/MovableOnly
Apr 26 04:18:15 elastichost kernel: 193436 pages reserved
Apr 26 04:18:15 elastichost kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
Apr 26 04:18:15 elastichost kernel: [ 3164]     0  3164    47420     4016     100     6314             0 systemd-journal
Apr 26 04:18:15 elastichost kernel: [ 3192]     0  3192    11158        2      24      174         -1000 systemd-udevd
Apr 26 04:18:15 elastichost kernel: [ 3198]     0  3198    66023        0      31      117             0 lvmetad
Apr 26 04:18:15 elastichost kernel: [ 6051]     0  6051    15511       20      29      138         -1000 auditd
Apr 26 04:18:15 elastichost kernel: [ 6073]    81  6073    14557       78      32       88          -900 dbus-daemon
Apr 26 04:18:15 elastichost kernel: [ 6075]    32  6075    18412       16      39      166             0 rpcbind
Apr 26 04:18:15 elastichost kernel: [ 6076]     0  6076    50404        0      37      171             0 gssproxy
Apr 26 04:18:15 elastichost kernel: [ 6077]     0  6077     5422       46      15       41             0 irqbalance
Apr 26 04:18:15 elastichost kernel: [ 6078]     0  6078   118943      155      85      362             0 NetworkManager
Apr 26 04:18:15 elastichost kernel: [ 6086]     0  6086     6594       42      18       41             0 systemd-logind
Apr 26 04:18:15 elastichost kernel: [ 6087]     0  6087    24892        0      42      402             0 VGAuthService
Apr 26 04:18:15 elastichost kernel: [ 6088]     0  6088    56746       91      58      246             0 vmtoolsd
Apr 26 04:18:15 elastichost kernel: [ 6089]   999  6089   153086      130      61     1787             0 polkitd
Apr 26 04:18:15 elastichost kernel: [ 6129]    38  6129    11817       36      27      140             0 ntpd
Apr 26 04:18:15 elastichost kernel: [ 6464]     0  6464    56962      130      62      970             0 snmpd
Apr 26 04:18:15 elastichost kernel: [ 6468]     0  6468    97733     1215     100      485             0 rsyslogd
Apr 26 04:18:15 elastichost kernel: [ 6470]     0  6470   470477     6230     100      781             0 sh-metricbeat
Apr 26 04:18:15 elastichost kernel: [ 6472]     0  6472   250084     2279      72     2367             0 sh-filebeat
Apr 26 04:18:15 elastichost kernel: [ 6483]     0  6483    28189       26      57      231         -1000 sshd
Apr 26 04:18:15 elastichost kernel: [ 6485]     0  6485   143455      107      97     2667             0 tuned
Apr 26 04:18:15 elastichost kernel: [ 6569]     0  6569    76290      625      41      188             0 avagent.bin
Apr 26 04:18:15 elastichost kernel: [ 6574]    29  6574    12239        1      27      254             0 rpc.statd
Apr 26 04:18:15 elastichost kernel: [ 6601]     0  6601    31572       27      18      129             0 crond
Apr 26 04:18:15 elastichost kernel: [ 6609]     0  6609    27523        1      10       32             0 agetty
Apr 26 04:18:15 elastichost kernel: [ 7451]   494  7451   315970    36412     364     8550             0 node
Apr 26 04:18:15 elastichost kernel: [11272]     0 11272    39154        0      80      336             0 sshd
Apr 26 04:18:15 elastichost kernel: [11277]     0 11277    28885        2      12      112             0 bash
Apr 26 04:18:15 elastichost kernel: [16070]   495 16070 47122631  3993992   15027   457699             0 java
Apr 26 04:18:15 elastichost kernel: Out of memory: Kill process 16291 (java) score 512 or sacrifice child
Apr 26 04:18:15 elastichost kernel: Killed process 16070 (java) total-vm:188490524kB, anon-rss:15975968kB, file-rss:0kB, shmem-rss:0kB
Apr 26 04:18:16 elastichost systemd: elasticsearch.service: main process exited, code=killed, status=9/KILL
Apr 26 04:18:16 elastichost systemd: Unit elasticsearch.service entered failed state.
Apr 26 04:18:16 elastichost systemd: elasticsearch.service failed.

The line “Free swap = 0kB” suggest the trigger of OOM is out of swap. So does ElasticSearch contribute to the swap shortage?

ElasticSearch was the main process on the VM and I noticed that the OS did not disable swap, which does not align with the best practice from ElasticSearch community. In order to find out whether Elastic Search is pushed to use swap, we can get the process ID:

pidof java

ElasticSearch happens to be the only Java based process and the PID is 2283, the following command shows the swap usage by this process:

cat /proc/2283/status  | grep VmSwap

If it shows a non-zero value, then ElasticSearch is using swap and you should expect some performance issues. We definitely should disable swap on ElasticSearch but does that solve the problem? Probably not because we didn’t address what caused the memory pressure in the first place. Disabling swap most likely makes ElastciSearch last longer before something else such as shortage of available memory triggers OOM.

In search for the source of memory shortage, I checked the result of free command and the top command. The free command (-mh) shows the following:

              total        used        free      shared  buff/cache   available
Mem:            31G         20G         10G        117M        243M         10G
Swap:          2.0G        249M        1.8G

I tried to follow my example from a previous post to make sense of the memory reads. When I monitor process with top and watch for RSS column, I cannot identify a single process that even takes more than a few hundred megabytes. I’ve downloaded a tool smem from epel repo, and the result of “smem -kt” suggest that the total RSS is about 372.2M:

So there is about 19.6GB of memory usage unaccounted for. There is something that takes this much memory in the VM and haunting round even after OOM killer!

Inspired by this post, I was able to identify the culprit, that is the memory balloon. We don’t have access to the host but from the Guest OS, we can tell by vmware-toolbox-cmd (need to install yum package open-vm-tools):

vmware-toolbox-cmd stat balloon

The result displays a whopping 20807 MB as memory balloon! This needs to be sent to PaaS vendor for investigation but it is likely a result of memory over-allocation/over-commitment at host level, as well as the setup where the memory of guest is not reserved. Looking at the original log snippet, the line with “events_freezable vmballoon_work” is also an indicator of balloon causing the OOM.

Hypervisor needs memory ballooning to reclaim memory from guest. Since the guest OS does not expect the amount of physical memory to change, hypervisor has to maintain the illusion that the guest has its fixed amount of physical memory. The hypervisor first computes the amount of memory that needs to reclaim, then it leverage some low-level mechanism such as a balloon driver (a pseudo-device driver) installed on guest. The driver communicates with hypervisor and is told to allocate or de-allocate memory. If the driver is told to allocate memory to host, it tells the guest OS to pin the allocated pages into physical memory so they are locked and the physical memory available to guest OS is decreased. All these low level mechanisms explains why it is hard to account for memory consumption based on process running in Guest OS.

Memory balloon is a dynamic process and the hypervisor adjusts the size of balloon. However, the hypervisor, the guest OS and the application process (ElasticSearch in this case) may enter a nuance interaction:

  1. The hypervisor reclaims memory from guest OS (ballooning);
  2. Guest OS panicked with OOM exception;
  3. OOM picks a process to kill, based on oom_score. Unfortunately, the true culprit (balloon) is not visible to guest OS, hence exempted from being considered to kill.
  4. By killing the application process (usually the main application for VM), a fair chunk of guest memory is freed up;
  5. the balloon then became more aggressive on the guest as seeing more memory becomes “available”;
  6. the application on guest then does not have minimum memory on guest OS to start;

How to solve this problem? There are several things to consider. On the hypervisor, keep the entire or part of the guest memory reserved for start of application. Tune the OOM scoring so the non-critical application get killed. Implement application daemon, etc.