Project

General

Profile

Actions

Bug #3516

closed

Suricata Out of memory: Kill process

Added by xu hui almost 5 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
Affected Versions:
Effort:
Difficulty:
Label:

Description

hi, team:
I deployed Suricata v5.0.2 on an EC2 instance on AWS, use the analysis of VXLAN protocol to get DNS data ( Only mirrored DNS traffic )。An out-of-memory alarm event recently occurred on this machine.
This problem happens every day, I hope you guys can help me, I wonder if I need to add another server?

Sample from kern.log:

Mar  7 06:25:09 ip-10-180-102-245 kernel: [99601.312428] new mount options do not match the existing superblock, will be ignored
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718745] W#03-ens5 invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718747] W#03-ens5 cpuset=/ mems_allowed=0
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718751] CPU: 3 PID: 21372 Comm: W#03-ens5 Not tainted 4.15.0-1060-aws #62-Ubuntu
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718752] Hardware name: Amazon EC2 c5n.4xlarge/, BIOS 1.0 10/16/2017
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718753] Call Trace:
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718761]  dump_stack+0x6d/0x8e
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718764]  dump_header+0x71/0x285
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718767]  ? security_capable_noaudit+0x4b/0x70
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718769]  oom_kill_process+0x21f/0x420
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718770]  out_of_memory+0x116/0x4e0
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718772]  __alloc_pages_slowpath+0xa53/0xe00
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718774]  __alloc_pages_nodemask+0x29a/0x2c0
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718777]  alloc_pages_current+0x6a/0xe0
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718780]  __page_cache_alloc+0x81/0xa0
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718782]  filemap_fault+0x3ea/0x6f0
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718784]  ? filemap_map_pages+0x181/0x390
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718786]  ext4_filemap_fault+0x31/0x44
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718788]  __do_fault+0x5b/0x115
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718789]  __handle_mm_fault+0xdef/0x1290
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718792]  ? futex_wake+0x8f/0x180
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718794]  handle_mm_fault+0xb1/0x210
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718797]  __do_page_fault+0x281/0x4b0
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718800]  ? ktime_get_ts64+0x51/0xf0
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718801]  do_page_fault+0x2e/0xe0
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718805]  ? async_page_fault+0x2f/0x50
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718809]  do_async_page_fault+0x51/0x80
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718810]  async_page_fault+0x45/0x50
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718812] RIP: 0033:0x7f5a929b3ad0
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718813] RSP: 002b:00007f5a8f484d78 EFLAGS: 00010287
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718814] RAX: 00007f5a842677e0 RBX: 00007f5a842677e0 RCX: 0000000000000000
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718815] RDX: 000000000000003f RSI: 00007f5a8c43a0c2 RDI: 00007f5a84267b80
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718815] RBP: 00007f5a84268180 R08: 000000000000003f R09: 0000000000000003
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718816] R10: 0000000000000055 R11: 00007f5a8c43a0ac R12: 0000000000000003
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718817] R13: 0000563973654170 R14: 00007f5a8427a090 R15: 00007f5a8c43a0c2
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718818] Mem-Info:
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718821] active_anon:10339830 inactive_anon:429 isolated_anon:0
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718821]  active_file:129 inactive_file:56 isolated_file:0
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718821]  unevictable:0 dirty:16 writeback:0 unstable:0
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718821]  slab_reclaimable:23954 slab_unreclaimable:35917
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718821]  mapped:1293 shmem:502 pagetables:22019 bounce:0
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718821]  free:58985 free_pcp:244 free_cma:0
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718824] Node 0 active_anon:41359320kB inactive_anon:1716kB active_file:516kB inactive_file:224kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:
5172kB dirty:64kB writeback:0kB shmem:2008kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 96256kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718825] Node 0 DMA free:15908kB min:24kB low:36kB high:48kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0
kB present:15992kB managed:15908kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718828] lowmem_reserve[]: 0 2972 41199 41199 41199
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718830] Node 0 DMA32 free:157744kB min:4872kB low:7916kB high:10960kB active_anon:2892356kB inactive_anon:0kB active_file:168kB inactive_file:88kB unevictable
:0kB writepending:0kB present:3129316kB managed:3063748kB mlocked:0kB kernel_stack:80kB pagetables:5796kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718832] lowmem_reserve[]: 0 0 38226 38226 38226
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718834] Node 0 Normal free:62288kB min:62680kB low:101824kB high:140968kB active_anon:38467196kB inactive_anon:1716kB active_file:512kB inactive_file:1224kB u
nevictable:0kB writepending:0kB present:39845888kB managed:39148676kB mlocked:0kB kernel_stack:7200kB pagetables:82280kB bounce:0kB free_pcp:976kB local_pcp:104kB free_cma:0kB
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718837] lowmem_reserve[]: 0 0 0 0 0
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718838] Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15908kB
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718844] Node 0 DMA32: 8894*4kB (UME) 743*8kB (UME) 573*16kB (UME) 439*32kB (UME) 317*64kB (UME) 201*128kB (UME) 83*256kB (UME) 27*512kB (UME) 12*1024kB (UME)
0*2048kB 0*4096kB = 158112kB
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718850] Node 0 Normal: 494*4kB (UME) 406*8kB (UME) 3556*16kB (UME) 81*32kB (UME) 6*64kB (M) 2*128kB (M) 1*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 65608
kB
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718856] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718857] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718857] 1223 total pagecache pages
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718858] 0 pages in swap cache
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718859] Swap cache stats: add 0, delete 0, find 0/0
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718859] Free swap  = 0kB
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718860] Total swap = 0kB
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718860] 10747799 pages RAM
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718861] 0 pages HighMem/MovableOnly
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718861] 190716 pages reserved
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718862] 0 pages cma reserved
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718862] 0 pages hwpoisoned
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718863] [ pid ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718876] [  759]     0   759    11901      114   131072        0             0 rpcbind
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718877] [ 1068]     0  1068    72022      248   196608        0             0 accounts-daemon
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718879] [ 1073]     0  1073   192063      925   204800        0             0 amazon-ssm-agen
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718881] [ 1078]   103  1078    12635      287   151552        0          -900 dbus-daemon
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718882] [ 1119]     0  1119    17697      240   180224        0             0 systemd-logind
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718883] [ 1123]     0  1123     7083       52   102400        0             0 atd
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718885] [ 1137]     0  1137    42706     1945   221184        0             0 networkd-dispat
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718886] [ 1150]     0  1150    46917     1978   253952        0             0 unattended-upgr
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718887] [ 1154]     0  1154    77203       97    98304        0             0 lxcfs
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718888] [ 1159]     0  1159     7962       75   102400        0             0 cron
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718890] [ 1167]     0  1167    16563     3517   167936        0             0 supervisord
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718891] [ 1197]     0  1197   717602     3521   524288        0          -999 containerd
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718892] [ 1232]     0  1232    72221      211   208896        0             0 polkitd
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718893] [ 1252]     0  1252     4103       37    69632        0             0 agetty
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718895] [ 1258]     0  1258    18075      188   184320        0         -1000 sshd
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718897] [ 1271]     0  1271     3722       32    69632        0             0 agetty
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718898] [ 4746]   100  4746    17998      184   172032        0             0 systemd-network
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718899] [ 4778]   101  4778    17660      167   180224        0             0 systemd-resolve
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718901] [ 4807] 62583  4807    35484      148   188416        0             0 systemd-timesyn
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718902] [ 4832]     0  4832    25988     2438   229376        0             0 systemd-journal
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718904] [ 8661]   106  8661     7149       45   102400        0             0 uuidd
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718905] [ 8852]     0  8852    10801      262   114688        0         -1000 systemd-udevd
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718906] [10433]     0 10433    27632      100   110592        0             0 irqbalance
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718908] [12403]     0 12403   750983     3798   507904        0          -900 snapd
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718909] [15024]     0 15024    24427       45    90112        0             0 lvmetad
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718910] [21396]   102 21396    66818      363   180224        0             0 rsyslogd
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718911] [12620]   114 12620    24471      214   225280        0             0 zabbix_agentd
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718913] [12640]   114 12640    24471      534   217088        0             0 zabbix_agentd
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718914] [12642]   114 12642    24471      254   217088        0             0 zabbix_agentd
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718915] [12643]   114 12643    24471      254   217088        0             0 zabbix_agentd
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718916] [12646]   114 12646    24471      254   217088        0             0 zabbix_agentd
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718917] [12647]   114 12647    24471      238   217088        0             0 zabbix_agentd
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718919] [21369]     0 21369 10465803 10203806 82219008        0             0 Suricata-Main
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718920] [12700]     0 12700   811505   107586  1581056        0             0 filebeat
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718921] [20638]     0 20638     9091      245    90112        0             0 ossec-execd
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718923] [20643]   113 20643    64696      423   126976        0             0 ossec-agentd
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718924] [20650]     0 20650    28048      860   102400        0             0 ossec-syscheckd
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718925] [20655]     0 20655   101282      329   135168        0             0 ossec-logcollec
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718927] [20665]     0 20665   104394      766   159744        0             0 wazuh-modulesd
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718928] [24815]     0 24815   491679     2218   479232        0             0 filebeat
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718930] [24679]     0 24679   204869      553   258048        0             0 kubelet
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.718932] Out of memory: Kill process 21369 (Suricata-Main) score 939 or sacrifice child
Mar  7 17:50:13 ip-10-180-102-245 kernel: [140705.722647] Killed process 21369 (Suricata-Main) total-vm:41863212kB, anon-rss:40815224kB, file-rss:0kB, shmem-rss:0kB
Mar  7 17:50:16 ip-10-180-102-245 kernel: [140708.385798] oom_reaper: reaped process 21369 (Suricata-Main), now anon-rss:0kB, file-rss:3584kB, shmem-rss:0kB
Mar  7 17:50:17 ip-10-180-102-245 kernel: [140709.304162] device ens5 left promiscuous mode
Mar  7 17:50:18 ip-10-180-102-245 kernel: [140710.614528] device ens5 entered promiscuous mode

EC2 Config:
c5n.4xlarge
vCPU: 16
Mem: 42G

OS

Linux ip-10-180-102-245 4.15.0-1060-aws #62-Ubuntu SMP Tue Feb 11 21:23:22 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Suricata version

This is Suricata version 5.0.2 RELEASE

Suricata Config


  - eve-log:
      enabled: yes
      filetype: regular
      filemode: 644
      filename: dns_query-%Y-%m-%d-%H:%M.json
      rotate-interval: 30m

      metadata: yes
      pcap-file: false
      community-id: false
      community-id-seed: 0

      types:
        - dns:
            version: 2
            requests: yes
            responses: no
            types: [a, cname, mx, ns, ptr, txt]

  - eve-log:
      enabled: yes
      filetype: regular
      filemode: 644
      filename: dns_answer-%Y-%m-%d-%H:%M.json
      rotate-interval: 30m

      metadata: yes
      pcap-file: false
      community-id: false
      community-id-seed: 0

      types:
        - dns:
            version: 2
            requests: no
            responses: yes
            formats: [grouped] #[detailed, grouped]
            types: [a, cname, mx, ns, ptr, txt]

af-packet:
  - interface: ens5
     threads: auto
     cluster-id: 99
     cluster-type: cluster_flow
     defrag: yes
     use-mmap: yes
     mmap-locked: yes
     tpacket-v3: yes

default-packet-size: 9015

threading:
  set-cpu-affinity: yes
  cpu-affinity:
    - management-cpu-set:
        cpu: ["0"]  # include only these CPUs in affinity settings
    - worker-cpu-set:
        cpu: ["1-15"]
        mode: "exclusive" 
        prio:
          low: []
          medium: ["0"]
          high: ["1-15"]
          default: "high" 

# free -m
              total        used        free      shared  buff/cache   available
Mem:          41238       37606         292           1        3340        3116
Swap:             0           0           0

NIC traffic from Zabbix

MEM from zabbix

CPU load from zabbix


Files

Actions

Also available in: Atom PDF