Bug #3900: After a completed rule reload, Suricata sometimes is stuck for 1h with `rs_nfs_state_get_tx` peak - Suricata - Open Information Security Foundation

Actions

Copy link

Bug #3900

closed

After a completed rule reload, Suricata sometimes is stuck for 1h with `rs_nfs_state_get_tx` peak

Added by Andreas Herz over 4 years ago. Updated almost 2 years ago.

Status:

Closed

Priority:

Normal

Assignee:

OISF Dev

Target version:

TBD

Affected Versions:

5.0.3

Effort:

Difficulty:

Label:

Description

We discovered some strange behavior with Suricata on one machine after the rule reload happened.

1/9/2020 -- 06:27:18 - <Notice> - rule reload starting
1/9/2020 -- 06:27:31 - <Info> - 3 rule files processed. 40732 rules successfully loaded, 0 rules failed
1/9/2020 -- 06:27:32 - <Info> - Threshold config parsed: 0 rule(s) found
1/9/2020 -- 06:27:33 - <Info> - 40761 signatures processed. 0 are IP-only rules, 10671 are inspecting packet payload, 30034 inspect application layer, 0 are decoder event only
1/9/2020 -- 06:30:09 - <Info> - cleaning up signature grouping structure... complete
1/9/2020 -- 06:30:09 - <Notice> - rule reload complete
1/9/2020 -- 06:30:09 - <Info> - Unix socket: lost connection with client

I see in htop that 1 process is at 100% cpu load, I still see stats.log being updated but NO values change. So Suricata itself is still running.

With perf top I saw the attached peak of `rs_nfs_state_get_tx` function.

Samples: 1M of event 'cycles', 4000 Hz, Event count (approx.): 53158890162, DSO: suricata                                                                                                                                                      
rs_nfs_state_get_tx  /usr/bin/suricata [Percent: local period]                                                                                                                                                                                 
Percent│                                                                                                                                                                                                                                      ◆
       │                                                                                                                                                                                                                                      ▒
       │                                                                                                                                                                                                                                      ▒
       │    Disassembly of section .text:                                                                                                                                                                                                     ▒
       │                                                                                                                                                                                                                                      ▒
       │    00000000002a0c30 <strlcpy@@Base+0x2a0c0>:                                                                                                                                                                                         ▒
  0.00 │      add  $0x1,%rsi                                                                                                                                                                                                                  ▒
       │      mov  $0xfffffffffffffed0,%rax                                                                                                                                                                                                   ▒
  0.00 │      add  0x50(%rdi),%rax                                                                                                                                                                                                            ▒
  0.00 │      imul $0x130,0x60(%rdi),%rcx                                                                                                                                                                                                     ▒
       │      nop                                                                                                                                                                                                                             ▒
  0.67 │20:   test %rcx,%rcx                                                                                                                                                                                                                  ▒
  0.00 │    ↓ je   3d                                                                                                                                                                                                                         ▒
  0.94 │      add  $0xfffffffffffffed0,%rcx                                                                                                                                                                                                   ▒
 94.22 │      cmp  %rsi,0x130(%rax)                                                                                                                                                                                                           ▒
  2.10 │      lea  0x130(%rax),%rax                                                                                                                                                                                                           ▒
  2.08 │    ↑ jne  20                                                                                                                                                                                                                         ▒
       │    ← retq                                                                                                                                                                                                                            ▒
       │3d:   xor  %eax,%eax                                                                                                                                                                                                                  ▒
  0.00 │    ← retq

It's somehow stuck with the `cmp`.

Since it's just at this machine we suspect that some sort of traffic could produce that. This is also backed by the fact that this doesn't happen on every daily rule reload.

Files

suri_rulereloadbug.png (11.8 KB) suri_rulereloadbug.png

perf top output

Andreas Herz, 09/01/2020 07:02 AM