Project

General

Profile

Actions

Bug #6963

open

rule-reload: potential memory leak in multiple rule reloads

Added by Andreas Herz 7 months ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assignee:
Target version:
Affected Versions:
Effort:
Difficulty:
Label:

Description

There is a potential memory leak present in Suricata 6.0.18 and 7.0.4 that is revealed by the memory usage on rule-reloads.
This doesn't need any traffic being forwarded.

To reproduce it, start Suricata with the ET Open ruleset and default/basic settings. You can use a dummy interface instead of an actual interface where traffic is forwarded to.

ip link add dummy0 type dummy
ip link set dummy0 up

Once Suricata did start check the memory usage, for example with htop. After 2-3 minutes trigger a rule reload (either via suricatasc or sending USR2 signal).
Observe the memory output and repeat this a few times. You should see that in most cases the memory usage increases during the reload, is reduced a bit in the end but the overall diff between the time before and after the reload is positive.

On a testrun with 6.0.18 I saw the following usage for the Suricata process with the first value being "VIRT" and the second one being "RES" memory value read from htop:

3170/732
3443/993
3485/1026
3486/1039
3490/1049

Those are the values for 7.0.4

3271/883
3587/1182
3653/1247
3666/1278
3679/1291

The PR https://github.com/OISF/suricata/pull/9756 which is linked at https://redmine.openinfosecfoundation.org/issues/6454 doesn't change that issue (I tried a backport of that PR to 7.0.4)

Actions #1

Updated by Andreas Herz 7 months ago

Some more additions, the first bump in memory usage happens with the call of "SigLoadSignatures" from 3271/883 to 3888/1499 in the example, this is expected to be an increase since the rules are loaded in parallel while the current ruleset is still active and will be swapped later.
The second bump happens when "DetectEngineReloadThreads" is triggered, the bump is much smaller, from the 3888/1499 to 3907/1502 so very minor.
With the "DetectEnginePruneFreeList" most is freed again down to "3587/1182" and the new addition with the malloc PR reduces it further down to "3587/964" which makes sense that less RES memory is used when the "malloc_trim" happens. But we still have an overall increase that will go up for each reload and thus overtime the system memory will be exhausted at one point.

Actions #2

Updated by Andreas Herz 6 months ago

Further investigation with different runs of Suricata with the default `glibc`, `jemalloc` and `tcmalloc` showed no diff on the root issue. There is a slight diff on how much memory is used (see table below) but the steady increase is there in all 3 cases.
The additional output described in https://blog.inliniac.net/2014/12/23/profiling-suricata-with-jemalloc/ for jemalloc also showed no actual leak, so it's more a logical "leak"

Also `runmode=single` shows the issue as well.

Memory usage (VIRT and RES) after the start, after the first reload, after the second reload

Default glibc

VIRT/RES

1065/761
1381/859
1402/1096

tcmalloc

VIRT/RES

966/854
1607/1494
1657/1595

jemalloc

VIRT/RES

1185/788
2007/975
2119/972

Actions #3

Updated by Andreas Herz about 2 months ago

I had some time to play around with the suggestion from Victor to see if it's related to threshold, classification and/or reference. I tried a ET ruleset where I removed all metadata (except sid/rev/msg) and all thresholds. The "leak" is still present there as well.

Actions

Also available in: Atom PDF