Project

General

Profile

Actions

Bug #6416

open

Suricata not using Myricom SNF driver in a performant way

Added by Peter McPherson about 1 year ago. Updated about 1 year ago.

Status:
New
Priority:
Normal
Assignee:
Target version:
Affected Versions:
Effort:
Difficulty:
Label:

Description

I am not seeing Suricata perform very well when running it on a Myricom 10G interface with the SNF (Snifferv3) driver.

I configured and built it with these flags:

./configure --with-libpcap-includes=/opt/snf/include/ --with-libpcap-libraries=/opt/snf/lib/ --prefix=/usr --sysconfdir=/etc --localstatedir=/var

Which is in accordance with the instructions here: https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Myricom

I test Suricata manually after that with 15 threads and 256MB per thread defined in the suricata.yaml, and:

SNF_NUM_RINGS=15 SNF_FLAGS=0x2 SNF_APP_ID=37 /bin/suricata -c /etc/suricata/suricata.yaml --pcap --runmode=workers

I am running on a Del 640R with 32 cores and 500GB memory. Using SNF v3.0.26 driver.

But when playing 5Gbps at the Myricom interface, I can confirm that Suricata is using the SNF driver because I can see the events coming in, in myri_counters under the SNF row. But, I see about a 50% drop rate from Suricata, as reported by myri_counters (dividing "SNF drop ring full" by "SNF recv pkts") as well as by Suricata itself -- interestingly, I think Suricata might get its drop metrics from myricom? Because if I don't reset the myri_counters, then suricata reports wacky dropped numbers, like 180%.

There are no other major applications running or listening on the interface.

htop shows a possible elephant flow (screenshot attached): work seems like is is distributed across 15 cores as intended, except that one core (alternating/random assignment) is almost always maxed out at 100%. Memory usage looks fine at about 10% across the board for each worker.

In Suricata.log, all the workers report between 45% to 100% drop rate...most at 98% loss. Which is weird, because the overall drop rate was 50%....
The most glaring statistic might be in stats.log, which shows that most packets received by the "kernel" were dropped. But memory doesn't seem to be a problem; no memcap. I assume in this context, "kernel" means the libpcap driver that I'm pointing it to, which is actually SNF? Not sure how that works.

Date: 10/24/2023 -- 15:57:37 (uptime: 0d, 00h 00m 57s)
------------------------------------------------------------------------------------
Counter                                       | TM Name                   | Value
------------------------------------------------------------------------------------
capture.kernel_packets                        | Total                     | 198656671
capture.kernel_drops                          | Total                     | 176458693
decoder.pkts                                  | Total                     | 22415469
decoder.bytes                                 | Total                     | 19306809457
decoder.ipv4                                  | Total                     | 22369028
decoder.ipv6                                  | Total                     | 4155
decoder.ethernet                              | Total                     | 22415469
decoder.tcp                                   | Total                     | 21960338
tcp.syn                                       | Total                     | 9817
tcp.synack                                    | Total                     | 5412
tcp.rst                                       | Total                     | 3109
decoder.udp                                   | Total                     | 192942
decoder.icmpv4                                | Total                     | 115515
decoder.icmpv6                                | Total                     | 3479
decoder.vlan                                  | Total                     | 15562333
decoder.avg_pkt_size                          | Total                     | 861
decoder.max_pkt_size                          | Total                     | 1518
flow.tcp                                      | Total                     | 716
flow.udp                                      | Total                     | 397
flow.icmpv4                                   | Total                     | 5
flow.icmpv6                                   | Total                     | 7
flow.tcp_reuse                                | Total                     | 1
flow.wrk.spare_sync_avg                       | Total                     | 100
flow.wrk.spare_sync                           | Total                     | 15
defrag.ipv4.fragments                         | Total                     | 100909
defrag.ipv4.reassembled                       | Total                     | 27779
decoder.event.ipv6.zero_len_padn              | Total                     | 2142
decoder.event.udp.pkt_too_small               | Total                     | 263
decoder.event.udp.hlen_invalid                | Total                     | 224
decoder.event.ipv4.frag_overlap               | Total                     | 35080
flow.wrk.flows_evicted_needs_work             | Total                     | 68
flow.wrk.flows_evicted_pkt_inject             | Total                     | 94
flow.wrk.flows_evicted                        | Total                     | 1
flow.wrk.flows_injected                       | Total                     | 68
tcp.sessions                                  | Total                     | 575
tcp.stream_depth_reached                      | Total                     | 4
tcp.reassembly_gap                            | Total                     | 2
tcp.overlap                                   | Total                     | 409
app_layer.flow.tls                            | Total                     | 174
app_layer.flow.dns_udp                        | Total                     | 136
app_layer.tx.dns_udp                          | Total                     | 5659
app_layer.flow.failed_udp                     | Total                     | 261
flow.mgr.full_hash_pass                       | Total                     | 1
flow.spare                                    | Total                     | 10100
flow.mgr.rows_maxlen                          | Total                     | 2
flow.mgr.flows_checked                        | Total                     | 238
flow.mgr.flows_notimeout                      | Total                     | 238
tcp.memuse                                    | Total                     | 9093120
tcp.reassembly_memuse                         | Total                     | 1474560
flow.memuse                                   | Total                     | 7906304

I tried CPU pinning with:

threading:
  set-cpu-affinity: true
  cpu-affinity:
  - management-cpu-set:
      cpu: ["2-5"]
  - receive-cpu-set:
      cpu:
      - 0
  - worker-cpu-set:
      cpu: ["9-23"]
      mode: exclusive
      prio:
        #low:
        #- 0
        #medium:
        #- 1-2
        #high:
        #- 3
        default:  "high" 
  detect-thread-ratio: 1.0

But that doesn't seem to make any difference in any way.

I tried upping the number of workers and rings to 30, and saw pretty much the same exact stats! I thought it might improve a bit due to smaller percentage of workers being consumed by the elephant flow, but....apparently not.

This is all at 5Gbps. I have tested at different data rates and it does ok until about 1.9Gbps; anything over that rate induces drops. This is unacceptable in production. I am looking into this because we started seeing drops about two years ago, without knowing it. I'm wondering if something changed in the Suricata sourcecode where it can no longer efficiently leverage the SNF driver as provided under the libpcap shim? It definitely is using it, it just seems like it's doing a really poor job. But hopefully I'm just missing an important config. Please advise.


Files

htop.PNG (199 KB) htop.PNG partial htop output when running suricata Peter McPherson, 10/24/2023 04:25 PM
Actions #1

Updated by Peter McPherson about 1 year ago

Oh; I have about 500 vanilla rules running. Not a big load.

Actions #2

Updated by Peter McPherson about 1 year ago

I have discovered that even when using af_packet instead of Myricom, performance is a bit better but still starts dropping packets at about 4Gbps, which is worse than I'd like. The traffic I'm capturing looks fairly balanced; just once flow that has a lot of data compared to the others.

When I run "perf top <pid of suricata>", it shows that 89% is "SCACSearch"; not sure what that is or if it's a problem.

Actions #3

Updated by Peter McPherson about 1 year ago

I have discovered that most of the issue seems to be that my PCAP has a couple large flows in it: like 500MB, 3500 seconds. When I send traffic that does not have this, suricata has no drops. I am not sure if it's the flow size or the flow length that is the issue, but do you have any recommendations for getting around this issue?

Actions #4

Updated by Peter McPherson about 1 year ago

I have also discovered that when we enable decoder-events.rules, we take a huge performance hit. I'm not exactly sure what's special about these rules or whether that's expected; there doesn't seem to be much documentation on these rules. Any insight you could give would be appreciated.

Actions

Also available in: Atom PDF