Bug #1838
closedsuricata 3.0* and 3.1 hang after heavy traffic w/ pfring zc (reproducible)
Description
suricata appears to hang upon receiving heavy traffic.
It stops reporting any packet reception and doesn't
recover upon traffic being stopped. This is easily
reproducible in our environment. CPU is quiet when it
occurs. Also, traffic is being received from other
programs during this time.
ESXI 6.0 Ubuntu 14.04 LTS 16GB RAM 6 cores
sudo suricata -c /usr/local/etc/suricata/suricata.yaml --pfring-int="zc:eth3" --pfring-cluster-id=1 --pfring-cluster-type=cluster_flow -v --init-errors-fatal & Date: 7/6/2016 -- 01:47:20 (uptime: 0d, 01h 15m 01s) ------------------------------------------------------------------------------------ Counter | TM Name | Value ------------------------------------------------------------------------------------ capture.kernel_packets | Total | 366503069 capture.kernel_drops | Total | 1232732 Date: 7/6/2016 -- 01:49:26 (uptime: 0d, 01h 17m 07s) ------------------------------------------------------------------------------------ Counter | TM Name | Value ------------------------------------------------------------------------------------ capture.kernel_packets | Total | 366503069 capture.kernel_drops | Total | 1232732 ------------------------------------------------------------------------------------ Date: 7/6/2016 -- 02:03:11 (uptime: 0d, 01h 30m 52s) ------------------------------------------------------------------------------------ Counter | TM Name | Value ------------------------------------------------------------------------------------ capture.kernel_packets | Total | 366503069 capture.kernel_drops | Total | 1232732
Traffic ~1Gbps: sudo tcpreplay -i eth3 --pps=150000 --loop=1000 MGMT.pcap* &
-rw-r--r-- 1 kevin kevin 1382236653 Jun 28 19:14 MGMT.pcap -rw-r--r-- 1 kevin kevin 1382236235 Jun 28 19:15 MGMT.pcap1 -rw-r--r-- 1 kevin kevin 1382237328 Jun 28 19:39 MGMT.pcap10 -rw-r--r-- 1 kevin kevin 1382236528 Jun 28 19:39 MGMT.pcap11 -rw-r--r-- 1 kevin kevin 1382237364 Jun 28 19:40 MGMT.pcap12 -rw-r--r-- 1 kevin kevin 1382236332 Jun 28 19:41 MGMT.pcap13 -rw-r--r-- 1 kevin kevin 1382236497 Jun 28 19:44 MGMT.pcap14 -rw-r--r-- 1 kevin kevin 1382236975 Jun 28 19:16 MGMT.pcap2 -rw-r--r-- 1 kevin kevin 1382237173 Jun 28 19:16 MGMT.pcap3 -rw-r--r-- 1 kevin kevin 1382237379 Jun 28 19:18 MGMT.pcap4 -rw-r--r-- 1 kevin kevin 1382236296 Jun 28 19:18 MGMT.pcap5 -rw-r--r-- 1 kevin kevin 1382237432 Jun 28 19:19 MGMT.pcap6 -rw-r--r-- 1 kevin kevin 1382236496 Jun 28 19:20 MGMT.pcap7 -rw-r--r-- 1 kevin kevin 1382236177 Jun 28 19:20 MGMT.pcap8 -rw-r--r-- 1 kevin kevin 1382236336 Jun 28 19:38 MGMT.pcap9
This is Suricata version 3.1 RELEASE Features: PCAP_SET_BUFF LIBPCAP_VERSION_MAJOR=1 PF_RING AF_PACKET HAVE_PACKET_FANOUT LIBCAP_NG LIBNET1.1 HAVE_HTP_URI_NORMALIZE_HOOK PCRE_JIT HAVE_NSS HAVE_LUA HAVE_LUAJIT HAVE_LIBJANSSON TLS SIMD support: SSE_4_2 SSE_4_1 SSE_3 Atomic intrisics: 1 2 4 8 16 byte(s) 64-bits, Little-endian architecture GCC version 4.8.4, C version 199901 compiled with -fstack-protector compiled with _FORTIFY_SOURCE=2 L1 cache line size (CLS)=64 thread local storage method: __thread compiled with LibHTP v0.5.20, linked against LibHTP v0.5.20 Suricata Configuration: AF_PACKET support: yes PF_RING support: yes NFQueue support: no NFLOG support: no IPFW support: no Netmap support: no DAG enabled: no Napatech enabled: no Unix socket enabled: yes Detection enabled: yes libnss support: yes libnspr support: yes libjansson support: yes hiredis support: no Prelude support: no PCRE jit: yes LUA support: yes, through luajit libluajit: yes libgeoip: yes Non-bundled htp: no Old barnyard2 support: no CUDA enabled: no Hyperscan support: no Libnet support: yes Suricatasc install: yes Profiling enabled: no Profiling locks enabled: no Development settings: Coccinelle / spatch: yes Unit tests enabled: no Debug output enabled: no Debug validation enabled: no Generic build parameters: Installation prefix: /usr/local Configuration directory: /usr/local/etc/suricata/ Log directory: /usr/local/var/log/suricata/ --prefix /usr/local --sysconfdir /usr/local/etc --localstatedir /usr/local/var Host: x86_64-pc-linux-gnu Compiler: gcc (exec name) / gcc (real) GCC Protect enabled: no GCC march native enabled: yes GCC Profile enabled: no Position Independent Executable enabled: no CFLAGS -g -O2 -march=native PCAP_CFLAGS -I/usr/local/include SECCFLAGS
%YAML 1.1 --- # Suricata configuration file. In addition to the comments describing all # options in this file, full documentation can be found at: # https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Suricatayaml #promithius: # publisher: "tcp://127.0.0.1:5557" # Number of packets allowed to be processed simultaneously. Default is a # conservative 1024. A higher number will make sure CPU's/CPU cores will be # more easily kept busy, but may negatively impact caching. # # If you are using the CUDA pattern matcher (mpm-algo: ac-cuda), different rules # apply. In that case try something like 60000 or more. This is because the CUDA # pattern matcher buffers and scans as many packets as possible in parallel. #max-pending-packets: 1024 # Runmode the engine should use. Please check --list-runmodes to get the available # runmodes for each packet acquisition method. Defaults to "autofp" (auto flow pinned # load balancing). #runmode: workers # Specifies the kind of flow load balancer used by the flow pinned autofp mode. # # Supported schedulers are: # # round-robin - Flows assigned to threads in a round robin fashion. # active-packets - Flows assigned to threads that have the lowest number of # unprocessed packets (default). # hash - Flow alloted usihng the address hash. More of a random # technique. Was the default in Suricata 1.2.1 and older. # #autofp-scheduler: active-packets #autofp-scheduler: round-robin # If suricata box is a router for the sniffed networks, set it to 'router'. If # it is a pure sniffing setup, set it to 'sniffer-only'. # If set to auto, the variable is internally switch to 'router' in IPS mode # and 'sniffer-only' in IDS mode. # This feature is currently only used by the reject* keywords. host-mode: auto # Run suricata as user and group. #run-as: # user: promithius # group: promithius # Default pid file. # Will use this file if no --pidfile in command options. #pid-file: /var/run/suricata.pid # Daemon working directory # Suricata will change directory to this one if provided # Default: "/" #daemon-directory: "/" # Preallocated size for packet. Default is 1514 which is the classical # size for pcap on ethernet. You should adjust this value to the highest # packet size (MTU + hardware header) on your system. #default-packet-size: 1514 # The default logging directory. Any log or output file will be # placed here if its not specified with a full path name. This can be # overridden with the -l command line parameter. default-log-dir: /usr/local/var/log/suricata/ # Unix command socket can be used to pass commands to suricata. # An external tool can then connect to get information from suricata # or trigger some modifications of the engine. Set enabled to yes # to activate the feature. You can use the filename variable to set # the file name of the socket. unix-command: enabled: no #filename: custom.socket # Configure the type of alert (and other) logging you would like. outputs: # a line based alerts log similar to Snort's fast.log - fast: enabled: no filename: fast.log append: yes #filetype: regular # 'regular', 'unix_stream' or 'unix_dgram' # Extensible Event Format (nicknamed EVE) event log in JSON format - eve-log: enabled: no type: file #file|syslog|unix_dgram|unix_stream filename: eve.json # the following are valid when type: syslog above #identity: "suricata" #facility: local5 #level: Info ## possible levels: Emergency, Alert, Critical, ## Error, Warning, Notice, Info, Debug types: - alert - http: extended: yes # enable this for extended logging information # custom allows additional http fields to be included in eve-log # the example below adds three additional fields when uncommented #custom: [Accept-Encoding, Accept-Language, Authorization] - dns - tls: extended: yes # enable this for extended logging information - files: force-magic: no # force logging magic on all logged files force-md5: no # force logging of md5 checksums #- drop - ssh # alert output for use with Barnyard2 - unified2-alert: enabled: no filename: unified2.alert # File size limit. Can be specified in kb, mb, gb. Just a number # is parsed as bytes. #limit: 32mb # Sensor ID field of unified2 alerts. #sensor-id: 0 # HTTP X-Forwarded-For support by adding the unified2 extra header that # will contain the actual client IP address or by overwriting the source # IP address (helpful when inspecting traffic that is being reversed # proxied). xff: enabled: no # Two operation modes are available, "extra-data" and "overwrite". Note # that in the "overwrite" mode, if the reported IP address in the HTTP # X-Forwarded-For header is of a different version of the packet # received, it will fall-back to "extra-data" mode. mode: extra-data # Header name were the actual IP address will be reported, if more than # one IP address is present, the last IP address will be the one taken # into consideration. header: X-Forwarded-For # a line based log of HTTP requests (no alerts) - http-log: enabled: no filename: http.log append: yes #extended: yes # enable this for extended logging information #custom: yes # enabled the custom logging format (defined by customformat) #customformat: "%{%D-%H:%M:%S}t.%z %{X-Forwarded-For}i %H %m %h %u %s %B %a:%p -> %A:%P" #filetype: regular # 'regular', 'unix_stream' or 'unix_dgram' # a line based log of TLS handshake parameters (no alerts) - tls-log: enabled: no # Log TLS connections. filename: tls.log # File to store TLS logs. append: yes #filetype: regular # 'regular', 'unix_stream' or 'unix_dgram' #extended: yes # Log extended information like fingerprint certs-log-dir: certs # directory to store the certificates files # a line based log of DNS requests and/or replies (no alerts) - dns-log: enabled: no filename: dns.log append: yes #filetype: regular # 'regular', 'unix_stream' or 'unix_dgram' # a line based log to used with pcap file study. # this module is dedicated to offline pcap parsing (empty output # if used with another kind of input). It can interoperate with # pcap parser like wireshark via the suriwire plugin. - pcap-info: enabled: no # Packet log... log packets in pcap format. 2 modes of operation: "normal" # and "sguil". # # In normal mode a pcap file "filename" is created in the default-log-dir, # or are as specified by "dir". In Sguil mode "dir" indicates the base directory. # In this base dir the pcaps are created in th directory structure Sguil expects: # # $sguil-base-dir/YYYY-MM-DD/$filename.<timestamp> # # By default all packets are logged except: # - TCP streams beyond stream.reassembly.depth # - encrypted streams after the key exchange # - pcap-log: enabled: no filename: log.pcap # File size limit. Can be specified in kb, mb, gb. Just a number # is parsed as bytes. limit: 1000mb # If set to a value will enable ring buffer mode. Will keep Maximum of "max-files" of size "limit" max-files: 2000 mode: normal # normal or sguil. #sguil-base-dir: /nsm_data/ #ts-format: usec # sec or usec second format (default) is filename.sec usec is filename.sec.usec use-stream-depth: no #If set to "yes" packets seen after reaching stream inspection depth are ignored. "no" logs all packets # a full alerts log containing much information for signature writers # or for investigating suspected false positives. - alert-debug: enabled: no filename: alert-debug.log append: yes #filetype: regular # 'regular', 'unix_stream' or 'unix_dgram' # alert output to prelude (http://www.prelude-technologies.com/) only # available if Suricata has been compiled with --enable-prelude - alert-prelude: enabled: no profile: suricata log-packet-content: no log-packet-header: yes # Stats.log contains data from various counters of the suricata engine. # The interval field (in seconds) tells after how long output will be written # on the log file. - stats: enabled: yes filename: stats.log interval: 8 # a line based alerts log similar to fast.log into syslog - syslog: enabled: no # reported identity to syslog. If ommited the program name (usually # suricata) will be used. #identity: "suricata" facility: local5 #level: Info ## possible levels: Emergency, Alert, Critical, ## Error, Warning, Notice, Info, Debug # a line based information for dropped packets in IPS mode - drop: enabled: no filename: drop.log append: yes #filetype: regular # 'regular', 'unix_stream' or 'unix_dgram' # output module to store extracted files to disk # # The files are stored to the log-dir in a format "file.<id>" where <id> is # an incrementing number starting at 1. For each file "file.<id>" a meta # file "file.<id>.meta" is created. # # File extraction depends on a lot of things to be fully done: # - stream reassembly depth. For optimal results, set this to 0 (unlimited) # - http request / response body sizes. Again set to 0 for optimal results. # - rules that contain the "filestore" keyword. - file-store: enabled: no # set to yes to enable log-dir: files # directory to store the files force-magic: no # force logging magic on all stored files force-md5: no # force logging of md5 checksums #waldo: file.waldo # waldo file to store the file_id across runs # output module to log files tracked in a easily parsable json format - file-log: enabled: no filename: files-json.log append: yes #filetype: regular # 'regular', 'unix_stream' or 'unix_dgram' force-magic: no # force logging magic on all logged files force-md5: no # force logging of md5 checksums # Magic file. The extension .mgc is added to the value here. #magic-file: /usr/share/file/magic magic-file: /usr/share/file/magic # When running in NFQ inline mode, it is possible to use a simulated # non-terminal NFQUEUE verdict. # This permit to do send all needed packet to suricata via this a rule: # iptables -I FORWARD -m mark ! --mark $MARK/$MASK -j NFQUEUE # And below, you can have your standard filtering ruleset. To activate # this mode, you need to set mode to 'repeat' # If you want packet to be sent to another queue after an ACCEPT decision # set mode to 'route' and set next-queue value. # On linux >= 3.1, you can set batchcount to a value > 1 to improve performance # by processing several packets before sending a verdict (worker runmode only). # On linux >= 3.6, you can set the fail-open option to yes to have the kernel # accept the packet if suricata is not able to keep pace. #nfq: # mode: accept # repeat-mark: 1 # repeat-mask: 1 # route-queue: 2 # batchcount: 20 # fail-open: yes #nflog support #nflog: # netlink multicast group # (the same as the iptables --nflog-group param) # Group 0 is used by the kernel, so you can't use it # - group: 2 # netlink buffer size # buffer-size: 18432 # put default value here # - group: default # set number of packet to queue inside kernel # qthreshold: 1 # set the delay before flushing packet in the queue inside kernel # qtimeout: 100 # netlink max buffer size # max-size: 20000 # af-packet support # Set threads to > 1 to use PACKET_FANOUT support af-packet: - interface: eth0 # Number of receive threads (>1 will enable experimental flow pinned # runmode) threads: 1 # Default clusterid. AF_PACKET will load balance packets based on flow. # All threads/processes that will participate need to have the same # clusterid. cluster-id: 99 # Default AF_PACKET cluster type. AF_PACKET can load balance per flow or per hash. # This is only supported for Linux kernel > 3.1 # possible value are: # * cluster_round_robin: round robin load balancing # * cluster_flow: all packets of a given flow are send to the same socket # * cluster_cpu: all packets treated in kernel by a CPU are send to the same socket cluster-type: cluster_flow # In some fragmentation case, the hash can not be computed. If "defrag" is set # to yes, the kernel will do the needed defragmentation before sending the packets. defrag: yes # To use the ring feature of AF_PACKET, set 'use-mmap' to yes use-mmap: yes # Ring size will be computed with respect to max_pending_packets and number # of threads. You can set manually the ring size in number of packets by setting # the following value. If you are using flow cluster-type and have really network # intensive single-flow you could want to set the ring-size independantly of the number # of threads: #ring-size: 2048 # On busy system, this could help to set it to yes to recover from a packet drop # phase. This will result in some packets (at max a ring flush) being non treated. #use-emergency-flush: yes # recv buffer size, increase value could improve performance # buffer-size: 32768 # Set to yes to disable promiscuous mode # disable-promisc: no # Choose checksum verification mode for the interface. At the moment # of the capture, some packets may be with an invalid checksum due to # offloading to the network card of the checksum computation. # Possible values are: # - kernel: use indication sent by kernel for each packet (default) # - yes: checksum validation is forced # - no: checksum validation is disabled # - auto: suricata uses a statistical approach to detect when # checksum off-loading is used. # Warning: 'checksum-validation' must be set to yes to have any validation #checksum-checks: kernel # BPF filter to apply to this interface. The pcap filter syntax apply here. #bpf-filter: port 80 or udp # You can use the following variables to activate AF_PACKET tap od IPS mode. # If copy-mode is set to ips or tap, the traffic coming to the current # interface will be copied to the copy-iface interface. If 'tap' is set, the # copy is complete. If 'ips' is set, the packet matching a 'drop' action # will not be copied. #copy-mode: ips #copy-iface: eth1 - interface: eth1 threads: 1 cluster-id: 98 cluster-type: cluster_flow defrag: yes # buffer-size: 32768 # disable-promisc: no # Put default values here - interface: default #threads: 1 #use-mmap: yes legacy: uricontent: enabled # You can specify a threshold config file by setting "threshold-file" # to the path of the threshold config file: # threshold-file: /etc/suricata/threshold.config # The detection engine builds internal groups of signatures. The engine # allow us to specify the profile to use for them, to manage memory on an # efficient way keeping a good performance. For the profile keyword you # can use the words "low", "medium", "high" or "custom". If you use custom # make sure to define the values at "- custom-values" as your convenience. # Usually you would prefer medium/high/low. # # "sgh mpm-context", indicates how the staging should allot mpm contexts for # the signature groups. "single" indicates the use of a single context for # all the signature group heads. "full" indicates a mpm-context for each # group head. "auto" lets the engine decide the distribution of contexts # based on the information the engine gathers on the patterns from each # group head. # # The option inspection-recursion-limit is used to limit the recursive calls # in the content inspection code. For certain payload-sig combinations, we # might end up taking too much time in the content inspection code. # If the argument specified is 0, the engine uses an internally defined # default limit. On not specifying a value, we use no limits on the recursion. detect-engine: - profile: medium - custom-values: toclient-src-groups: 2 toclient-dst-groups: 2 toclient-sp-groups: 2 toclient-dp-groups: 3 toserver-src-groups: 2 toserver-dst-groups: 4 toserver-sp-groups: 2 toserver-dp-groups: 25 - sgh-mpm-context: auto - inspection-recursion-limit: 3000 # When rule-reload is enabled, sending a USR2 signal to the Suricata process # will trigger a live rule reload. Experimental feature, use with care. #- rule-reload: true # If set to yes, the loading of signatures will be made after the capture # is started. This will limit the downtime in IPS mode. #- delayed-detect: yes # Suricata is multi-threaded. Here the threading can be influenced. threading: # On some cpu's/architectures it is beneficial to tie individual threads # to specific CPU's/CPU cores. In this case all threads are tied to CPU0, # and each extra CPU/core has one "detect" thread. # # On Intel Core2 and Nehalem CPU's enabling this will degrade performance. # set-cpu-affinity: no # Tune cpu affinity of suricata threads. Each family of threads can be bound # on specific CPUs. cpu-affinity: - management-cpu-set: cpu: [ 0 ] # include only these cpus in affinity settings - receive-cpu-set: cpu: [ 0 ] # include only these cpus in affinity settings - decode-cpu-set: cpu: [ 0, 1 ] mode: "balanced" - stream-cpu-set: cpu: [ "0-1" ] - detect-cpu-set: cpu: [ "all" ] mode: "exclusive" # run detect threads in these cpus # Use explicitely 3 threads and don't compute number by using # detect-thread-ratio variable: # threads: 3 prio: low: [ 0 ] medium: [ "1-2" ] high: [ 3 ] default: "medium" - verdict-cpu-set: cpu: [ 3 ] prio: default: "high" - reject-cpu-set: cpu: [ 3 ] prio: default: "low" - output-cpu-set: cpu: [ "all" ] prio: default: "medium" # # By default Suricata creates one "detect" thread per available CPU/CPU core. # This setting allows controlling this behaviour. A ratio setting of 2 will # create 2 detect threads for each CPU/CPU core. So for a dual core CPU this # will result in 4 detect threads. If values below 1 are used, less threads # are created. So on a dual core CPU a setting of 0.5 results in 1 detect # thread being created. Regardless of the setting at a minimum 1 detect # thread will always be created. # detect-thread-ratio: 1.5 # Cuda configuration. cuda: # The "mpm" profile. On not specifying any of these parameters, the engine's # internal default values are used, which are same as the ones specified in # in the default conf file. mpm: # The minimum length required to buffer data to the gpu. # Anything below this is MPM'ed on the CPU. # Can be specified in kb, mb, gb. Just a number indicates it's in bytes. # A value of 0 indicates there's no limit. data-buffer-size-min-limit: 0 # The maximum length for data that we would buffer to the gpu. # Anything over this is MPM'ed on the CPU. # Can be specified in kb, mb, gb. Just a number indicates it's in bytes. data-buffer-size-max-limit: 1500 # The ring buffer size used by the CudaBuffer API to buffer data. cudabuffer-buffer-size: 500mb # The max chunk size that can be sent to the gpu in a single go. gpu-transfer-size: 50mb # The timeout limit for batching of packets in microseconds. batching-timeout: 2000 # The device to use for the mpm. Currently we don't support load balancing # on multiple gpus. In case you have multiple devices on your system, you # can specify the device to use, using this conf. By default we hold 0, to # specify the first device cuda sees. To find out device-id associated with # the card(s) on the system run "suricata --list-cuda-cards". device-id: 0 # No of Cuda streams used for asynchronous processing. All values > 0 are valid. # For this option you need a device with Compute Capability > 1.0. cuda-streams: 2 # Select the multi pattern algorithm you want to run for scan/search the # in the engine. The supported algorithms are b2g, b2gc, b2gm, b3g, wumanber, # ac and ac-gfbs. # # The mpm you choose also decides the distribution of mpm contexts for # signature groups, specified by the conf - "detect-engine.sgh-mpm-context". # Selecting "ac" as the mpm would require "detect-engine.sgh-mpm-context" # to be set to "single", because of ac's memory requirements, unless the # ruleset is small enough to fit in one's memory, in which case one can # use "full" with "ac". Rest of the mpms can be run in "full" mode. # # There is also a CUDA pattern matcher (only available if Suricata was # compiled with --enable-cuda: b2g_cuda. Make sure to update your # max-pending-packets setting above as well if you use b2g_cuda. mpm-algo: ac # The memory settings for hash size of these algorithms can vary from lowest # (2048) - low (4096) - medium (8192) - high (16384) - higher (32768) - max # (65536). The bloomfilter sizes of these algorithms can vary from low (512) - # medium (1024) - high (2048). # # For B2g/B3g algorithms, there is a support for two different scan/search # algorithms. For B2g the scan algorithms are B2gScan & B2gScanBNDMq, and # search algorithms are B2gSearch & B2gSearchBNDMq. For B3g scan algorithms # are B3gScan & B3gScanBNDMq, and search algorithms are B3gSearch & # B3gSearchBNDMq. # # For B2g the different scan/search algorithms and, hash and bloom # filter size settings. For B3g the different scan/search algorithms and, hash # and bloom filter size settings. For wumanber the hash and bloom filter size # settings. pattern-matcher: - b2gc: search-algo: B2gSearchBNDMq hash-size: low bf-size: medium - b2gm: search-algo: B2gSearchBNDMq hash-size: low bf-size: medium - b2g: search-algo: B2gSearchBNDMq hash-size: low bf-size: medium - b3g: search-algo: B3gSearchBNDMq hash-size: low bf-size: medium - wumanber: hash-size: low bf-size: medium # Defrag settings: defrag: memcap: 32mb hash-size: 65536 trackers: 65535 # number of defragmented flows to follow max-frags: 65535 # number of fragments to keep (higher than trackers) prealloc: yes timeout: 60 # Enable defrag per host settings # host-config: # # - dmz: # timeout: 30 # address: [192.168.1.0/24, 127.0.0.0/8, 1.1.1.0/24, 2.2.2.0/24, "1.1.1.1", "2.2.2.2", "::1"] # # - lan: # timeout: 45 # address: # - 192.168.0.0/24 # - 192.168.10.0/24 # - 172.16.14.0/24 # Flow settings: # By default, the reserved memory (memcap) for flows is 32MB. This is the limit # for flow allocation inside the engine. You can change this value to allow # more memory usage for flows. # The hash-size determine the size of the hash used to identify flows inside # the engine, and by default the value is 65536. # At the startup, the engine can preallocate a number of flows, to get a better # performance. The number of flows preallocated is 10000 by default. # emergency-recovery is the percentage of flows that the engine need to # prune before unsetting the emergency state. The emergency state is activated # when the memcap limit is reached, allowing to create new flows, but # prunning them with the emergency timeouts (they are defined below). # If the memcap is reached, the engine will try to prune flows # with the default timeouts. If it doens't find a flow to prune, it will set # the emergency bit and it will try again with more agressive timeouts. # If that doesn't work, then it will try to kill the last time seen flows # not in use. # The memcap can be specified in kb, mb, gb. Just a number indicates it's # in bytes. flow: memcap: 256mb hash-size: 131072 prealloc: 100000 emergency-recovery: 30 managers: 2 recyclers: 2 # This option controls the use of vlan ids in the flow (and defrag) # hashing. Normally this should be enabled, but in some (broken) # setups where both sides of a flow are not tagged with the same vlan # tag, we can ignore the vlan id's in the flow hashing. vlan: use-for-tracking: true # Specific timeouts for flows. Here you can specify the timeouts that the # active flows will wait to transit from the current state to another, on each # protocol. The value of "new" determine the seconds to wait after a hanshake or # stream startup before the engine free the data of that flow it doesn't # change the state to established (usually if we don't receive more packets # of that flow). The value of "established" is the amount of # seconds that the engine will wait to free the flow if it spend that amount # without receiving new packets or closing the connection. "closed" is the # amount of time to wait after a flow is closed (usually zero). # # There's an emergency mode that will become active under attack circumstances, # making the engine to check flow status faster. This configuration variables # use the prefix "emergency-" and work similar as the normal ones. # Some timeouts doesn't apply to all the protocols, like "closed", for udp and # icmp. flow-timeouts: default: new: 30 established: 300 closed: 0 emergency-new: 10 emergency-established: 100 emergency-closed: 0 tcp: new: 60 established: 3600 closed: 120 emergency-new: 10 emergency-established: 300 emergency-closed: 20 udp: new: 30 established: 300 emergency-new: 10 emergency-established: 100 icmp: new: 30 established: 300 emergency-new: 10 emergency-established: 100 # Stream engine settings. Here the TCP stream tracking and reassembly # engine is configured. # # stream: # memcap: 32mb # Can be specified in kb, mb, gb. Just a # # number indicates it's in bytes. # checksum-validation: yes # To validate the checksum of received # # packet. If csum validation is specified as # # "yes", then packet with invalid csum will not # # be processed by the engine stream/app layer. # # Warning: locally generated trafic can be # # generated without checksum due to hardware offload # # of checksum. You can control the handling of checksum # # on a per-interface basis via the 'checksum-checks' # # option # prealloc-sessions: 2k # 2k sessions prealloc'd per stream thread # midstream: false # don't allow midstream session pickups # async-oneside: false # don't enable async stream handling # inline: no # stream inline mode # max-synack-queued: 5 # Max different SYN/ACKs to queue # # reassembly: # memcap: 64mb # Can be specified in kb, mb, gb. Just a number # # indicates it's in bytes. # depth: 1mb # Can be specified in kb, mb, gb. Just a number # # indicates it's in bytes. # toserver-chunk-size: 2560 # inspect raw stream in chunks of at least # # this size. Can be specified in kb, mb, # # gb. Just a number indicates it's in bytes. # # The max acceptable size is 4024 bytes. # toclient-chunk-size: 2560 # inspect raw stream in chunks of at least # # this size. Can be specified in kb, mb, # # gb. Just a number indicates it's in bytes. # # The max acceptable size is 4024 bytes. # randomize-chunk-size: yes # Take a random value for chunk size around the specified value. # # This lower the risk of some evasion technics but could lead # # detection change between runs. It is set to 'yes' by default. # randomize-chunk-range: 10 # If randomize-chunk-size is active, the value of chunk-size is # # a random value between (1 - randomize-chunk-range/100)*randomize-chunk-size # # and (1 + randomize-chunk-range/100)*randomize-chunk-size. Default value # # of randomize-chunk-range is 10. # # raw: yes # 'Raw' reassembly enabled or disabled. # # raw is for content inspection by detection # # engine. # # chunk-prealloc: 250 # Number of preallocated stream chunks. These # # are used during stream inspection (raw). # segments: # Settings for reassembly segment pool. # - size: 4 # Size of the (data)segment for a pool # prealloc: 256 # Number of segments to prealloc and keep # # in the pool. # stream: memcap: 128mb checksum-validation: yes # reject wrong csums inline: auto # auto will use inline mode in IPS mode, yes or no set it statically reassembly: memcap: 128mb depth: 1mb # reassemble 1mb into a stream toserver-chunk-size: 2560 toclient-chunk-size: 2560 randomize-chunk-size: yes #randomize-chunk-range: 10 #raw: yes #chunk-prealloc: 250 #segments: # - size: 4 # prealloc: 256 # - size: 16 # prealloc: 512 # - size: 112 # prealloc: 512 # - size: 248 # prealloc: 512 # - size: 512 # prealloc: 512 # - size: 768 # prealloc: 1024 # - size: 1448 # prealloc: 1024 # - size: 65535 # prealloc: 128 # Host table: # # Host table is used by tagging and per host thresholding subsystems. # host: hash-size: 4096 prealloc: 1000 memcap: 16777216 # Logging configuration. This is not about logging IDS alerts, but # IDS output about what its doing, errors, etc. logging: # The default log level, can be overridden in an output section. # Note that debug level logging will only be emitted if Suricata was # compiled with the --enable-debug configure option. # # This value is overriden by the SC_LOG_LEVEL env var. default-log-level: notice # The default output format. Optional parameter, should default to # something reasonable if not provided. Can be overriden in an # output section. You can leave this out to get the default. # # This value is overriden by the SC_LOG_FORMAT env var. #default-log-format: "[%i] %t - (%f:%l) <%d> (%n) -- " # A regex to filter output. Can be overridden in an output section. # Defaults to empty (no filter). # # This value is overriden by the SC_LOG_OP_FILTER env var. default-output-filter: # Define your logging outputs. If none are defined, or they are all # disabled you will get the default - console output. outputs: - console: enabled: no - file: enabled: no filename: /var/log/suricata.log - syslog: enabled: no facility: local5 format: "[%i] <%d> -- " # Tilera mpipe configuration. for use on Tilera TILE-Gx. mpipe: # Load balancing modes: "static", "dynamic", "sticky", or "round-robin". load-balance: dynamic # Number of Packets in each ingress packet queue. Must be 128, 512, 2028 or 65536 iqueue-packets: 2048 # List of interfaces we will listen on. inputs: - interface: xgbe2 - interface: xgbe3 - interface: xgbe4 # Relative weight of memory for packets of each mPipe buffer size. stack: size128: 0 size256: 9 size512: 0 size1024: 0 size1664: 7 size4096: 0 size10386: 0 size16384: 0 # PF_RING configuration. for use with native PF_RING support # for more info see http://www.ntop.org/PF_RING.html pfring: - interface: zc:1@0 #eth3@0 # Number of receive threads (>1 will enable experimental flow pinned # runmode) threads: 1 # Default clusterid. PF_RING will load balance packets based on flow. # All threads/processes that will participate need to have the same # clusterid. # cluster-id: 1 # Default PF_RING cluster type. PF_RING can load balance per flow or per hash. # This is only supported in versions of PF_RING > 4.1.1. cluster-type: cluster_flow # bpf filter for this interface #bpf-filter: tcp # Choose checksum verification mode for the interface. At the moment # of the capture, some packets may be with an invalid checksum due to # offloading to the network card of the checksum computation. # Possible values are: # - rxonly: only compute checksum for packets received by network card. # - yes: checksum validation is forced # - no: checksum validation is disabled # - auto: suricata uses a statistical approach to detect when # checksum off-loading is used. (default) # Warning: 'checksum-validation' must be set to yes to have any validation #checksum-checks: auto # Second interface # - interface: zc:1@1 #eth3@1 # threads: 1 # cluster-id: 1 # cluster-type: cluster_flow # Put default values here - interface: default threads: 1 pcap: - interface: eth0 # On Linux, pcap will try to use mmaped capture and will use buffer-size # as total of memory used by the ring. So set this to something bigger # than 1% of your bandwidth. #buffer-size: 16777216 #bpf-filter: "tcp and port 25" # Choose checksum verification mode for the interface. At the moment # of the capture, some packets may be with an invalid checksum due to # offloading to the network card of the checksum computation. # Possible values are: # - yes: checksum validation is forced # - no: checksum validation is disabled # - auto: suricata uses a statistical approach to detect when # checksum off-loading is used. (default) # Warning: 'checksum-validation' must be set to yes to have any validation #checksum-checks: auto # With some accelerator cards using a modified libpcap (like myricom), you # may want to have the same number of capture threads as the number of capture # rings. In this case, set up the threads variable to N to start N threads # listening on the same interface. #threads: 16 # set to no to disable promiscuous mode: #promisc: no # set snaplen, if not set it defaults to MTU if MTU can be known # via ioctl call and to full capture if not. #snaplen: 1518 # Put default values here - interface: default #checksum-checks: auto pcap-file: # Possible values are: # - yes: checksum validation is forced # - no: checksum validation is disabled # - auto: suricata uses a statistical approach to detect when # checksum off-loading is used. (default) # Warning: 'checksum-validation' must be set to yes to have checksum tested checksum-checks: auto # For FreeBSD ipfw(8) divert(4) support. # Please make sure you have ipfw_load="YES" and ipdivert_load="YES" # in /etc/loader.conf or kldload'ing the appropriate kernel modules. # Additionally, you need to have an ipfw rule for the engine to see # the packets from ipfw. For Example: # # ipfw add 100 divert 8000 ip from any to any # # The 8000 above should be the same number you passed on the command # line, i.e. -d 8000 # ipfw: # Reinject packets at the specified ipfw rule number. This config # option is the ipfw rule number AT WHICH rule processing continues # in the ipfw processing system after the engine has finished # inspecting the packet for acceptance. If no rule number is specified, # accepted packets are reinjected at the divert rule which they entered # and IPFW rule processing continues. No check is done to verify # this will rule makes sense so care must be taken to avoid loops in ipfw. # ## The following example tells the engine to reinject packets # back into the ipfw firewall AT rule number 5500: # # ipfw-reinjection-rule-number: 5500 # Set the default rule path here to search for the files. # if not set, it will look at the current working dir default-rule-path: /usr/local/etc/suricata/rules rule-files: - botcc.rules - ciarmy.rules - compromised.rules - drop.rules - dshield.rules - emerging-activex.rules - emerging-attack_response.rules - emerging-chat.rules - emerging-current_events.rules - emerging-dns.rules - emerging-dos.rules - emerging-exploit.rules - emerging-ftp.rules # - emerging-games.rules # - emerging-icmp_info.rules # - emerging-icmp.rules - emerging-imap.rules # - emerging-inappropriate.rules - emerging-malware.rules - emerging-misc.rules # - emerging-mobile_malware.rules - emerging-netbios.rules - emerging-p2p.rules - emerging-policy.rules - emerging-pop3.rules - emerging-rpc.rules # - emerging-scada.rules - emerging-scan.rules - emerging-shellcode.rules - emerging-smtp.rules - emerging-snmp.rules - emerging-sql.rules - emerging-telnet.rules - emerging-tftp.rules - emerging-trojan.rules - emerging-user_agents.rules - emerging-voip.rules - emerging-web_client.rules - emerging-web_server.rules - emerging-web_specific_apps.rules - emerging-worm.rules - tor.rules # - decoder-events.rules # available in suricata sources under rules dir # - stream-events.rules # available in suricata sources under rules dir # - http-events.rules # available in suricata sources under rules dir # - smtp-events.rules # available in suricata sources under rules dir - dns-events.rules # available in suricata sources under rules dir - tls-events.rules # available in suricata sources under rules dir classification-file: /usr/local/etc/suricata/classification.config reference-config-file: /usr/local/etc/suricata/reference.config # Holds variables that would be used by the engine. vars: # Holds the address group vars that would be passed in a Signature. # These would be retrieved during the Signature address parsing stage. address-groups: HOME_NET: "[192.168.0.0/16,10.0.0.0/16,172.16.0.0/12,10.1.20.0/24]" EXTERNAL_NET: "!$HOME_NET" HTTP_SERVERS: "$HOME_NET" SMTP_SERVERS: "$HOME_NET" SQL_SERVERS: "$HOME_NET" DNS_SERVERS: "$HOME_NET" TELNET_SERVERS: "$HOME_NET" AIM_SERVERS: "$EXTERNAL_NET" DNP3_SERVER: "$HOME_NET" DNP3_CLIENT: "$HOME_NET" MODBUS_CLIENT: "$HOME_NET" MODBUS_SERVER: "$HOME_NET" ENIP_CLIENT: "$HOME_NET" ENIP_SERVER: "$HOME_NET" # Holds the port group vars that would be passed in a Signature. # These would be retrieved during the Signature port parsing stage. port-groups: HTTP_PORTS: "80" SHELLCODE_PORTS: "!80" ORACLE_PORTS: 1521 SSH_PORTS: 22 DNP3_PORTS: 20000 # Set the order of alerts bassed on actions # The default order is pass, drop, reject, alert action-order: - pass - drop - reject - alert # IP Reputation #reputation-categories-file: /usr/local/etc/suricata/iprep/categories.txt #default-reputation-path: /usr/local/etc/suricata/iprep #reputation-files: # - reputation.list # Host specific policies for defragmentation and TCP stream # reassembly. The host OS lookup is done using a radix tree, just # like a routing table so the most specific entry matches. host-os-policy: # Make the default policy windows. windows: [0.0.0.0/0] bsd: [] bsd-right: [] old-linux: [] linux: [10.0.0.0/8, 192.168.1.100, "8762:2352:6241:7245:E000:0000:0000:0000"] old-solaris: [] solaris: ["::1"] hpux10: [] hpux11: [] irix: [] macos: [] vista: [] windows2k3: [] # Limit for the maximum number of asn1 frames to decode (default 256) asn1-max-frames: 256 # When run with the option --engine-analysis, the engine will read each of # the parameters below, and print reports for each of the enabled sections # and exit. The reports are printed to a file in the default log dir # given by the parameter "default-log-dir", with engine reporting # subsection below printing reports in its own report file. engine-analysis: # enables printing reports for fast-pattern for every rule. rules-fast-pattern: yes # enables printing reports for each rule rules: yes #recursion and match limits for PCRE where supported pcre: match-limit: 3500 match-limit-recursion: 1500 # Holds details on the app-layer. The protocols section details each protocol. # Under each protocol, the default value for detection-enabled and " # parsed-enabled is yes, unless specified otherwise. # Each protocol covers enabling/disabling parsers for all ipprotos # the app-layer protocol runs on. For example "dcerpc" refers to the tcp # version of the protocol as well as the udp version of the protocol. # The option "enabled" takes 3 values - "yes", "no", "detection-only". # "yes" enables both detection and the parser, "no" disables both, and # "detection-only" enables detection only(parser disabled). app-layer: protocols: tls: enabled: yes detection-ports: dp: 443 #no-reassemble: yes dcerpc: enabled: yes ftp: enabled: yes ssh: enabled: yes smtp: enabled: yes imap: enabled: detection-only msn: enabled: detection-only smb: enabled: yes detection-ports: dp: 139 # smb2 detection is disabled internally inside the engine. #smb2: # enabled: yes dns: # memcaps. Globally and per flow/state. #global-memcap: 16mb #state-memcap: 512kb # How many unreplied DNS requests are considered a flood. # If the limit is reached, app-layer-event:dns.flooded; will match. #request-flood: 500 tcp: enabled: yes detection-ports: dp: 53 udp: enabled: yes detection-ports: dp: 53 http: enabled: yes # memcap: 64mb ########################################################################### # Configure libhtp. # # # default-config: Used when no server-config matches # personality: List of personalities used by default # request-body-limit: Limit reassembly of request body for inspection # by http_client_body & pcre /P option. # response-body-limit: Limit reassembly of response body for inspection # by file_data, http_server_body & pcre /Q option. # double-decode-path: Double decode path section of the URI # double-decode-query: Double decode query section of the URI # # server-config: List of server configurations to use if address matches # address: List of ip addresses or networks for this block # personalitiy: List of personalities used by this block # request-body-limit: Limit reassembly of request body for inspection # by http_client_body & pcre /P option. # response-body-limit: Limit reassembly of response body for inspection # by file_data, http_server_body & pcre /Q option. # double-decode-path: Double decode path section of the URI # double-decode-query: Double decode query section of the URI # # uri-include-all: Include all parts of the URI. By default the # 'scheme', username/password, hostname and port # are excluded. Setting this option to true adds # all of them to the normalized uri as inspected # by http_uri, urilen, pcre with /U and the other # keywords that inspect the normalized uri. # Note that this does not affect http_raw_uri. # Also, note that including all was the default in # 1.4 and 2.0beta1. # # meta-field-limit: Hard size limit for request and response size # limits. Applies to request line and headers, # response line and headers. Does not apply to # request or response bodies. Default is 18k. # If this limit is reached an event is raised. # # Currently Available Personalities: # Minimal # Generic # IDS (default) # IIS_4_0 # IIS_5_0 # IIS_5_1 # IIS_6_0 # IIS_7_0 # IIS_7_5 # Apache_2 ########################################################################### libhtp: default-config: personality: IDS # Can be specified in kb, mb, gb. Just a number indicates # it's in bytes. request-body-limit: 3072 response-body-limit: 3072 # inspection limits request-body-minimal-inspect-size: 32kb request-body-inspect-window: 4kb response-body-minimal-inspect-size: 32kb response-body-inspect-window: 4kb # Take a random value for inspection sizes around the specified value. # This lower the risk of some evasion technics but could lead # detection change between runs. It is set to 'yes' by default. #randomize-inspection-sizes: yes # If randomize-inspection-sizes is active, the value of various # inspection size will be choosen in the [1 - range%, 1 + range%] # range # Default value of randomize-inspection-range is 10. #randomize-inspection-range: 10 # decoding double-decode-path: no double-decode-query: no server-config: #- apache: # address: [192.168.1.0/24, 127.0.0.0/8, "::1"] # personality: Apache_2 # # Can be specified in kb, mb, gb. Just a number indicates # # it's in bytes. # request-body-limit: 4096 # response-body-limit: 4096 # double-decode-path: no # double-decode-query: no #- iis7: # address: # - 192.168.0.0/24 # - 192.168.10.0/24 # personality: IIS_7_0 # # Can be specified in kb, mb, gb. Just a number indicates # # it's in bytes. # request-body-limit: 4096 # response-body-limit: 4096 # double-decode-path: no # double-decode-query: no # Profiling settings. Only effective if Suricata has been built with the # the --enable-profiling configure flag. # profiling: # Run profiling for every xth packet. The default is 1, which means we # profile every packet. If set to 1000, one packet is profiled for every # 1000 received. #sample-rate: 1000 # rule profiling rules: # Profiling can be disabled here, but it will still have a # performance impact if compiled in. enabled: no filename: rule_perf.log append: yes # Sort options: ticks, avgticks, checks, matches, maxticks sort: avgticks # Limit the number of items printed at exit. limit: 100 # per keyword profiling keywords: enabled: no filename: keyword_perf.log append: yes # packet profiling packets: # Profiling can be disabled here, but it will still have a # performance impact if compiled in. enabled: no filename: packet_stats.log append: yes # per packet csv output csv: # Output can be disabled here, but it will still have a # performance impact if compiled in. enabled: no filename: packet_stats.csv # profiling of locking. Only available when Suricata was built with # --enable-profiling-locks. locks: enabled: no filename: lock_stats.log append: yes # Suricata core dump configuration. Limits the size of the core dump file to # approximately max-dump. The actual core dump size will be a multiple of the # page size. Core dumps that would be larger than max-dump are truncated. On # Linux, the actual core dump size may be a few pages larger than max-dump. # Setting max-dump to 0 disables core dumping. # Setting max-dump to 'unlimited' will give the full core dump file. # On 32-bit Linux, a max-dump value >= ULONG_MAX may cause the core dump size # to be 'unlimited'. coredump: max-dump: unlimited napatech: # The Host Buffer Allowance for all streams # (-1 = OFF, 1 - 100 = percentage of the host buffer that can be held back) hba: -1 # use_all_streams set to "yes" will query the Napatech service for all configured # streams and listen on all of them. When set to "no" the streams config array # will be used. use-all-streams: yes # The streams to listen on streams: [1, 2, 3] # Includes. Files included here will be handled as if they were # inlined in this configuration file. #include: include1.yaml #include: include2.yaml
OS Ubuntu 14.04 LTS
Files
Updated by Peter Manev over 8 years ago
Where do you do the replay - is it on the same machine?
Can you share (please use attach instead of copy/paste) your suricata.log when started in verbose mode (-vvv)?
Does this happen if you use any other capture method - afpacket for example?
In your command line there is - "zc:eth3" but in the pfring section the config is referring to "zc:1@0"/"zc:1@1" - is that intended?
Updated by Victor Julien over 8 years ago
When hangs again, can you attach to the process with gdb and share the output of 'thread apply all bt'. Attaching to the process: gdb -p $(pidof suricata)
Updated by kevin buchanan over 8 years ago
- File 1838-1-autofp.txt 1838-1-autofp.txt added
Awesome response time!!!
Replay:
We send from multiple machines and local VMs. It occurs in 10G non-ZC VM environment as well. Has yet to occur in a low production bare metal situation. All using PF RING 6.3 and
recently 6.4. Both binaries and local compiles, but otherwise stock.
No copy paste :).
Attached: 1838-1-vvv.txt and 1838-1-autofp.txt. Both asserted.
Other instances:
We typically run as:
sudo LD_LIBRARY_PATH=/usr/local/pfring/lib suricata -c /usr/local/etc/suricata/suricata.yaml --pfring-int="eth3" --pfring-cluster-id=1 --pfring-cluster-type=cluster_flow --runmode=autofp
w/ basic suricata.yaml changes to memory allocation and rule ingestion.
zc queue config was from a recent experiment. I'll help recreate and verify
if that'll help. We may be contributors at some point. I'd love to make some
suggestions.
Thanks for looking at this, esp. so quickly. Excellent SW folks!
Updated by kevin buchanan over 8 years ago
- File 1838-1-vvv.txt 1838-1-vvv.txt added
Interesting feature.
The triple verbose file :).
Updated by kevin buchanan over 8 years ago
Victor Julien wrote:
When hangs again, can you attach to the process with gdb and share the output of 'thread apply all bt'. Attaching to the process: gdb -p $(pidof suricata)
Sure. Do I need to enable anything prior to doing so; compile flags?
I'll try and get this before the weekend. Thx
Updated by Victor Julien over 8 years ago
I'm a bit confused. You mention it 'hangs' but the attached outputs show that it aborts on startup. Are there 2 issues?
Updated by kevin buchanan over 8 years ago
- File 1838-2-autofp.txt 1838-2-autofp.txt added
- File 1838-2-vvv.txt 1838-2-vvv.txt added
Victor Julien wrote:
I'm a bit confused. You mention it 'hangs' but the attached outputs show that it aborts on startup. Are there 2 issues?
My bad. That was against an experimental version same config.
See 1838-2* attached. Both startup normally. And yes this is
a stock image we're discussing. Same for 3.0.1.
Looks like I have some work to do as well.
Updated by kevin buchanan over 8 years ago
- File 1838-1-gdb-dualflowhandlers.txt 1838-1-gdb-dualflowhandlers.txt added
- File 1838-2-gdb-single-flowmgrAndrecycler.txt 1838-2-gdb-single-flowmgrAndrecycler.txt added
kevin buchanan wrote:
Victor Julien wrote:
When hangs again, can you attach to the process with gdb and share the output of 'thread apply all bt'. Attaching to the process: gdb -p $(pidof suricata)
Sure. Do I need to enable anything prior to doing so; compile flags?
I'll try and get this before the weekend. Thx
Here are 2 gdb outputs. 1 w/ the original config and the other w/ a single flow manager and recycler. I've got the last session in that state if you require more info. Occurs quickly in a 10G environment.
Stock 3.1 w/ --enable-debug.
Thx
Updated by Victor Julien over 8 years ago
Can you set your max-pending-packets in your yaml to 64000 as a test?
Updated by kevin buchanan over 8 years ago
Nothing yet on 2 setups. I'll try and turn the heat up. Thx.
Updated by kevin buchanan over 8 years ago
That number suffices in our environments; for now. Thanks.
Updated by Andreas Herz over 8 years ago
- Assignee set to OISF Dev
- Target version set to TBD
Should we close it or try to narrow it down if we can improve on that side?
Updated by kevin buchanan over 8 years ago
In my experience this could be considered a psirt so closing it doesn't seem like an option.
Using 64K packets per thread probably works in most environments, but having the flow mgr.
starved waiting on packets ...
Updated by Peter Manev over 8 years ago
The flow-timeouts come into play as well -
tcp: new: 60 established: 3600
it means it would wait 1 hr for an established tcp flow if no packets are seen - before it is timesout/cleaned up.
Updated by kevin buchanan over 8 years ago
Absolutely. Running w/ max-packets=1204 and flow-timeout=360 now. Will advise; thanks.
Updated by kevin buchanan over 8 years ago
max-pending-packets: 1024
established: 360
makes a significant improvement in general thanks. not sure how that value got changed.
given this, it's unlikely that others will have issues. However, when traffic was stopped
(original issue), the device didn't recover as well as one might expect. But to be fair
we may not have given it enough time. No problem closing the issue if that's the consensus.
Thanks again.
Updated by kevin buchanan over 8 years ago
Looks like my last comment was premature. While it takes longer it still occurs w/ suggested changes and back to 1024 packets.
Once it occurred again traffic was stopped and then sent @100pps for the shown timespan.
Date: 8/3/2016 -- 06:16:16 (uptime: 0d, 16h 53m 10s)
------------------------------------------------------------------------------------
Counter | TM Name | Value
------------------------------------------------------------------------------------
capture.kernel_packets | Total | 10007146482
capture.kernel_drops | Total | 62456553
decoder.pkts | Total | 10007146483
------------------------------------------------------------------------------------
Date: 8/3/2016 -- 07:20:33 (uptime: 0d, 17h 57m 27s)
------------------------------------------------------------------------------------
Counter | TM Name | Value
------------------------------------------------------------------------------------
capture.kernel_packets | Total | 10007146482
capture.kernel_drops | Total | 62456553
decoder.pkts | Total | 10007146483
Updated by Peter Manev over 8 years ago
What are the pfring specific stats at the same time in proc/net/pf_ring ?
Updated by kevin buchanan over 8 years ago
The stats/ is empty. This is a VM w/ a recent ntop PF RING package installed
and currently non-responsive. Default max-packets-pending 1024 and flow timeout
of 360.
03:07 0:00 sudo suricata -c /usr/local/etc/suricata/suricata.yaml --pfring-int=eth1 --pfring-cluster-id=1 --pfring-cluster-type=cluster_flow --runmode=autofp -v --init-errors-fatal
prom 4455 5.9 12.1 1053108 493660 ? Sl 03:07 60:17 suricata -c /usr/local/etc/suricata/suricata.yaml --pfring-int=eth1 --pfring-cluster-id=1 --pfring-cluster-type=cluster_flow --runmode=autofp -v --init-errors-fatal
PF_RING Version : 6.4.1 (6.4.1-stable:ea4d1f549e438269e6f4acb94bd52922b5f8b23c)Total rings : 1
Name: eth1
Index: 3
Address: 00:0C:29:42:BE:65
Polling Mode: NAPI
Type: Ethernet
Family: Standard NIC
- Bound Sockets: 1
Max # TX Queues: 2 - Used RX Queues: 2
Standard (non ZC) Options
Ring slots : 4096
Slot version : 16
Capture TX : Yes [RX+TX]
IP Defragment : No
Socket Mode : Standard
Total plugins : 0
Cluster Fragment Queue : 0
Cluster Fragment Discard : 0
Updated by Peter Manev over 8 years ago
Can you try using "-vvv" and share the suricata.log ?
Updated by kevin buchanan over 8 years ago
sudo suricata -c /usr/local/etc/suricata/suricata.yaml --pfring-int=eth1 --pfring-cluster-id=1 --pfring-cluster-type=cluster_flow --runmode=autofp -vvv --init-errors-fatal Warning: Invalid/No global_log_level assigned by user. Falling back on the default_log_level "Info" 21/9/2016 -- 01:20:42 - <Notice> - This is Suricata version 3.0.1 RELEASE 21/9/2016 -- 01:20:42 - <Info> - CPUs/cores online: 2 21/9/2016 -- 01:20:42 - <Info> - 'default' server has 'request-body-minimal-inspect-size' set to 33882 and 'request-body-inspect-window' set to 4053 after randomization. 21/9/2016 -- 01:20:42 - <Info> - 'default' server has 'response-body-minimal-inspect-size' set to 42119 and 'response-body-inspect-window' set to 16872 after randomization. 21/9/2016 -- 01:20:42 - <Info> - DNS request flood protection level: 500 21/9/2016 -- 01:20:42 - <Info> - DNS per flow memcap (state-memcap): 524288 21/9/2016 -- 01:20:42 - <Info> - DNS global memcap: 16777216 21/9/2016 -- 01:20:42 - <Info> - Protocol detection and parser disabled for modbus protocol. 21/9/2016 -- 01:20:42 - <Info> - Found an MTU of 1500 for 'eth1' 21/9/2016 -- 01:20:42 - <Info> - allocated 3670016 bytes of memory for the defrag hash... 65536 buckets of size 56 21/9/2016 -- 01:20:42 - <Info> - preallocated 65535 defrag trackers of size 168 21/9/2016 -- 01:20:42 - <Info> - defrag memory usage: 14679896 bytes, maximum: 33554432 21/9/2016 -- 01:20:42 - <Info> - allocated 65536 bytes of memory for the host hash... 1024 buckets of size 64 21/9/2016 -- 01:20:42 - <Info> - preallocated 1000 hosts of size 136 21/9/2016 -- 01:20:42 - <Info> - host memory usage: 201536 bytes, maximum: 6291456 21/9/2016 -- 01:20:42 - <Info> - allocated 4194304 bytes of memory for the flow hash... 65536 buckets of size 64 21/9/2016 -- 01:20:42 - <Info> - preallocated 10000 flows of size 296 21/9/2016 -- 01:20:42 - <Info> - flow memory usage: 7154304 bytes, maximum: 33554432 21/9/2016 -- 01:20:42 - <Info> - stream "prealloc-sessions": 2048 (per thread) 21/9/2016 -- 01:20:42 - <Info> - stream "memcap": 33554432 21/9/2016 -- 01:20:42 - <Info> - stream "midstream" session pickups: disabled 21/9/2016 -- 01:20:42 - <Info> - stream "async-oneside": disabled 21/9/2016 -- 01:20:42 - <Info> - stream "checksum-validation": enabled 21/9/2016 -- 01:20:42 - <Info> - stream."inline": disabled 21/9/2016 -- 01:20:42 - <Info> - stream "max-synack-queued": 5 21/9/2016 -- 01:20:42 - <Info> - stream.reassembly "memcap": 134217728 21/9/2016 -- 01:20:42 - <Info> - stream.reassembly "depth": 1048576 21/9/2016 -- 01:20:42 - <Info> - stream.reassembly "toserver-chunk-size": 2616 21/9/2016 -- 01:20:42 - <Info> - stream.reassembly "toclient-chunk-size": 2481 21/9/2016 -- 01:20:42 - <Info> - stream.reassembly.raw: enabled 21/9/2016 -- 01:20:42 - <Info> - segment pool: pktsize 4, prealloc 256 21/9/2016 -- 01:20:42 - <Info> - segment pool: pktsize 16, prealloc 512 21/9/2016 -- 01:20:42 - <Info> - segment pool: pktsize 112, prealloc 512 21/9/2016 -- 01:20:42 - <Info> - segment pool: pktsize 248, prealloc 512 21/9/2016 -- 01:20:42 - <Info> - segment pool: pktsize 512, prealloc 512 21/9/2016 -- 01:20:42 - <Info> - segment pool: pktsize 768, prealloc 1024 21/9/2016 -- 01:20:42 - <Info> - segment pool: pktsize 1448, prealloc 1024 21/9/2016 -- 01:20:42 - <Info> - segment pool: pktsize 65535, prealloc 128 21/9/2016 -- 01:20:42 - <Info> - stream.reassembly "chunk-prealloc": 250 21/9/2016 -- 01:20:42 - <Info> - stream.reassembly "zero-copy-size": 128 21/9/2016 -- 01:20:42 - <Info> - allocated 262144 bytes of memory for the ippair hash... 4096 buckets of size 64 21/9/2016 -- 01:20:42 - <Info> - preallocated 1000 ippairs of size 136 21/9/2016 -- 01:20:42 - <Info> - ippair memory usage: 398144 bytes, maximum: 16777216 21/9/2016 -- 01:20:42 - <Info> - Delayed detect disabled 21/9/2016 -- 01:20:42 - <Info> - IP reputation disabled 21/9/2016 -- 01:20:42 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/botcc.rules 21/9/2016 -- 01:20:42 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/compromised.rules 21/9/2016 -- 01:20:42 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/dshield.rules 21/9/2016 -- 01:20:42 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-activex.rules 21/9/2016 -- 01:20:42 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-attack_response.rules 21/9/2016 -- 01:20:42 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-chat.rules 21/9/2016 -- 01:20:42 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-current_events.rules 21/9/2016 -- 01:20:42 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-dns.rules 21/9/2016 -- 01:20:42 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-dos.rules 21/9/2016 -- 01:20:42 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-exploit.rules 21/9/2016 -- 01:20:42 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-ftp.rules 21/9/2016 -- 01:20:42 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-imap.rules 21/9/2016 -- 01:20:42 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-malware.rules 21/9/2016 -- 01:20:42 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-misc.rules 21/9/2016 -- 01:20:42 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-mobile_malware.rules 21/9/2016 -- 01:20:42 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-netbios.rules 21/9/2016 -- 01:20:43 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-p2p.rules 21/9/2016 -- 01:20:43 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-policy.rules 21/9/2016 -- 01:20:43 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-pop3.rules 21/9/2016 -- 01:20:43 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-rpc.rules 21/9/2016 -- 01:20:43 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-scan.rules 21/9/2016 -- 01:20:43 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-shellcode.rules 21/9/2016 -- 01:20:43 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-smtp.rules 21/9/2016 -- 01:20:43 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-snmp.rules 21/9/2016 -- 01:20:43 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-sql.rules 21/9/2016 -- 01:20:43 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-telnet.rules 21/9/2016 -- 01:20:43 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-tftp.rules 21/9/2016 -- 01:20:43 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-trojan.rules 21/9/2016 -- 01:20:44 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-user_agents.rules 21/9/2016 -- 01:20:44 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-voip.rules 21/9/2016 -- 01:20:44 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-web_client.rules 21/9/2016 -- 01:20:44 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-web_server.rules 21/9/2016 -- 01:20:44 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-web_specific_apps.rules 21/9/2016 -- 01:20:45 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/emerging-worm.rules 21/9/2016 -- 01:20:45 - <Info> - Loading rule file: /usr/local/etc/suricata/rules/tls-events.rules 21/9/2016 -- 01:20:45 - <Info> - 35 rule files processed. 17081 rules successfully loaded, 0 rules failed 21/9/2016 -- 01:20:45 - <Info> - 17089 signatures processed. 289 are IP-only rules, 5800 are inspecting packet payload, 13358 inspect application layer, 0 are decoder event only 21/9/2016 -- 01:20:45 - <Info> - building signature grouping structure, stage 1: preprocessing rules... complete 21/9/2016 -- 01:20:45 - <Info> - building signature grouping structure, stage 2: building source address list... complete 21/9/2016 -- 01:20:47 - <Info> - building signature grouping structure, stage 3: building destination address lists... complete 21/9/2016 -- 01:20:47 - <Info> - Threshold config parsed: 0 rule(s) found 21/9/2016 -- 01:20:47 - <Info> - Core dump size set to unlimited. 21/9/2016 -- 01:20:47 - <Info> - stats output device (regular) initialized: stats.log 21/9/2016 -- 01:20:47 - <Info> - AutoFP mode using "Active Packets" flow load balancer 21/9/2016 -- 01:20:47 - <Info> - Using flow cluster mode for PF_RING (iface eth1) 21/9/2016 -- 01:20:47 - <Info> - Going to use 1 ReceivePfring receive thread(s) 21/9/2016 -- 01:20:47 - <Info> - preallocated 1024 packets. Total memory 3600384 21/9/2016 -- 01:20:47 - <Info> - (RxPFR1) Using PF_RING v.6.4.1, interface eth1, cluster-id 1, single-pfring-thread 21/9/2016 -- 01:20:47 - <Info> - RunModeIdsPfringAutoFp initialised 21/9/2016 -- 01:20:47 - <Info> - dropped the caps for main thread 21/9/2016 -- 01:20:47 - <Info> - using 1 flow manager threads 21/9/2016 -- 01:20:47 - <Info> - preallocated 1024 packets. Total memory 3600384 21/9/2016 -- 01:20:47 - <Info> - using 1 flow recycler threads 21/9/2016 -- 01:20:47 - <Notice> - all 4 packet processing threads, 4 management threads initialized, engine started.
Updated by Peter Manev over 8 years ago
I have two suggestions which we could try and see if we can pinpoint any problems.
- Try suricata 3.1.2 with runmode workers and pfring
- Try suricata 3.1.2 with runmode workers and afpacket
Updated by kevin buchanan over 8 years ago
It's pretty easy to reproduce. If it's un-reproducible there, then I'll see what I can do, but
I don't currently have the time to deal w/ another version this week. It happens 100% of the time
w/ the default packet pool size in less than 30 mins w/ like 200K pps avg. size 450B.
My first guess would be that it's stuck in PacketPoolWaitForN within the flow recycler. Can you
rule that out? Give me some insight and I may be able to help more.
Thx Kevin
Updated by Peter Manev almost 8 years ago
I can not reproduce that case.
Do you still experience the same problem if you use regular pfring for example(non ZC)?
Updated by kevin buchanan almost 8 years ago
It occurs w/ or w/o ZC w/ 1024 pkt pools, but not w/ 4096. If I remember correctly 2048 was also a problem.
I switched back to the original mode, cfg, ran 100000 pps via tcpreplay and after several hours the same
issue in 3.1. Other engines do not experience this loss (tcpdump, ntopng).
This output never changes one the issue occurs even though traffic was stopped for several minutes
and restarted.
Date: 1/30/2017 -- 00:41:55 (uptime: 0d, 11h 25m 28s)
------------------------------------------------------------------------------------
Counter | TM Name | Value
------------------------------------------------------------------------------------
capture.kernel_packets | Total | 289272133
capture.kernel_drops | Total | 44399579
decoder.pkts | Total | 289319685
decoder.bytes | Total | 98835435648
decoder.invalid | Total | 308
decoder.ipv4 | Total | 284882574
decoder.ipv6 | Total | 622740
decoder.ethernet | Total | 289319685
decoder.tcp | Total | 268885694
decoder.udp | Total | 14024561
decoder.sctp | Total | 3714
decoder.icmpv4 | Total | 1130995
decoder.icmpv6 | Total | 454510
decoder.pppoe | Total | 2834
decoder.vlan | Total | 204
decoder.vlan_qinq | Total | 204
decoder.teredo | Total | 24630
decoder.mpls | Total | 1617
decoder.avg_pkt_size | Total | 341
decoder.max_pkt_size | Total | 1514
defrag.ipv4.fragments | Total | 712530
defrag.ipv4.reassembled | Total | 2601
decoder.ipv4.trunc_pkt | Total | 8
decoder.tcp.invalid_optlen | Total | 100
decoder.udp.hlen_invalid | Total | 200
tcp.sessions | Total | 56637205
tcp.pseudo | Total | 710020
tcp.syn | Total | 109208121
tcp.synack | Total | 5017919
tcp.rst | Total | 9518315
tcp.segment_memcap_drop | Total | 1181762
tcp.stream_depth_reached | Total | 238
tcp.reassembly_gap | Total | 1386125
detect.alert | Total | 14621224
flow_mgr.closed_pruned | Total | 4692919
flow_mgr.new_pruned | Total | 50791364
flow_mgr.est_pruned | Total | 1768216
flow.spare | Total | 100000
flow.emerg_mode_entered | Total | 100
flow.emerg_mode_over | Total | 100
flow.tcp_reuse | Total | 1356464
tcp.memuse | Total | 393216
tcp.reassembly_memuse | Total | 20849024
flow.memuse | Total | 96708864
Thx Kevin
Updated by kevin buchanan over 7 years ago
As I eluded to earlier, this call is the issue:
void PacketPoolWaitForN(int n)
{
#ifdef U_WANT_SC_TO_HANG
flow-manager.c and flow-timeout.c (4) instances. Once these are removed or as I did ignore the call,
everything works fine as far as I can tell. I am now running w/ 1024 packets and can hammer as desired
w/o loosing service.
Thx Kevin
Updated by lee len over 7 years ago
kevin buchanan wrote:
As I eluded to earlier, this call is the issue:
void PacketPoolWaitForN(int n) {
#ifdef U_WANT_SC_TO_HANGflow-manager.c and flow-timeout.c (4) instances. Once these are removed or as I did ignore the call,
everything works fine as far as I can tell. I am now running w/ 1024 packets and can hammer as desired
w/o loosing service.Thx Kevin
Have you fixed this problem?
It occurred in my environment. suricata version3.2.1, with pfring or not, in vms or docker.
I use gdb -p [$suricatpid], find all work threads and recieve thread are in pthread_cond_wait@@GLIBC_2.3.2 ( ) from /lib64/libpthread.so.0.
Is this mean threads are in deadlock?
Updated by Andreas Herz almost 6 years ago
Can you test it again with the most recent verion(s) of suricata?
Updated by Andreas Herz over 5 years ago
- Status changed from New to Closed
Hi, we're closing this issue since there have been no further responses.
If you think this bug is still relevant, try to test it again with the
most recent version of suricata and reopen the issue. If you want to
improve the bug report please take a look at
https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Reporting_Bugs