Bug #1174
closedsegfault in suricata 2.0
Description
I'm having a segfault occur about once a week with suricata 2.0 . I
think the issue is may not be specific to just 2.0, we ran 1.4.7 for a
little while and it segfaulted once or twice too. All the core dumps
I've captured seem to point at a buffer overflow in the memcpy function
called at stream-tcp-reassemble.c line 3139.
Stack trace: (gdb) bt #0 0x0000003968432925 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x0000003968434105 in abort () at abort.c:92 #2 0x0000003968470837 in __libc_message (do_abort=2, fmt=0x3968557930 "*** %s ***: %s terminated\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:198 #3 0x0000003968502827 in __fortify_fail ( msg=0x39685578d6 "buffer overflow detected") at fortify_fail.c:32 #4 0x0000003968500710 in __chk_fail () at chk_fail.c:29 #5 0x0000000000511230 in memcpy (tv=0xad3dd80, ra_ctx=0x7f75c0000fb0, ssn=0x7f75c3ae0050, stream=0x7f75c3ae0058, p=0x33e4230) at /usr/include/bits/string3.h:52 #6 StreamTcpReassembleAppLayer (tv=0xad3dd80, ra_ctx=0x7f75c0000fb0, ssn=0x7f75c3ae0050, stream=0x7f75c3ae0058, p=0x33e4230) at stream-tcp-reassemble.c:3139 #7 0x00000000005115c0 in StreamTcpReassembleHandleSegmentUpdateACK ( tv=0xad3dd80, ra_ctx=0x7f75c0000fb0, ssn=0x7f75c3ae0050, stream=0x7f75c3ae0058, p=0x33e4230) at stream-tcp-reassemble.c:3545 #8 0x0000000000513773 in StreamTcpReassembleHandleSegment (tv=0xad3dd80, ra_ctx=0x7f75c0000fb0, ssn=0x7f75c3ae0050, stream=0x7f75c3ae00a0, p=0x33e4230, pq=<value optimized out>) at stream-tcp-reassemble.c:3573 #9 0x000000000050b09b in HandleEstablishedPacketToClient (tv=0xad3dd80, p=0x33e4230, stt=0x7f75c00008c0, ssn=0x7f75c3ae0050, pq=<value optimized out>) at stream-tcp.c:2091 #10 StreamTcpPacketStateEstablished (tv=0xad3dd80, p=0x33e4230, stt=0x7f75c00008c0, ssn=0x7f75c3ae0050, pq=<value optimized out>) at stream-tcp.c:2337 #11 0x000000000050e670 in StreamTcpPacket (tv=0xad3dd80, p=0x33e4230, stt=0x7f75c00008c0, pq=0xad3deb0) at stream-tcp.c:4243 #12 0x000000000050f4d3 in StreamTcp (tv=0xad3dd80, p=0x33e4230, data=0x7f75c00008c0, pq=<value optimized out>, postpq=<value optimized out>) at stream-tcp.c:4485 #13 0x0000000000524109 in TmThreadsSlotVarRun (tv=0xad3dd80, p=0x33e4230, slot=<value optimized out>) at tm-threads.c:557 #14 0x00000000005242e9 in TmThreadsSlotVar (td=0xad3dd80) at tm-threads.c:814 #15 0x0000003aede079d1 in start_thread (arg=0x7f75cbfff700) at pthread_create.c:301 #16 0x00000039684e8b6d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
compiled with command:
CFLAGS="-O2 -g" CCFLAGS="-O2 -g" ./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var --libdir=/usr/lib64 --enable-gccprotect --with-nss-includes=/usr/include/nss3 --with-libnspr-includes=/usr/include/nspr
Suricata Configuration: AF_PACKET support: yes PF_RING support: no NFQueue support: no IPFW support: no DAG enabled: no Napatech enabled: no Unix socket enabled: yes Detection enabled: yes libnss support: yes libnspr support: yes libjansson support: yes Prelude support: no PCRE jit: no libluajit: no libgeoip: no Non-bundled htp: no Old barnyard2 support: no CUDA enabled: no Suricatasc install: yes Unit tests enabled: no Debug output enabled: no Debug validation enabled: no Profiling enabled: no Profiling locks enabled: no Coccinelle / spatch: no Generic build parameters: Installation prefix (--prefix): /usr Configuration directory (--sysconfdir): /etc/suricata/ Log directory (--localstatedir) : /var/log/suricata/ Host: x86_64-unknown-linux-gnu GCC binary: gcc GCC Protect enabled: yes GCC march native enabled: yes GCC Profile enabled: no
Suricata run with command:
suricata -c /etc/suricata/suricata.yaml --af-packet=eth2 -D
suricata.yaml minified:
%YAML 1.1 --- host-mode: sniffer-only default-log-dir: /var/log/suricata/ unix-command: enabled: no outputs: - fast: enabled: no filename: fast.log append: yes - eve-log: enabled: no type: file #file|syslog|unix_dgram|unix_stream filename: eve.json types: - alert - http: extended: yes # enable this for extended logging information - dns - tls: extended: yes # enable this for extended logging information - files: force-magic: no # force logging magic on all logged files force-md5: no # force logging of md5 checksums - ssh - unified2-alert: enabled: yes filename: unified2.alert limit: 32mb sensor-id: 0 xff: enabled: yes mode: extra-data header: X-Forwarded-For - http-log: enabled: no filename: http.log append: yes - tls-log: enabled: no # Log TLS connections. filename: tls.log # File to store TLS logs. append: yes certs-log-dir: certs # directory to store the certificates files - dns-log: enabled: no filename: dns.log append: yes - pcap-info: enabled: no - pcap-log: enabled: no filename: log.pcap limit: 1000mb max-files: 2000 mode: normal # normal or sguil. use-stream-depth: no #If set to "yes" packets seen after reaching stream inspection depth are ignored. "no" logs all packets - alert-debug: enabled: no filename: alert-debug.log append: yes - alert-prelude: enabled: no profile: suricata log-packet-content: no log-packet-header: yes - stats: enabled: no filename: stats.log interval: 8 - syslog: enabled: no facility: local5 - drop: enabled: no filename: drop.log append: yes - file-store: enabled: no # set to yes to enable log-dir: files # directory to store the files force-magic: no # force logging magic on all stored files force-md5: no # force logging of md5 checksums - file-log: enabled: no filename: files-json.log append: yes force-magic: no # force logging magic on all logged files force-md5: no # force logging of md5 checksums magic-file: /usr/share/file/magic nfq: af-packet: - interface: eth2 threads: 8 cluster-id: 99 cluster-type: cluster_flow defrag: yes use-mmap: no checksum-checks: no - interface: eth1 threads: 1 cluster-id: 98 cluster-type: cluster_flow defrag: yes - interface: default legacy: uricontent: enabled detect-engine: - profile: high - custom-values: toclient-src-groups: 15 toclient-dst-groups: 15 toclient-sp-groups: 15 toclient-dp-groups: 20 toserver-src-groups: 15 toserver-dst-groups: 15 toserver-sp-groups: 15 toserver-dp-groups: 40 - sgh-mpm-context: auto - inspection-recursion-limit: 3000 threading: set-cpu-affinity: no cpu-affinity: - management-cpu-set: cpu: [ 0 ] # include only these cpus in affinity settings - receive-cpu-set: cpu: [ 0 ] # include only these cpus in affinity settings - decode-cpu-set: cpu: [ 0, 1 ] mode: "balanced" - stream-cpu-set: cpu: [ "0-1" ] - detect-cpu-set: cpu: [ "all" ] mode: "exclusive" # run detect threads in these cpus prio: low: [ 0 ] medium: [ "1-2" ] high: [ 3 ] default: "medium" - verdict-cpu-set: cpu: [ 0 ] prio: default: "high" - reject-cpu-set: cpu: [ 0 ] prio: default: "low" - output-cpu-set: cpu: [ "all" ] prio: default: "medium" detect-thread-ratio: 1.5 cuda: mpm: data-buffer-size-min-limit: 0 data-buffer-size-max-limit: 1500 cudabuffer-buffer-size: 500mb gpu-transfer-size: 50mb batching-timeout: 2000 device-id: 0 cuda-streams: 2 mpm-algo: ac pattern-matcher: - b2gc: search-algo: B2gSearchBNDMq hash-size: low bf-size: medium - b2gm: search-algo: B2gSearchBNDMq hash-size: low bf-size: medium - b2g: search-algo: B2gSearchBNDMq hash-size: low bf-size: medium - b3g: search-algo: B3gSearchBNDMq hash-size: low bf-size: medium - wumanber: hash-size: low bf-size: medium defrag: memcap: 32mb hash-size: 65536 trackers: 65535 # number of defragmented flows to follow max-frags: 65535 # number of fragments to keep (higher than trackers) prealloc: yes timeout: 60 flow: memcap: 64mb hash-size: 65536 prealloc: 10000 emergency-recovery: 30 vlan: use-for-tracking: true flow-timeouts: default: new: 30 established: 300 closed: 0 emergency-new: 10 emergency-established: 100 emergency-closed: 0 tcp: new: 60 established: 3600 closed: 120 emergency-new: 10 emergency-established: 300 emergency-closed: 20 udp: new: 30 established: 300 emergency-new: 10 emergency-established: 100 icmp: new: 30 established: 300 emergency-new: 10 emergency-established: 100 stream: memcap: 32mb checksum-validation: no # reject wrong csums inline: auto # auto will use inline mode in IPS mode, yes or no set it statically reassembly: memcap: 128mb depth: 1mb # reassemble 1mb into a stream toserver-chunk-size: 2560 toclient-chunk-size: 2560 randomize-chunk-size: yes host: hash-size: 4096 prealloc: 1000 memcap: 16777216 logging: default-log-level: notice default-output-filter: outputs: - console: enabled: yes - file: enabled: yes filename: /var/log/suricata/suricata.log - syslog: enabled: no facility: local5 format: "[%i] <%d> -- " mpipe: load-balance: dynamic iqueue-packets: 2048 inputs: - interface: xgbe2 - interface: xgbe3 - interface: xgbe4 stack: size128: 0 size256: 9 size512: 0 size1024: 0 size1664: 7 size4096: 0 size10386: 0 size16384: 0 pfring: - interface: eth0 threads: 1 cluster-id: 99 cluster-type: cluster_flow - interface: default pcap: - interface: eth0 - interface: default pcap-file: checksum-checks: auto ipfw: default-rule-path: /etc/suricata/rules rule-files: - botcc.portgrouped.rules - ciarmy.rules - compromised.rules - drop.rules - dshield.rules - emerging-activex.rules - emerging-attack_response.rules - emerging-chat.rules - emerging-current_events.rules - emerging-dns.rules - emerging-dos.rules - emerging-exploit.rules - emerging-ftp.rules - emerging-games.rules - emerging-imap.rules - emerging-inappropriate.rules - emerging-malware.rules - emerging-misc.rules - emerging-mobile_malware.rules - emerging-netbios.rules - emerging-p2p.rules - emerging-policy.rules - emerging-pop3.rules - emerging-rpc.rules - emerging-scada.rules - emerging-scan.rules - emerging-shellcode.rules - emerging-smtp.rules - emerging-snmp.rules - emerging-sql.rules - emerging-telnet.rules - emerging-tftp.rules - emerging-trojan.rules - emerging-user_agents.rules - emerging-voip.rules - emerging-web_client.rules - emerging-web_server.rules - emerging-web_specific_apps.rules - emerging-worm.rules - tor.rules - http-events.rules # available in suricata sources under rules dir - smtp-events.rules # available in suricata sources under rules dir classification-file: /etc/suricata/rules/classification.config reference-config-file: /etc/suricata/rules/reference.config vars: address-groups: HOME_NET: "[192.168.0.0/16,10.0.0.0/8,172.16.0.0/12,50.114.0.0/16,199.58.198.224/27,199.58.199.0/24,69.27.166.0/26]" EXTERNAL_NET: "!$HOME_NET" HTTP_SERVERS: "$HOME_NET" SMTP_SERVERS: "$HOME_NET" SQL_SERVERS: "$HOME_NET" DNS_SERVERS: "$HOME_NET" TELNET_SERVERS: "$HOME_NET" AIM_SERVERS: "$EXTERNAL_NET" DNP3_SERVER: "$HOME_NET" DNP3_CLIENT: "$HOME_NET" MODBUS_CLIENT: "$HOME_NET" MODBUS_SERVER: "$HOME_NET" ENIP_CLIENT: "$HOME_NET" ENIP_SERVER: "$HOME_NET" port-groups: HTTP_PORTS: "80" SHELLCODE_PORTS: "!80" ORACLE_PORTS: 1521 SSH_PORTS: 22 DNP3_PORTS: 20000 action-order: - pass - drop - reject - alert host-os-policy: windows: [] bsd: [] bsd-right: [] old-linux: [] linux: [0.0.0.0/0] old-solaris: [] solaris: [] hpux10: [] hpux11: [] irix: [] macos: [] vista: [] windows2k3: [] asn1-max-frames: 256 engine-analysis: rules-fast-pattern: yes rules: yes pcre: match-limit: 3500 match-limit-recursion: 1500 app-layer: protocols: tls: enabled: yes detection-ports: toserver: 443 dcerpc: enabled: yes ftp: enabled: yes ssh: enabled: yes smtp: enabled: yes imap: enabled: detection-only msn: enabled: detection-only smb: enabled: yes detection-ports: toserver: 139 dns: tcp: enabled: yes detection-ports: toserver: 53 udp: enabled: yes detection-ports: toserver: 53 http: enabled: yes libhtp: default-config: personality: IDS request-body-limit: 3072 response-body-limit: 3072 request-body-minimal-inspect-size: 32kb request-body-inspect-window: 4kb response-body-minimal-inspect-size: 32kb response-body-inspect-window: 4kb double-decode-path: no double-decode-query: no server-config: profiling: rules: enabled: yes filename: rule_perf.log append: yes sort: avgticks limit: 100 keywords: enabled: yes filename: keyword_perf.log append: yes packets: enabled: yes filename: packet_stats.log append: yes csv: enabled: no filename: packet_stats.csv locks: enabled: no filename: lock_stats.log append: yes coredump: max-dump: unlimited napatech: hba: -1 use-all-streams: yes streams: [1, 2, 3]
Let me know if I need to provide any more information or enable features.
Thanks,
Jason
Updated by Victor Julien over 10 years ago
Could try compiling with --enable-debug and then run that for a while? It will add some extra checks in this part of the code.
Updated by Jason Borden over 10 years ago
I recompiled and ran it with --enable-debug and set default-log-level: debug in suricata.yaml . After ten minutes of running the log file is already 6GB in size so I don't think that's going to work. I can change the surrounding SCLogDebug calls to SCLogInfo instead an run with info level debug. That might give the pertinent information without having to log so much data.
Updated by Victor Julien over 10 years ago
Oh sorry, didn't mention that you don't need to change the log level. Keep it on info or notice. Under the hood there is some more aggressive (and costly) checking if debug is compiled in.
Updated by Jason Borden over 10 years ago
OK, good to know. I've got it compiled and running with --enable-debug and I'll report back next time it segfaults.
Updated by Jason Borden over 10 years ago
Had another segfault today with --enable-debug running. Looks about the same as all the others I've had.
(gdb) bt #0 0x0000003968432925 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x0000003968434105 in abort () at abort.c:92 #2 0x0000003968470837 in __libc_message (do_abort=2, fmt=0x3968557930 "*** %s ***: %s terminated\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:198 #3 0x0000003968502827 in __fortify_fail ( msg=0x39685578d6 "buffer overflow detected") at fortify_fail.c:32 #4 0x0000003968500710 in __chk_fail () at chk_fail.c:29 #5 0x000000000059f0b9 in memcpy (tv=0xed1a760, ra_ctx=0x7f0160000fb0, ssn=0x7f016126c290, stream=0x7f016126c298, p=0x25b9d40) at /usr/include/bits/string3.h:52 #6 StreamTcpReassembleAppLayer (tv=0xed1a760, ra_ctx=0x7f0160000fb0, ssn=0x7f016126c290, stream=0x7f016126c298, p=0x25b9d40) at stream-tcp-reassemble.c:3139 #7 0x00000000005ac52f in StreamTcpReassembleHandleSegmentUpdateACK ( tv=0xed1a760, ra_ctx=0x7f0160000fb0, ssn=0x7f016126c290, stream=0x7f016126c298, p=0x25b9d40) at stream-tcp-reassemble.c:3545 #8 0x00000000005ac721 in StreamTcpReassembleHandleSegment (tv=0xed1a760, ra_ctx=0x7f0160000fb0, ssn=0x7f016126c290, stream=0x7f016126c2e0, p=0x25b9d40, pq=<value optimized out>) at stream-tcp-reassemble.c:3573 #9 0x000000000058db07 in HandleEstablishedPacketToServer (tv=0xed1a760, p=0x25b9d40, stt=0x7f01600008c0, ssn=0x7f016126c290, pq=0x7f01600008d0) at stream-tcp.c:1969 #10 StreamTcpPacketStateEstablished (tv=0xed1a760, p=0x25b9d40, stt=0x7f01600008c0, ssn=0x7f016126c290, pq=0x7f01600008d0) at stream-tcp.c:2323 #11 0x0000000000593da0 in StreamTcpPacket (tv=0xed1a760, p=0x25b9d40, stt=0x7f01600008c0, pq=0x29eebc30) at stream-tcp.c:4243 #12 0x00000000005995a9 in StreamTcp (tv=0xed1a760, p=0x25b9d40, data=0x7f01600008c0, pq=0x29eebc30, postpq=<value optimized out>) at stream-tcp.c:4485 #13 0x00000000005c0e69 in TmThreadsSlotVarRun (tv=0xed1a760, p=0x25b9d40, slot=<value optimized out>) at tm-threads.c:557 #14 0x00000000005c10a6 in TmThreadsSlotVar (td=0xed1a760) at tm-threads.c:814 #15 0x0000003aede079d1 in start_thread (arg=0x7f0182ca1700) at pthread_create.c:301 #16 0x00000039684e8b6d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
Let me know where you want to go from here.
Updated by Jason Borden over 10 years ago
I've made some progress on determining the issue here. The problem manifests on line 3129 of stream-tcp-reassemble.c:
if (copy_size > (seg->payload_len - payload_offset)) { copy_size = (seg->payload_len - payload_offset); }
When I get a segfault, seg->payload_len is less than payload_offset which in turn tries to set copy_size to a negative value. Since copy_size is a uint16_t this takes and subtracts the negative value from 65536. Then on line 3139 it tries to memcpy far more data than can be stored in the 4k size of the data variable causing a segfault.
What I'm not sure of is why I occasionally get a seg->payload_len that is less than my payload_offset. I haven't heard anyone else having this issue, so my thoughts are that maybe it's a configuration issue. I'll try reverting some of my settings back to defaults to see if I can figure out what the cause is.
Updated by Victor Julien over 10 years ago
If you get this again, can you post the following gdb output:
Jump to frame 6 (the one that has "#6 StreamTcpReassembleAppLayer (tv=0xad3dd80, ra_ctx=0x7f75c0000fb0,")
(gdb) f 6
(gdb) print *ssn
(gdb) print ra_base_seq
(gdb) print payload_len
(gdb) print payload_offset
(gdb) print copy_size
(gdb) print *seg
Hopefully this will be enough to find the cause.
Please don't change your settings. That would only hide the bug, because that is what it is :)
Updated by Jason Borden over 10 years ago
(gdb) f 6 #6 StreamTcpReassembleAppLayer (tv=0x9f0de70, ra_ctx=0x7f1b50000fb0, ssn=0x7f1b51c5c860, stream=0x7f1b51c5c868, p=0x27682e0) at stream-tcp-reassemble.c:3139 3139 memcpy(data + data_len, seg->payload + (gdb) p *ssn $1 = {res = 0, state = 4 '\004', queue_len = 0 '\000', data_first_seen_dir = -13 '\363', flags = 5648, server = {flags = 128, wscale = 6 '\006', os_policy = 5 '\005', isn = 3552530536, next_seq = 3552862937, last_ack = 3552851289, next_win = 3552886425, window = 29312, last_ts = 0, last_pkt_ts = 0, ra_app_base_seq = 3552854200, ra_raw_base_seq = 3552849832, seg_list = 0x7f1b513eda50, seg_list_tail = 0x7f1b49d6f410, sack_head = 0x7f1ae43a6030, sack_tail = 0x7f1ae43a6030}, client = { flags = 160, wscale = 6 '\006', os_policy = 0 '\000', isn = 1221981603, next_seq = 1221982478, last_ack = 1221982478, next_win = 1222013390, window = 30912, last_ts = 0, last_pkt_ts = 1397754340, ra_app_base_seq = 1221982477, ra_raw_base_seq = 1221982477, seg_list = 0x0, seg_list_tail = 0x0, sack_head = 0x0, sack_tail = 0x0}, toserver_smsg_head = 0x0, toserver_smsg_tail = 0x0, toclient_smsg_head = 0x7f1b53ab7420, toclient_smsg_tail = 0x7f1b53ab7420, queue = 0x0} (gdb) p ra_base_seq $2 = 3552858296 (gdb) p payload_len $3 = 272 (gdb) p payload_offset $4 = 11376 (gdb) p copy_size $5 = 64352 (gdb) p *seg $6 = {payload = 0x7f1b52851180 "e\243k", payload_len = 10192, pool_size = 65535, seq = 3552846921, next = 0x7f1b49d6f410, prev = 0x0, flags = 1 '\001'}
Updated by Victor Julien over 10 years ago
It seems the problem actually originates from before this packet. We have a case where stream->ra_app_base_seq > stream->last_ack, which shouldn't be possible.
Are you able to reproduce this on a pcap? Is it possible for you to record some of your traffic and replay it somehow to Suricata (or directly read the pcap file) to see if that reproduces the issue?
Updated by Jason Borden over 10 years ago
At this point I'm about 90% sure that the problem is related to having checksum checks turned off. I have not yet been able to reproduce the issue when they are turned on. I'll still try out pcap with checksums off and record the traffic.
Updated by Victor Julien over 10 years ago
Thanks. Even if the checksum checks somehow influence it, a crash still shouldn't happen in any case.
Updated by Jason Borden over 10 years ago
I've been running with --pcap=eth2 and checksums off for the past couple weeks and haven't been able to reproduce the bug.