Bug #4766
closedFlow leaked when flow->use_cnt access race happens
Description
Description:¶
Flow->use_cnt is used to track if a flow is still being processed by suricata. However, if it becomes unsynced, the flow basically stays in the queue forever, i.e. leaked.
In most cases, Flow->use_cnt seems to be protected by the flow lock, however at least in the following scenario, it is not. Basically, when the packet is bypassed, and the FlowWorker unlocks the flow and defers the flow dereference to the packet output handler TmqhOutputPacketpool, however, TmqhOutputPacketpool calls PACKET_RELEASE_REFS without a flow lock.
Steps to Reproduce:¶
1. Have a stream depth set for one of the app layer protocols, in my case, smb
2. Have a client mounted a smb share
3. Have the client copy a file from the share to local
4. Watch for the use_cnt of the flow to increase and not going back down.
TmThreadsSlotProcessPkt ->TmThreadsSlotVarRun ->FlowWorker ->if (p->flags & PKT_WANTS_FLOW) ->FlowHandlePacket ->FlowGetFlowFromHash ->FlowReference ->FlowIncrUsecnt ->if (likely(p->flow != NULL)) ->if (FlowUpdate(tv, fw, p) == TM_ECODE_DONE) ->FLOWLOCK_UNLOCK(p->flow); ->return TM_ECODE_OK; ->tv->tmqh_out(tv, p) ->TmqhOutputPacketpool ->PACKET_RELEASE_REFS(p)
Build Info:¶
This is Suricata version 6.0.3 RELEASE Features: DEBUG UNITTESTS NFQ PCAP_SET_BUFF AF_PACKET HAVE_PACKET_FANOUT LIBCAP_NG LIBNET1.1 HAVE_HTP_URI_NORMALIZE_HOOK PCRE_JIT HAVE_NSS HAVE_LUA HAVE_LUAJIT HAVE_LIBJANSSON TLS TLS_C11 MAGIC RUST SIMD support: SSE_4_2 SSE_4_1 SSE_3 Atomic intrinsics: 1 2 4 8 16 byte(s) 64-bits, Little-endian architecture GCC version 10.2.1 20210110, C version 201112 compiled with -fstack-protector compiled with _FORTIFY_SOURCE=2 L1 cache line size (CLS)=64 thread local storage method: _Thread_local compiled with LibHTP v0.5.38, linked against LibHTP v0.5.38 Suricata Configuration: AF_PACKET support: yes eBPF support: yes XDP support: yes PF_RING support: no NFQueue support: yes NFLOG support: yes IPFW support: no Netmap support: no DAG enabled: no Napatech enabled: no WinDivert enabled: no Unix socket enabled: yes Detection enabled: yes Libmagic support: yes libnss support: yes libnspr support: yes libjansson support: yes hiredis support: no hiredis async with libevent: no Prelude support: no PCRE jit: yes LUA support: yes, through luajit libluajit: yes GeoIP2 support: yes Non-bundled htp: no Hyperscan support: yes Libnet support: yes liblz4 support: yes HTTP2 decompression: no Rust support: yes Rust strict mode: no Rust compiler path: /root/.cargo/bin/rustc Rust compiler version: rustc 1.55.0 (c8dfcfe04 2021-09-06) Cargo path: /root/.cargo/bin/cargo Cargo version: cargo 1.55.0 (32da73ab1 2021-08-23) Cargo vendor: yes Python support: yes Python path: /usr/bin/python3 Python distutils yes Python yaml yes Install suricatactl: yes Install suricatasc: yes Install suricata-update: not bundled Profiling enabled: no Profiling locks enabled: no Plugin support (experimental): yes Development settings: Coccinelle / spatch: no Unit tests enabled: yes Debug output enabled: yes Debug validation enabled: no Generic build parameters: Installation prefix: /suricata/install Configuration directory: /etc/suricata/ Log directory: /var/log/suricata/ --prefix /suricata/install --sysconfdir /etc --localstatedir /var --datarootdir /suricata/install/share Host: x86_64-pc-linux-gnu Compiler: gcc (exec name) / g++ (real) GCC Protect enabled: yes GCC march native enabled: no GCC Profile enabled: no Position Independent Executable enabled: no CFLAGS -DUNITTESTS -march=sandybridge -O0 -std=c11 -I${srcdir}/../rust/gen -I${srcdir}/../rust/dist PCAP_CFLAGS -I/usr/include SECCFLAGS -fstack-protector -D_FORTIFY_SOURCE=2 -Wformat -Wformat-security
Updated by Victor Julien about 3 years ago
- Status changed from New to Assigned
- Assignee set to Victor Julien
- Target version set to 7.0.0-beta1
- Label Needs backport to 6.0 added
Great analysis, thanks a lot. Need to review the logic a bit more, but I think we can just decouple the flow and packet as soon as we decide we're in the bypass path.
Updated by Victor Julien about 3 years ago
Just noticed this today on one of my machines, that has bypass enabled
Oct 20 12:31:42 z230 suricata[27592]: suricata: flow-worker.c:191: CheckWorkQueue: Assertion `!(f->use_cnt > 0)' failed.
Probably the same issue.
Updated by Victor Julien about 3 years ago
- Status changed from Assigned to In Review
Updated by Jeff Lucovsky about 3 years ago
- Copied to Bug #4792: Flow leaked when flow->use_cnt access race happens added
Updated by Victor Julien about 3 years ago
- Related to Bug #4650: Stream TCP raw reassembly is leaking added
Updated by Victor Julien about 3 years ago
- Status changed from In Review to Closed
- Priority changed from High to Normal