Bug #6066
closedMemory Corruption in util-streaming-buffer
Description
While running a recent version of suricata (commit 2ddd26446e3, 12 Apr 2023) we have been getting crashes related to memory corruption in the streaming buffer.
Our suricata.yaml has:
stream: inline: yes midstream: true async-oneside: true
We noticed the instability manifested when we switched to inline: yes.
The crashes manifest as seemingly random memory corruption. Using ASAN, we found a large write (>4GB) in GrowRegionToSize() at util-streaming-buffer.c:609.
The backtrace was:
GrowRegionToSize util-streaming-buffer:609 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/util-streaming-buffer.c#L609 BufferInsertAtRegionConsolidate util-streaming-buffer.c:1109 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/util-streaming-buffer.c#L1109 BufferInsertAtRegionDo util-streaming-buffer.c:1245 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/util-streaming-buffer.c#L1245 BufferInsertAtRegion util-streaming-buffer.c:1311 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/util-streaming-buffer.c#L1311 StreamingBufferInsertAt util-streaming-buffer.c:1331 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/util-streaming-buffer.c#L1331 InsertSegmentDataCustom stream-tcp-list.c:99 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/stream-tcp-list.c#L99 StreamTcpReassembleInsertSegment stream-tcp-list.c:650 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/stream-tcp-list.c#L650 StreamTcpReassembleHandleSegmentHandleData stream-tcp-reassemble.c:804 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/stream-tcp-reassemble.c#L804 StreamTcpReassembleHandleSegment stream-tcp-reassemble.c:2031 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/stream-tcp-reassemble.c#L2031 HandleEstablisherPacketToClient stream-tcp.c:2813 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/stream-tcp.c#L2813 StreamTcpPacketStateEstablished stream-tcp.c:3225 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/stream-tcp.c#L3225 StreamTcpStateDispatch stream-tcp.c:5238 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/stream-tcp.c#L5238 StreamTcpPacket stream-tcp.c:5447 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/stream-tcp.c#L5447 StreamTcp stream-tcp.c:5759 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/stream-tcp.c#L5759 FlowWorkerStreamTCPUpdate flow-worker.c:377 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/flow-worker.c#L377 FlowWorker flow-worker.c:555 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/flow-worker.c#L555 TmThreadsSlotVarRun tm-threads.c:119 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/tm-threads.c#L119 TmThreadsSlotProcessPkt tm-threads.h:194 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/tm-threads.h#L194 ReceiveDPDKLoop source-dpdk.c:413 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/source-dpdk.c#L413 TmThreadsSlotPktAcqLoop tm-threads.c:314 https://github.com/OISF/suricata/blob/2ddd26446e3a568074650cf16ec4ad8402acdcd4/src/tm-threads.c#L314
We have identified that the problem is here:
/* for safe printing and general caution, lets memset the * new data to 0 */ size_t diff = grow - region->buf_size; void *new_mem = ((char *)ptr) + region->buf_size; memset(new_mem, 0, diff);
Specifically in our case `grow` is less than `region->buf_size`, and so the calculation of `diff` underflows and the `memset()` writes NULL bytes to a large chunk of memory.
We have a workaround that simplify identifies when this condition will occur and returns `-1` to indicate error before reallocating the buffer. When this happens, the requested new `size` of the buffer passed to `GrowRegionToSize()` is actually less than the existing `region->buf_size`, so the call is trying to shrink the buffer. Simply returning an error here seems effective at preventing the memory corruption (and subsequent segfault), but is obviously not the good solution since it is not handling the reassembly correctly.
After applying this workaround, suricata stability is much improved, but now we seem to be hitting an assert (which may or may not be related - it is possible these are two separate issues):
The backtrace on this assert is (sorry, no line numbers):
StreamingBufferSBBGetData() StreamReassembleRaw() PrefilterPktStream() Prefilter() DetectRun.part.0() Detect() FlowWorker() TmThreadsSlotVarRun() ReceiveDPDKLoop() TmThreadsSlotPktAcqLoop()
It isn't obvious what is going wrong that we are hitting this assert, but looking at the region / SBB segment selection logic, we are trying the following patch:
--- a/src/util-streaming-buffer.c +++ b/src/util-streaming-buffer.c @@ -72,10 +72,10 @@ static inline int InclusiveCompare(StreamingBufferBlock *lookup, StreamingBuffer const uint64_t tre = intree->offset + intree->len; if (lre <= intree->offset) // entirely before return -1; - else if (lookup->offset < tre && lre <= tre) // (some) overlap - return 0; - else - return 1; // entirely after + if (tre <= lookup->offset) // entirely after + return 1; + + return 0; // (some) overlap }
Since the existing logic does not identify the right hand overlap case `(lookup->offset < tre && lre > tre)`, as overlap (it identifies it as entirely after), but we haven't been able to determine if this is effective or not.