Project

General

Profile

Actions

Bug #7004

open

app-layer: wrong tx may be logged for stream rules

Added by Jason Ish 6 months ago. Updated 5 months ago.

Status:
In Progress
Priority:
Normal
Target version:
Affected Versions:
Effort:
Difficulty:
Label:

Description

[Private as it discusses an issue where DNS transactions are missed]

In IDS mode, transaction handling appears to be off, and there exists a discrepancy between IDS and IPS modes. While pointed out in PGSQL, I believe DNS presents a simpler reproduction case just being a simpler protocol with less state.

The test PCAP (dns-tcp-multi-5.pcap) contains 1 TCP session with 3 DNS requests/responses, in order.
- 1) Request/response for suricata.io
- 2) Request/response for oisf.net
- 3) Request/response for google.com

One rule is used:

alert dns any any -> any any (msg:"DNS suricata"; content:"suricata"; sid:1; rev:1;)

Issue 1. IDS Mode

Using a default Suricata configuration:

suricata -S test.rules -r dns-tcp-multi-5.pcap

Observations:
- 2 alerts, logged matching on the "suricata" in the DNS stream
- The transaction in both alerts contain the data from the 3rd transaction, which is for "google.com", which is misleading.

Issue:
- The wrong transaction data is logged. This is very similar to what is being seen in #7000.

Issue 2. IPS Mode

Run the same test above, but in IPS simulation mode:

suricata -S test.rules -r dns-tcp-multi-5.pcap --simulate-ips

Observations:
- 6 alerts logged, one for each request and response
- Given the alert, and pcap_cnt, the tx data seems to be correct. TX data for each request and response is seen.

Issue:
- One might expect to get only 2 alerts in this scenario.

Summary

Being a stream rule there is no direct correlation between the match and the transaction logged. Generally logging the wrong data is worse than not logging it at all, so one idea would be to not log the transaction data in this case. For DNS, most TCP DNS sessions contain one transaction, so maybe in some cases we know we're going to log the correct tx, but in others we know it might not be the correct transaction.

I'm not sure about the IPS issue, I brought it up as it differs from IDS mode, but may be a separate issue.


Files

dns-tcp-multi-other.pcap (1.16 KB) dns-tcp-multi-other.pcap Jason Ish, 05/02/2024 05:20 PM
dns-tcp-multi-4.pcap (1.82 KB) dns-tcp-multi-4.pcap Jason Ish, 05/02/2024 05:20 PM
dns-udp-multi.pcap (622 Bytes) dns-udp-multi.pcap Jason Ish, 05/02/2024 05:20 PM
dns-tcp-multi.pcap (1.42 KB) dns-tcp-multi.pcap Jason Ish, 05/02/2024 05:20 PM
test.rules (85 Bytes) test.rules Jason Ish, 05/02/2024 05:21 PM
eve-rules-2-3.json (4.73 KB) eve-rules-2-3.json Juliana Fajardini Reichow, 05/09/2024 07:52 PM
eve-more-tx-ids-et-rules.json (6.24 KB) eve-more-tx-ids-et-rules.json Juliana Fajardini Reichow, 05/09/2024 07:52 PM
20240512_184916.jpg (4.05 MB) 20240512_184916.jpg attempt at picturing DNS transaction seen in dns-tcp-multi, in IDS mode Juliana Fajardini Reichow, 05/12/2024 09:56 PM
dns-tcp-multi-5.pcap (1.47 KB) dns-tcp-multi-5.pcap Jason Ish, 06/04/2024 03:50 PM

Related issues 3 (1 open2 closed)

Related to Suricata - Bug #7000: pgsql: trigger raw stream reassemblyClosedJuliana Fajardini ReichowActions
Related to Suricata - Optimization #7018: dns/tcp: allow triggering raw stream reassemblyClosedJuliana Fajardini ReichowActions
Related to Suricata - Optimization #7026: app-protos: trigger raw stream reassemblyNewOISF DevActions
Actions #1

Updated by Jason Ish 6 months ago

  • Related to Bug #7000: pgsql: trigger raw stream reassembly added
Actions #2

Updated by Jason Ish 6 months ago

Actions #3

Updated by Juliana Fajardini Reichow 6 months ago

Victor has suggested that I try to map the whole lifetime of a few transactions for DNS, to help visualize this. I'll be focusing on that this week, and share my findings asap.

Updated by Juliana Fajardini Reichow 6 months ago

Not sure where's the best place to keep WIP results, but since this is private for now, I'll share them here.

I've made temporary changes to the code, to:
- dns_log_json_[query/answer]: log tx_id in DNS logs in places where either it wasn't logged or it was first decremented before being logged.
- Alert event: create it adding tx_id that comes from what we track as log_id
- Alert header: add the tx_id used from the PacketAlert info to the output
- dns-json: create event headers using the tx_id that is passed to the function (originally, we don't add this info to this event header)

I have also added 2 more rules:

alert dns any any -> any any (msg:"DNS oisf client"; content:"oisf"; flow:to_client; sid:2; rev:1;)
alert dns any any -> any any (msg:"DNS suricata server"; content:"suricata"; flow:to_server; sid:3; rev:1;) 

Debugging with dns-tcp-multi.pcap:

Suricata only starts to append rules to the AlertQueue after parsing packet 12 - something that I think is related to the tx_id tracking - it seems that the engine only catches up with a packet where it could match the rule with what the packet has around the transactions 4 or 5 - which would explain why we seem to miss the alert for the first transaction, and seemingly alert on wrong data, too - but could be wrong.

The DNS transactions logged don't seem to be necessarily the ones that triggered the alerts. It also doesn't seem to align with the tx_id that is saved in the PacketAlert - at least for some cases?

If I keep only rules sid: 2 and sid:3 (eve-rules-2.-3.json), we only see two alerts - but will log query and answer from what I think should be transactions 5 and 6 (query and answer to suricata.org). When we look at the tx_id logged for the alert for sid:3, from the PacketAlert and from the engine tracking, we see that it's 4. Maybe this could work if we tracked all txs as starting with index 0, but as we use the State's tx_id for many functions that control tx_id handling and tracking - and that one starts from 1, things get messed up, I think. It gets more confusing to me for sid:2, as the oisf.net query tx_id seems to indicate it was detected after the suricata.org one. I must investigate more, I guess.

I'm attaching the resulting eve log (s), including eve-with-more-tx-ids-et-rules.json.

I'll share soon a drawing of what I understand is happening while the engine processes dns-tcp-multi.pcap along with what I thought should be seen at each step - at least in terms of tx_id tracking.

I think it should be possible for us to simplify the tx_id tracking to avoid app-layer having to increment or decrement tx_id info when sending or receiving it from the engine.

TODO: for all cases where we use the tx_id, where are we getting it from, and is it incremented or decremented before we save/log/return?

Actions #5

Updated by Juliana Fajardini Reichow 6 months ago

One attempt to represent what I saw happening for the DNS flow for pcap dns-tcp-multi, with the three rules indicated above, running GDB.
I will bring whatever findings to discuss with Victor tomorrow, and see if any of that makes sense, to know what should be the next step.
Haven't gotten to run this with simulate-ips, yet.

Actions #6

Updated by Juliana Fajardini Reichow 6 months ago

Actions #7

Updated by Jason Ish 6 months ago

Just noting that there are a handful of rules out there, and rules still being written that don't use the DNS keywords, as we don't provide all the coverage needed. While these rules will catch most DNS traffic as its over UDP, we appear to be blind to single request/response TCP DNS traffic or at least the first/request response in a TCP DNS flow.

Actions #8

Updated by Juliana Fajardini Reichow 6 months ago

Jason Ish wrote in #note-7:

Just noting that there are a handful of rules out there, and rules still being written that don't use the DNS keywords, as we don't provide all the coverage needed. While these rules will catch most DNS traffic as its over UDP, we appear to be blind to single request/response TCP DNS traffic or at least the first/request response in a TCP DNS flow.

And this may be the case for other protos with similar characteristics in similar scenarios, right?

How do we approach this with trusted rule providers?

Actions #9

Updated by Jason Ish 6 months ago

Juliana Fajardini Reichow wrote in #note-8:

Jason Ish wrote in #note-7:

Just noting that there are a handful of rules out there, and rules still being written that don't use the DNS keywords, as we don't provide all the coverage needed. While these rules will catch most DNS traffic as its over UDP, we appear to be blind to single request/response TCP DNS traffic or at least the first/request response in a TCP DNS flow.

And this may be the case for other protos with similar characteristics in similar scenarios, right?

How do we approach this with trusted rule providers?

I think we need to fix it, as we don't have a work-around that I know of, and its expected to work I believe.

Actions #10

Updated by Juliana Fajardini Reichow 5 months ago

  • Status changed from New to In Progress
  • Assignee changed from OISF Dev to Juliana Fajardini Reichow
Current investigation results/conclusions/understandings:
  1. if we trigger raw inspection data for TCP app-protos, the engine properly detects the alerts from stream in IDS mode;
  2. the false positives happen in IPS mode because of the stream nature of the rule. So the alert is actually due to previous traffic on the stream matching;
  3. we'll thus address this by triggering raw stream inspection. I also think I should go back to finishing the frame's documentation from a user perspective, and plan to submit a lightning talk to SuriCon to discuss the types of rules and impacts - such as what we saw with this issue.
Actions #11

Updated by Jason Ish 5 months ago

  • Description updated (diff)
Actions #12

Updated by Jason Ish 5 months ago

  • Description updated (diff)
Actions #13

Updated by Jason Ish 5 months ago

Actions #14

Updated by Jason Ish 5 months ago

The description updated to better reflect the actual error of the misleading TX data being logged with an alert.

Actions #15

Updated by Philippe Antoine 5 months ago

Another solution could be to log the tx data only when there is only one tx

Actions #16

Updated by Juliana Fajardini Reichow 5 months ago

For IPS mode, the logging of the wrong transaction seems even more critical, as for stream rules we get more alerts triggered...

Actions #17

Updated by Jason Ish 5 months ago

  • Private changed from Yes to No
Actions #18

Updated by Juliana Fajardini Reichow about 1 month ago

Actions

Also available in: Atom PDF