Support #2981
closed"state": “TRUNCATED” for large files (may be caused by CheckGap function)
Description
Me and my team work on integrating threat intelligence hash feeds with Suricata 4.1.3 using rules “filemd5” option.
Our goal is to receive alerts when files, which md5s we consider as dangerous, passes through Suricata. To achieve this goal we does the following:
1. Create a rule:
alert tcp $HOME_NET any -> $EXTERNAL_NET any (msg:"TCP: FILE MD5 Found"; filemd5:md5_list.txt; filestore; sid:11111112; rev:1;)
2. Create a file with set of hashes
1276481102f218c981e0324180bafd9f
f18208f33ca9f847dd2e348117e3bc54
cac7b533ba7abfb8591bb6a3f7a95eab
3. Include the rule to suricata.yaml and run Suricata
While testing the detection files using MD5-sum ('filemd5' keyword in rules) I faced to not clear behavior of Suricata: It looks like Suricata can’t calculate the MD5-sum of files with size about 50 MB and more. In that case the string in event-json.log looks like:
{ "timestamp": "05\/14\/2019-17:22:50.384309",
"ipver": 4,
"srcip": "10.16.159.190",
"dstip": "10.65.67.147",
"protocol": 6,
"sp": 80,
"dp": 41430,
"http_uri":
"\/dev\/INC000010408298\/test_file50mb.img",
"http_host": "ponybuntu.avp.ru",
"http_referer": "<unknown>",
"http_user_agent": "Wget\/1.14 (linux-gnu)",
"filename": "\/dev\/INC000010408298\/test_file50mb.img",
"state": "TRUNCATED",
"stored": false,
"size": 1013331 }
At the same time, in case of a small files everything works well:
{"id": 1,
"timestamp": "05\/14\/2019-17:22:42.507823",
"ipver": 4,
"srcip": "10.16.159.190",
"dstip": "10.65.67.147",
"protocol": 6,
"sp": 80,
"dp": 41428,
"http_uri": "\/dev\/INC000010408298\/test_file10kb.img",
"http_host": "ponybuntu.avp.ru",
"http_referer": "<unknown>",
"http_user_agent": "Wget\/1.14 (linux-gnu)",
"filename": "\/dev\/INC000010408298\/test_file10kb.img",
"state": "CLOSED",
"md5": "1276481102f218c981e0324180bafd9f",
"stored": true,
"size": 10240 }
I tried to understand why Suricata marks large files as Truncated. I followed Self-help diagram https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Self_Help_Diagrams, but it isn’t help me. So, I turned on debug logging and I found that according to https://github.com/OISF/suricata/blob/master/src/stream-tcp-reassemble.c the while-loop in ReassembleUpdateAppLayer may breaks if CheckGap function returns “false”, and it seems to be so in case of Suricata finds a GAP.
In that case Suricata stops to reassemble the tcp stream and doesn’t calculate the MD5-sum. It is proofed by debug logs – after CheckGap function was called, file assembling stopped and its size will not be increased anymore. Nevertheless on a client side file loads fully and correct.
Could you please give me any suggestions how to avoid such behavior and force Suricata calculate MD5-sums in large files with GAPs? Or this behavior is by design and should not be fixed?
Links:
Pcap-file: https://box.kaspersky.com/f/a54978a4b2924b4eb2c9/?dl=1
Debug-log (nohup.out): https://box.kaspersky.com/f/12ef7ee147744ce592b7/?dl=1
suricata.yaml: https://box.kaspersky.com/f/07cb539222fb4e66b870/?dl=1
Thank you in advance!
Updated by Victor Julien over 5 years ago
The pcap shows that while a 50mb file is requested from the server, only about 1mb is received before the flow gets lots of 'TCP previous segment not captured'. So it seems to make sense that the file is incomplete.
Updated by Georgy Varlamov over 5 years ago
Victor Julien wrote:
The pcap shows that while a 50mb file is requested from the server, only about 1mb is received before the flow gets lots of 'TCP previous segment not captured'. So it seems to make sense that the file is incomplete.
The same information is in eve-json.log: ... "size": 1013331 } but however that file was completely downloaded to disk in full size of 52 428 800 bytes...
Updated by Victor Julien over 5 years ago
How did you capture the pcap? The pcap suggests packet loss on that flow.
Updated by Georgy Varlamov over 5 years ago
Victor Julien wrote:
How did you capture the pcap? The pcap suggests packet loss on that flow.
It was done by pcap-log setting in suricata yaml (I suppose I give you incorrect link to suricata.yaml in issue descriptiom, right file is here: https://box.kaspersky.com/f/ba968a36b3f74115bad2/?dl=1).
Updated by Victor Julien over 5 years ago
Did you also preserve the suricata stats.log from the live run that captured the pcap? If so, can you share it?
Updated by Georgy Varlamov over 5 years ago
Victor Julien wrote:
Did you also preserve the suricata stats.log from the live run that captured the pcap? If so, can you share it?
Yep. Here it is: https://box.kaspersky.com/f/44b08c2647044ec4b16d/?dl=1
Updated by Victor Julien over 5 years ago
This files includes multiple runs, but many of them report some packet loss. So it seems likely the packet loss caused the file to be truncated.
Updated by Georgy Varlamov over 5 years ago
Victor Julien wrote:
This files includes multiple runs, but many of them report some packet loss. So it seems likely the packet loss caused the file to be truncated.
What do you mean truncated? Repeating my self - all files was downloaded correct and fully.
Updated by Georgy Varlamov over 5 years ago
Victor Julien wrote:
How are you running Suricata?
sudo /usr/local/bin/suricata -c /usr/local/etc/suricata/suricata.yaml -i ens160 -v
Updated by Victor Julien over 5 years ago
So Suricata runs in IDS mode. This means that even if Suricata has packet loss, the original traffic is not affected by this. This would explain why you still downloaded the file even if Suricata failed to fully track it.
Updated by Georgy Varlamov over 5 years ago
Victor Julien wrote:
So Suricata runs in IDS mode. This means that even if Suricata has packet loss, the original traffic is not affected by this. This would explain why you still downloaded the file even if Suricata failed to fully track it.
You mean that in IDS-mode if Suricata has packet losses it can't calculate MD5-sum regardless of what happening in original traffic?
What should I change in settings to make Suricata calculate MD5 in any cases?
Thank you in advance.
Updated by Victor Julien over 5 years ago
Yes, you need all bytes to calculate the md5sum, so loosing even a single byte puts it off. We do have a ticket on supporting only the start of the file (#448), but no work has been done in that area.
Updated by Georgy Varlamov over 5 years ago
Victor Julien wrote:
Yes, you need all bytes to calculate the md5sum, so loosing even a single byte puts it off. We do have a ticket on supporting only the start of the file (#448), but no work has been done in that area.
Ok. I understand that MD5-sum can be calculated only apon all bytes of file. But may there is a way to wait for all bytes? Not to stop trying calculate MD5 after first gap in TCP segments?
I mean that there is a possibility to wait until server will re-sent lost bytes even Suricata see that there is some loses in stream?
Updated by Victor Julien over 5 years ago
Suricata will declare a 'gap' when it sees that the receiving host has ACK'd data Suricata itself hasn't seen. In this case there will be no retransmit because the host has received the packet(s) Suricata missed.
Updated by Georgy Varlamov over 5 years ago
Victor Julien wrote:
Suricata will declare a 'gap' when it sees that the receiving host has ACK'd data Suricata itself hasn't seen. In this case there will be no retransmit because the host has received the packet(s) Suricata missed.
So the question is:
Is it possible to tune Suricata somehow to avoid getting gaps and make it be able to calculate MD5 on any correct downloaded files?
May be I should try another mode?
Could you please help me in this task. Thank you.
Updated by Andreas Herz over 5 years ago
- Assignee set to Community Ticket
- Target version set to Support
Updated by Victor Julien over 5 years ago
Tuning guides for Suricata can be found here:
https://github.com/pevma/SEPTun
https://github.com/pevma/SEPTun-Mark-II
Updated by Andreas Herz over 5 years ago
- Status changed from New to Feedback
Did those suggestions help you?
Updated by Victor Julien about 5 years ago
- Status changed from Feedback to Closed