Bug #3159
closedSC_ERR_PCAP_DISPATCH with message "error code -2" upon rule reload completion (4.1.x)
Description
Summary
When doing rule reloads (using the USR2 signal), about half of the time at the completion of the reload we see the error SC_ERR_PCAP_DISPATCH with error_code 20 and the message "error code -2". I am not sure at this time if we should be concerned about this error or if it can safely be ignored.
Details
Upon the completion of rule reloads, about half of the time this produces a sequence of events like the following:
{"timestamp":"2019-04-26T10:07:54.238852-0500","log_level":"Error","event_type":"engine","engine":{"error_code":20,"error":"SC_ERR_PCAP_DISPATCH","message":"error code -2 "}} {"timestamp":"2019-04-26T10:07:54.664296-0500","log_level":"Info","event_type":"engine","engine":{"message":"cleaning up signature grouping structure... complete"}} {"timestamp":"2019-04-26T10:07:54.665821-0500","log_level":"Notice","event_type":"engine","engine":{"message":"rule reload complete"}}
We upgraded from 4.0.6 to 4.1.3 a while back and that is believed to be when this started happening. I checked the 4.0.6 logs and did not see these messages. Upgrading from 4.1.3 to 4.1.4 did not resolve the issue.
We are using the pcap capture method (Myricom) with workers runmode.
I looked into this issue back when I submitted this to the OISF-USERS mailing list (https://lists.openinfosecfoundation.org/pipermail/oisf-users/2019-April/016850.html) and have the following observations:
It seems this error is coming from source-pcap.c on line 269 (https://github.com/OISF/suricata/blob/7f38ffc8bcfa3bca793eb3be41f112634b48de2a/src/source-pcap.c#L269), since we aren't loading a pcap file in this case and that is mostly where else this error is thrown.
There is a pcap_dispatch call above this one (line 265) and the conditional on line 267 to enter the trigger for this error checks that the return from pcap_dispatch is < 0. From https://linux.die.net/man/3/pcap_dispatch, "-2 (is returned) if the loop terminated due to a call to pcap_breakloop() before any packets were processed". The PCAP_ERROR_BREAK (-2) code would be handled on line 272 once inside of here. There is a pcap_breakloop() call (line 226) inside PcapCallbackLoop which is called on line 266, but I believe instead this could be the result of the change for 4.1.3 in https://github.com/OISF/suricata/commit/bb26e6216e5190d841529c0ecb1292b9a358ed54#diff-2079412a59d37868318fc953aeddef52 where ReceivePcapBreakLoop was created for PktAcqBreakLoop. So possibly in tm-threads.c at https://github.com/OISF/suricata/blob/d6903e70c1b653984ca95f8808755efbc6a9ece4/src/tm-threads.c#L1610?
If that is how the error occurs, then I am curious if we may be losing a half second (at least) of traffic visibility due to the reconnect on line 277 of source-pcap.c?
Steps to reproduce
Unknown at this time other than the possibility of needing to use pcap capture method.
Updated by Victor Julien about 5 years ago
- Copied from Bug #3004: SC_ERR_PCAP_DISPATCH with message "error code -2" upon rule reload completion added
Updated by Victor Julien about 5 years ago
- Status changed from Assigned to Closed