Bug #2516
closedDead lock caused by unix command register-tenant
Description
steps:
1, run suricata in multi tenants mode
2, use suricatasc to send a command "register-tenant 2 /etc/suricata/tenant2.yaml"
result:
The command does not return, and check the stack using pstack: Thread 2 (Thread 0x7f6e1d3c3700 (LWP 18202)): #0 0x00007f6e2acc451d in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007f6e2acbfe1b in _L_lock_812 () from /lib64/libpthread.so.0 #2 0x00007f6e2acbfce8 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x00000000005ccf7a in DetectEngineThreadCtxInitGlobalKeywords (det_ctx=0x1c641800, det_ctx=0x1c641800) at detect-engine.c:1984 #4 ThreadCtxDoInit (de_ctx=de_ctx@entry=0xd15e000, det_ctx=det_ctx@entry=0x1c641800) at detect-engine.c:2260 #5 0x00000000005ce14b in DetectEngineThreadCtxInitForReload (tv=tv@entry=0x240cc60, new_de_ctx=new_de_ctx@entry=0xd15e000, mt=mt@entry=1) at detect-engine.c:2374 #6 0x00000000005d6746 in DetectEngineReloadThreads (new_de_ctx=new_de_ctx@entry=0xd15e000) at detect-engine.c:1362 #7 0x00000000005db208 in DetectEngineMTApply () at detect-engine.c:3530 #8 0x0000000000805c07 in UnixSocketRegisterTenant (cmd=<optimized out>, answer=0x6d73f90, data=<optimized out>) at runmode-unix-socket.c:839 #9 0x00000000008c8c5e in UnixCommandExecute (client=0x13536a80, command=0x7f6e1d3c1310 "{\"command\": \"register-tenant\", \"arguments\": {\"id\": 2, \"filename\": \"/etc/suricata/tenant2.yaml\"}}", this=0xd039c0 <command>) at unix-manager.c:506 #10 UnixCommandRun (client=0x13536a80, this=0xd039c0 <command>) at unix-manager.c:624 #11 UnixMain (this=0xd039c0 <command>) at unix-manager.c:675 #12 UnixManager (th_v=0x137f6000, thread_data=<optimized out>) at unix-manager.c:1130 #13 0x00000000008bf644 in TmThreadsManagement (td=0x137f6000) at tm-threads.c:719 #14 0x00007f6e2acbde25 in start_thread () from /lib64/libpthread.so.0 #15 0x00007f6e2a7e3bad in clone () from /lib64/libc.so.6
master->lock is locked twice in this stack, first in DetectEngineMTApply and the second in DetectEngineThreadCtxInitGlobalKeywords
Updated by Andreas Herz over 6 years ago
- Assignee set to OISF Dev
- Target version set to TBD
Can you add your config, operating system and suricata --build-info?
Updated by kai jiang over 6 years ago
All the information is below. However, they may not related to the issue. I reviewed the codes, and it may be caused by the global key words feature(https://github.com/OISF/suricata/commit/cf9678d926f4e09d962ac1dee1cd808786ccf8cb#diff-67cc0d1f93fb61787a55c2de59cbf89f). It imported a lock for master->lock in DetectEngineThreadCtxInitGlobalKeywords. My suggestion is to set the master->lock as a recursive lock.
Configure:
multi-detect: enabled: yes #selector: direct # direct or vlan selector: vlan loaders: 4 default: yes tenants: mappings:
OS:
centos 7.5
build info:
This is Suricata version 4.0.0-dev (rev a2540e5) Features: UNITTESTS PCAP_SET_BUFF AF_PACKET HAVE_PACKET_FANOUT HAVE_HTP_URI_NORMALIZE_HOOK PCRE_JIT HAVE_NSS HAVE_LIBJANSSON TLS SIMD support: SSE_4_2 SSE_4_1 SSE_3 Atomic intrisics: 1 2 4 8 16 byte(s) 64-bits, Little-endian architecture GCC version 4.8.5 20150623 (Red Hat 4.8.5-28), C version 199901 compiled with _FORTIFY_SOURCE=0 L1 cache line size (CLS)=64 thread local storage method: __thread compiled with LibHTP v0.5.26, linked against LibHTP v0.5.26 Suricata Configuration: AF_PACKET support: yes PF_RING support: no NFQueue support: no NFLOG support: no IPFW support: no Netmap support: no DAG enabled: no Napatech enabled: no Unix socket enabled: yes Detection enabled: yes Libmagic support: no libnss support: yes libnspr support: yes libjansson support: yes hiredis support: no hiredis async with libevent: no Prelude support: no PCRE jit: yes LUA support: no libluajit: no libgeoip: no Non-bundled htp: no Old barnyard2 support: no CUDA enabled: no Hyperscan support: yes Libnet support: no Rust support (experimental): no Experimental Rust parsers: no Rust strict mode: no Suricatasc install: yes Profiling enabled: no Profiling locks enabled: no Development settings: Coccinelle / spatch: no Unit tests enabled: yes Debug output enabled: no Debug validation enabled: no Generic build parameters: Installation prefix: /usr/local Configuration directory: /usr/local/etc/suricata/ Log directory: /usr/local/var/log/suricata/ --prefix /usr/local --sysconfdir /usr/local/etc --localstatedir /usr/local/var Host: x86_64-unknown-linux-gnu Compiler: gcc (exec name) / gcc (real) GCC Protect enabled: no GCC march native enabled: yes GCC Profile enabled: no Position Independent Executable enabled: no CFLAGS -g -march=native PCAP_CFLAGS -I/usr/local/include
Updated by Victor Julien over 6 years ago
- Status changed from New to Assigned
- Assignee changed from OISF Dev to Victor Julien
- Priority changed from High to Normal
- Target version changed from TBD to 4.1rc1
Updated by Victor Julien over 6 years ago
Can you test https://github.com/OISF/suricata/pull/3407 and report if this fixes the issue for you?
Updated by kai jiang over 6 years ago
https://github.com/OISF/suricata/pull/3407/commits/d351f92837986a22ec45a65e787c98b25d3407e1
I tested and can confirm the crash is fixed
https://github.com/OISF/suricata/pull/3407/commits/c1cea9931fcb8cf9ee0220c7094afb4b877ed41a
The dead lock has gone, but I am afraid that other issues may be caused by simply removing the lock. The global keyword list is also updated when execute register-tenant command to register a new tenant, but at the same time it may be used by the detect engine. The safer way I think is to make the lock recursive.
Updated by Victor Julien over 6 years ago
The only function updating those fields is DetectRegisterThreadCtxGlobalFuncs, and that is only called at keyword registration. This is when suricata starts up and is completely single threaded. All further access is read only.
Updated by Victor Julien over 6 years ago
- Status changed from Assigned to Closed