Optimization #426
closedthreshold: rule based thresholding data structure improvement
Description
In recent changes most of the thresholding code moved away from one big lock for the entire thresholding table. With the introduction of a host table, the thresholding that tracks per src or dst is now done in a fine grained manner, reducing lock contention.
For thresholding tracked per rule however, we still use one big evil lock. So we need to fix that. Currently the data structure for storing these threshold entries is an array, with a size of the number of signatures. The signatures internal id (s->num) is the index. The array is stored in the de_ctx.
We need to move away from this. As we are going to support rule reloads, the internal id can no longer be used here. The thresholding table should survive the rule reload. The signatures internal id's will change on a reload so they can't be used as index.
So ideally we'd have some form of hash that uses the gid+sid to lookup the thresholding data struct without a global lock.
Options would be:
- have a hash with per row locking and update threshold entry under row lock.
- have a hash with per row locking for lookups and a per threshold entry lock for checks and updates (this is how flow and host tables work).
- make sure internal id's for sigs remain the same with reloads, using unique id's for new sigs -- too complicated I think.
Other options?
Updated by Victor Julien over 12 years ago
Thinking of another option: most signatures don't have a threshold setting and most setups will not use large number of thresholds through the thresholds.conf file. What this means is that the number of threshold rules is relatively low.
We can also do a separate "threshold id" per rule, that is unique per "gid + sid". During the init stage we can set this by having a simple hash. Due to the small number of threshold enabled rules the global lookup array can be small. Each threshold entry would have it's own lock. The only contention would happen when 2 or more threads set/check a threshold for the same rule, which should be relatively rare.
Then on a rule reload, we can use the same simple hash to lookup the existing id's. Then we'd assign new id's to new rules and resize the lookup array somehow. The resize would require some form of locking though, this might bring back the big evil lock... Maybe a rwlock would be an option:
Lookup(Signature *s) { read_lock(global_table); entry = s->threshold_id; mutex_lock(entry); // check/update mutex_unlock(entry); read_unlock(global_table); } Update() { write_lock(global_table); // resize and init new entries write_unlock(global_table); }
Thoughts?
Updated by Victor Julien over 12 years ago
- Assignee set to OISF Dev
- Target version set to TBD
Updated by Victor Julien about 11 years ago
- Tracker changed from Feature to Optimization
Updated by Victor Julien over 6 years ago
- Assignee changed from OISF Dev to Anonymous
- Effort set to low
- Difficulty set to medium
Updated by Philippe Antoine over 1 year ago
Is this issue about this code ?
detect-engine-threshold.c :
} else if (td->track == TRACK_RULE) {
SCMutexLock(&de_ctx->ths_ctx.threshold_table_lock);
ret = ThresholdHandlePacketRule(de_ctx,p,td,s,pa);
SCMutexUnlock(&de_ctx->ths_ctx.threshold_table_lock);
}
Updated by Victor Julien 7 months ago
- Status changed from New to In Progress
- Assignee changed from Community Ticket to Victor Julien
- Target version changed from TBD to 8.0.0-beta1
Updated by Victor Julien 6 months ago
- Status changed from In Progress to In Review
Updated by Victor Julien 4 months ago
- Subject changed from sid based thresholding data structure improvement to threshold: rule based thresholding data structure improvement
- Status changed from In Review to Closed