Optimization #7208: tcp/reassemble: GetBlock takes O(nlgn) in worst case - Suricata - Open Information Security Foundation

Actions

Copy link

Optimization #7208

closed

tcp/reassemble: GetBlock takes O(nlgn) in worst case

Added by chris tang 9 months ago. Updated about 2 months ago.

Status:

Closed

Priority:

Normal

Assignee:

chris tang

Target version:

8.0.0-beta1

Effort:

Difficulty:

Label:

Description

The GetBlock function search the suitable StreamingBufferBlock by offset, but it relys on SBB_RB_NEXT which degree the sbb red black tree search time from O(logn) to O(n), and the SBB_RB_NEXT increases L1 dcache missing when tcp session within much tcpsegment. In production env, SBB_RB_NEXT would occoupies 8-12% cpu times. Suggest changing the way to find block to standerd red black search:

Subtasks 1 (0 open — 1 closed)

Actions

Copy link

Updated by chris tang 9 months ago

1. Finds the first node A greater than or equal to the offset
2. Find A's prev node B
3. Check B->len+B->offset > offset? yes:return B ,no: return A

According to optimization above, we've decrease the cpu time from 8-12% to under 1%.

Actions

Copy link

Updated by Shivani Bhardwaj 8 months ago

chris tang wrote in #note-1:

1. Finds the first node A greater than or equal to the offset
2. Find A's prev node B
3. Check B->len+B->offset > offset? yes:return B ,no: return A

According to optimization above, we've decrease the cpu time from 8-12% to under 1%.

Thank you.
Q: Could you please tell how is it guaranteed that only the A's prev node B will satisfy B->len + B->offset > offset?
Q: Could it not be any other node prior to A?

Actions

Copy link

Updated by chris tang 8 months ago

Shivani Bhardwaj wrote in #note-2:

chris tang wrote in #note-1:

1. Finds the first node A greater than or equal to the offset
2. Find A's prev node B
3. Check B->len+B->offset > offset? yes:return B ,no: return A

According to optimization above, we've decrease the cpu time from 8-12% to under 1%.

Thank you.
Q: Could you please tell how is it guaranteed that only the A's prev node B will satisfy B->len + B->offset > offset?
Q: Could it not be any other node prior to A?

Thanks for pointing out that, that's also confused me.
So I read the SBBUpdate function code, found that new StreamingBufferBlock node will be adjusted by ConsolidateFwd and ConsolidateBackward to make sure there is only one node at a certain offset+len, overlap would be merged, only gap exists? I'm not sure if my understanding is right. If it is,i think any node prior to B will not satisfy if B satisfied B->len + B->offset > offset ?

Actions

Copy link

Updated by Jason Ish 8 months ago

Target version changed from 7.0.7 to 8.0.0-beta1

Retargeted to 8, will need to scope for backport after further review.

Actions

Copy link

Updated by Shivani Bhardwaj 8 months ago

So I read the SBBUpdate function code, found that new StreamingBufferBlock node will be adjusted by ConsolidateFwd and ConsolidateBackward to make sure there is only one node at a certain offset+len, overlap would be merged, only gap exists? I'm not sure if my understanding is right. If it is,i think any node prior to B will not satisfy if B satisfied B->len + B->offset > offset ?

Thank you for the necessary pointers. Apologies for getting back late on this.
After my analysis, I think you algorithm could actually work and yield more correct results indeed.
However, I'm a bit unsure about the claimed time complexity of the rbtree macros. It seems to me that RB_NEXT (finding successor) will take the same amount of time as RB_PREV (finding predecessor) i.e. O(lg n) per run. Please check the implementation in tree.h in the source code. Lmk wdyt about this?

However, with the change in algorithm that you suggest, particularly the search for first node greater than equal to offset, we always seem to reach the correct node.

I'd like to understand how you'd handle the following.
Assumption: There is no offset bigger than the current offset in the tree.
1. There does exist an overlap with an existing buffer whose offset + len > offset (right most node always?)
2. There is no block covering that offset (so, perhaps we're in a GAP?)

Let's see if your PR passes our QA and if our analysis is correct. :) Please make sure to add your test results and useful pointers/test data for us to verify that as well.

Please feel free to assign the ticket to yourself since you're working on it. Thanks.

Actions

Copy link

Updated by chris tang 8 months ago

Assignee changed from OISF Dev to chris tang

Thank you for your ongoing attention and responses to my tickets.

However, I'm a bit unsure about the claimed time complexity of the rbtree macros. It seems to me that RB_NEXT (finding successor) will take the same amount of time as RB_PREV (finding predecessor) i.e. O(lg n) per run. Please check the implementation in tree.h in the source code. Lmk wdyt about this?

Sorry for my describe, yes the time complexity of RB_NEXT is equal to RB_PREV, i mean, but the GetBlock call O(n) times RB_NEXT.

 for ( ; blk != NULL; blk = SBB_RB_NEXT(blk)) {
        if (blk->offset >= offset)
            return blk;
        else if ((blk->offset + blk->len) > offset) {
            return blk;
        }
    }

It searches from beginning of the tree, one by one, truns rb tree search into line. In worst situation, the node we find is the end block of the streaming buffer, it takes n*logn time to find it. If we find the prior node from root, we can run RB_PREV only once. Here's the code snippet:

static StreamingBufferBlock *GetBlockDo(const StreamingBuffer *sb, const uint64_t offset)
{
    StreamingBufferBlock *blk;
    StreamingBufferBlock *res = GetPriorBlock(sb, offset);

    if (res != NULL) {
        blk = SBB_RB_PREV(res);
    } else {
        blk = SBB_RB_MINMAX(&sb->sbb_tree, RB_INF);
    }
    if (blk && blk->offset + blk->len > offset) {
        return blk;
    }
    return res;
}

However, with the change in algorithm that you suggest, particularly the search for first node greater than equal to offset, we always seem to reach the correct node.

I'd like to understand how you'd handle the following.
Assumption: There is no offset bigger than the current offset in the tree.
1. There does exist an overlap with an existing buffer whose offset + len > offset (right most node always?)
2. There is no block covering that offset (so, perhaps we're in a GAP?)

Pleas look at the code snippet,
1.Return the RB_MAX node, then check if it satisfied offset and length check
2.Like what current GetBlock code is doing: seems that there is't a correct node, so we return NULL.

Let's see if your PR passes our QA and if our analysis is correct. :) Please make sure to add your test results and useful pointers/test data for us to verify that as well.

Please feel free to assign the ticket to yourself since you're working on it. Thanks.

Thanks,here's the commit https://github.com/regit/suricata/pull/51/commits/04906b57fdb0ed5a941389e8630c7783cd15d740 PR and hope it passed the QA.

Actions

Copy link

Updated by Shivani Bhardwaj 8 months ago

chris tang wrote in #note-6:

Thank you for your ongoing attention and responses to my tickets.

However, I'm a bit unsure about the claimed time complexity of the rbtree macros. It seems to me that RB_NEXT (finding successor) will take the same amount of time as RB_PREV (finding predecessor) i.e. O(lg n) per run. Please check the implementation in tree.h in the source code. Lmk wdyt about this?

Sorry for my describe, yes the time complexity of RB_NEXT is equal to RB_PREV, i mean, but the GetBlock call O(n) times RB_NEXT.

ah ok. yes, thank you!

[...]

It searches from beginning of the tree, one by one, truns rb tree search into line. In worst situation, the node we find is the end block of the streaming buffer, it takes n*logn time to find it. If we find the prior node from root, we can run RB_PREV only once. Here's the code snippet:

agreed.

[...]

However, with the change in algorithm that you suggest, particularly the search for first node greater than equal to offset, we always seem to reach the correct node.

I'd like to understand how you'd handle the following.
Assumption: There is no offset bigger than the current offset in the tree.
1. There does exist an overlap with an existing buffer whose offset + len > offset (right most node always?)
2. There is no block covering that offset (so, perhaps we're in a GAP?)

Pleas look at the code snippet,
1.Return the RB_MAX node, then check if it satisfied offset and length check
2.Like what current GetBlock code is doing: seems that there is't a correct node, so we return NULL.

Let's see if your PR passes our QA and if our analysis is correct. :) Please make sure to add your test results and useful pointers/test data for us to verify that as well.

Please feel free to assign the ticket to yourself since you're working on it. Thanks.

Thanks,here's the commit https://github.com/regit/suricata/pull/51/commits/04906b57fdb0ed5a941389e8630c7783cd15d740 PR and hope it passed the QA.

Thank you. Could you please create a PR at https://github.com/OISF/suricata/ so we can see if the QA passes and we're both correct in our analysis? ;)

Actions

Copy link

Updated by chris tang 8 months ago

Shivani Bhardwaj wrote in #note-7:

Thank you. Could you please create a PR at https://github.com/OISF/suricata/ so we can see if the QA passes and we're both correct in our analysis? ;)

Oh sorry for my mistake. I've request a PR at https://github.com/OISF/suricata/ but wrote the wrong address, here's the PR https://github.com/OISF/suricata/pull/11727, and i'm requesting a new PR to format code and sign commit.
Thanks.

Actions

Copy link

Updated by chris tang 8 months ago · Edited

Here is the new PR https://github.com/OISF/suricata/pull/11749

Actions

Copy link

#10

Updated by Shivani Bhardwaj 7 months ago

Subject changed from Tcpreassmble GetBlock needs optimization to tcp/reassemble: GetBlock takes O(nlgn) in worst case
Status changed from New to In Review