在二进制数据中搜索文本

我有一个包含文本的二进制数据.该文本是已知的.什么是搜索该文本的快速方法:

作为例如.

This is text 1---
!@##$%%#^%&!%^$! <= Assume this line is 3 MB of binary data
Now, This is text 2 ---
!@##$%%#^%&!%^$! <= Assume this line is 2.5 MB of binary data
This is text 3 ---

如何搜索文本这是文本2.

目前我在做:

size_t count = 0;
size_t s_len = strlen("This is text 2");

//Assume data_len is length of the data from which text is to be found and data is pointer (char*) to the start of it.
for(; count < data_len; ++count)
{
    if(!memcmp("This is text 2", data + count, s_len)
    {
         printf("%s\n", "Hurray found you...");
    }
}

>还有其他方法,更有效的方法来做到这一点
>将用memchr(‘T’)逻辑替换计数逻辑帮助< =如果此语句不清楚,请忽略
> memchr的平均情况应该是大O的复杂性

标准C中没有任何东西可以帮助你,但有一个GNU扩展 memmem()可以做到这一点:

#define TEXT2 "This is text 2"

char *pos = memmem(data, data_len, TEXT2, sizeof(TEXT2));

if (pos != NULL)
    /* Found it. */

如果您需要可移植到没有这个的系统,您可以采用memmem()的glibc实现并将其合并到您的程序中.

相关文章
相关标签/搜索