I am thinking about an effient algorithm for wildcard searching with * representing any characters with any length.
aa*c, she*he, *she*he
When I am supposed to return, say, the beginning index of the first matching instance.
Say the pattern is of length M and the document is of length N, and the pattern has K '*' signs. I can think of a solution that first uses AC Automation to find all occurences of each chunk in O(N + M), with bitmask.
While converting the bitmask to indexes takes O(N * K)
Then binary search for the last possible beginning positions for each chunk. This could take O (K log N)
So the overall time complexity is still O (N * K), any way to do better?
References
https://codeforces.me/blog/entry/111380 https://codeforces.me/problemset/problem/1023/A