Suppose I have a list of strings S, each string at most 50 characters long. I'm looking for a data structure and algorithm that can handle the following query efficiently:
Given another list of strings G (a small list of at most 20 strings), I want to find the lexicographically smallest subset of at most 15 strings from S. The result should satisfy the condition that each string in G is a substring of each and every string in the result. i need fast query.
I've searched extensively but couldn't find a satisfactory answer. Most solutions I found address the case with a single string in G, but I need to handle multiple strings as stated above.
Example: S: ["hello my", "yuck ", "mr elan", "now is the", "parag", "hello", "yustuc"] Query1: G: ["el", "m"] Query2: G: ["yu", "uc"] Desired Result1: ["hello my", "mr elan"] Desired Result2: ["yuck", "yustuc"]
If anyone has insights or knows the answer, please share your knowledge.
First of all, I think u should define lexographically smallest subset. Because a subset is a set, and so its not ordered. So it doesn't make sense to compare subsets of S lexographically.
i meant suppose i get 10 results then those 10 results should lexicographically ordered but for a particular query i get 50 results then among those 50 results i need 15 results such that those results are lexicographically smallest amongst all 50.
in the example "hello my" is lexicographically smaller than "mr elan"