python - Reducing the time complexity of this Program -


question - write function called answer(document, searchterms) returns shortest snippet of document, containing of given search terms. search terms can appear in order.

inputs: (string) document = "many google employees can program" (string list) searchterms = ["google", "program"] output: (string) "google employees can program"   inputs: (string) document = "a b c d a" (string list) searchterms = ["a", "c", "d"]  output: (string) "c d a" 

my program below giving correct answer time complexity high since doing cartesian product. if input high not able clear test cases. not able reduce complexity of program, , appreciated. thanks

import itertools  import sys  def answer(document, searchterms):      min = sys.maxint      matchedstring = ""      stringlist = document.split(" ")      d = dict()      j in range(len(searchterms)):          in range(len(stringlist)):              if searchterms[j] == stringlist[i]:                  d.setdefault(searchterms[j], []).append(i)      element in itertools.product(*d.values()):          sortedlist = sorted(list(element))          differencelist = [t - s s, t in zip(sortedlist, sortedlist[1:])]         if min > sum(differencelist):            min = sum(differencelist)           sortedelement = sortedlist            if sum(differencelist) == len(sortedelement) - 1:             break      try:         in range(sortedelement[0], sortedelement[len(sortedelement)-1]+1):              matchedstring += "".join(stringlist[i]) + " "      except:         pass      return matchedstring 

if wants clone program here code

one solution iterate through document using 2 indices (start , stop). keep track of how many of each of searchterms between start , stop. if not present increase stop until (or reach end of document). when present increase start until before searchterms no longer present. whenever searchterms present check if candidate better previous candidates. should able done in o(n) time (with limited number of search terms or search terms put in set o(1) lookup). like:

start = 0 stop = 0 counts = dict() cand_start = none cand_end = none  while stop < len(document):     if len(counts) < len(searchterms):          term = document[stop]          if term in searchterms:              if term not in counts:                   counts[term] = 1              else:                   counts[term] += 1     else:         if cand_start none or stop-start < cand_stop-cand_start:            cand_start = start            cand_stop = stop         term = document[start]         if term in counts:             if counts[start] == 1:                del counts[start]             else:                counts[start] -= 1         start += 1 

Comments

Popular posts from this blog

c# - Binding a comma separated list to a List<int> in asp.net web api -

Delphi 7 and decode UTF-8 base64 -

html - Is there any way to exclude a single element from the style? (Bootstrap) -