python - Reducing the time complexity of this Program -
question - write function called answer(document, searchterms) returns shortest snippet of document, containing of given search terms. search terms can appear in order.
inputs: (string) document = "many google employees can program" (string list) searchterms = ["google", "program"] output: (string) "google employees can program" inputs: (string) document = "a b c d a" (string list) searchterms = ["a", "c", "d"] output: (string) "c d a"
my program below giving correct answer time complexity high since doing cartesian product. if input high not able clear test cases. not able reduce complexity of program, , appreciated. thanks
import itertools import sys def answer(document, searchterms): min = sys.maxint matchedstring = "" stringlist = document.split(" ") d = dict() j in range(len(searchterms)): in range(len(stringlist)): if searchterms[j] == stringlist[i]: d.setdefault(searchterms[j], []).append(i) element in itertools.product(*d.values()): sortedlist = sorted(list(element)) differencelist = [t - s s, t in zip(sortedlist, sortedlist[1:])] if min > sum(differencelist): min = sum(differencelist) sortedelement = sortedlist if sum(differencelist) == len(sortedelement) - 1: break try: in range(sortedelement[0], sortedelement[len(sortedelement)-1]+1): matchedstring += "".join(stringlist[i]) + " " except: pass return matchedstring
if wants clone program here code
one solution iterate through document using 2 indices (start
, stop
). keep track of how many of each of searchterms
between start
, stop
. if not present increase stop
until (or reach end of document). when present increase start
until before searchterms
no longer present. whenever searchterms
present check if candidate better previous candidates. should able done in o(n)
time (with limited number of search terms or search terms put in set o(1)
lookup). like:
start = 0 stop = 0 counts = dict() cand_start = none cand_end = none while stop < len(document): if len(counts) < len(searchterms): term = document[stop] if term in searchterms: if term not in counts: counts[term] = 1 else: counts[term] += 1 else: if cand_start none or stop-start < cand_stop-cand_start: cand_start = start cand_stop = stop term = document[start] if term in counts: if counts[start] == 1: del counts[start] else: counts[start] -= 1 start += 1
Comments
Post a Comment