суббота, 10 декабря 2016 г.

Speeding Up Motif Finding (ROSALIND KMP)

Given: A DNA string s (of length at most 100 kbp) in FASTA format.
Return: The failure array of s.

f = open('15.txt','r')
pre_s = ''
for line in f:
 pre_s = pre_s + line
f.close()
s = pre_s.replace('\n', '')
len_s = len(s)
first = s[0]
i = 1
P = [0]
matches = [] # list with matches lengths
while i < len_s:
 max_match = 0 # max match length for current string symbol
 j = 0
 len_matches = len(matches)
 while j < len_matches:
  match_num = matches[j]
  if s[i] == s[match_num]:
   match_num_pl1 = match_num + 1
   matches[j] = match_num_pl1
   if match_num_pl1 > max_match:
    max_match = match_num_pl1
  else:
   matches.pop(j)
   j -= 1
   len_matches -= 1
  j += 1
 if s[i] == first:
  matches.append(1)
  if max_match < 1:
   max_match = 1
 i += 1
 P.append(max_match)
res = ''
for k in P:
 res = res + ' ' + str(k)
f1 = open('15_1.txt', 'w')
f1.write(res)
f1.close()
 

Комментариев нет:

Отправить комментарий