вторник, 22 ноября 2016 г.

Overlap Graphs (ROSALIND GRPH)

Given: A collection of DNA strings in FASTA format having total length at most 10 kbp.
Return: The adjacency list corresponding to O3. You may return edges in any order.

import re

f = open('12.txt', 'r')
reads = re.findall(r'(Rosalind_[0-9]+)\n(([A-T]+\n)+)', f.read())
suffix = {}
prefix = {}
for s in reads:
 string = s[1].replace('\n', '')
 if len(string) > 3:
  head = string[:3]
  if head in prefix:
   prefix[head].append(s[0])
  else:
   prefix[head] = [s[0]]
  tail = string[-3:]
  if tail in suffix:
   suffix[tail].append(s[0])
  else:
   suffix[tail] = [s[0]]
for rec in suffix:
 if rec in prefix:
  i = 0
  while i < len(suffix[rec]):
   j = 0
   while j < len(prefix[rec]):
    if suffix[rec][i] != prefix[rec][j]:
     print suffix[rec][i] + ' ' + prefix[rec][j]
    j += 1
   i += 1

Комментариев нет:

Отправить комментарий