суббота, 14 января 2017 г.

Finding a Protein Motif (ROSALIND MPRT)

How to read from web in Python 3 using the standard library:
  • urllib.request.urlopen
    import urllib.request
    response = urllib.request.urlopen('http://www.example.com/')
    html = response.read()
  • urllib.request.urlretrieve
    import urllib.request
    urllib.request.urlretrieve('http://www.example.com/songs/mp3.mp3', 'mp3.mp3')
     
    
     
Given: At most 15 UniProt Protein Database access IDs.

Return: For each protein possessing the N-glycosylation motif, output its given access ID followed by a list of locations in the protein string where the motif can be found.

import urllib.request
import os

filenames = []
sequences = []
f = open('mprt.txt', 'r')
for l in f:
    l = l.replace('\n', '')
    filename = l + '.txt'    filenames.append(l)
    urllib.request.urlretrieve('http://www.uniprot.org/uniprot/' + l + '.fasta', filename)
    f1 = open(filename, 'r')
    counter = 0    line = ''    for l1 in f1:
        if counter != 0:
            line += l1
        counter += 1    line = line.replace('\n', '')
    sequences.append(line)
    f1.close()
    os.remove(filename)

res = []
for s in sequences:
    positions = ''    for i in range(0, len(s)-3):
        if s[i] == 'N':
            if s[i+1] != 'P':
                if s[i+2] == 'S' or s[i+2] == 'T':
                    if s[i+3] != 'P':
                        pos = i+1                        positions += str(pos) + ' '    res.append(positions)

for q in range(len(filenames)):
    if len(res[q]) > 0:
        print(filenames[q])
        print(res[q])

Комментариев нет:

Отправить комментарий