среда, 30 ноября 2016 г.

Introduction to Random Strings (ROSALIND PROB)

If I want to find the propobility that string s and some random string are equal, it'll 0.25 to the power of s length. Here I have not a random string, and it's not 0.25, but some other number, which is easy to find from gc-content.
Hint in this problem is very useful.

Given: A DNA string s of length at most 100 bp and an array A
containing at most 20 numbers between 0 and 1.
Return: An array B
having the same length as A in which B[k] represents the common logarithm of the probability that a random string constructed with the GC-content found in A[k] will match s exactly.
Hint: One property of the logarithm function is that for any positive numbers x and y, log10(xy)=log10(x)+log10(y).


from __future__ import division
import re
import sys
import math

def atcg_prob(x):
 cg_prob = float(x)
 at_prob = 1 - cg_prob
 atcg_prob = {}
 atcg_prob['A'] = at_prob / 2
 atcg_prob['T'] = atcg_prob['A']
 atcg_prob['C'] = cg_prob / 2
 atcg_prob['G'] = atcg_prob['C']
 return atcg_prob

def main():
 if len(sys.argv) > 1:
  res = ''
  i = 2
  while i < len(sys.argv):
   cont = atcg_prob(sys.argv[i])
   prob = 0
   for nuk in sys.argv[1]:
    prob = prob + math.log(cont[nuk], 10)
   #res = res + str(math.log(prob, 10)) + ' '
   res = res + str(round(prob, 3)) + ' ' 
   i += 1
  print res
 else:
  print 'Enter datas!'

if __name__ == '__main__':
 main()

Комментариев нет:

Отправить комментарий