Skip to content
Cart

Your Cart

×

You have 0 items in your cart.

Register Sign in Wishlist

Code for chapter 18


Below are listed all the scripts as shown in chapter 18 of the book. These are free to copy and paste into your code editor.

Quick links to each script:

18.1 18.2 18.3 18.4 18.5 18.6 18.7 18.8 18.9 18.10
18.11 18.12 18.13 18.14 18.15 18.16 18.17 18.18 18.19 18.20
18.21 18.22 18.23 18.24 18.25 18.26 18.27 18.28 18.29 18.30
18.31 18.32 18.33 18.34 18.35 18.36 18.37 18.38 18.39 18.40
18.41 18.42 18.43 18.44 18.45 18.46 18.47 18.48 18.49 18.50
18.51 18.52 18.53 18.54 18.55 18.56 18.57 18.58 18.59 18.60
18.61 18.62 18.63 18.64 18.65          

Script 18.1

script_18_1.py
# Script 18.1
mass = 3.4
volume = 1.8
density = mass/volume       # Division
density = round(density, 3) # Round to three decimal places
print(density)
script_18_1.out
[ah@hobbes Documents]$ python script_18_1.py 
1.889
[ah@hobbes Documents]$


Script 18.2

script_18_2.py
# Script 18.2
x = 2     # Integer
y = 5     # Integer
z = 3.0   # Floating point

print(x * y) # 10  - Integer
print(x / y) # 0.4 - Floating point due to division
print(x + z) # 5.0 - Floating point

t = type(x * y - z)  # Get a value's data type
print(t)             # float
script_18_2.out
[ah@hobbes Documents]$ python script_18_2.py 
10
0
5.0
<type 'float'>
[ah@hobbes Documents]$

 

Script 18.3

script_18_3.py
# Script 18.3
x = 1            # An arbitrary integer
print(x.__doc__) # Print Python documentation for this object
script_18_3.out
[ah@hobbes Documents]$ python script_18_3.py 
int(x=0) -> int or long
int(x, base=10) -> int or long

Convert a number or string to an integer, or return 0 if no arguments
are given.  If x is floating point, the conversion truncates towards zero.
If x is outside the integer range, the function returns a long instead.

If x is not a number or if base is given, then x must be a string or
Unicode object representing an integer literal in the given base.  The
literal can be preceded by '+' or '-' and be surrounded by whitespace.
The base defaults to 10.  Valid bases are 0 and 2-36.  Base 0 means to
interpret the base from the string as an integer literal.
>>> int('0b100', base=0)
4
[ah@hobbes Documents]$

 

Script 18.4

script_18_4.py
# Script 18.4
x = True
print(x)
print(bool(0.0)) # False
print(bool(7))   # True
script_18_4.out
[ah@hobbes Documents]$ python script_18_4.py 
True
False
True
[ah@hobbes Documents]$

 

Script 18.5

script_18_5.py
# Script 18.5
x = 2
print(x > 5) # False
y = x < 3
print(y)     # True
script_18_5.out
[ah@hobbes Documents]$ python script_18_5.py 
False
True
[ah@hobbes Documents]$

 

Script 18.6

script_18_6.py
# Script 18.6
x = 1
y = -1
x > y and y > 1   # False - second comparison is False
print(x > y and y > 1)
x != 2 or y == 1  # True - first comparison is True; x does not equal 2
print(x != 2 or y == 1)
not (y > x)       # True - comparison is False; not False is True
print(not (y > x))
script_18_6.out
[ah@hobbes Documents]$ python script_18_6.py 
False
True
True
[ah@hobbes Documents]$

 

Script 18.7

script_18_7.py
# Script 18.7
x = 'Hello'
y = "world"
print(x,y)  # Print both values to screen
script_18_7.out
[ah@hobbes Documents]$ python script_18_7.py 
('Hello', 'world')
[ah@hobbes Documents]$

 

Script 18.8

script_18_8.py
# Script 18.8
x = '''This text flows from one line
to the next line inside triple quotes'''
print(x)
script_18_8.out
[ah@hobbes Documents]$ python script_18_8.py 
This text flows from one line
to the next line inside triple quotes
[ah@hobbes Documents]$

 

Script 18.9

script_18_9.py
# Script 18.9
x = 'abcde'
print(len(x)) # 5
print(x[0])   # First character 'a' at index 0
print(x[-1])  # Last character 'e'
print(x[1:3]) # 'bcd' range from index 1, up to, but not including 3
script_18_9.out
[ah@hobbes Documents]$ python script_18_9.py 
5
a
e
bc
[ah@hobbes Documents]$

 

Script 18.10

script_18_10.py
# Script 18.10
name = 'Mary'
x = 0.12
y = 34121.0

a = 'Hello {}'.format(name)    # name replaces brackets
print(a)                       # 'Hello Mary'

b = 'X {:.5f} Y {:.2e}'.format(x, y)   # 5 dp float 2 dp sci
print(b)                               # 'X 0.12000 Y 3.41e4'
 
c = '{:3d},{:3d}'.format(12,7)   # pad digits to 3 characters
print(c)                         # ' 12,  7'
script_18_10.out
[ah@hobbes Documents]$ python script_18_10.py 
Hello Mary
X 0.12000 Y 3.41e+04
 12,  7
[ah@hobbes Documents]$

 

Script 18.11

script_18_11.py
# Script 18.11
x = ['a', 'b', 'c', 'd']
print(x[2])   # 'c' - the character at index 2
print(x[1:])  # ['b', 'c', 'd'] - from index 1 to the end
x[0] = 'z'    # character at index 0 is set to 'z'
del x[2]      # Deletes item at index 2
print(x)      # ['z', 'b', 'd']

y = [[4,7], [9,6]] # A list containing two other lists
print(y[1])        # [9,6] - the sub-list at y index 1
print(y[1][0])     # 9 - item 0 from sub-list 1  
script_18_11.out
[ah@hobbes Documents]$ python script_18_11.py 
c
['b', 'c', 'd']
['z', 'b', 'd']
[9, 6]
9
[ah@hobbes Documents]$

 

Script 18.12

script_18_12.py
# Script 18.12
x = list('PQRST')     # Create list from text string
print(x)              # ['P','Q','R','S','T'] - list of characters
y = list(range(10))   # A list of the range from 0 up to <10
print(y)              # [0,1,2,3,4,5,6,7,8,9]
script_18_12.out
[ah@hobbes Documents]$ python script_18_12.py 
['P', 'Q', 'R', 'S', 'T']
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[ah@hobbes Documents]$

 

Script 18.13

script_18_13.py
# Script 18.13
x = [4, 1, 0] # Three items
a, b, c = x   # a is 4, b is 1, c is 0
x = ['a', 'b', 'c', 'd']
print(len(x))     # 4 - number of items in list
print('c' in x)   # True - string 'c' is in the list
print(2 in x)     # False - number 2 is not in list
script_18_13.out
[ah@hobbes Documents]$ python script_18_13.py 
4
True
False
[ah@hobbes Documents]$

 

Script 18.14

script_18_14.py
# Script 18.14
x = (9, 3, 1, 0)  # Create a tuple
y = (2,)          # Tuple with one item (needs trailing comma)
z = ()            # Empty tuple

print(x[-1])      # Print the last item - 0
#x[0] = 6          # Does not work! Tuples cannot be changed!

w = list(x)       # Create equivalent list from a tuple
w[0] = 6          # List copy can be changed
print('x',x)
print('w',w)
script_18_14.out
[ah@hobbes Documents]$ python script_18_14.py 
0
('x', (9, 3, 1, 0))
('w', [6, 3, 1, 0])
[ah@hobbes Documents]$

 

Script 18.15

script_18_15.py
# Script 18.15
x = {1,2,3,4,3,2,1}  # x is {1,2,3,4} - duplicates ignored
y = set([3,4,5,6])   # y created from a list
print(len(y))        # 4
print(1 in y)        # False; 1 is not in y
print(x & y)         # {3,4} - intersection, items in both
print(x | y)         # {1,2,3,4,5,6} - union, items in either
script_18_15.out
[ah@hobbes Documents]$ python script_18_15.py 
4
False
set([3, 4])
set([1, 2, 3, 4, 5, 6])
[ah@hobbes Documents]$

 

Script 18.16

script_18_16.py
# Script 18.16
d = {"G":329.21, "C":289.18, "A":313.21, "T":314.19}
print(d['A'])      # 313.21 -  value associated with 'A'
print(len(d))      # 4 - number of key:value pairs
print(d.keys())    # Just keys 'G', 'A', 'T', 'C'
print(d.values())  # Just values 329.21, 313.21, 314.19, 289.18
script_18_16.out
[ah@hobbes Documents]$ python script_18_16.py 
313.21
4
['A', 'C', 'T', 'G']
[313.21, 289.18, 314.19, 329.21]
[ah@hobbes Documents]$

 

Script 18.17

script_18_17.py
# Script 18.17
d = {"G":329.21, "C":289.18, "A":313.21, "T":314.19}
d['T'] = 304.19   # Change the value of an existing item
d['U'] = 291.08   # Add a new key:value pair
print(len(d))     # 5 - dict is larger  
del d['U']        # Delete a key and its value from the dictionary
script_18_17.out
[ah@hobbes Documents]$ python script_18_17.py 
5
[ah@hobbes Documents]$

 

Script 18.18

script_18_18.py
# Script 18.18
x = ['Mon', 'Tue', 'Wed'] # A list of strings
y = ['Fri', 'Sat', 'Sun'] # And another
x.append('Thu')  # Add a single new item to end
print(x)
x.extend(y)       # Extend with items from another collection
print(x)
x.sort()         # Sort contents alphabetically
print(x)
x.remove('Sun')  # Remove an item
print(x)
x.index('Sat')   # Positional index of an item
print(x)
script_18_18.out
[ah@hobbes Documents]$ python script_18_18.py 
['Mon', 'Tue', 'Wed', 'Thu']
['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']
['Fri', 'Mon', 'Sat', 'Sun', 'Thu', 'Tue', 'Wed']
['Fri', 'Mon', 'Sat', 'Thu', 'Tue', 'Wed']
['Fri', 'Mon', 'Sat', 'Thu', 'Tue', 'Wed']
[ah@hobbes Documents]$

 

Script 18.19

script_18_19.py
# Script 18.19
s = {'G', 'C', 'A', 'T'}   # A set with 4 strings
t = {'N', 'R', 'Y'}
print(s)
s.add('U')       # Add a single item (if not present)
print(s)
s.update(t)      # Add any new items from another collection
print(s)
script_18_19.out
[ah@hobbes Documents]$ python script_18_19.py 
set(['A', 'C', 'T', 'G'])
set(['A', 'C', 'U', 'T', 'G'])
set(['A', 'C', 'G', 'N', 'R', 'U', 'T', 'Y'])
[ah@hobbes Documents]$

 

Script 18.20

script_18_20.py
# Script 18.20
x = 3
if x < -1 or x > 1:    # Run the next indented lines only when true   
    x *= 2                
    print('Value was doubled')  

print('Value is:', x)  # Always executed, not in indented block
script_18_20.out
[ah@hobbes Documents]$ python script_18_20.py 
Value was doubled
('Value is:', 6)
[ah@hobbes Documents]$

 

Script 18.21

script_18_21.py
# Script 18.21
x = 3
print(x)
if x > 0:
    print('Positive')
elif x < 0:            # Checked if first condition was false
    print('Negative')
else:                  # If all fails
    print('Zero')

x = -5
print(x)
if x > 0:
    print('Positive')
elif x < 0:            # Checked if first condition was false
    print('Negative')
else:                  # If all fails
    print('Zero')
script_18_21.out
[ah@hobbes Documents]$ python script_18_21.py 
3
Positive
-5
Negative
[ah@hobbes Documents]$

 

Script 18.22

script_18_22.py
# Script 18.22
x = 'abc'
if x:              # Test innate truth of x
    print('true')  # This prints; a non-empty string is true
else:
    print('false')
script_18_22.out
[ah@hobbes Documents]$ python script_18_22.py 
true
[ah@hobbes Documents]$

 

Script 18.23

script_18_23.py
# Script 18.23
total = 0
data = [1,4,9,25,36]
for x in data:       # x is first 1, then 4, then 9 etc.
    print(x)         # Current value of x in this cycle
    total += x       # Add current value of x to total
script_18_23.out
[ah@hobbes Documents]$ python script_18_23.py 
1
4
9
25
36
[ah@hobbes Documents]$

 

Script 18.24

script_18_24.py
# Script 18.24
text = 'AGCAGTAGACGAACAT'     # String of characters
for i, x in enumerate(text):  # Extract index and character value
    print(i, x)               # Print index and value for each cycle
script_18_24.out
[ah@hobbes Documents]$ python script_18_24.py 
(0, 'A')
(1, 'G')
(2, 'C')
(3, 'A')
(4, 'G')
(5, 'T')
(6, 'A')
(7, 'G')
(8, 'A')
(9, 'C')
(10, 'G')
(11, 'A')
(12, 'A')
(13, 'C')
(14, 'A')
(15, 'T')
[ah@hobbes Documents]$

 

Script 18.25

script_18_25.py
# Script 18.25
x = 1
while x < 1000:   # Repeat the indented block while this is true
    print(x)
    x *= 2        # Double the value

print(x)  # 1024 - final value stopped the loop: not less than 1000
script_18_25.out
[ah@hobbes Documents]$ python script_18_25.py 
1
2
4
8
16
32
64
128
256
512
1024
[ah@hobbes Documents]$

 

Script 18.26

script_18_26.py
# Script 18.26
t = 0
data = [3, -1, 2, -5, None, 9, -2]
for x in data:  
   if x < 0:
       continue      # Skip the remainder of 'for' loop
   elif x is None:
       break         # Quit entirely
     
   t += x * x        # Otherwise do a calculation
   print(t)

print('Final',t)
script_18_26.out
[ah@hobbes Documents]$ python script_18_26.py 
9
13
94
('Final', 94)
[ah@hobbes Documents]$  

Script 18.27

script_18_27.py
# Script 18.27
squares = []
for x in range(1, 10):
    squares.append(x*x)  # 1, 4, 9, 16, 25, 36...
    print('Option 1',squares)

squares2 = []
squares2 = [x*x for x in range(1,10)]
print('Option 2',squares2)
script_18_27.out
[ah@hobbes Documents]$ python script_18_27.py 
('Option 1', [1])
('Option 1', [1, 4])
('Option 1', [1, 4, 9])
('Option 1', [1, 4, 9, 16])
('Option 1', [1, 4, 9, 16, 25])
('Option 1', [1, 4, 9, 16, 25, 36])
('Option 1', [1, 4, 9, 16, 25, 36, 49])
('Option 1', [1, 4, 9, 16, 25, 36, 49, 64])
('Option 1', [1, 4, 9, 16, 25, 36, 49, 64, 81])
('Option 2', [1, 4, 9, 16, 25, 36, 49, 64, 81])
[ah@hobbes Documents]$

 

Script 18.28

script_18_28.py
# Script 18.28
x = 1
y = 0
try:                 # Run the following block and check for failure
    w = x / y

except ZeroDivisionError as err:        # y was zero
   print('divided by zero, continuing') # warn, but otherwise ignore

except Exception as err:   # Other, unspecified problem
    raise(err)             # Trigger the error, do not continue
script_18_28.out
[ah@hobbes Documents]$ python script_18_28.py 
divided by zero, continuing
[ah@hobbes Documents]$

 

Script 18.29

script_18_29.py
# Script 18.29
def my_calc(x, y=1):     # x is mandatory, y defaults to 1 if not given
    z = x * x - y * y
    return z

a = my_calc(7)           # x is 7, y defaults to 1  
b = my_calc(x=2, y=2)    # name both arguments
c = my_calc(y=9, x=-1)   # name arguments in a different order
d = my_calc(3, y=-2)     # unnamed arguments (x is 3) come first
print(a, b, c, d)
script_18_29.out
[ah@hobbes Documents]$ python script_18_29.py 
(48, 0, -80, 5)
[ah@hobbes Documents]$

 

Script 18.30

script_18_30.py
# Script 18.30
class Person():         # Next indented block is in the class definition

  def __init__(self, name, age):  # Values specified when object is made
    self.name = name              # Link input values to the object
    self.age = age

  def get_first_name(self):       # A second, custom function
    names = self.name.split()     # self refers to the run-time object
    return names[0]               # Give back first word

p1 = Person('Lisa Simpson', 8)    # Make object of Person class
p2 = Person('Bart Simpson', 10)   # Make another
print(p1.age, p2.age)             # Values linked to objects - 8, 10
print(p1.get_first_name())        # Run a linked function - gives 'Lisa'
script_18_30.out
[ah@hobbes Documents]$ python script_18_30.py 
(8, 10)
Lisa
[ah@hobbes Documents]$

 

Script 18.31

script_18_31.py
# Script 18.31
import math           # Import the inbuilt mathematics module
print(math.e)         # An attribute representing e
print(math.exp(2.0))  # Use the exponent (ex) function
script_18_31.out
[ah@hobbes Documents]$ python script_18_31.py 
2.71828182846
7.38905609893
[ah@hobbes Documents]$

 

Script 18.32

script_18_32.py
# Script 18.32
import math as m      # Import as a different name
print(m.log(2.0))     # Use the logarithm function
script_18_32.out
[ah@hobbes Documents]$ python script_18_32.py 
0.69314718056
[ah@hobbes Documents]$

 

Script 18.33

script_18_33.py
# Script 18.33
from math import sqrt, cos   # Import named module components
x = sqrt(3/2)                # Use the square root function
print(cos(x))                # Use the cosine function
script_18_33.out
[ah@hobbes Documents]$ python script_18_33.py 
0.540302305868
[ah@hobbes Documents]$

 

Script 18.34

script_18_34.py
# Script 18.34
import sys
print(sys.argv) # List of words typed at the command line after "python"
print(sys.path) # Directory search path used to locate Python modules
sys.exit()      # Quit the Python program
script_18_34.out
[ah@hobbes Documents]$ python script_18_34.py 
['script_18_34.py']
['/home/ah/Documents', '/usr/lib64/python27.zip', '/usr/lib64/python2.7',
'/usr/lib64/python2.7/plat-linux2', '/usr/lib64/python2.7/lib-tk',
'/usr/lib64/python2.7/lib-old', '/usr/lib64/python2.7/lib-dynload',
'/usr/lib64/python2.7/site-packages',
'/usr/lib64/python2.7/site-packages/Numeric',
'/usr/lib64/python2.7/site-packages/gtk-2.0',
'/usr/lib64/python2.7/site-packages/wx-3.0-gtk3',
'/usr/lib/python2.7/site-packages']
[ah@hobbes Documents]$

 

Script 18.35

script_18_35.py
# Script 18.35
import random
random.seed(3)                   # Set the random number seed
print(random.randint(1,10))      # A random integer from 1 to 10 inc.
print(random.uniform(0.0, 2.0))  # A random float between 0.0 and 2.0
data = [1,2,3,4,5]
random.shuffle(data)             # Shuffle list order
script_18_35.out
[ah@hobbes Documents]$ python script_18_35.py 
3
1.08845845059
[ah@hobbes Documents]$

 

Script 18.36

script_18_36.py
# Script 18.36
import re
pattern = re.compile('\d+') # Make a regular expression object
                            # (one or more digits)
text = 'A 123 B 456'        # A string to look in

match_obj = pattern.search(text)  # Match inside string  
print(match_obj.group())          # '123' - matching substring
print(match_obj.start)            # 2 - position of match

text_2 = pattern.sub(text, '**')  # Substitute all matches with '**'
print(text_2)                     # 'A ** B **'

hits = pattern.findall(text)      # List of matching sub-strings
print(hits)                       # ['123', '456'])
script_18_36.out
[ah@hobbes Documents]$ python script_18_36.py 
123
<built-in method start of _sre.SRE_Match object at 0x7fb1605f8578>
**
['123', '456']
[ah@hobbes Documents]$

 

Script 18.37

script_18_37.py
# Script 18.37
import sys
print(sys.path)                        # Current module search path
sys.path.append('./')                  # Directory with my_module.py
import my_module                       # Import custom module (no .py)

print(my_module.CONSTANT)       # Use variable from other Python file
print(my_module.my_calc(9,-1))  # Use function from other Python file
script_18_37.out
[ah@hobbes Documents]$ python script_18_37.py 
['/home/ah/Documents', '/usr/lib64/python27.zip', '/usr/lib64/python2.7',
'/usr/lib64/python2.7/plat-linux2', '/usr/lib64/python2.7/lib-tk',
'/usr/lib64/python2.7/lib-old', '/usr/lib64/python2.7/lib-dynload',
'/usr/lib64/python2.7/site-packages',
'/usr/lib64/python2.7/site-packages/Numeric',
'/usr/lib64/python2.7/site-packages/gtk-2.0',
'/usr/lib64/python2.7/site-packages/wx-3.0-gtk3',
'/usr/lib/python2.7/site-packages']
1.0545718e-34
98
[ah@hobbes Documents]$

 

Script 18.38

script_18_38.py
# Script 18.38
def my_calc(x, y):             # Example function
    return x*x - 2*x*y - y*y

if __name__ == '__main__':     # Is the code run as a main program?
    print(my_calc(2,3))        # Run test code only when it is main
    print(my_calc(-1,0))       # Module imports do not run this block
script_18_38.out
[ah@hobbes Documents]$ python script_18_38.py 
-17
1
[ah@hobbes Documents]$

 

Script 18.39

script_18_39.py
# Script 18.39
path = 'data_file.txt'            # Location of data in file system
file_obj = open(path, 'r')        # Create file object for reading
data = file_obj.read()            # Read all the data (as a string)
print(data)
script_18_39.out
[ah@hobbes Documents]$ python script_18_39.py 
Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur.

Excepteur sint occaecat cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.

[ah@hobbes Documents]$

 

Script 18.40

script_18_40.py
# Script 18.1
mass = 3.4
volume = 1.8
density = mass/volume       # Division
density = round(density, 3) # Round to three decimal places
print(density)
script_18_40.out
[ah@hobbes Documents]$ python script_18_40.py 
** First read **
Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur.

Excepteur sint occaecat cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.

** Second read **

** Third read **
Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur.

Excepteur sint occaecat cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.

[ah@hobbes Documents]$

 

Script 18.41

script_18_41.py
# Script 18.41
path = 'data.txt'           # Location of data in file system
file_obj = open(path, 'r')  # Create file object for reading
for line in file_obj:       # Loop through all lines in file
    print(line)
script_18_41.out
[ah@hobbes Documents]$ python script_18_41.py 
chr        mb_size        proteins

1        249.250        2012

2        243.199        1203

3        198.022        1040

[ah@hobbes Documents]$

 

Script 18.42

script_18_42.py
# Script 18.42
f_name = 'data.txt'
total = 0
with open(f_name) as file_obj:
    file_obj.readline()         # Read first header line; not used
    for line in file_obj:       # Go through each remaining line
        data = line.split()     # Split line at whitespace into a list
        size = float(data[1])   # Make a number from column [1]
        total += size           # Increase total
        print(size,total)
script_18_42.out
[ah@hobbes Documents]$ python script_18_42.py 
(249.25, 249.25)
(243.199, 492.449)
(198.022, 690.471)
[ah@hobbes Documents]$

 

Script 18.43

script_18_43.py
# Script 18.43
import sys                 # Access the sys module
py_script = sys.argv[0]    # 'script_18_43.py'   
data_file = sys.argv[1]    # 'inputFile.txt'

data = open(data_file).read()  # Make file object and read its data

print(data_file)
print(data)
script_18_43.out
[ah@hobbes Documents]$ python script_18_43.py inputFile.txt 
inputFile.txt
  SampleID Gender Weight Age Affected
1       S1   Male   66.3  20     TRUE
2       S2   Male   80.2  21    FALSE
3       S3 Female   53.1  29     TRUE
4       S4 Female   21.3  18     TRUE
5       S5 Female   54.6  NA    FALSE
6       S6 Female   78.1  23     TRUE

[ah@hobbes Documents]$

 

Script 18.44

script_18_44.py
# Script 18.44
from Bio import SeqIO    # Load BioPython module; must be installed
 
file_name = "demoSequences.fasta"     # Location of data
file_obj = open(file_name, "rU")               # Universal read mode

for protein in SeqIO.parse(file_obj, 'fasta'):  # Go through each entry
  print(protein.id)                            # The ID of seq record
  print(protein.seq)                           # The actual sequence

file_obj.close()
script_18_44.out
[ah@hobbes Documents]$ python script_18_44.py 
sp|P52907|CAPZA1_HUMAN
MADFDDRVSDEEKVRIAAKFITHAPPGEFNEVFNDVRLLLNNDNLLREGAAHAFAQYNMDQFTPVKIEGYEDQV
LITEHGDLGNSRFLDPRNKISFKFDHLRKEASDPQPEEADGGLKSWRESCDSALRAYVKDHYSNGFCTVYAKT
IDGQQTIIACIESHQFQPKNFWNGRWRSEWKFTITPPTAQVVGVLKIQVHYYEDGNVQLVSHKDVQDSLTVS
NEAQTAKEFIKIIENAENEYQTAISENYQTMSDTTFKALRRQLPVTRTKIDWNKILSYKIGKEMQNA
sp|Q9Y3C0|CCD53_HUMAN
MDEDGLPLMGSGIDLTKVPAIQQKRTVAFLNQFVVHTVQFLNRFSTVCEEKLADLSLRIQQIETTLNILDAKLS
SIPGLDDVTVEVSPLNVTSVTNGAHPEATSEQPQQNSTQDSGLQESEVSAENILTVAKDPRYARYLKMVQVGV
PVMAIRNKMISEGLDPDLLERPDAPVPDGESEKTVEESSDSESSFSD
1V1A.pdb_A
MLEVVTAGEPLVALVPQEPGHLRGKRLLEVYVGGAEVNVAVALARLGVKVGFVGRVGEDELGAMVEERLRAEGV
DLTHFRRAPGFTGLYLREYLPLGQGRVFYYRKGSAGSALAPGAFDPDYLEGVRFLHLSGITPALSPEARAFSLW
AMEEAKRRGVRVSLDVNYRQTLWSPEEARGFLERALPGVDLLFLSEEEAELLFGRVEEALRALSAPEVVLKRGAK
G-AWAFVDGRRVEGSAFAVEAVDPVGAGDAFAAGYLAGAVWGLPVEERLRLANLLGASVAASRGDHEGAPYRE
DLEVLLK
H.
MGKDNYDVVVIGAVGVDTNVYLPGQDIDFDVEANFTTNIDYPGLAGSYTSRGFARLVDKVAVIAYIGEDHNGW
YVKNELEGDGIDCTFFLDPVGTKRSINFMYKDGRRKNFYDGKASMEVKPDIDICKSILKKTNLAHFNIVNWTRY
LLPVARELGVTISCDIQDITDLDDEYRQDFINYADVLFFSAVNFKDPVPLIKEFIKRKPDRIVICGMGDKGCALG
TEQGIKFYKALDMLEPVVDTNGAGDGLAVGFLSSYFIDGFSLEDSILRGQIVARYTCTKKATSSELITGDKLNKY
FEEMKESG
[ah@hobbes Documents]$

Script 18.45

script_18_45.py
# Script 18.45
import pandas
data_set = pandas.read_csv('in.csv', sep='\t',
                           header=None, names=['A','B','C'])

for col in data_set:            # Loop through column names
  for value in data_set[col]:   # Column names act as keys
    print(col, value)
script_18_45.out
[ah@hobbes Documents]$ python script_18_45.py 
('A', 'chr')
('A', '1')
('A', '2')
('A', '3')
('B', 'mb_size')
('B', '249.250')
('B', '243.199')
('B', '198.022')
('C', 'proteins')
('C', '2012')
('C', '1203')
('C', '1040')
[ah@hobbes Documents]$

 

Script 18.46

script_18_46.py
# Script 18.46
import sqlite3                  # Installed as standard
connection = sqlite3.connect('genome_example.sqlite')

cursor = connection.cursor()    # Make a cursor

make_table_smt = """
CREATE TABLE Genome (
  build_id VARCHAR(12),
  genus TEXT NOT NULL,
  species TEXT NOT NULL,
  strain TEXT,
  num_chromsomes INT,
  mb_length FLOAT,
  pubmed_id VARCHAR(12),
  PRIMARY KEY (build_id));
"""

cursor.execute(make_table_smt)  # Execute SQL string to make table
cursor.close()                  # The cursor is finished
connection.commit()             # Commit the changes to the database
script_18_46.out
[ah@hobbes Documents]$ ls -al genome_example.sqlite 
-rw-r--r--. 1 ah ah 3072 Jan  5 18:19 genome_example.sqlite
[ah@hobbes Documents]$

 

Script 18.47

script_18_47.py
# Script 18.47
import sqlite3                  # Installed as standard
connection = sqlite3.connect('genome_example.sqlite')

cursor = connection.cursor()    # A new cursor into the database

# Create the INSERT statement using "\" to split over two lines
add_genome_smt = 'INSERT INTO Genome (build_id, genus, species, ' \
                 'strain, num_chromsomes) VALUES (?, ?, ?, ?, ?)'

mouse_data = ['GRCm38', 'Mus', 'musculus', 'C57BL/6', 22] # Data to add
cursor.execute(add_genome_smt, mouse_data)                # Add one row

multi_data = [['GRCh38', 'Homo', 'sapiens', None, 25],    # Data to add
              ['GRCh37', 'Homo', 'sapiens', None, 25],
              ['BDGP6', 'Drosophila', 'melanogaster', '2057', 7]]
cursor.executemany(add_genome_smt, multi_data)            # Add many rows

set_size_smt = 'UPDATE Genome SET mb_length=? WHERE build_id=?'
cursor.execute(set_size_smt, [142.573, 'BDGP6'])          # Alter a row
cursor.execute(set_size_smt, [3482.010, 'GRCm38'])

get_organism_smt = 'SELECT genus, species FROM Genome WHERE build_id=?'
genus, species = cursor.execute(get_organism_smt, ['GRCh37']).fetchone()

smt = 'SELECT build_id, genus, species, mb_length FROM Genome'
for build, g, s, sz in cursor.execute(smt):
  print(build, g, s, sz)
   
cursor.close()
script_18_47.out
[ah@hobbes Documents]$ python script_18_47.py 
(u'GRCm38', u'Mus', u'musculus', 3482.01)
(u'GRCh38', u'Homo', u'sapiens', None)
(u'GRCh37', u'Homo', u'sapiens', None)
(u'BDGP6', u'Drosophila', u'melanogaster', 142.573)
[ah@hobbes Documents]$

 

Script 18.48

script_18_48.py
# Script 18.48
from numpy import array       # Import the function to make ndarrays
x = array([[1,2,3],[4,5,6]])  # Make array from list of lists
print(x.shape)                # (2,3) - rows x columns
print(x.size)                 # 6 - elements in total
print(x.ndim)                 # 2 - two axes
print(x.dtype)                # int - whole numbers
y = array(x)                  # Copy an array
y.reshape((3,2))              # [[1,2], [3,4], [5,6]] - Convert to 3 x 2
z = array((3,5,1,2), float)   # Make array forced as floating point
print('y',y)
print('z',z)
script_18_48.out
[ah@hobbes Documents]$ python script_18_48.py 
(2, 3)
6
2
int64
('y', array([[1, 2, 3],
       [4, 5, 6]]))
('z', array([ 3.,  5.,  1.,  2.]))
[ah@hobbes Documents]$

 

Script 18.49

script_18_49.py
# Script 18.49
from numpy import array # Import the function to make ndarrays
x = array([5.1, 6.2, 7.3, 8.4, 9.5]) # Flat array of five float numbers

x[2]         # 7.3                    - number at index 2
x[1:4]       # array([6.2, 7.3, 8.4]) - slice range makes new array
print('x',x)

y = array([[1,2,3], [4,5,6]]) # Two dimensional array

y[1,2]       # 6                       - row one, column two
y[0]         # array([1, 2, 3])        - entire row zero
y[0,:]       # array([1, 2, 3])        - row zero, all columns (same)
y[:,2]       # array([3, 6])           - all rows, column two
y[-1,:]      # array([4, 5, 6])        - last row, all columns
y[:,1:]      # array([[2, 3],[5, 6]])  - column one onwards
y[1,0:2]     # array([4, 5])           - row one, first two columns
y[::-1,:]    # array([[4, 5, 6],[1, 2, 3]]) - reversed rows
y[:,(2,1,0)] # array([[3, 1, 2],[6, 4, 5]]) - new column order
print('y',y)
script_18_49.out
[ah@hobbes Documents]$ python script_18_49.py 
('x', array([ 5.1,  6.2,  7.3,  8.4,  9.5]))
('y', array([[1, 2, 3],
       [4, 5, 6]]))
[ah@hobbes Documents]$

 

Script 18.50

script_18_50.py
# Script 18.50
import numpy as np

a = np.zeros((2,3))          # 2 x 3 array full of 0.0
b = np.ones((3,2), int)      # 3 x 2 array full of 1
c = np.full((2,2), 7.0)      # 2 x 2 array full of 7.0
d = np.identity(3)           # 3 x 3 identity (1 on diagonal 0 elsewhere)
e = np.arange(1.0, 3.0, 0.5) # [1.0, 1.5, 2.0, 2.5]
                             # From 1.0 up to <3.0 in steps of 0.5
print('a',a)
print('b',b)
print('c',c)
print('d',d)
print('e',e)

x = np.array([1.0, 2.0, 3.0])
y = np.array([3.0, 4.0, 5.0])
x + 2    # array([3.0, 4.0, 5.0]) - Add 2 to all values
y / 2    # array([1.5, 2.0, 2.5]) - Divide all values by 2

x + y    # array([4.0, 6.0, 8.0]) - i.e. 1+3, 2+4, 3+5
x * y    # array([3.0, 8.0, 15.0])    
x - y    # array([-2.0, -2.0, -2.0])
x / y    # array([0.33333333, 0.5, 0.6])

angles  = np.array([30.0, 60.0, 90.0, 135.0])
radians = np.radians(angles)
cosines = np.cos(radians)     # array([0.866, 0.50, 0.0, -0.707])
print('angles',angles)
print('radians',radians)
print('cosines',cosines)

np.log([10.0, 2.71828, 1.0])  # array([2.302585, 1.0, 0.0])
np.exp([2.302585, 1.0, 0.0])  # array([10.0, 2.71828, 1.0])
script_18_50.out
[ah@hobbes Documents]$ python script_18_50.py 
('a', array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.]]))
('b', array([[1, 1],
       [1, 1],
       [1, 1]]))
('c', array([[ 7.,  7.],
       [ 7.,  7.]]))
('d', array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]]))
('e', array([ 1. ,  1.5,  2. ,  2.5]))
('angles', array([  30.,   60.,   90.,  135.]))
('radians', array([ 0.52359878,  1.04719755,  1.57079633,  2.35619449]))
('cosines', array([  8.66025404e-01,   5.00000000e-01,   6.12323400e-17,
        -7.07106781e-01]))
[ah@hobbes Documents]$

 

Script 18.51

script_18_51.py
# Script 18.51
from numpy import arange, exp, fft, random, pi

l = 0.04
w = 0.1
t = arange(0.0, 100.0, 1.0)       # A range of time values
x = exp(2j*pi*w*t) * exp(-l*t)    # Wave equation using complex numbers
y = fft.fft(x)                    # Fast Fourier transform array of wave

a = random.uniform(0.0, 5.0, 100) # Uniform sample 100 values in [0, 5)
b = random.normal(0.0, 2.5, 100)  # Sample normal distribution
                                  # Mean 0.0, standard deviation 2.5
print(x[0],y[0])
print(a[0],b[0])
script_18_51.out
[ah@hobbes Documents]$ python script_18_51.py 
((1+0j), (0.59324391476340854+1.5043545361088564j))
(1.7221136377293216, 3.4608000427150243)
[ah@hobbes Documents]$

 

Script 18.52

script_18_52.py
# Script 18.52
from matplotlib import pyplot
from numpy import arange, random
 
x_vals = arange(1000)
y_vals = random.poisson(5, 1000)
pyplot.scatter(x_vals, y_vals, s=4, marker='*') # Make scatter plot
pyplot.savefig("script_18_52.png", dpi=300)     # Save graph as an image
pyplot.show()                                   # Show on screen
script_18_52.out
[ah@hobbes Documents]$ python script_18_52.py 
[ah@hobbes Documents]$ ls -al script_18_52.png
-rw-rw-r--. 1 ah ah 78409 Jan  5 22:25 script_18_52.png
[ah@hobbes Documents]$
script_18_52.png
script_18_52.png

 

Script 18.53

script_18_53.py
# Script 18.53
from scipy import ndimage
img_file = 'my_image.png'
pixmap = ndimage.imread(img_file)     # Read image data as array

height, width, channels = pixmap.shape           
red_channel   = pixmap[:,:,0]         # Color channels are last axis
green_channel = pixmap[:,:,1]
print(red_channel)
script_18_53.out
[ah@hobbes Documents]$ python script_18_53.py 
[[ 0  0  0 ...,  0  0  0]
 [ 0  0  0 ...,  0  0  0]
 [ 3  2 14 ...,  8 17 23]
 ...,
 [18 18  7 ..., 18  6  1]
 [10  9  9 ..., 19  9  7]
 [11  1  1 ...,  8  1  3]]
[ah@hobbes Documents]$

 

Script 18.54

script_18_54.py
# Script 18.54
def rev_complement(seq):       # Define function name and input argument
  rev_map = {'A':'T','T':'A',  # DNA letter mapping dictionary
             'G':'C','C':'G'}
  rc_seq = ''                  # Rev. comp. string is initially empty
  rev_comp = ''

  for x in seq:                # Go through each seq letter in turn
    y = rev_map[x]             # look up comp. letter in dictionary
    rev_comp = y + rev_comp    # Rev. comp. string grows
                               # New letter added to _beginning_
 
  return rev_comp              # Rev. comp string sent as output

f_seq = 'AGCATAAGAATAGCAGCAGCGCGA'  # A test DNA one-letter sequence
r_seq = rev_complement(f_seq)       # Run function on test sequence
print(f_seq, r_seq)                 # Print original and rev. complement
script_18_54.out
[ah@hobbes Documents]$ python script_18_54.py 
('AGCATAAGAATAGCAGCAGCGCGA', 'TCGCGCTGCTGCTATTCTTATGCT')
[ah@hobbes Documents]$

 

Script 18.55

script_18_55.py
# Script 18.55
def calc_seq_ident(seq_a, seq_b):  # Define func. taking two input args
  n = min(len(seq_a), len(seq_b))  # Find minimum input seq. length
  count = 0.0                      # Starting identity count is zero

  for i in range(n):          # Loop through valid position indices
    a = seq_a[i]              # Letter at index i for first seq
    b = seq_b[i]              # Letter at index i for second seq
    if a == b and a != '-':   # Test if letters are same and not a gap
      count += 1.0            # .. if they are increase count by one

  return 100.0 * count/n      # Calc. and send back identity as % total
 
seq1 = 'ALIGDPVENTS'
seq2 = 'ALIGN-MENTS'
x = calc_seq_ident(seq1, seq2) # Run on some test sequences

print('Seq identity:', x, '%') # x is 72.7
script_18_55.out
[ah@hobbes Documents]$ python script_18_55.py 
('Seq identity:', 72.72727272727273, '%')
[ah@hobbes Documents]$

 

Script 18.56

script_18_56.py
# Script 18.56
def download_db_id(db_id, url, file_name=None): # Save file is optional

  #from urllib.request import urlopen    # python 3.0 Import urlopen() function  
  from urllib2 import urlopen            # python 2.7 Import urlopen() function  
  response = urlopen(url.format(db_id))  # Makes object to handle link
  data = response.read()                 # Read data from URL
  data = data.decode('utf-8')            # Interpret data as plain text
 
  if file_name:                       # If save file was specified...
    file_obj = open(file_name, 'w')   # Create file object for writing
    file_obj.write(data)              # Save all data to file
    file_obj.close()

  return data                         # Hand back data from function

UP_FASTA_URL = 'http://www.uniprot.org/uniprot/{}.fasta'
PDB_URL = 'http://www.rcsb.org/pdb/cgi/export.cgi/' \
          '{}.pdb?format=PDB&compression=None'

download_db_id('1AFO', PDB_URL, '1AF0.pdb')   # Save .pdb file

data = download_db_id('P02788', UP_FASTA_URL) # Get FASTA data, no save
print(data)                                   # Show downloaded data
script_18_56.out
[ah@hobbes Documents]$ python script_18_56.py 
>sp|P02788|TRFL_HUMAN Lactotransferrin OS=Homo sapiens
GN=LTF PE=1 SV=6
MKLVFLVLLFLGALGLCLAGRRRSVQWCAVSQPEATKCFQWQRNMRK
VRGPPVSCIKRDSPIQCIQAIAENRADAVTLDGGFIYEAGLAPYKLRPV
AAEVYGTERQPRTHYYAVAVVKKGGSFQLNELQGLKSCHTGLRRTAGW
NVPIGTLRPFLNWTGPPEPIEAAVARFFSASCVPGADKGQFPNLCRLCA
GTGENKCAFSSQEPYFSYSGAFKCLRDGAGDVAFIRESTVFEDLSDEAER
DEYELLCPDNTRKPVDKFKDCHLARVPSHAVVARSVNGKEDAIWNLLRQ
AQEKFGKDKSPKFQLFGSPSGQKDLLFKDSAIGFSRVPPRIDSGLYLGS
GYFTAIQNLRKSEEEVAARRARVVWCAVGEQELRKCNQWSGLSEGSVT
CSSASTTEDCIALVLKGEADAMSLDGGYVYTAGKCGLVPVLAENYKSQQ
SSDPDPNCVDRPVEGYLAVAVVRRSDTSLTWNSVKGKKSCHTAVDRTA
GWNIPMGLLFNQTGSCKFDEYFSQSCAPGSDPRSNLCALCIGDEQGEN
KCVPNSNERYYGYTGAFRCLAENAGDVAFVKDVTVLQNTDGNNNEAWA
KDLKLADFALLCLDGKRKPVTEARSCHLAMAPNHAVVSRMDKVERLKQ
VLLHQQAKFGRNGSDCPDKFCLFQSETKNLLFNDNTECLARLHGKTTYE
KYLGPQYVAGITNLKKCSTSPLLEACEFLRK

[ah@hobbes Documents]$ ls -al 1AF0.pdb
-rw-rw-r--. 1 ah ah 2182302 Jan  5 22:39 1AF0.pdb
[ah@hobbes Documents]$

 

Script 18.57

script_18_57.py
# Script 18.57

def download_db_id(db_id, url, file_name=None): # Save file is optional

  #from urllib.request import urlopen   
# python 3.0 Import urlopen() function  
  from urllib2 import urlopen           
# python 2.7 Import urlopen() function  
  response = urlopen(url.format(db_id)) 
# Makes object to handle link
  data = response.read()                
# Read data from URL
  data = data.decode('utf-8')           
# Interpret data as plain text
 
  if file_name:                       # If save file was specified...
    file_obj = open(file_name, 'w')   # Create file object for writing
    file_obj.write(data)              # Save all data to file
    file_obj.close()

  return data                         # Hand back data from function

def get_uniprot_seq_record(db_id):

   import os
   from Bio import SeqIO
   UNIPROT_FASTA_URL = 'http://www.uniprot.org/uniprot/{}.fasta'
   file_name = db_id + '.fasta' 
# Add extension to ID to make file name
   
   if not os.path.exists(file_name):
     download_db_id(db_id, UNIPROT_FASTA_URL, file_name)
   
   file_obj = open(file_name)    # Open file for reading
   
   for seq_record in SeqIO.parse(file_obj, 'fasta'):
     return seq_record           # Give back first record encountered

sr = get_uniprot_seq_record('P18754')
print(sr.id, sr.seq)
script_18_57.out
[ah@hobbes Documents]$ python script_18_57.py 
('sp|P18754|RCC1_HUMAN', Seq('MSPKRIAKRRSPPADAIPKSKKVK
VSHRSHSTEPGLVLTLGQGDVGQLGLGENV...EQS',
SingleLetterAlphabet()))
[ah@hobbes Documents]$ ls -al P18754.fasta
-rw-rw-r--. 1 ah ah 522 Jan  5 22:48 P18754.fasta
[ah@hobbes Documents]$

 

Script 18.58

script_18_58.py
# Script 18.58

def download_db_id(db_id, url, file_name=None): # Save file is optional

  #from urllib.request import urlopen   
# python 3.0 Import urlopen() function  
  from urllib2 import urlopen           
# python 2.7 Import urlopen() function  
  response = urlopen(url.format(db_id))  # Makes object to handle link
  data = response.read()                 # Read data from URL
  data = data.decode('utf-8')            # Interpret data as plain text
 
  if file_name:                       # If save file was specified...
    file_obj = open(file_name, 'w')   # Create file object for writing
    file_obj.write(data)              # Save all data to file
    file_obj.close()

  return data                         # Hand back data from function

def get_uniprot_seq_record(db_id):

   import os
   from Bio import SeqIO
   UNIPROT_FASTA_URL = 'http://www.uniprot.org/uniprot/{}.fasta'
   file_name = db_id + '.fasta'  # Add extension to ID to make file name
   
   if not os.path.exists(file_name):
     download_db_id(db_id, UNIPROT_FASTA_URL, file_name)
   
   file_obj = open(file_name)    # Open file for reading
   
   for seq_record in SeqIO.parse(file_obj, 'fasta'):
     return seq_record           # Give back first record encountered

def blast_search(seq, database, program,          # Mandatory input
                 e_cutoff=1e-10, cpu_cores=None): # Optional parameters
 
  from subprocess import Popen, PIPE # imports for running external jobs
   
  if not cpu_cores:                  # CPU count parameter not specified
    import multiprocessing   
    cpu_cores = multiprocessing.cpu_count() 
# Use all avail. CPU cores
 
  # Make a list of all options and params to run BLAST
  # Output format 6 means table output
  # Numeric params. are converted to strings with str()  
  cmd_args = [program, '-outfmt', '6', '-db', database,
              '-num_threads', str(cpu_cores),
              '-evalue', str(e_cutoff)]  
 
  try:      # Try to create a process to run BLAST, but catch any error
    proc = Popen(cmd_args, stdin=PIPE, stdout=PIPE)
    
  except Exception as err:         # Something went wrong
    print('BLAST command failed')
    print('Command used: "%s"' % ' '.join(cmd_args))
    print(err)
    return []     # Send back an empty list if BLAST failed
 
  template = '>UserQuery\n{}\n'          # FASTA format string template
  in_data = template.format(seq.upper()) # Insert the seq in template
  in_data.encode('utf-8')                # Removes any unicode chars

  out_data, err_data = proc.communicate(in_data) # Send input get output
 
  results_str = out_data.decode('utf-8')    # Interpret output as text
 
  results = []                              # Start with empty results
  for line in results_str.split('\n'):      # Split output into lines
    data = line.split()                     # Split line into sub-list
    
    if data:                                # If line was not blank
      for col in range(3,10):              
        data[col] = int(data[col])          # Some columns are integers

      for col in (2,10,11):  
        data[col] = float(data[col])        # Other columns are float
 
      results.append(data)                  # Add row of data to results
 
  return results                            # Send back results

db = '/home/ah/database/blast/nrfilt'       # BLAST database: must exist
sr = get_uniprot_seq_record('P18754')
print(sr.seq)
blast_hits = blast_search(sr.seq, db, 'blastp') # Run on seq used before

for hit in blast_hits:              # Go through each row in output list
  print(hit[1], hit[10],  hit[11])  # Print some data columns
script_18_58.out
[ah@hobbes Documents]$ python script_18_58.py 
MSPKRIAKRRSPPADAIPKSKKVKVSHRSHSTEPGLVLTLGQGDVGQLGLGENV
MERKKPALVSIPEDVVQAEAGGMHTVCLSKSGQVYSFGCNDEGALGRDTSVEG
SEMVPGKVELQEKVVQVSAGDSHTAALTDDGRVFLWGSFRDNNGVIGLLEPMK
KSMVPVQVQLDVPVVKVASGNDHLVMLTADGDLYTLGCGEQGQLGRVPELFAN
RGGRQGLERLLVPKCVMLKSRGSRGHVRFQDAFCGAYFTFAISHEGHVYGFGL
SNYHQLGTPGTESCFIPQNLTSFKNSTKSWVGFSGGQHHTVCMDSEGKAYSL
GRAEYGRLGLGEGAEEKSIPTLISRLPAVSSVACGASVGYAVTKDGRVFAWGM
GTNYQLGTGQDEDAWSPVEMMGKQLENRVVLSVSSGGQHTVLLVKDKEQS
(u'gi|4502801|ref|NP_001260.1|', 0.0, 863.0)
(u'gi|327199310|ref|NP_001192134.1|', 0.0, 861.0)
(u'gi|119628100|gb|EAX07695.1|', 0.0, 860.0)
(u'gi|397515820|ref|XP_003828141.1|', 0.0, 858.0)
(u'gi|402853628|ref|XP_003891494.1|', 0.0, 856.0)
(u'gi|119628097|gb|EAX07692.1|', 0.0, 854.0)
(u'gi|114796646|ref|NP_001041660.1|', 0.0, 853.0)
(u'gi|402853626|ref|XP_003891493.1|', 0.0, 853.0)
(u'gi|544407822|ref|XP_005544289.1|', 0.0, 852.0)
(u'gi|119628099|gb|EAX07694.1|', 0.0, 850.0)
(u'gi|397515824|ref|XP_003828143.1|', 0.0, 849.0)
(u'gi|4389390|pdb|1A12|A', 0.0, 847.0)
(u'gi|114796644|ref|NP_001041659.1|', 0.0, 846.0)
(u'gi|332245213|ref|XP_003271757.1|', 0.0, 844.0)
(u'gi|119628095|gb|EAX07690.1|', 0.0, 844.0)
(u'gi|402853634|ref|XP_003891497.1|', 0.0, 844.0)
(u'gi|327199318|ref|NP_001192137.1|', 0.0, 843.0)
(u'gi|544407820|ref|XP_005544288.1|', 0.0, 843.0)
(u'gi|397515822|ref|XP_003828142.1|', 0.0, 843.0)
(u'gi|327199324|ref|NP_001192140.1|', 0.0, 843.0)
(u'gi|478503167|ref|XP_004425891.1|', 0.0, 842.0)
(u'gi|410966597|ref|XP_003989817.1|', 0.0, 840.0)
(u'gi|507575528|ref|XP_004671309.1|', 0.0, 838.0)
(u'gi|560898119|ref|XP_006175513.1|', 0.0, 837.0)
(u'gi|544407818|ref|XP_005544287.1|', 0.0, 837.0)
(u'gi|560948741|ref|XP_006196937.1|', 0.0, 837.0)
(u'gi|545490248|ref|XP_005617300.1|', 0.0, 837.0)
(u'gi|511832853|ref|XP_004741077.1|', 0.0, 837.0)
(u'gi|402853632|ref|XP_003891496.1|', 0.0, 837.0)
(u'gi|327315350|ref|NP_001192141.1|', 0.0, 836.0)
(u'gi|472350030|ref|XP_004394708.1|', 0.0, 835.0)
(u'gi|301755120|ref|XP_002913415.1|', 0.0, 835.0)
(u'gi|327315353|ref|NP_001192142.1|', 0.0, 833.0)
(u'gi|355764243|gb|EHH62271.1|', 0.0, 833.0)
(u'gi|545834593|ref|XP_005665197.1|', 0.0, 832.0)
(u'gi|403308348|ref|XP_003944627.1|', 0.0, 832.0)
(u'gi|478503165|ref|XP_004425890.1|', 0.0, 831.0)
(u'gi|478503163|ref|XP_004425889.1|', 0.0, 830.0)
(u'gi|548455340|ref|XP_005676827.1|', 0.0, 830.0)
(u'gi|545207509|ref|XP_005607066.1|', 0.0, 829.0)
(u'gi|426221838|ref|XP_004005113.1|', 0.0, 828.0)
(u'gi|157427942|ref|NP_001098878.1|', 0.0, 827.0)
(u'gi|545490246|ref|XP_005617299.1|', 0.0, 827.0)
(u'gi|560898117|ref|XP_006175512.1|', 0.0, 826.0)
(u'gi|471378580|ref|XP_004377091.1|', 0.0, 826.0)
(u'gi|465990666|ref|XP_004266634.1|', 0.0, 826.0)
(u'gi|348571060|ref|XP_003471314.1|', 0.0, 826.0)
(u'gi|560948739|ref|XP_006196936.1|', 0.0, 825.0)
(u'gi|395856826|ref|XP_003800819.1|', 0.0, 825.0)
(u'gi|507935083|ref|XP_004678748.1|', 0.0, 824.0)
(u'gi|478503171|ref|XP_004425893.1|', 0.0, 824.0)
(u'gi|410966599|ref|XP_003989818.1|', 0.0, 824.0)
(u'gi|545207507|ref|XP_005607065.1|', 0.0, 824.0)
(u'gi|472350028|ref|XP_004394707.1|', 0.0, 824.0)
(u'gi|504164105|ref|XP_004592234.1|', 0.0, 823.0)
(u'gi|507694534|ref|XP_004642946.1|', 0.0, 823.0)
(u'gi|344287488|ref|XP_003415485.1|', 0.0, 823.0)
(u'gi|14278207|pdb|1I2M|B', 0.0, 822.0)
(u'gi|560898115|ref|XP_006175511.1|', 0.0, 821.0)
(u'gi|511832847|ref|XP_004741074.1|', 0.0, 821.0)
(u'gi|511832845|ref|XP_004741073.1|', 0.0, 821.0)
(u'gi|132171|sp|P23800.1|RCC1_MESAU', 0.0, 820.0)
(u'gi|560948737|ref|XP_006196935.1|', 0.0, 820.0)
(u'gi|472350026|ref|XP_004394706.1|', 0.0, 820.0)
(u'gi|335290816|ref|XP_003356291.1|', 0.0, 818.0)
(u'gi|533162396|ref|XP_005395013.1|', 0.0, 818.0)
(u'gi|548455338|ref|XP_005676826.1|', 0.0, 817.0)
(u'gi|507655597|ref|XP_004705111.1|', 0.0, 816.0)
(u'gi|426221844|ref|XP_004005116.1|', 0.0, 816.0)
(u'gi|512815930|ref|XP_004878182.1|', 0.0, 816.0)
(u'gi|431891177|gb|ELK02054.1|', 0.0, 815.0)
(u'gi|344245054|gb|EGW01158.1|', 0.0, 815.0)
(u'gi|537237358|gb|ERE85933.1|', 0.0, 815.0)
(u'gi|488527862|ref|XP_004455742.1|', 0.0, 815.0)
(u'gi|528941068|ref|XP_005203264.1|', 0.0, 815.0)
(u'gi|507935080|ref|XP_004678747.1|', 0.0, 815.0)
(u'gi|354472395|ref|XP_003498425.1|', 0.0, 815.0)
(u'gi|505780544|ref|XP_004603574.1|', 0.0, 814.0)
(u'gi|148698161|gb|EDL30108.1|', 0.0, 814.0)
(u'gi|528768277|gb|EPY87936.1|', 0.0, 814.0)
(u'gi|514463590|ref|XP_005004313.1|', 0.0, 813.0)
(u'gi|471378578|ref|XP_004377090.1|', 0.0, 813.0)
(u'gi|465990664|ref|XP_004266633.1|', 0.0, 813.0)
(u'gi|19527092|ref|NP_598639.1|', 0.0, 813.0)
(u'gi|556729262|ref|XP_005960349.1|', 0.0, 812.0)
(u'gi|148698159|gb|EDL30106.1|', 0.0, 812.0)
(u'gi|504164103|ref|XP_004592233.1|', 0.0, 812.0)
(u'gi|74213820|dbj|BAE29345.1|', 0.0, 812.0)
(u'gi|344287492|ref|XP_003415487.1|', 0.0, 812.0)
(u'gi|532063752|ref|XP_005317746.1|', 0.0, 811.0)
(u'gi|74139909|dbj|BAE31793.1|', 0.0, 811.0)
(u'gi|555967883|ref|XP_005895986.1|', 0.0, 811.0)
(u'gi|532063754|ref|XP_005317747.1|', 0.0, 811.0)
(u'gi|507694530|ref|XP_004642945.1|', 0.0, 810.0)
(u'gi|426221842|ref|XP_004005115.1|', 0.0, 810.0)
(u'gi|189491881|ref|NP_001121661.1|', 0.0, 810.0)
(u'gi|471378576|ref|XP_004377089.1|', 0.0, 809.0)
(u'gi|395856828|ref|XP_003800820.1|', 0.0, 809.0)
(u'gi|528941060|ref|XP_005203260.1|', 0.0, 809.0)
(u'gi|507935078|ref|XP_004678746.1|', 0.0, 808.0)
(u'gi|507935075|ref|XP_004678745.1|', 0.0, 807.0)
(u'gi|471378574|ref|XP_004377088.1|', 0.0, 807.0)
(u'gi|465990661|ref|XP_004266632.1|', 0.0, 807.0)
(u'gi|504164100|ref|XP_004592232.1|', 0.0, 806.0)
(u'gi|344287490|ref|XP_003415486.1|', 0.0, 806.0)
(u'gi|533162394|ref|XP_005395012.1|', 0.0, 805.0)
(u'gi|351695932|gb|EHA98850.1|', 0.0, 805.0)
(u'gi|354472397|ref|XP_003498426.1|', 0.0, 805.0)
(u'gi|308153287|ref|NP_001184011.1|', 0.0, 804.0)
(u'gi|532063756|ref|XP_005317748.1|', 0.0, 804.0)
(u'gi|512815922|ref|XP_004878180.1|', 0.0, 804.0)
(u'gi|532063750|ref|XP_005317745.1|', 0.0, 803.0)
(u'gi|488527860|ref|XP_004455741.1|', 0.0, 803.0)
(u'gi|512815918|ref|XP_004878179.1|', 0.0, 803.0)
(u'gi|512815914|ref|XP_004878178.1|', 0.0, 803.0)
(u'gi|148698160|gb|EDL30107.1|', 0.0, 803.0)
(u'gi|505780540|ref|XP_004603573.1|', 0.0, 802.0)
(u'gi|354472399|ref|XP_003498427.1|', 0.0, 798.0)
(u'gi|488527856|ref|XP_004455739.1|', 0.0, 797.0)
(u'gi|505780537|ref|XP_004603572.1|', 0.0, 797.0)
(u'gi|505780532|ref|XP_004603571.1|', 0.0, 796.0)
(u'gi|558163174|ref|XP_006097311.1|', 0.0, 795.0)
(u'gi|532018884|ref|XP_005353185.1|', 0.0, 794.0)
(u'gi|545490250|ref|XP_005617301.1|', 0.0, 791.0)
(u'gi|281351636|gb|EFB27220.1|', 0.0, 791.0)
(u'gi|488527858|ref|XP_004455740.1|', 0.0, 779.0)
(u'gi|558163170|ref|XP_006097310.1|', 0.0, 776.0)
(u'gi|488527854|ref|XP_004455738.1|', 0.0, 772.0)
(u'gi|521031443|gb|EPQ13229.1|', 0.0, 767.0)
(u'gi|354475663|ref|XP_003500047.1|', 0.0, 761.0)
(u'gi|395730887|ref|XP_003780622.1|', 0.0, 754.0)
(u'gi|470628969|ref|XP_004321307.1|', 0.0, 741.0)
(u'gi|395521930|ref|XP_003765067.1|', 0.0, 740.0)
(u'gi|327199320|ref|NP_001192138.1|', 0.0, 736.0)
(u'gi|545834597|ref|XP_005665199.1|', 0.0, 729.0)
(u'gi|533162398|ref|XP_005395014.1|', 0.0, 729.0)
(u'gi|558153057|ref|XP_006121129.1|', 0.0, 721.0)
(u'gi|558153052|ref|XP_006121128.1|', 0.0, 721.0)
(u'gi|488527864|ref|XP_004455743.1|', 0.0, 718.0)
(u'gi|465966594|gb|EMP31100.1|', 0.0, 715.0)
(u'gi|557317496|ref|XP_006031961.1|', 0.0, 709.0)
(u'gi|529437161|ref|XP_005238344.1|', 0.0, 706.0)
(u'gi|326932912|ref|XP_003212555.1|', 0.0, 696.0)
(u'gi|525022195|ref|XP_005058591.1|', 0.0, 695.0)
(u'gi|525022197|ref|XP_005058592.1|', 0.0, 694.0)
(u'gi|363742391|ref|XP_427366.3|', 0.0, 694.0)
(u'gi|524959499|ref|XP_005079698.1|', 0.0, 693.0)
(u'gi|543277880|ref|XP_005427132.1|', 0.0, 691.0)
(u'gi|543372998|ref|XP_005529718.1|', 0.0, 689.0)
(u'gi|542181352|ref|XP_005496408.1|', 0.0, 687.0)
(u'gi|327315362|ref|NP_001192146.1|', 0.0, 684.0)
(u'gi|543718943|ref|XP_005500705.1|', 0.0, 683.0)
(u'gi|62859419|ref|NP_001016101.1|', 0.0, 664.0)
(u'gi|115528650|gb|AAI24849.1|', 0.0, 664.0)
(u'gi|171846900|gb|AAI61455.1|', 0.0, 663.0)
(u'gi|47682592|gb|AAH70867.1|', 0.0, 662.0)
(u'gi|48735020|gb|AAH72030.1|', 0.0, 661.0)
(u'gi|147900380|ref|NP_001085384.1|', 0.0, 660.0)
(u'gi|2134146|pir||I51558', 0.0, 658.0)
(u'gi|148227322|ref|NP_001081229.1|', 0.0, 657.0)
(u'gi|55742589|ref|NP_998343.1|', 0.0, 637.0)
(u'gi|291190112|ref|NP_001167184.1|', 0.0, 629.0)
(u'gi|348529242|ref|XP_003452123.1|', 0.0, 620.0)
(u'gi|542238103|ref|XP_005456766.1|', 0.0, 615.0)
(u'gi|499018444|ref|XP_004560016.1|', 0.0, 610.0)
(u'gi|530584815|ref|XP_005286204.1|', 0.0, 609.0)
(u'gi|548435391|ref|XP_005744531.1|', 0.0, 607.0)
(u'gi|554870555|ref|XP_005945248.1|', 0.0, 607.0)
(u'gi|551493873|ref|XP_005798503.1|', 0.0, 604.0)
(u'gi|410905257|ref|XP_003966108.1|', 0.0, 604.0)
(u'gi|556961668|ref|XP_005991115.1|', 0.0, 596.0)
(u'gi|47220834|emb|CAG00041.1|', 0.0, 586.0)
(u'gi|537236894|gb|ERE85794.1|', 0.0, 582.0)
(u'gi|527248394|ref|XP_005142516.1|', 0.0, 573.0)
(u'gi|426328672|ref|XP_004025374.1|', 0.0, 566.0)
(u'gi|327289459|ref|XP_003229442.1|', 0.0, 563.0)
(u'gi|297282712|ref|XP_002808329.1|', 0.0, 558.0)
(u'gi|512815934|ref|XP_004878183.1|', 0.0, 558.0)
(u'gi|355562421|gb|EHH19015.1|', 1e-159, 466.0)
(u'gi|556103713|gb|ESO92365.1|', 6e-148, 438.0)
(u'gi|524898165|ref|XP_005105535.1|', 3e-147, 437.0)
(u'gi|260831025|ref|XP_002610460.1|', 2e-142, 432.0)
(u'gi|521035504|gb|EPQ17286.1|', 8e-142, 420.0)
(u'gi|224167221|ref|XP_002198843.1|', 7e-141, 414.0)
(u'gi|449267782|gb|EMC78684.1|', 4e-140, 412.0)
(u'gi|354475661|ref|XP_003500046.1|', 2e-139, 409.0)
(u'gi|432951688|ref|XP_004084886.1|', 9e-139, 409.0)
(u'gi|432951688|ref|XP_004084886.1|', 2e-14, 81.6)
(u'gi|146332865|gb|ABQ22938.1|', 3e-136, 400.0)
(u'gi|390345132|ref|XP_001186731.2|', 4e-136, 408.0)
(u'gi|405976795|gb|EKC41279.1|', 2e-135, 407.0)
(u'gi|443686928|gb|ELT90046.1|', 1e-133, 403.0)
(u'gi|321478568|gb|EFX89525.1|', 1e-132, 399.0)
(u'gi|345492280|ref|XP_001603331.2|', 2e-126, 383.0)
(u'gi|327315376|ref|NP_001192151.1|', 3e-126, 385.0)
(u'gi|221108961|ref|XP_002157481.1|', 1e-122, 374.0)
(u'gi|241750184|ref|XP_002405888.1|', 3e-122, 372.0)
(u'gi|332031143|gb|EGI70720.1|', 5e-122, 372.0)
(u'gi|322786159|gb|EFZ12764.1|', 7e-122, 374.0)
(u'gi|307173133|gb|EFN64233.1|', 2e-121, 373.0)
(u'gi|383865068|ref|XP_003707997.1|', 2e-120, 370.0)
(u'gi|312374979|gb|EFR22435.1|', 8e-120, 367.0)
(u'gi|327315378|ref|NP_001192152.1|', 9e-119, 364.0)
(u'gi|380022175|ref|XP_003694928.1|', 1e-118, 363.0)
(u'gi|156361074|ref|XP_001625345.1|', 2e-118, 363.0)
(u'gi|402880198|ref|XP_003903699.1|', 8e-118, 357.0)
(u'gi|402880198|ref|XP_003903699.1|', 3e-64, 219.0)
(u'gi|328792337|ref|XP_003251711.1|', 2e-117, 362.0)
(u'gi|350419275|ref|XP_003492127.1|', 2e-115, 357.0)
(u'gi|340709117|ref|XP_003393160.1|', 9e-115, 355.0)
(u'gi|478250372|gb|ENN70867.1|', 3e-114, 352.0)
(u'gi|189235771|ref|XP_970173.2|', 2e-112, 348.0)
(u'gi|260795414|ref|XP_002592700.1|', 6e-112, 345.0)
(u'gi|196008365|ref|XP_002114048.1|', 2e-109, 340.0)
(u'gi|498964755|ref|XP_004525461.1|', 2e-109, 342.0)
(u'gi|307213459|gb|EFN88881.1|', 2e-109, 338.0)
(u'gi|170053955|ref|XP_001862909.1|', 8e-109, 340.0)
(u'gi|157116974|ref|XP_001652914.1|', 5e-108, 339.0)
(u'gi|347970520|ref|XP_310274.7|', 2e-107, 338.0)
(u'gi|555703651|gb|ESO06884.1|', 3e-105, 328.0)
(u'gi|557782445|ref|XP_005190835.1|', 9e-105, 331.0)
(u'gi|313213144|emb|CBY37001.1|', 2e-102, 326.0)
(u'gi|347970522|ref|XP_003436591.1|', 6e-102, 322.0)
(u'gi|512893358|ref|XP_004923161.1|', 7e-100, 319.0)
(u'gi|195020061|ref|XP_001985112.1|', 4e-98, 313.0)
(u'gi|195127239|ref|XP_002008076.1|', 8e-98, 312.0)
(u'gi|195376815|ref|XP_002047188.1|', 1e-97, 312.0)
(u'gi|195492131|ref|XP_002093858.1|', 9e-96, 308.0)
(u'gi|194867327|ref|XP_001972047.1|', 1e-95, 307.0)
(u'gi|304445931|pdb|3MVD|K', 1e-95, 304.0)
(u'gi|194750136|ref|XP_001957486.1|', 1e-95, 306.0)
(u'gi|195337773|ref|XP_002035500.1|', 1e-94, 305.0)
(u'gi|195588086|ref|XP_002083789.1|', 2e-94, 306.0)
(u'gi|7660|emb|CAA41417.1|', 2e-94, 305.0)
(u'gi|18859625|ref|NP_523943.1|', 2e-94, 305.0)
(u'gi|340370015|ref|XP_003383542.1|', 2e-94, 301.0)
(u'gi|195431549|ref|XP_002063799.1|', 2e-93, 300.0)
(u'gi|195428731|ref|XP_002062419.1|', 7e-93, 299.0)
(u'gi|555937498|emb|CDJ14619.1|', 2e-92, 303.0)
(u'gi|344242290|gb|EGV98393.1|', 5e-92, 286.0)
(u'gi|198455071|ref|XP_002138000.1|', 8e-90, 291.0)
(u'gi|195157504|ref|XP_002019636.1|', 1e-89, 290.0)
(u'gi|195174428|ref|XP_002027976.1|', 2e-89, 291.0)
(u'gi|556516705|emb|CDJ19951.1|', 3e-89, 295.0)
(u'gi|125977322|ref|XP_001352694.1|', 3e-89, 291.0)
(u'gi|357626115|gb|EHJ76321.1|', 9e-89, 291.0)
(u'gi|167534172|ref|XP_001748764.1|', 6e-85, 278.0)
(u'gi|555689788|gb|ESN93020.1|', 4e-83, 271.0)
(u'gi|242021219|ref|XP_002431043.1|', 2e-82, 268.0)
(u'gi|391339176|ref|XP_003743928.1|', 1e-78, 267.0)
(u'gi|358337819|dbj|GAA56144.1|', 2e-78, 270.0)
(u'gi|426328674|ref|XP_004025375.1|', 3e-78, 250.0)
(u'gi|470305350|ref|XP_004348307.1|', 4e-78, 261.0)
(u'gi|195174458|ref|XP_002027991.1|', 3e-74, 248.0)
(u'gi|391347935|ref|XP_003748209.1|', 2e-71, 241.0)
(u'gi|328766889|gb|EGF76941.1|', 1e-66, 230.0)
(u'gi|387598185|gb|AFJ91748.1|', 1e-65, 219.0)
(u'gi|545369609|ref|XP_005649542.1|', 4e-65, 233.0)
(u'gi|511003208|gb|EPB84619.1|', 2e-63, 221.0)
(u'gi|198463144|ref|XP_002135445.1|', 4e-63, 217.0)
(u'gi|514699980|ref|XP_004997558.1|', 8e-61, 214.0)
(u'gi|291222248|ref|XP_002731129.1|', 1e-60, 204.0)
(u'gi|389642071|ref|XP_003718668.1|', 6e-60, 217.0)
(u'gi|432949840|ref|XP_004084284.1|', 3e-59, 200.0)
(u'gi|384489724|gb|EIE80946.1|', 2e-58, 207.0)
(u'gi|303270823|ref|XP_003054773.1|', 1e-57, 208.0)
(u'gi|550805201|gb|ERS97118.1|', 2e-56, 206.0)
(u'gi|402083663|gb|EJT78681.1|', 8e-56, 205.0)
(u'gi|328712488|ref|XP_001950480.2|', 2e-55, 202.0)
(u'gi|500260767|gb|EOO03290.1|', 3e-55, 203.0)
(u'gi|475675995|gb|EMT73038.1|', 2e-54, 198.0)
(u'gi|346321525|gb|EGX91124.1|', 4e-54, 198.0)
(u'gi|340518879|gb|EGR49119.1|', 1e-53, 197.0)
(u'gi|46127591|ref|XP_388349.1|', 2e-53, 197.0)
(u'gi|340939343|gb|EGS19965.1|', 2e-53, 198.0)
(u'gi|346972722|gb|EGY16174.1|', 3e-53, 198.0)
(u'gi|302885386|ref|XP_003041585.1|', 3e-53, 197.0)
(u'gi|255069897|ref|XP_002507030.1|', 4e-53, 194.0)
(u'gi|342873514|gb|EGU75680.1|', 6e-53, 196.0)
(u'gi|477508915|gb|ENH62208.1|', 7e-53, 196.0)
(u'gi|348685793|gb|EGZ25608.1|', 7e-53, 195.0)
(u'gi|408389507|gb|EKJ68954.1|', 1e-52, 196.0)
(u'gi|384501769|gb|EIE92260.1|', 1e-52, 192.0)
(u'gi|400597595|gb|EJP65325.1|', 1e-52, 194.0)
(u'gi|19112818|ref|NP_596026.1|', 2e-52, 194.0)
(u'gi|430813410|emb|CCJ29231.1|', 2e-52, 193.0)
(u'gi|320588268|gb|EFX00743.1|', 5e-52, 194.0)
(u'gi|336270150|ref|XP_003349834.1|', 5e-52, 194.0)
(u'gi|552812683|ref|XP_005843760.1|', 5e-52, 197.0)
(u'gi|452977561|gb|EME77327.1|', 9e-52, 191.0)
(u'gi|145344573|ref|XP_001416804.1|', 1e-51, 192.0)
(u'gi|512191448|gb|EPE07209.1|', 2e-51, 193.0)
(u'gi|358394418|gb|EHK43811.1|', 4e-51, 191.0)
(u'gi|485925292|gb|EOD49915.1|', 4e-51, 191.0)
(u'gi|517319687|emb|CCT70497.1|', 4e-51, 191.0)
(u'gi|477513564|gb|ENH66050.1|', 8e-51, 196.0)
(u'gi|358385781|gb|EHK23377.1|', 1e-50, 190.0)
(u'gi|367040685|ref|XP_003650723.1|', 3e-50, 189.0)
(u'gi|407927258|gb|EKG20156.1|', 4e-50, 188.0)
(u'gi|85097594|ref|XP_960475.1|', 8e-50, 188.0)
(u'gi|336466175|gb|EGO54340.1|', 9e-50, 188.0)
(u'gi|470319153|gb|EMR10777.1|', 9e-50, 186.0)
(u'gi|258577767|ref|XP_002543065.1|', 2e-49, 186.0)
(u'gi|116201941|ref|XP_001226782.1|', 3e-49, 186.0)
(u'gi|521729061|gb|EPQ59146.1|', 3e-49, 184.0)
(u'gi|171681784|ref|XP_001905835.1|', 6e-49, 185.0)
(u'gi|449297034|gb|EMC93053.1|', 8e-49, 184.0)
(u'gi|528317086|gb|EPY51662.1|', 9e-49, 184.0)
(u'gi|471565888|gb|EMR66351.1|', 1e-48, 183.0)
(u'gi|530741911|gb|EQC40153.1|', 2e-48, 182.0)
(u'gi|303320911|ref|XP_003070450.1|', 3e-48, 183.0)
(u'gi|367030153|ref|XP_003664360.1|', 3e-48, 183.0)
(u'gi|531862984|gb|EQL00397.1|', 3e-48, 184.0)
(u'gi|531862984|gb|EQL00397.1|', 6e-11, 73.6)
(u'gi|50543738|ref|XP_500035.1|', 7e-48, 181.0)
(u'gi|320033050|gb|EFW14999.1|', 9e-48, 182.0)
(u'gi|406602412|emb|CCH46028.1|', 1e-47, 179.0)
(u'gi|308802331|ref|XP_003078479.1|', 1e-47, 181.0)
(u'gi|119179563|ref|XP_001241353.1|', 2e-47, 181.0)
(u'gi|528060942|gb|EPX70506.1|', 2e-47, 181.0)
(u'gi|526199565|gb|EPS40434.1|', 2e-47, 181.0)
(u'gi|301111610|ref|XP_002904884.1|', 2e-47, 180.0)
(u'gi|320583511|gb|EFW97724.1|', 3e-47, 179.0)
(u'gi|385304742|gb|EIF48748.1|', 5e-47, 176.0)
(u'gi|326470311|gb|EGD94320.1|', 9e-47, 179.0)
(u'gi|384501453|gb|EIE91944.1|', 1e-46, 176.0)
(u'gi|449549966|gb|EMD40931.1|', 2e-46, 177.0)
(u'gi|302507508|ref|XP_003015715.1|', 2e-46, 178.0)
(u'gi|325190927|emb|CCA25412.1|', 2e-46, 177.0)
(u'gi|325190927|emb|CCA25412.1|', 3e-11, 74.3)
(u'gi|539432770|gb|ERF69624.1|', 2e-46, 178.0)
(u'gi|327292499|ref|XP_003230948.1|', 3e-46, 177.0)
(u'gi|213409582|ref|XP_002175561.1|', 3e-46, 177.0)
(u'gi|390601153|gb|EIN10547.1|', 4e-46, 177.0)
(u'gi|254574068|ref|XP_002494143.1|', 5e-46, 175.0)
(u'gi|169848433|ref|XP_001830924.1|', 5e-46, 176.0)
(u'gi|527294507|gb|EPS95366.1|', 5e-46, 174.0)
(u'gi|440639174|gb|ELR09093.1|', 7e-46, 176.0)
(u'gi|451996623|gb|EMD89089.1|', 8e-46, 177.0)
(u'gi|429859855|gb|ELA34615.1|', 9e-46, 177.0)
(u'gi|530472733|gb|EQB53285.1|', 1e-45, 176.0)
(u'gi|451847555|gb|EMD60862.1|', 2e-45, 176.0)
(u'gi|296819631|ref|XP_002849878.1|', 2e-45, 175.0)
(u'gi|5478530|gb|AAD43920.1|AF130441_1', 2e-45, 172.0)
(u'gi|5478530|gb|AAD43920.1|AF130441_1', 1e-22, 107.0)
(u'gi|310799740|gb|EFQ34633.1|', 2e-45, 176.0)
(u'gi|315040529|ref|XP_003169642.1|', 3e-45, 174.0)
(u'gi|326481150|gb|EGE05160.1|', 4e-45, 174.0)
(u'gi|557090874|gb|ESQ31521.1|', 4e-45, 172.0)
(u'gi|557090874|gb|ESQ31521.1|', 8e-23, 108.0)
(u'gi|452839962|gb|EME41901.1|', 4e-45, 174.0)
(u'gi|482805757|gb|EOA82843.1|', 6e-45, 174.0)
(u'gi|312282233|dbj|BAJ33982.1|', 7e-45, 171.0)
(u'gi|312282233|dbj|BAJ33982.1|', 1e-22, 107.0)
(u'gi|453083534|gb|EMF11580.1|', 7e-45, 174.0)
(u'gi|379318630|pdb|4DNU|A', 1e-44, 169.0)
(u'gi|379318630|pdb|4DNU|A', 7e-25, 113.0)
(u'gi|379318635|pdb|4DNW|A', 2e-44, 169.0)
(u'gi|379318635|pdb|4DNW|A', 6e-24, 110.0)
(u'gi|297797389|ref|XP_002866579.1|', 2e-44, 170.0)
(u'gi|297797389|ref|XP_002866579.1|', 9e-24, 110.0)
(u'gi|501755350|emb|CCG81367.1|', 3e-44, 171.0)
(u'gi|554913264|gb|ESK94478.1|', 4e-44, 171.0)
(u'gi|383280368|pdb|4D9S|A', 4e-44, 169.0)
(u'gi|383280368|pdb|4D9S|A', 1e-23, 110.0)
(u'gi|309366092|emb|CAP21957.2|', 4e-44, 171.0)
(u'gi|336373725|gb|EGO02063.1|', 4e-44, 171.0)
(u'gi|379318631|pdb|4DNV|A', 5e-44, 167.0)
(u'gi|379318631|pdb|4DNV|A', 8e-24, 110.0)
(u'gi|268530278|ref|XP_002630265.1|', 5e-44, 171.0)
(u'gi|15237540|ref|NP_201191.1|', 6e-44, 169.0)
(u'gi|15237540|ref|NP_201191.1|', 1e-23, 110.0)
(u'gi|159488781|ref|XP_001702381.1|', 6e-44, 171.0)
(u'gi|345563579|gb|EGX46567.1|', 6e-44, 171.0)
(u'gi|393215520|gb|EJD01011.1|', 7e-44, 168.0)
(u'gi|380473989|emb|CCF46021.1|', 7e-44, 171.0)
(u'gi|380473989|emb|CCF46021.1|', 5e-11, 73.9)
(u'gi|341896411|gb|EGT52346.1|', 9e-44, 171.0)
(u'gi|341877072|gb|EGT33007.1|', 9e-44, 171.0)
(u'gi|523422630|emb|CDF88905.1|', 1e-43, 169.0)
(u'gi|482549182|gb|EOA13376.1|', 1e-43, 168.0)
(u'gi|482549182|gb|EOA13376.1|', 2e-23, 110.0)
(u'gi|448119392|ref|XP_004203719.1|', 3e-43, 168.0)
(u'gi|552924917|gb|ESA08835.1|', 3e-43, 167.0)
(u'gi|399218366|emb|CCF75253.1|', 4e-43, 169.0)
(u'gi|17532079|ref|NP_495753.1|', 4e-43, 168.0)
(u'gi|477531419|gb|ENH83126.1|', 6e-43, 169.0)
(u'gi|448116941|ref|XP_004203136.1|', 6e-43, 167.0)
(u'gi|426192396|gb|EKV42332.1|', 1e-42, 167.0)
(u'gi|302417714|ref|XP_003006688.1|', 2e-42, 166.0)
(u'gi|398404986|ref|XP_003853959.1|', 3e-42, 166.0)
(u'gi|325092204|gb|EGC45514.1|', 3e-42, 166.0)
(u'gi|330925699|ref|XP_003301155.1|', 3e-42, 166.0)
(u'gi|396081299|gb|AFN82917.1|', 4e-42, 162.0)
(u'gi|121714259|ref|XP_001274740.1|', 4e-42, 165.0)
(u'gi|409079646|gb|EKM80007.1|', 5e-42, 166.0)
(u'gi|242820402|ref|XP_002487503.1|', 5e-42, 165.0)
(u'gi|308509404|ref|XP_003116885.1|', 5e-42, 165.0)
(u'gi|367005462|ref|XP_003687463.1|', 6e-42, 164.0)
(u'gi|322699843|gb|EFY91602.1|', 6e-42, 165.0)
(u'gi|169610477|ref|XP_001798657.1|', 6e-42, 165.0)
(u'gi|242208551|ref|XP_002470126.1|', 8e-42, 163.0)
(u'gi|242208551|ref|XP_002470126.1|', 4e-11, 73.9)
(u'gi|255089408|ref|XP_002506626.1|', 1e-41, 162.0)
(u'gi|302832638|ref|XP_002947883.1|', 2e-41, 161.0)
(u'gi|367039539|ref|XP_003650150.1|', 2e-41, 163.0)
(u'gi|344231377|gb|EGV63259.1|', 2e-41, 162.0)
(u'gi|402217324|gb|EJT97405.1|', 2e-41, 162.0)
(u'gi|294654461|ref|XP_456522.2|', 3e-41, 162.0)
(u'gi|409049848|gb|EKM59325.1|', 3e-41, 163.0)
(u'gi|409081255|gb|EKM81614.1|', 3e-41, 161.0)
(u'gi|392512630|emb|CAD25339.2|', 3e-41, 159.0)
(u'gi|392512630|emb|CAD25339.2|', 3e-13, 79.7)
(u'gi|392512630|emb|CAD25339.2|', 2e-11, 73.9)
(u'gi|528292911|emb|CCU81601.1|', 3e-41, 163.0)
(u'gi|19074229|ref|NP_584835.1|', 3e-41, 161.0)
(u'gi|19074229|ref|NP_584835.1|', 5e-13, 79.7)
(u'gi|19074229|ref|NP_584835.1|', 2e-11, 74.3)
(u'gi|399169568|emb|CCE29551.1|', 4e-41, 163.0)
(u'gi|310795953|gb|EFQ31414.1|', 8e-41, 160.0)
(u'gi|552926495|gb|ESA10399.1|', 8e-41, 160.0)
(u'gi|255563857|ref|XP_002522929.1|', 9e-41, 160.0)
(u'gi|255563857|ref|XP_002522929.1|', 1e-19, 99.0)
(u'gi|358372284|dbj|GAA88888.1|', 1e-40, 162.0)
(u'gi|494827921|gb|EON64817.1|', 1e-40, 162.0)
(u'gi|154273022|ref|XP_001537363.1|', 1e-40, 161.0)
(u'gi|270004502|gb|EFA00950.1|', 1e-40, 152.0)
(u'gi|378732654|gb|EHY59113.1|', 2e-40, 161.0)
(u'gi|557721580|dbj|GAD99570.1|', 2e-40, 160.0)
(u'gi|392593511|gb|EIW82836.1|', 2e-40, 159.0)
(u'gi|302696855|ref|XP_003038106.1|', 2e-40, 159.0)
(u'gi|401826164|ref|XP_003887176.1|', 2e-40, 157.0)
(u'gi|241957479|ref|XP_002421459.1|', 2e-40, 160.0)
(u'gi|303389130|ref|XP_003072798.1|', 2e-40, 157.0)
(u'gi|449328953|gb|AGE95228.1|', 3e-40, 159.0)
(u'gi|449328953|gb|AGE95228.1|', 5e-13, 79.3)
(u'gi|449328953|gb|AGE95228.1|', 7e-11, 73.2)
(u'gi|212545723|ref|XP_002153015.1|', 4e-40, 160.0)
(u'gi|238883419|gb|EEQ47057.1|', 4e-40, 159.0)
(u'gi|459366611|gb|EMG45917.1|', 4e-40, 159.0)
(u'gi|549045443|emb|CCX33475.1|', 5e-40, 160.0)
(u'gi|3850148|emb|CAA21948.1|', 7e-40, 158.0)
(u'gi|12001926|gb|AAG43106.1|AF049867_1', 8e-40, 158.0)
(u'gi|77022888|ref|XP_888888.1|', 9e-40, 158.0)
(u'gi|322702426|gb|EFY94076.1|', 9e-40, 159.0)
(u'gi|261188107|ref|XP_002620470.1|', 1e-39, 158.0)
(u'gi|255731754|ref|XP_002550801.1|', 1e-39, 158.0)
(u'gi|350639170|gb|EHA27524.1|', 1e-39, 158.0)
(u'gi|239609087|gb|EEQ86074.1|', 1e-39, 158.0)
(u'gi|390600679|gb|EIN10074.1|', 1e-39, 157.0)
(u'gi|531980384|gb|EQL30971.1|', 2e-39, 158.0)
(u'gi|317027708|ref|XP_001399882.2|', 2e-39, 157.0)
(u'gi|145248596|ref|XP_001400637.1|', 2e-39, 158.0)
(u'gi|350634707|gb|EHA23069.1|', 2e-39, 157.0)
(u'gi|225555094|gb|EEH03387.1|', 2e-39, 157.0)
(u'gi|557534224|gb|ESR45342.1|', 2e-39, 156.0)
(u'gi|557534224|gb|ESR45342.1|', 8e-21, 102.0)
(u'gi|156848822|ref|XP_001647292.1|', 3e-39, 157.0)
(u'gi|170090772|ref|XP_001876608.1|', 3e-39, 156.0)
(u'gi|296424151|ref|XP_002841613.1|', 3e-39, 156.0)
(u'gi|403353979|gb|EJY76535.1|', 3e-39, 159.0)
(u'gi|403353979|gb|EJY76535.1|', 7e-14, 82.8)
(u'gi|403353979|gb|EJY76535.1|', 2e-13, 81.6)
(u'gi|380490700|emb|CCF35832.1|', 4e-39, 157.0)
(u'gi|323348646|gb|EGA82889.1|', 4e-39, 156.0)
(u'gi|548850227|gb|ERN08779.1|', 4e-39, 155.0)
(u'gi|548850227|gb|ERN08779.1|', 2e-18, 95.9)
(u'gi|548850227|gb|ERN08779.1|', 2e-18, 95.5)
(u'gi|349578129|dbj|GAA23295.1|', 5e-39, 156.0)
(u'gi|151943711|gb|EDN62021.1|', 5e-39, 156.0)
(u'gi|323337589|gb|EGA78834.1|', 6e-39, 155.0)
(u'gi|323309055|gb|EGA62283.1|', 7e-39, 155.0)
(u'gi|389743006|gb|EIM84191.1|', 8e-39, 155.0)
(u'gi|296082603|emb|CBI21608.3|', 9e-39, 154.0)
(u'gi|296082603|emb|CBI21608.3|', 9e-18, 94.0)
(u'gi|557534225|gb|ESR45343.1|', 9e-39, 154.0)
(u'gi|557534225|gb|ESR45343.1|', 1e-21, 105.0)
(u'gi|557534225|gb|ESR45343.1|', 8e-13, 79.0)
(u'gi|365765541|gb|EHN07048.1|', 1e-38, 155.0)
(u'gi|6321341|ref|NP_011418.1|', 1e-38, 155.0)
(u'gi|145545147|ref|XP_001458258.1|', 1e-38, 155.0)
(u'gi|134056804|emb|CAK37712.1|', 1e-38, 156.0)
(u'gi|526118004|ref|NP_001267915.1|', 2e-38, 154.0)
(u'gi|526118004|ref|NP_001267915.1|', 1e-16, 90.5)
(u'gi|500253801|gb|EON97448.1|', 2e-38, 154.0)
(u'gi|126134081|ref|XP_001383565.1|', 2e-38, 154.0)
(u'gi|326634388|pdb|3OF7|A', 3e-38, 154.0)
(u'gi|531862981|gb|EQL00394.1|', 3e-38, 154.0)
(u'gi|363749193|ref|XP_003644814.1|', 3e-38, 154.0)
(u'gi|255719666|ref|XP_002556113.1|', 3e-38, 154.0)
(u'gi|94469124|gb|ABF18411.1|', 3e-38, 147.0)
(u'gi|50310715|ref|XP_455379.1|', 3e-38, 153.0)
(u'gi|170575713|ref|XP_001893354.1|', 3e-38, 154.0)
(u'gi|550334138|gb|EEE90389.2|', 4e-38, 152.0)
(u'gi|550334138|gb|EEE90389.2|', 7e-20, 99.8)
(u'gi|550334138|gb|EEE90389.2|', 2e-16, 90.1)
(u'gi|224093583|ref|XP_002309939.1|', 4e-38, 152.0)
(u'gi|224093583|ref|XP_002309939.1|', 7e-20, 99.8)
(u'gi|224093583|ref|XP_002309939.1|', 2e-16, 89.7)
(u'gi|557726493|dbj|GAD94871.1|', 5e-38, 154.0)
(u'gi|254586731|ref|XP_002498933.1|', 5e-38, 153.0)
(u'gi|523780116|gb|EPR79468.1|', 5e-38, 150.0)
(u'gi|401625749|gb|EJS43742.1|', 6e-38, 153.0)
(u'gi|226495309|ref|NP_001141147.1|', 6e-38, 152.0)
(u'gi|226495309|ref|NP_001141147.1|', 9e-11, 72.8)
(u'gi|242073146|ref|XP_002446509.1|', 8e-38, 152.0)
(u'gi|242073146|ref|XP_002446509.1|', 4e-20, 100.0)
(u'gi|190344431|gb|EDK36104.2|', 9e-38, 152.0)
(u'gi|514801519|ref|XP_004975663.1|', 2e-37, 151.0)
(u'gi|514801519|ref|XP_004975663.1|', 5e-19, 97.4)
(u'gi|410074851|ref|XP_003955008.1|', 2e-37, 152.0)
(u'gi|429850121|gb|ELA25421.1|', 2e-37, 150.0)
(u'gi|342867548|gb|EGU72488.1|', 2e-37, 149.0)
(u'gi|322701288|gb|EFY93038.1|', 2e-37, 152.0)
(u'gi|514801517|ref|XP_004975662.1|', 2e-37, 150.0)
(u'gi|514801517|ref|XP_004975662.1|', 6e-19, 97.4)
(u'gi|119486678|ref|XP_001262325.1|', 2e-37, 152.0)
(u'gi|402590741|gb|EJW84671.1|', 2e-37, 152.0)
(u'gi|393905325|gb|EJD73936.1|', 2e-37, 152.0)
(u'gi|512203121|gb|EPE31946.1|', 3e-37, 151.0)
(u'gi|471874627|emb|CCO35985.1|', 4e-37, 150.0)
(u'gi|346327098|gb|EGX96694.1|', 4e-37, 151.0)
(u'gi|444318081|ref|XP_004179698.1|', 4e-37, 150.0)
(u'gi|448536004|ref|XP_003871048.1|', 4e-37, 150.0)
(u'gi|168052711|ref|XP_001778783.1|', 5e-37, 150.0)
(u'gi|168052711|ref|XP_001778783.1|', 4e-22, 106.0)
(u'gi|326518338|dbj|BAJ88198.1|', 5e-37, 149.0)
(u'gi|326518338|dbj|BAJ88198.1|', 9e-21, 102.0)
(u'gi|146421762|ref|XP_001486825.1|', 5e-37, 149.0)
(u'gi|145504819|ref|XP_001438376.1|', 6e-37, 150.0)
(u'gi|444313687|ref|XP_004177501.1|', 6e-37, 150.0)
(u'gi|168008673|ref|XP_001757031.1|', 7e-37, 149.0)
(u'gi|168008673|ref|XP_001757031.1|', 6e-22, 105.0)
(u'gi|260950135|ref|XP_002619364.1|', 7e-37, 150.0)
(u'gi|154297017|ref|XP_001548937.1|', 1e-36, 150.0)
(u'gi|365760753|gb|EHN02449.1|', 1e-36, 149.0)
(u'gi|344299635|gb|EGW29988.1|', 1e-36, 149.0)
(u'gi|361125476|gb|EHK97517.1|', 1e-36, 149.0)
(u'gi|545365640|ref|XP_005647780.1|', 1e-36, 148.0)
(u'gi|545365640|ref|XP_005647780.1|', 4e-17, 92.0)
(u'gi|545365640|ref|XP_005647780.1|', 1e-16, 90.1)
(u'gi|545365640|ref|XP_005647780.1|', 3e-14, 83.2)
(u'gi|124088041|ref|XP_001346941.1|', 1e-36, 149.0)
(u'gi|462407499|gb|EMJ12833.1|', 1e-36, 148.0)
(u'gi|462407499|gb|EMJ12833.1|', 1e-17, 93.2)
(u'gi|358389183|gb|EHK26775.1|', 2e-36, 147.0)
(u'gi|429964945|gb|ELA46942.1|', 2e-36, 146.0)
[ah@hobbes Documents]$

 

Script 18.59

script_18_59.py
# Script 18.59
from pandas import DataFrame, Series, ExcelWriter
from matplotlib import pyplot

def download_db_id(db_id, url, file_name=None): # Save file is optional

  #from urllib.request import urlopen   
# python 3.0 Import urlopen() function  
  from urllib2 import urlopen          
# python 2.7 Import urlopen() function  
  response = urlopen(url.format(db_id))  # Makes object to handle link
  data = response.read()                 # Read data from URL
  data = data.decode('utf-8')            # Interpret data as plain text
 
  if file_name:                       # If save file was specified...
    file_obj = open(file_name, 'w')   # Create file object for writing
    file_obj.write(data)              # Save all data to file
    file_obj.close()

  return data                         # Hand back data from function

def blast_search(seq, database, program,          # Mandatory input
                 e_cutoff=1e-10, cpu_cores=None): # Optional parameters
 
  from subprocess import Popen, PIPE # imports for running external jobs
   
  if not cpu_cores:                  # CPU count parameter not specified
    import multiprocessing   
    cpu_cores = multiprocessing.cpu_count()  # Use all avail. CPU cores
 
  # Make a list of all options and params to run BLAST
  # Output format 6 means table output
  # Numeric params. are converted to strings with str()  
  cmd_args = [program, '-outfmt', '6', '-db', database,
              '-num_threads', str(cpu_cores),
              '-evalue', str(e_cutoff)]  
 
  try:      # Try to create a process to run BLAST, but catch any error
    proc = Popen(cmd_args, stdin=PIPE, stdout=PIPE)
    
  except Exception as err:         # Something went wrong
    print('BLAST command failed')
    print('Command used: "%s"' % ' '.join(cmd_args))
    print(err)
    return []     # Send back an empty list if BLAST failed
 
  template = '>UserQuery\n{}\n'          # FASTA format string template
  in_data = template.format(seq.upper()) # Insert the seq in template
  in_data.encode('utf-8')                # Removes any unicode chars

  out_data, err_data = proc.communicate(in_data) # Send input get output
 
  results_str = out_data.decode('utf-8')    # Interpret output as text
 
  results = []                              # Start with empty results
  for line in results_str.split('\n'):      # Split output into lines
    data = line.split()                     # Split line into sub-list
    
    if data:                                # If line was not blank
      for col in range(3,10):              
        data[col] = int(data[col])          # Some columns are integers

      for col in (2,10,11):  
        data[col] = float(data[col])        # Other columns are float
 
      results.append(data)                  # Add row of data to results
 
  return results                            # Send back results

def get_uniprot_seq_record(db_id):

   import os
   from Bio import SeqIO
   UNIPROT_FASTA_URL = 'http://www.uniprot.org/uniprot/{}.fasta'
   file_name = db_id + '.fasta'  # Add extension to ID to make file name
   
   if not os.path.exists(file_name):
     download_db_id(db_id, UNIPROT_FASTA_URL, file_name)
   
   file_obj = open(file_name)    # Open file for reading
   
   for seq_record in SeqIO.parse(file_obj, 'fasta'):
     return seq_record           # Give back first record encountered

cols = ('query_id', 'subject_id', '%_identity', 'alignment_length',  
        'mismatches', 'gap_opens', 'q_start', 'q_end', 's_start',
        's_end', 'evalue', 'bit_score')   # Column/key names

db = '/home/ah/database/blast/nrfilt'     # BLAST database: must exist
sr = get_uniprot_seq_record('P18754')
print(sr.seq)
hits = blast_search(sr.seq, db, 'blastp') # Run on seq used before

data_set = DataFrame(hits, columns=cols)  # Convert to Pandas table

print(data_set.dtypes)                # Show data types of the columns

column = data_set['%_identity']                  # Fetch a column
print(column.min(), column.mean(), column.max()) # numpy-like functions

rows = data_set['%_identity'] > 75    # Conditionally select indices
print(data_set[rows])                 # Show selected indices
 

# Create a new column from existing column series
data_set['offset'] = Series(data_set['q_start']-data_set['s_start'])

data_set.to_csv('BLAST_results.csv', '\t') # Save as tab-separated file

excel_writer = ExcelWriter('BLAST_results.xlsx')       # Write to Excel
data_set.to_excel(excel_writer, sheet_name='Sheet1')   # spreadsheet
excel_writer.save()


# Plot a graph directly from a column Series (uses matplotlib)
ax = column.plot('hist', bins=50, title='Seq identity distribution')

ax.set_xlabel('% Identity') # Set axis labels
ax.set_ylabel('Count')
pyplot.show()               # Display graphics in screen
script_18_59.out
[ah@hobbes Documents]$ python script_18_59.py 
MSPKRIAKRRSPPADAIPKSKKVKVSHRSHSTEPGLVLTLGQGDVGQLGLGEN
VMERKKPALVSIPEDVVQAEAGGMHTVCLSKSGQVYSFGCNDEGALGRDTSV
EGSEMVPGKVELQEKVVQVSAGDSHTAALTDDGRVFLWGSFRDNNGVIGLLEP
MKKSMVPVQVQLDVPVVKVASGNDHLVMLTADGDLYTLGCGEQGQLGRVPEL
FANRGGRQGLERLLVPKCVMLKSRGSRGHVRFQDAFCGAYFTFAISHEGHVY
GFGLSNYHQLGTPGTESCFIPQNLTSFKNSTKSWVGFSGGQHHTVCMDSEGK
AYSLGRAEYGRLGLGEGAEEKSIPTLISRLPAVSSVACGASVGYAVTKDGRVFA
WGMGTNYQLGTGQDEDAWSPVEMMGKQLENRVVLSVSSGGQHTVLLVKDKEQS
query_id             object
subject_id           object
%_identity          float64
alignment_length      int64
mismatches            int64
gap_opens             int64
q_start               int64
q_end                 int64
s_start               int64
s_end                 int64
evalue              float64
bit_score           float64
dtype: object
(24.789999999999999, 52.088226691042053, 100.0)
      query_id                        subject_id  %_identity  \
0    UserQuery       gi|4502801|ref|NP_001260.1|      100.00   
1    UserQuery  gi|327199310|ref|NP_001192134.1|       99.76   
2    UserQuery       gi|119628100|gb|EAX07695.1|      100.00   
3    UserQuery  gi|397515820|ref|XP_003828141.1|       99.76   
4    UserQuery  gi|402853628|ref|XP_003891494.1|       99.05   
5    UserQuery       gi|119628097|gb|EAX07692.1|       96.12   
6    UserQuery  gi|114796646|ref|NP_001041660.1|       96.12   
7    UserQuery  gi|402853626|ref|XP_003891493.1|       99.05   
8    UserQuery  gi|544407822|ref|XP_005544289.1|       99.05   
9    UserQuery       gi|119628099|gb|EAX07694.1|       96.12   
10   UserQuery  gi|397515824|ref|XP_003828143.1|       95.89   
11   UserQuery             gi|4389390|pdb|1A12|A      100.00   
12   UserQuery  gi|114796644|ref|NP_001041659.1|       93.14   
13   UserQuery  gi|332245213|ref|XP_003271757.1|       94.98   
14   UserQuery       gi|119628095|gb|EAX07690.1|       93.14   
15   UserQuery  gi|402853634|ref|XP_003891497.1|       95.21   
16   UserQuery  gi|327199318|ref|NP_001192137.1|       97.86   
17   UserQuery  gi|544407820|ref|XP_005544288.1|       95.21   
18   UserQuery  gi|397515822|ref|XP_003828142.1|       92.92   
19   UserQuery  gi|327199324|ref|NP_001192140.1|       96.91   
20   UserQuery  gi|478503167|ref|XP_004425891.1|       97.39   
21   UserQuery  gi|410966597|ref|XP_003989817.1|       96.91   
22   UserQuery  gi|507575528|ref|XP_004671309.1|       96.20   
23   UserQuery  gi|560898119|ref|XP_006175513.1|       96.44   
24   UserQuery  gi|544407818|ref|XP_005544287.1|       92.26   
25   UserQuery  gi|560948741|ref|XP_006196937.1|       96.20   
26   UserQuery  gi|545490248|ref|XP_005617300.1|       96.44   
27   UserQuery  gi|511832853|ref|XP_004741077.1|       96.67   
28   UserQuery  gi|402853632|ref|XP_003891496.1|       92.26   
29   UserQuery  gi|327315350|ref|NP_001192141.1|       96.44   
..         ...                               ...         ...   
142  UserQuery  gi|326932912|ref|XP_003212555.1|       79.81   
143  UserQuery  gi|525022195|ref|XP_005058591.1|       78.62   
144  UserQuery  gi|525022197|ref|XP_005058592.1|       78.62   
145  UserQuery     gi|363742391|ref|XP_427366.3|       79.57   
146  UserQuery  gi|524959499|ref|XP_005079698.1|       90.69   
147  UserQuery  gi|543277880|ref|XP_005427132.1|       78.86   
148  UserQuery  gi|543372998|ref|XP_005529718.1|       77.80   
149  UserQuery  gi|542181352|ref|XP_005496408.1|       78.15   
150  UserQuery  gi|327315362|ref|NP_001192146.1|       77.91   
151  UserQuery  gi|543718943|ref|XP_005500705.1|       77.96   
152  UserQuery   gi|62859419|ref|NP_001016101.1|       76.90   
153  UserQuery       gi|115528650|gb|AAI24849.1|       76.43   
154  UserQuery       gi|171846900|gb|AAI61455.1|       80.15   
155  UserQuery        gi|47682592|gb|AAH70867.1|       76.74   
156  UserQuery        gi|48735020|gb|AAH72030.1|       75.95   
157  UserQuery  gi|147900380|ref|NP_001085384.1|       75.71   
158  UserQuery            gi|2134146|pir||I51558       79.40   
159  UserQuery  gi|148227322|ref|NP_001081229.1|       76.44   
165  UserQuery  gi|530584815|ref|XP_005286204.1|       77.98   
172  UserQuery       gi|537236894|gb|ERE85794.1|       88.68   
173  UserQuery  gi|527248394|ref|XP_005142516.1|       77.43   
174  UserQuery  gi|426328672|ref|XP_004025374.1|       99.64   
176  UserQuery  gi|297282712|ref|XP_002808329.1|       98.54   
177  UserQuery  gi|512815934|ref|XP_004878183.1|       86.16   
183  UserQuery  gi|224167221|ref|XP_002198843.1|       80.08   
184  UserQuery       gi|449267782|gb|EMC78684.1|       79.75   
185  UserQuery  gi|354475661|ref|XP_003500046.1|       88.00   
188  UserQuery       gi|146332865|gb|ABQ22938.1|       98.46   
239  UserQuery       gi|344242290|gb|EGV98393.1|       86.34   
251  UserQuery  gi|426328674|ref|XP_004025375.1|       90.78   

     alignment_length  mismatches  gap_opens  q_start  q_end  s_start  s_end  \
0                 421           0          0        1    421        1    421   
1                 421           1          0        1    421        1    421   
2                 421           0          0        1    421       55    475   
3                 421           1          0        1    421       55    475   
4                 421           4          0        1    421        1    421   
5                 438           0          1        1    421       87    524   
6                 438           0          1        1    421        1    438   
7                 421           4          0        1    421       55    475   
8                 421           4          0        1    421       55    475   
9                 438           0          1        1    421       55    492   
10                438           1          1        1    421       26    463   
11                413           0          0        9    421        1    413   
12                452           0          1        1    421        1    452   
13                438           5          1        1    421        1    438   
14                452           0          1        1    421       55    506   
15                438           4          1        1    421       26    463   
16                421           9          0        1    421        1    421   
17                438           4          1        1    421       55    492   
18                452           1          1        1    421       26    477   
19                421          13          0        1    421        1    421   
20                421          11          0        1    421        1    421   
21                421          13          0        1    421        1    421   
22                421          16          0        1    421        1    421   
23                421          15          0        1    421        1    421   
24                452           4          1        1    421       55    506   
25                421          16          0        1    421        1    421   
26                421          15          0        1    421       45    465   
27                421          14          0        1    421       10    430   
28                452           4          1        1    421       26    477   
29                421          15          0        1    421        1    421   
..                ...         ...        ...      ...    ...      ...    ...   
142               421          81          2        1    421        1    417   
143               421          88          1        1    421       15    433   
144               421          88          1        1    421        1    419   
145               421          82          2        1    421        1    417   
146               376          22          1        1    363        1    376   
147               421          86          1        4    421        8    428   
148               419          93          0        3    421        2    420   
149               421          91          1        1    421       27    446   
150               421          92          1        1    421        1    420   
151               422          92          1        1    421        1    422   
152               420          94          2        1    418        1    419   
153               420          96          2        1    418        1    419   
154               398          78          1       21    418       22    418   
155               417          94          2        4    418        1    416   
156               420          98          2        1    418        1    419   
157               420          99          2        1    418        1    419   
158               398          81          1       21    418        8    404   
159               416          96          2        4    418        9    423   
165               377          83          0        1    377        1    377   
172               318          36          0       77    394        1    318   
173               350          78          1       72    420       21    370   
174               274           1          0      148    421      219    492   
176               274           4          0      148    421      107    380   
177               318          22          1        1    296        8    325   
183               241          48          0      181    421        1    241   
184               242          49          0      180    421        1    242   
185               225          27          0      197    421        1    225   
188               195           3          0      227    421        1    195   
239               161          22          0      261    421        1    161   
251               141           6          1       25    165       27    160   

            evalue  bit_score  
0     0.000000e+00        863  
1     0.000000e+00        861  
2     0.000000e+00        860  
3     0.000000e+00        858  
4     0.000000e+00        856  
5     0.000000e+00        854  
6     0.000000e+00        853  
7     0.000000e+00        853  
8     0.000000e+00        852  
9     0.000000e+00        850  
10    0.000000e+00        849  
11    0.000000e+00        847  
12    0.000000e+00        846  
13    0.000000e+00        844  
14    0.000000e+00        844  
15    0.000000e+00        844  
16    0.000000e+00        843  
17    0.000000e+00        843  
18    0.000000e+00        843  
19    0.000000e+00        843  
20    0.000000e+00        842  
21    0.000000e+00        840  
22    0.000000e+00        838  
23    0.000000e+00        837  
24    0.000000e+00        837  
25    0.000000e+00        837  
26    0.000000e+00        837  
27    0.000000e+00        837  
28    0.000000e+00        837  
29    0.000000e+00        836  
..             ...        ...  
142   0.000000e+00        696  
143   0.000000e+00        695  
144   0.000000e+00        694  
145   0.000000e+00        694  
146   0.000000e+00        693  
147   0.000000e+00        691  
148   0.000000e+00        689  
149   0.000000e+00        687  
150   0.000000e+00        684  
151   0.000000e+00        683  
152   0.000000e+00        664  
153   0.000000e+00        664  
154   0.000000e+00        663  
155   0.000000e+00        662  
156   0.000000e+00        661  
157   0.000000e+00        660  
158   0.000000e+00        658  
159   0.000000e+00        657  
165   0.000000e+00        609  
172   0.000000e+00        582  
173   0.000000e+00        573  
174   0.000000e+00        566  
176   0.000000e+00        558  
177   0.000000e+00        558  
183  7.000000e-141        414  
184  4.000000e-140        412  
185  2.000000e-139        409  
188  3.000000e-136        400  
239   5.000000e-92        286  
251   3.000000e-78        250  

[172 rows x 12 columns]
[ah@hobbes Documents]$ ls -al *.fasta *.csv *.xlsx
-rw-rw-r--. 1 ah ah 48589 Jan  7 13:54 BLAST_results.csv
-rw-rw-r--. 1 ah ah 45210 Jan  7 13:54 BLAST_results.xlsx
-rw-rw-r--. 1 ah ah   522 Jan  7 13:53 P18754.fasta
[ah@hobbes Documents]$
script_18_59.png
script_18_59.png

 

Script 18.60

script_18_60.py
# Script 18.60
from numpy import random, dot, cov, linalg, concatenate, array
from matplotlib import pyplot

def get_principle_components(data, n=2):

  data = array(data)               # Convert input to array

  mean = data.mean(axis=0)         # Mean vector
  centred_data = (data - mean).T   # Centre all vectors and transpose

  covar = cov(centred_data)        # Get covariance matrix
  evals, evecs = linalg.eig(covar) # Get Eigenvalues and Eigenvectors

  indices = evals.argsort()[::-1]  # E. value indices by decreasing size

  evecs = evecs[:,indices]    # Sort Eigenvecs according to Eigenvals

  p_comp_mat = evecs[:,:n]    # Select required principle components

  return p_comp_mat

size = (100, 3)                     # 100 points times 3 dimensions
d1 = random.normal(0.0, 0.5, size)
d2 = random.normal(0.0, 0.5, size) + array([4.0, 1.0, 2.0])
d3 = random.normal(0.0, 0.5, size) + array([2.0, 0.0, -1.0])
test_data = concatenate([d1, d2, d3], axis=0)

pcomps = get_principle_components (test_data, n=2)   # Run PCA

pc1, pc2 = pcomps.T   # Extract the two PC vectors from columns

for x, y, z in (pc1, pc2):
  x *= 10             # Scale value so it can be seen better on graph       
  y *= 10
  pyplot.plot((0, x), (0, y))  # Plot PC x, y as line from origin (0,0)

x, y, x = test_data.T          # Extract x and y vals from transpose

pyplot.scatter(x, y, s=20, c='#0080FF', marker='o', alpha=0.5)
pyplot.show()
script_18_60.out
[ah@hobbes Documents]$ python script_18_60.py 
[ah@hobbes Documents]$
script_18_60.png
script_18_60.png

 

Script 18.61

script_18_61.py
# Script 18.61
from scipy.stats import norm, poisson
from numpy import arange
from matplotlib import pyplot

poisson_rand_var = poisson(2.0)       # Random variable object

x_xals = arange(0, 10, 1)             # Value to plot probabilities for
y_vals = poisson_rand_var.pmf(x_xals) # Get probabilities from distrib.

pyplot.plot(x_xals, y_vals, color='black') # Make line plot
pyplot.show()                              # Show on screen
script_18_61.out
[ah@hobbes Documents]$ python script_18_61.py 
[ah@hobbes Documents]$
script_18_61.png
script_18_61.png

 

Script 18.62

script_18_62.py
# Script 18.62
from numpy import array
from scipy.stats import norm

def normal_tail_test(values, mv, std, one_sided=True):
 
  norm_rv = norm(mv, std)     # Normal distrib. random variable object
 
  diffs = abs(values-mv)         # Calc differences from distrib. mean
  result = norm_rv.cdf(mv-diffs) # Use cumulative density function
 
  if not one_sided:              # Two-tailed test
    result *= 2                  # Distrib. is symmetric: double area

  return result

mean    = 1.76
std_dev = 0.075
values  = array([1.8, 1.9, 2.0])

result = normal_tail_test (values, mean, std_dev, one_sided=True)
print('Normal one-tail', result)         # [0.297, 0.03097, 0.000687]
script_18_62.out
[ah@hobbes Documents]$ python script_18_62.py 
('Normal one-tail', array([ 0.29690143,  0.03097408,  0.00068714]))
[ah@hobbes Documents]$

 

Script 18.63

script_18_63.py
# Script 18.63
from numpy import array, zeros, fromstring, int8, arange
from matplotlib import pyplot

def plot_fastq_qualities(fastq_file, read_len=100):
    
  file_obj = open(fastq_file)    # Open FASTQ file for reading

  qual_scores = []               # Initial scores are empty list

  line = file_obj.readline()     # Read first line (a header)
  pad = ' ' * read_len           # Lots of spaces (to pad short data)

  while line: # Continue looping while there are more lines

    seq = file_obj.readline()    # Sequence line; not used
    head = file_obj.readline()   # Second header line; not used
    codes = file_obj.readline()  # Quality score code line; used
    codes = codes.strip()        # Remove trailing newline char etc  

    if len(codes) < read_len:    # If quality code string too short
      codes += pad               # Extend with spaces
 
    codes = codes[:read_len]     # Chop codes string to desired length
 
    qvals = fromstring(codes, dtype=int8) # Convert string to numbers
    qvals -= 32                  # Set lowest possible value to zero
    
    qual_scores.append(qvals)    # Add qual. scores for this seq to list
    line = file_obj.readline()   # Read next header line
 
  qual_scores = array(qual_scores)   # Convert final list to array
  positions = arange(1, read_len+1)  # Make array of seq. positions

  ave_scores = qual_scores.mean(axis=0)  # Find mean for each position
  st_devs = qual_scores.std(axis=0)      # Standard dev. for each pos.

  # Plot errorbars of stardard dev. height at score positions
  pyplot.errorbar(positions, ave_scores, yerr=st_devs, color='red')

  # Plot a line graph of scores on top
  pyplot.plot(positions, ave_scores, color='black', linewidth=2)

  pyplot.show()

fastq_file = 'My_DNA_sample.fastq' # Test FASTQ file location
plot_fastq_qualities(fastq_file)
script_18_63.out
[ah@hobbes Documents]$ python script_18_63.py 
[ah@hobbes Documents]$
script_18_63.png
script_18_63.png

 

Script 18.64

script_18_64.py
# Script 18.64
def find_discordant_read_pairs(sam_file, max_sep=2e3, verbose=True):
 
  import os
  from time import time     # Import inside the function, for a change
  from pysam import Samfile, AlignmentFile

  start_time = time()       # Record start time (seconds since 1/1/1970)
 
  root_name, ext = os.path.splitext(sam_file) # Split input file ext.
  out_file = root_name + "_discordant.bam"    # Out file has new ending
 
  in_sam = Samfile(sam_file, 'rb')          # Open BAM file for reading
  out_sam = AlignmentFile(out_file, "wb",   # Make new (empty) BAM file
                          template=in_sam)  # base headers on input file  
 
  chromo_names = in_sam.references          # Fetch chromosome names

  msg_cis = 'Long range: {}:{} - {}:{}'     # Some message templates
  msg_trans = 'Interchromosome: {}:{} - {}:{}'
 
  n = 0                    # Number of discordant read pairs found
  for align in in_sam:     # Go through each input alignment record
    chr_a = align.tid      # Primary chromosome of alignment
    pos_a = align.pos      # Primary base pair position
    chr_b = align.rnext    # Secondary chromosome
    pos_b = align.pnext    # Secondary position

    if chr_a == chr_b:          # If both chromosomes the same
      delta = abs(pos_b-pos_a)  # Get separation of read ends
      
      if delta > max_sep:         # If too big
        out_sam.write(align)      # Write alignment record to output BAM
        n += 1                    # Increase count
        
        if verbose:               # Check option to print a message
          chr_name = chromo_names[chr_a]
          print(msg_cis.format(chr_name, pos_a, chr_name, pos_b))
      
    else:                         # Chromosomes were different
      out_sam.write(align)        # Write alignment record to output BAM
      n += 1                      # Increase count
      
      if verbose:                 # Check option to print a message
        chr_name_a = chromo_names[chr_a]
        chr_name_b = chromo_names[chr_b]
        print(msg_trans.format(chr_name_a, pos_a, chr_name_b, pos_b))
 
  in_sam.close()
  out_sam.close()
 
  delta_time = time()-start_time  # Time taken: current time minus start
 
  # Wite out some useful information
  print('Written {:d} records to BAM file: {}'.format(n, out_file))
  print('Time taken: {:.1f} s'.format(delta_time))
 
bam_file = 'GSM409307_UCSD.H3K4me1.bam'  # Test on a _paired_ BAM file
find_discordant_read_pairs(bam_file, verbose=False)
script_18_64.out
[ah@hobbes Documents]$ python script_18_64.py 
[W::bam_hdr_read] EOF marker is absent. The input is probably truncated.
Written 6146231 records to BAM file: GSM409307_UCSD.H3K4me1_discordant.bam
Time taken: 48.1 s
[ah@hobbes Documents]$ ls -al *.bam
-rw-rw-r--. 1 ah ah 356121115 Jan  8 08:04 GSM409307_UCSD.H3K4me1.bam
-rw-rw-r--. 1 ah ah 300195982 Jan  8 08:15
GSM409307_UCSD.H3K4me1_discordant.bam
[ah@hobbes Documents]$

 

Script 18.65

script_18_65.py
# Script 18.65
from subprocess import call          
# Function to run an external program
#from urllib.request import urlopen  
# python 3.0 Import urlopen() function
from urllib2 import urlopen          
# python 2.7 Import urlopen() function
from pandas import ExcelWriter, read_table
import ruffus

BLAST_DATABASE = '/home/ah/database/blast/nrfilt' 
# Location of BLAST database
UNIPROT_URL = 'http://www.uniprot.org/uniprot/{}'

# Create a list of FASTA file names from UniProt IDs
uniprot_id_list = ['P68431', 'P62805', 'P0C0S8', 'Q96A08']
fasta_files = [x + '.fasta' for x in uniprot_id_list]
 
@ruffus.originate(fasta_files)      # First stage, makes FASTA files
def get_uniprot_seq(fasta_file):    # Downloads a named FASTA file
 
  url = UNIPROT_URL.format(fasta_file)    # Insert file name into URL
  response = urlopen(url)                 # Access remote location
  data = response.read().decode('utf-8')  # Read data as text
 
  file_obj = open(fasta_file, 'w')  # Open file for writing
  file_obj.write(data)              # Save downloaded data
  file_obj.close()

@ruffus.transform(get_uniprot_seq, ruffus.suffix(".fasta"), ".blast")
def run_blast(fasta_file, table_file, e_cutoff=1e-10, cpu_cores=1):
    
  cmd_args = ['blastp', '-query', fasta_file, '-out', table_file,
              '-outfmt', '6', '-num_threads', str(cpu_cores),
              '-db', BLAST_DATABASE, '-evalue', str(e_cutoff)]  
 
  call(cmd_args)  # Run external program with command parameters.

@ruffus.transform(run_blast, ruffus.suffix(".blast"), ".xlsx")
def make_spreadsheet(table_file, xl_file):
 
  # A tuple containing spreadsheet column headings
  columns = ('query_id', 'subject_id', '%_identity', 'alignment_length',
             'mismatches', 'gap_opens', 'q_start', 'q_end', 's_start',
             's_end', 'evalue', 'bit_score')

  data_types = [str, str, float, int, int, int,    # Data type for each
                int, int, int, int, float, float]  # column in sheet
 
 
  data_types = zip(columns, data_types)   # Pair cols with data types
  data_types = dict(data_types)           # Make dict of col_name:dtype

  # Make a data table object with pandas, with the dtype dict
  # allowing the correct conversion of the BLAST table to numbers
  data_frame = read_table(table_file, dtype=data_types,
                                 names=columns, header=None)

  excel_writer = ExcelWriter(xl_file)         # Make Excel file

  data_frame.to_excel(excel_writer,           # Write data to sheet
                      sheet_name='Sheet1')    # in Excel file

  excel_writer.save()

ruffus.pipeline_printout()

ruffus.pipeline_run(multiprocess=4,)

ruffus.pipeline_printout()
script_18_65.out
[ah@hobbes Documents]$ python script_18_65.py 

________________________________________
Tasks which will be run:

Task = 'get_uniprot_seq'
       Job  = [None
             -> P68431.fasta]
         Job needs update: ...
               Missing file [P68431.fasta]
       Job  = [None
             -> P62805.fasta]
         Job needs update: ...
               Missing file [P62805.fasta]
       Job  = [None
             -> P0C0S8.fasta]
         Job needs update: ...
               Missing file [P0C0S8.fasta]
       Job  = [None
             -> Q96A08.fasta]
         Job needs update: ...
               Missing file [Q96A08.fasta]

Task = 'run_blast'
       Job  = [P0C0S8.fasta
             -> P0C0S8.blast]
         Job needs update: ...
               Missing files [P0C0S8.fasta, P0C0S8.blast]
       Job  = [P62805.fasta
             -> P62805.blast]
         Job needs update: ...
               Missing files [P62805.fasta, P62805.blast]
       Job  = [P68431.fasta
             -> P68431.blast]
         Job needs update: ...
               Missing files [P68431.fasta, P68431.blast]
       Job  = [Q96A08.fasta
             -> Q96A08.blast]
         Job needs update: ...
               Missing files [Q96A08.fasta, Q96A08.blast]

Task = 'make_spreadsheet'
       Job  = [P0C0S8.blast
             -> P0C0S8.xlsx]
         Job needs update: ...
               Missing files [P0C0S8.blast, P0C0S8.xlsx]
       Job  = [P62805.blast
             -> P62805.xlsx]
         Job needs update: ...
               Missing files [P62805.blast, P62805.xlsx]
       Job  = [P68431.blast
             -> P68431.xlsx]
         Job needs update: ...
               Missing files [P68431.blast, P68431.xlsx]
       Job  = [Q96A08.blast
             -> Q96A08.xlsx]
         Job needs update: ...
               Missing files [Q96A08.blast, Q96A08.xlsx]

________________________________________

________________________________________
Tasks which will be run:


Task enters queue = 'get_uniprot_seq'
Completed Task = 'get_uniprot_seq'
Task enters queue = 'run_blast'
Completed Task = 'run_blast'
Task enters queue = 'make_spreadsheet'
Completed Task = 'make_spreadsheet'

________________________________________
Tasks which will be run:

________________________________________
[ah@hobbes Documents]$ ls -al *.fasta *.blast *.xlsx
-rw-rw-r--. 1 ah ah 44432 Jan  7 13:59 P0C0S8.blast
-rw-rw-r--. 1 ah ah   213 Jan  7 13:57 P0C0S8.fasta
-rw-rw-r--. 1 ah ah 35802 Jan  7 14:00 P0C0S8.xlsx
-rw-rw-r--. 1 ah ah 43209 Jan  7 13:59 P62805.blast
-rw-rw-r--. 1 ah ah   174 Jan  7 13:57 P62805.fasta
-rw-rw-r--. 1 ah ah 35912 Jan  7 14:00 P62805.xlsx
-rw-rw-r--. 1 ah ah 47852 Jan  7 13:59 P68431.blast
-rw-rw-r--. 1 ah ah   211 Jan  7 13:57 P68431.fasta
-rw-rw-r--. 1 ah ah 38021 Jan  7 14:00 P68431.xlsx
-rw-rw-r--. 1 ah ah 45540 Jan  7 13:59 Q96A08.blast
-rw-rw-r--. 1 ah ah   213 Jan  7 13:57 Q96A08.fasta
-rw-rw-r--. 1 ah ah 35796 Jan  7 14:00 Q96A08.xlsx
[ah@hobbes Documents]$

 

 

Sign In

Please sign in to access your account

Cancel

Not already registered? Create an account now. ×

Find content that relates to you

This site uses cookies to improve your experience. Read more Close

Are you sure you want to delete your account?

This cannot be undone.

Cancel

Thank you for your feedback which will help us improve our service.

If you requested a response, we will make sure to get back to you shortly.

×
Please fill in the required fields in your feedback submission.
×