Python snippets

Below lab members may add some quick copy and paste code blocks which may be helpful in a variety of situations related to work in the lab

Read and write to a TSV file

from csv import DictReader

with open(input_path, 'r') as fr:

reader = DictReader(fr, delimiter='\t')

headers = reader.fieldnames

rows = [x for x in reader]

from csv import DictWriter

with open(output_path, 'w') as fw:

writer = DictWriter(fw, fieldnames=headers, delimiter='\t')

writer.writeheader()

for row in rows:

writer.writerow(row)

Subset a gff3 using majiq_tools

import majiq_tools as mt

import pandas as pd

import sys

gff3 = 'Homo_sapiens.GRCh38.94.gff3'

save_as = 'Homo_sapiens.GRCh38.94.gff3.sub'

genes = ['ENSG00000215704', 'ENSG00000142615', 'ENSG00000259042', 'ENSG00000251002', 'ENSG00000275552']

def subset_gff3(gff3, genes, save_as):

fullgff3 = gff3

gff3 = mt.gff3.load_gff3(fullgff3)

genes = ['gene:' + x for x in genes]

gff3_subsetted = mt.gff3.subset_genes(df_gff3=gff3,gene_id=genes)

mt.gff3.save_gff3(gff3_subsetted, save_as)

subset_gff3(gff3, genes, save_as)

Allow your program to take arguments

As your script grows past the most basic form, you will want to avoid needing to edit the script itself to modify options, such as input/output paths and filter thresholds. Here is a basic overview of one of the most intuitive libraries, argparse:

import argparse

parser = argparse.ArgumentParser(description='This is just a generally great software, you know~!')

parser.add_argument('-i', "--input-file", help="input path", required=True)

parser.add_argument("--output-file", help="output path", required=False)

parser.add_argument("--some-float-filter", help="some number thingy", type=float, default=0.05)

parser.add_argument("--number-of-things", type=int,

help="this can be a really long, super comprehensive description too, the program will automatically"

" make it look nice in the program help screen. Give it a try!"

)

parser.add_argument("--program-is-awesome", action='store_true', help="If the program is awesome, set this flag")

parser.add_argument("--analysis-type", help="psi, dpsi, or het?", choices=['psi', 'dpsi', 'het'])

parser.add_argument("--secret-thing", help=argparse.SUPPRESS)

args = parser.parse_args()

print(args)

print(args.input_file)

Here we have a basic demonstration of the library (there are many more advanced usages too, if you need something it probably exists)

The immediate points to note about the add_argument function:

-specify one or two strings, if two one will be the 'short' version (in the example thats '-i') and the other the actual name of the argument.

-the name that you give for the argument should use dashes to separate words, however, the variable name after parsed will replace all of the dashes with underscores. This is the convention (dashes in shell, underscores in python)

-there can be a "type" argument, which will require that the specified setting be a float or int. For a string argument, you simply don't specify any type.

-for a "switch" (boolean) type, instead of using type, you give action='store_true' as indicated.

-you can limit possible options for a string argument with the "choices" parameter

-for an argument that you can use while testing but you don't want displayed to users, you use help=argparse.SUPPRESS

A nice help text screen will be automatically generated for your program. To see it just run your script with "--help":

usage: samples.py [-h] -i INPUT_FILE [--output-file OUTPUT_FILE] [--some-float-filter SOME_FLOAT_FILTER] [--number-of-things NUMBER_OF_THINGS] [--program-is-awesome] [--analysis-type {psi,dpsi,het}]

This is just a generally great software, you know~!

optional arguments:

-h, --help show this help message and exit

-i INPUT_FILE, --input-file INPUT_FILE

input path

--output-file OUTPUT_FILE

output path

--some-float-filter SOME_FLOAT_FILTER

some number thingy

--number-of-things NUMBER_OF_THINGS

this can be a really long, super comprehensive description too, the program will automatically make it look nice in the program help screen. Give it a try!

--program-is-awesome If the program is awesome, set this flag

--analysis-type {psi,dpsi,het}

psi, dpsi, or het?