Cambridge has been incredible. I’ve met great classmates, awesome fellow Churchill scholars, and even a Nobel laureate. In between meeting new people, there has been a lot of work. My classes went from 0 to 100. In classes, we’ve covered things such as RNAseq, clustering, linear models, genome assembly, and much more. We learn about the comptuational problems associated with these topics then try to tackle them in practicals or homework assignments. More homework = less blog posts unfortunately.

For one homework assignment, I developed a Python exexcutable application for Windows called Swiss2GO. All code for the app is available here. This application is a simple start to a potentially much larger project. Swiss2GO obtains Gene Ontology (GO) annotations for FASTA files of proteins. Annotations are important because they allow scientists to predict functions of novel proteins based on similarly structured, known proteins.

Currently, Swiss2GO:

  1. Ingests a FASTA file selected by the user using a file chooser GUI.
  2. Queries the proteins of that file against the SwissProt database using the NCBI BLAST API. *BLAST usses modified dynamic programing to align the proteins of the FASTA file to the Swissprot database.
  3. Parses the output for Uniprot IDs, then obtains GO annotations.
  4. Reports these annotations along with e-values of alignments back to the user in the app and in a CSV file.

I built this app using Tkinter for graphics, Biopython and Requests for API calling.

In the future, I’d like to expand this into an open source alternative to Blast2GO. Side note I’ll probably change the name, I honestly had no idea Blast2GO was a thing when I started to build this and came up with a name!

Check out a picture of Swiss2GO in use: Swiss2GO