bioc218: predicting co-translational pausing with neural networks

Summary

Wanted to share my final paper for bioc218, a computational biology class i took this winter 2013. We were tasked with choosing a particular technique, algorithm, or any material discussed in the class and exploring it in more detail. Neural networks seemed particularly interesting—and not just because i'm a neuroscientist. I wanted to use machine learning algorithms to improve predictions for the project i am working on in Judith Frydman's lab.

An example plot of my analysis of sites for domain pausing. Bars at the top indicate domains, gray regions show ribosome exit tunnel, TE is the translation efficiency (rate of translation), and the rest is self-explanatory. A local GO database was used to annotate with a description of the protein. The figure was created in R.

For those who want to dive right in, the paper:

bioc218 final paper

Wanted to share my final paper for bioc218, a computational biology class i took this winter 2013. We were tasked with choosing a particular technique, algorithm, or any material discussed in the class and exploring it in more detail. Neural networks seemed particularly interesting—and not just because i'm a neuroscientist. I wanted to use machine learning algorithms to improve predictions for the project i am working on in Judith Frydman's lab.

We are using codon optimality to predict sites of ribosome pausing, which could help facilitate proper protein domain folding. This would be validated with ribosomal profiling or other emerging techniques. The higher-level idea is that synonymous substitutions in a protein coding region could affect an organism's fitness, as recently shown with Synechococcus elongate's kaiBC and Neurospora's FRQ. This mechanism has also been seen to play a role in MDR1 induced cancer and it is possible that other protein folding diseases, especially neurodegenerative ones, might be linked by this common pathway. Elucidating it could lead to novel therapies or reveal basic biological mechanisms.

The final report encompasses both a description of the history and implementation of neural networks. In addition, i went beyond the requirement by proposing and demonstrating initial analysis for applying neural networks to discovering novel sites of domain specific ribosomal pausing in the genome. While the results indicate much works needs to be done, the conceptual framework is there to continue. Further, i included a short section proposing an extension to current neural network methods that is subsequently shown to be valid through an article i discuss briefly. The paper was super fun to write (in LaTeX of course!) and i hope to finish the project at some point this summer.

-biafra
bahanonu [at] alum.mit.edu

additional articles to journey through:

janelia opening symposium
24 june 2011 | science

At the beginning of the Janelia Undergrad program we had to give a presentation on what we planned to do. It was a worthwhile experiences t[...]hat made one really focus on the different aspects of your project and get feedback.

neuroscience and biology resources
09 may 2014 | science

A continually updated repository of scientific resources (journals, blogs, etc.) in biology with a focus on neuroscience.[...]

dslr chronicles #1: getting ready for mars
09 september 2020 | photography

San Francisco looks good in red. Select pictures from a Martian-themed bike ride around SF.[...]

satellite-based videos: eastern europe during the russia-ukraine conflict
30 november 2022 | satellite

To visualize the nighttime lights of Eastern Europe, with a focus on times before and after the ongoing Russia-Ukraine conflict, I updated [...]my geoSatView R code originally built to view forest fires in the west coast of the United States to use satellite data from VNP46A1 and other datasets collected from the Suomi NPP VIIRS satellite.

I then created higher-quality movies in MATLAB by using the VNP46A2 Black Marble dataset collected by the same satellite, which has reduced cloud and other artifacts due to additional data processing. This allowed me to quantitate a permanent reduction in nighttime lights within Ukraine (in line with my initial hypothesis) and identify a multi-stage reduction of nighttime lights in Kiev's outer neighborhoods/metropolitan area that was greater than that seen in the city core/center. This highlights the utility of public satellite data to quickly test hypotheses and visualize large-scale changes.

I will go over how the Black Marble dataset is collected and processed along with how I created the movies and the advantages/disadvantages of each data source.

Using this platform and codebase, in follow-up posts I will look at 2021 Texas power crisis during the winter storms, vegetation changes in deforested areas or after conservation efforts, and other events.

©2006-2024 | Site created & coded by Biafra Ahanonu | Updated 17 April 2024
biafra ahanonu