Edit

3D Protein Modeling

Postach.io published biology

A few months ago I helped my girlfriend do some protein modeling for a class. I’d done this earlier while in a PHD program but that was way back when I though RAM was short for Baba Ram Dass.

You can find the completed project on GitHub as https://github.com/sblack4/bioinformatics-analysis-P70922, where we put a single FASTA sequence through a manual pipeline resulting in a 3D model which can be viewed in the browser. While the README should explain the process in limited detail the finished product (a static html page) can be viewed on GitHub Pages https://sblack4.github.io/bioinformatics-analysis-P70922/. Scroll down to the bottom of the page where you can play with the little 3D model of our protein, Pz-peptidase.

Quick terminology

structure

  • primary Structure = DNA sequence
  • 2ndary structure = beta-pleated sheet or alpha-helix
  • tertiary structure = 3d shape the sheets & helices make

modeling

  • homology-modeling = model by comparing to similar (homologous) proteins*

*it is practically impossible to get the 3D structure of a protein from scratch, so we must try to match it with similar proteins


What you will (probably) be doing

Build a pipeline to;
read FASTA text
search text against templates on BLAST
choose template from search results
use template to do sequence alignment
use alignment to generate secondary structure
use secondary structure to do homology modeling & produce 3D protein structure in .pdb format
use .pdb files to visualize protein in browser with jsmol


Tools that look promising

Our pipeline was manual. You will have to automate this,

  • jsmol for visualization
  • BLAST api to find template matches. This should be doable with Biopython’s BLAST module*
  • Biopython’s parwise2 module to do sequence alignment
  • biskit or modeller for homology modeling

*see https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=DeveloperInfo. this looks tricky - it looks like some people have wrapped the API, including;


For some background reading try Bitesize Bio https://bitesizebio.com/38005/computation-protein-modeling/

This book looks promising. check out chapters 2.1, 2.2 http://readiab.org/book/latest/

Awesome lists are cool https://github.com/danielecook/Awesome-Bioinformatics

We’re replicating part of this https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5113968/

%23%203D%20Protein%20Modeling%20%0A@%28Postach.io%29%5Bpublished%2C%20biology%5D%0A%0AA%20few%20months%20ago%20I%20helped%20my%20girlfriend%20do%20some%20protein%20modeling%20for%20a%20class.%20I%27d%20done%20this%20earlier%20while%20in%20a%20PHD%20program%20but%20that%20was%20way%20back%20when%20I%20though%20RAM%20was%20short%20for%20Baba%20Ram%20Dass.%0A%0AYou%20can%20find%20the%20completed%20project%20on%20GitHub%20as%20%20https%3A//github.com/sblack4/bioinformatics-analysis-P70922%2C%20where%20we%20put%20a%20single%20FASTA%20sequence%20through%20a%20manual%20pipeline%20resulting%20in%20a%203D%20model%20which%20can%20be%20viewed%20in%20the%20browser.%20While%20the%20README%20should%20explain%20the%20process%20in%20limited%20detail%20the%20finished%20product%20%28a%20static%20html%20page%29%20can%20be%20viewed%20on%20GitHub%20Pages%20https%3A//sblack4.github.io/bioinformatics-analysis-P70922/.%20Scroll%20down%20to%20the%20bottom%20of%20the%20page%20where%20you%20can%20play%20with%20the%20little%203D%20model%20of%20our%20protein%2C%20Pz-peptidase.%20%0A%0A%23%23%23%20Quick%20terminology%20%0Astructure%20%0A-%20primary%20Structure%20%3D%20DNA%20sequence%0A-%202ndary%20structure%20%3D%20beta-pleated%20sheet%20or%20alpha-helix%0A-%20tertiary%20structure%20%3D%203d%20shape%20the%20sheets%20%26%20helices%20make%20%0A%0Amodeling%20%0A-%20homology-modeling%20%3D%20model%20by%20comparing%20to%20similar%20%28homologous%29%20proteins*%20%0A%0A*it%20is%20practically%20impossible%20to%20get%20the%203D%20structure%20of%20a%20protein%20from%20scratch%2C%20so%20we%20must%20try%20to%20match%20it%20with%20similar%20proteins%20%0A%0A%0A---%0A%23%23%23%20What%20you%20will%20%28probably%29%20be%20doing%20%0ABuild%20a%20pipeline%20to%3B%20%0Aread%20FASTA%20text%0Asearch%20text%20against%20templates%20on%20%5BBLAST%5D%28https%3A//blast.ncbi.nlm.nih.gov/Blast.cgi%29%0Achoose%20template%20from%20search%20results%20%0Ause%20template%20to%20do%20sequence%20alignment%20%0Ause%20alignment%20to%20generate%20secondary%20structure%20%0Ause%20secondary%20structure%20to%20do%20homology%20modeling%20%26%20produce%203D%20protein%20structure%20in%20%60.pdb%60%20format%0Ause%20%60.pdb%60%20files%20to%20visualize%20protein%20in%20browser%20with%20%5Bjsmol%5D%28https%3A//chemapps.stolaf.edu/jmol/jsmol/jsmol.htm%29%0A%0A%0A---%0A%23%23%23%20Tools%20that%20look%20promising%20%0AOur%20pipeline%20was%20manual.%20You%20will%20have%20to%20automate%20this%2C%20%0A%0A-%20jsmol%20for%20visualization%0A-%20BLAST%20api%20to%20find%20template%20matches.%20This%20should%20be%20doable%20with%20Biopython%27s%20%5B%60BLAST%60%20module%5D%28http%3A//biopython.readthedocs.io/en/latest/Tutorial/chapter_blast.html%29*%20%0A-%20Biopython%27s%20%5B%60parwise2%60%20module%5D%28http%3A//biopython.org/DIST/docs/api/Bio.pairwise2-module.html%29%20to%20do%20sequence%20alignment%0A-%20%5Bbiskit%5D%28https%3A//github.com/graik/biskit%29%20or%20%5Bmodeller%5D%28https%3A//salilab.org/modeller/%29%20for%20homology%20modeling%20%0A%0A*see%20https%3A//blast.ncbi.nlm.nih.gov/Blast.cgi%3FCMD%3DWeb%26PAGE_TYPE%3DBlastDocs%26DOC_TYPE%3DDeveloperInfo.%20this%20looks%20tricky%20-%20it%20looks%20like%20some%20people%20have%20wrapped%20the%20API%2C%20including%3B%20%0A-%20https%3A//github.com/graik/biskit%2C%20which%20may%20provide%20everything%20we%20need%0A-%20https%3A//salilab.org/modeller/%2C%20which%20can%20do%20modeling%20%0A%0A---%0A%0AFor%20some%20background%20reading%20try%20Bitesize%20Bio%20https%3A//bitesizebio.com/38005/computation-protein-modeling/%0A%0AThis%20book%20looks%20promising.%20check%20out%20chapters%202.1%2C%202.2%20http%3A//readiab.org/book/latest/%20%0A%0AAwesome%20lists%20are%20cool%20https%3A//github.com/danielecook/Awesome-Bioinformatics%20%0A%0AWe%27re%20replicating%20part%20of%20this%20https%3A//www.ncbi.nlm.nih.gov/pmc/articles/PMC5113968/%20