DNA to Amino Acid - a sample Perl script
Archive - Originally posted on "The Horse's Mouth" - 2011-06-24 22:22:32 - Graham EllisA really rewarding course this week - Perl programming, for a dozen bright delegates in the bioinformatics field - the people who have defined the human code as billions of C A T and Gs and are then fuzzy matching against that human code to help in medical research. I hope I am forgiven for that simplistic explanation - THEY are the experts at the algorithms, not me. On a course such as this we add their knowledge of the data to my Perl and come up with a glue that is greater than the sum of the components.
The C A T and G letters go together in groups of 3 to make up Amino Acids ... of which there are 24. A hash is a very good way to set up and use this data, and a regular expression substitute is a short (but it must be admitted inefficient) way of translating 3 at a time. Of course, you don't know where in the sequence to start, so there are three possible strings ... see the sample program [here].
I'm sitting in Edinburgh Waverley station this evening, awaiting the sleeper train back to London. Home, midmorning tomorrow. It's been a long week too ...