Open Reading Frames

AUTHOR

L. Grondin

http://rosalind.info/problems/orf/

Sample input

>Rosalind_99
    AGCCATGTAGCTAACTCAGGTTACATGGGGATGACCCCGCGACTTGGATTAGAGTCTCTTTTGGAATAAGCCTGAATGATCCGAGTAGCATCTCAG

Sample output

MLLGSFRLIPKETLIQVAGSSPCNLS
    M
    MGMTPRLGLESLLE
    MTPRLGLESLLE
use v6;



constant DNA-codon = Hash.new: <
    TTT F      CTT L      ATT I      GTT V
    TTC F      CTC L      ATC I      GTC V
    TTA L      CTA L      ATA I      GTA V
    TTG L      CTG L      ATG M      GTG V
    TCT S      CCT P      ACT T      GCT A
    TCC S      CCC P      ACC T      GCC A
    TCA S      CCA P      ACA T      GCA A
    TCG S      CCG P      ACG T      GCG A
    TAT Y      CAT H      AAT N      GAT D
    TAC Y      CAC H      AAC N      GAC D
    TAA Stop   CAA Q      AAA K      GAA E
    TAG Stop   CAG Q      AAG K      GAG E
    TGT C      CGT R      AGT S      GGT G
    TGC C      CGC R      AGC S      GGC G
    TGA Stop   CGA R      AGA R      GGA G
    TGG W      CGG R      AGG R      GGG G
>;

my $default-input = "AGCCATGTAGCTAACTCAGGTTACATGGGGATGACCCCGCGACTTGGATTAGAGTCTCTTTTGGAATAAGCCTGAATGATCCGAGTAGCATCTCAG";

sub revc($dna) {
    $dna.comb.reverse.join.trans:
    [<A C T G>] => [<T G A C>]
}

sub orf($dna) {
    my %match;
    my @match = (gather for $dna, revc $dna {
        take .match: rx/ ATG [ <[ACGT]>**3 ]*? <before TAA|TAG|TGA> /, :overlap;
    })>>.list.flat;

    %match{
        [~] map { DNA-codon{$_} }, .match: rx/ <[ACGT]>**3 /, :g
    }++ for @match;

    return %match.keys.sort;
}

sub MAIN(Str $input = $default-input) {
    .say for orf $input;
}

# vim: expandtab shiftwidth=4 ft=perl6

See Also

afrq-grondilu.pl

Counting Disease Carriers

aspc-grondilu.pl

Introduction to Alternative Splicing

cons-grondilu.pl

Consensus and Profile

conv-grondilu.pl

Comparing Spectra with the Spectral Convolution

cstr-grondilu.pl

Creating a Character Table from Genetic Strings

ctbl-grondilu.pl

Creating a Character Table

dbpr-grondilu.pl

Introduction to Protein Databases

dna-gerdr.pl

Counting DNA Nucleotides

dna-grondilu.pl

Counting DNA Nucleotides

eubt-grondilu.pl

Enumerating Unrooted Binary Trees

eval-grondilu.pl

Expected Number of Restriction Sites

fib-grondilu.pl

Rabbits and Recurrence Relations

fibd-grondilu.pl

Mortal Fibonacci Rabbits

gc-gerdr.pl

Computing GC Content

grph-grondilu.pl

Overlap Graphs

hamm-grondilu.pl

Counting Point Mutations

iev-grondilu.pl

Calculating Expected Offspring

indc-grondilu.pl

Independent Segregation of Chromosomes

iprb-grondilu.pl

Mendel's First Law

itwv-grondilu.pl

Finding Disjoint Motifs in a Gene

lcsq-grondilu.pl

Finding a Shared Spliced Motif

lia-grondilu.pl

Independent Alleles

lrep-grondilu-p5.pl

mmch-grondilu.pl

Maximum Matchings and RNA Secondary Structures

mprt-grondilu.pl

Finding a Protein Motif

mrna-grondilu.pl

Inferring mRNA from Protein

nwck-grondilu.pl

Distances in Trees

pmch-grondilu.pl

Perfect Matchings and RNA Secondary Structures

pper-grondilu.pl

Partial Permutations

prob-grondilu.pl

Introduction to Random Strings

qrt-grondilu.pl

Quartets

README.md

revc-gerdr.pl

Complementing a Strand of DNA

rna-gerdr.pl

Transcribing DNA into RNA

rstr-grondilu.pl

Matching Random Motifs

sexl-grondilu.pl

Sex-Linked Inheritance

sgra-grondilu.pl

Using the Spectrum Graph to Infer Peptides

spec-grondilu.pl

Inferring Protein from Spectrum

sseq-grondilu.pl

Finding a Spliced Motif

subs-grondilu.pl

Finding a Motif in DNA

suff-grondilu.pl

Encoding Suffix Trees

tran-grondilu.pl

Transitions and Transversions

trie-grondilu.pl

Introduction to Pattern Matching

The Camelia image is copyright 2009 by Larry Wall. "Raku" is trademark of the Yet Another Society. All rights reserved.