Using the Spectrum Graph to Infer Peptides

AUTHOR

L. Grondin

http://rosalind.info/problems/sgra/

Sample input

3524.8542
    3623.5245
    3710.9335
    3841.974
    3929.00603
    3970.0326
    4026.05879
    4057.0646
    4083.08025

Sample output

WMSPG
use v6;



my $default-input = q:to/END/;
    3524.8542
    3623.5245
    3710.9335
    3841.974
    3929.00603
    3970.0326
    4026.05879
    4057.0646
    4083.08025
    END

sub MAIN($input-file = Nil) {
    my $input = $input-file ?? $input-file.IO.slurp !! $default-input;

    my $mass-table-file = $*SPEC.catdir($*PROGRAM-NAME.IO.dirname,
                                        "monoisotopic-mass-table.txt");
    my %mass-table = slurp($mass-table-file).words;
    my @L = sort $input.lines;
    my %invert-mass-table = %mass-table.invert.hash;

    sub spectrum-graph(@L) {
        my %edges;
        for ^@L -> $i {
            note
            my $u = @L[$i];
            for @L[$i+1..*-1] -> $v {
                my $mass = %invert-mass-table{$v - $u};
                %edges{$u}.push: {
                    next-mass => $v,
                    amino-acid => $mass;
                } if defined $mass;
            }
        }
        return %edges;
    }

    my %graph = spectrum-graph(@L);
    sub find-protein($initial-mass) {
        return '' unless %graph{$initial-mass} :exists;
        gather for %graph{$initial-mass}[] {
            take .<amino-acid> Ā«~Ā« find-protein(.<next-mass>);
        }
    }

    say max :by(*.chars), map &find-protein, @L;
}

# vim: expandtab shiftwidth=4 ft=perl6

See Also

afrq-grondilu.pl

Counting Disease Carriers

aspc-grondilu.pl

Introduction to Alternative Splicing

cons-grondilu.pl

Consensus and Profile

conv-grondilu.pl

Comparing Spectra with the Spectral Convolution

cstr-grondilu.pl

Creating a Character Table from Genetic Strings

ctbl-grondilu.pl

Creating a Character Table

dbpr-grondilu.pl

Introduction to Protein Databases

dna-gerdr.pl

Counting DNA Nucleotides

dna-grondilu.pl

Counting DNA Nucleotides

eubt-grondilu.pl

Enumerating Unrooted Binary Trees

eval-grondilu.pl

Expected Number of Restriction Sites

fib-grondilu.pl

Rabbits and Recurrence Relations

fibd-grondilu.pl

Mortal Fibonacci Rabbits

gc-gerdr.pl

Computing GC Content

grph-grondilu.pl

Overlap Graphs

hamm-grondilu.pl

Counting Point Mutations

iev-grondilu.pl

Calculating Expected Offspring

indc-grondilu.pl

Independent Segregation of Chromosomes

iprb-grondilu.pl

Mendel's First Law

itwv-grondilu.pl

Finding Disjoint Motifs in a Gene

lcsq-grondilu.pl

Finding a Shared Spliced Motif

lia-grondilu.pl

Independent Alleles

lrep-grondilu-p5.pl

mmch-grondilu.pl

Maximum Matchings and RNA Secondary Structures

mprt-grondilu.pl

Finding a Protein Motif

mrna-grondilu.pl

Inferring mRNA from Protein

nwck-grondilu.pl

Distances in Trees

orf-grondilu.pl

Open Reading Frames

pmch-grondilu.pl

Perfect Matchings and RNA Secondary Structures

pper-grondilu.pl

Partial Permutations

prob-grondilu.pl

Introduction to Random Strings

qrt-grondilu.pl

Quartets

README.md

revc-gerdr.pl

Complementing a Strand of DNA

rna-gerdr.pl

Transcribing DNA into RNA

rstr-grondilu.pl

Matching Random Motifs

sexl-grondilu.pl

Sex-Linked Inheritance

spec-grondilu.pl

Inferring Protein from Spectrum

sseq-grondilu.pl

Finding a Spliced Motif

subs-grondilu.pl

Finding a Motif in DNA

suff-grondilu.pl

Encoding Suffix Trees

tran-grondilu.pl

Transitions and Transversions

trie-grondilu.pl

Introduction to Pattern Matching

The Camelia image is copyright 2009 by Larry Wall. "Raku" is trademark of the Yet Another Society. All rights reserved.