Finding a Protein Motif

AUTHOR

L. Grondin

http://rosalind.info/problems/mprt/

Sample input

A2Z669
    B5ZC00
    P07204_TRBM_HUMAN
    P20840_SAG1_YEAST

Sample output

B5ZC00
    85 118 142 306 395
    P07204_TRBM_HUMAN
    47 115 116 382 409
    P20840_SAG1_YEAST
    79 109 135 248 306 348 364 402 485 501 614
use v6;



my @default-data = qw{
    A2Z669
    B5ZC00
    P07204_TRBM_HUMAN
    P20840_SAG1_YEAST
};

sub MAIN($input-file = Nil) {
    my @input = $input-file ?? $input-file.IO.lines !! @default-data;
    my $N-glycosylation = rx / N <-[P]> <[ST]> <-[P]> /;
    my $base-path = $*PROGRAM-NAME.IO.dirname;
    for @input -> $id {
        my $fasta-name = $*SPEC.catdir($base-path, "$id.fasta");
        my $fasta = $fasta-name.IO.e
            ?? $fasta-name.IO.slurp
            !! qqx{wget -O - -q "http://www.uniprot.org/uniprot/$id.fasta"};
        given join '', grep /^ <.alpha>+ $/, $fasta.lines {
            when $N-glycosylation {
                say $id;
                my @arr = gather for m:overlap/$N-glycosylation/ { take .from + 1}
                say "{@arr}"
            }
        }
    }
}

# vim: expandtab shiftwidth=4 ft=perl6

See Also

afrq-grondilu.pl

Counting Disease Carriers

aspc-grondilu.pl

Introduction to Alternative Splicing

cons-grondilu.pl

Consensus and Profile

conv-grondilu.pl

Comparing Spectra with the Spectral Convolution

cstr-grondilu.pl

Creating a Character Table from Genetic Strings

ctbl-grondilu.pl

Creating a Character Table

dbpr-grondilu.pl

Introduction to Protein Databases

dna-gerdr.pl

Counting DNA Nucleotides

dna-grondilu.pl

Counting DNA Nucleotides

eubt-grondilu.pl

Enumerating Unrooted Binary Trees

eval-grondilu.pl

Expected Number of Restriction Sites

fib-grondilu.pl

Rabbits and Recurrence Relations

fibd-grondilu.pl

Mortal Fibonacci Rabbits

gc-gerdr.pl

Computing GC Content

grph-grondilu.pl

Overlap Graphs

hamm-grondilu.pl

Counting Point Mutations

iev-grondilu.pl

Calculating Expected Offspring

indc-grondilu.pl

Independent Segregation of Chromosomes

iprb-grondilu.pl

Mendel's First Law

itwv-grondilu.pl

Finding Disjoint Motifs in a Gene

lcsq-grondilu.pl

Finding a Shared Spliced Motif

lia-grondilu.pl

Independent Alleles

lrep-grondilu-p5.pl

mmch-grondilu.pl

Maximum Matchings and RNA Secondary Structures

mrna-grondilu.pl

Inferring mRNA from Protein

nwck-grondilu.pl

Distances in Trees

orf-grondilu.pl

Open Reading Frames

pmch-grondilu.pl

Perfect Matchings and RNA Secondary Structures

pper-grondilu.pl

Partial Permutations

prob-grondilu.pl

Introduction to Random Strings

qrt-grondilu.pl

Quartets

README.md

revc-gerdr.pl

Complementing a Strand of DNA

rna-gerdr.pl

Transcribing DNA into RNA

rstr-grondilu.pl

Matching Random Motifs

sexl-grondilu.pl

Sex-Linked Inheritance

sgra-grondilu.pl

Using the Spectrum Graph to Infer Peptides

spec-grondilu.pl

Inferring Protein from Spectrum

sseq-grondilu.pl

Finding a Spliced Motif

subs-grondilu.pl

Finding a Motif in DNA

suff-grondilu.pl

Encoding Suffix Trees

tran-grondilu.pl

Transitions and Transversions

trie-grondilu.pl

Introduction to Pattern Matching

The Camelia image is copyright 2009 by Larry Wall. "Raku" is trademark of the Yet Another Society. All rights reserved.