lrep-grondilu-p5

#!/usr/bin/perl
use strict;
use warnings;

# parsing data
my $dna = <DATA>;
my $k = <DATA>;
my @edge = map [ split / +/ ], <DATA>;

# enumerating nodes
my %node; for my $edge (@edge) { $node{$edge->[$_]}++ for 0, 1 }
my @node = keys %node;

# enumerating parents
my %parent; $parent{$_->[1]} = $_->[0] for @edge;

# building tree-like structure
my $tree = {};
$tree->{$_->[0]}{$_->[1]} = [ @$_[2, 3] ] for @edge;

sub count_offspring {
    my $node = shift;
    return 1 unless keys %{$tree->{$node}};
    my $count;
    $count += count_offspring($_) for keys %{$tree->{$node}};
    return $count;
}

sub build_substr {
    my $node = shift;
    my $substr = '';
    while (exists $parent{$node}) {
        my $il = $tree->{$parent{$node}}{$node};
        $substr = substr($dna, $$il[0]-1, $$il[1]) . $substr;
        $node = $parent{$node};
    }
    return $substr;
}

my $found = '';
for my $node (@node) {
    my $count = count_offspring $node;
    if ($count >= $k) {
        my $substr = build_substr $node;
        $found = $substr if length($substr) > length($found);
    }
}
print "$found\n";


__DATA__
CATACATAC$
2
node1 node2 1 1
node1 node7 2 1
node1 node14 3 3
node1 node17 10 1
node2 node3 2 4
node2 node6 10 1
node3 node4 6 5
node3 node5 10 1
node7 node8 3 3
node7 node11 5 1
node8 node9 6 5
node8 node10 10 1
node11 node12 6 5
node11 node13 10 1
node14 node15 6 5
node14 node16 10 1

See Also

afrq-grondilu.pl

Counting Disease Carriers

aspc-grondilu.pl

Introduction to Alternative Splicing

cons-grondilu.pl

Consensus and Profile

conv-grondilu.pl

Comparing Spectra with the Spectral Convolution

cstr-grondilu.pl

Creating a Character Table from Genetic Strings

ctbl-grondilu.pl

Creating a Character Table

dbpr-grondilu.pl

Introduction to Protein Databases

dna-gerdr.pl

Counting DNA Nucleotides

dna-grondilu.pl

Counting DNA Nucleotides

eubt-grondilu.pl

Enumerating Unrooted Binary Trees

eval-grondilu.pl

Expected Number of Restriction Sites

fib-grondilu.pl

Rabbits and Recurrence Relations

fibd-grondilu.pl

Mortal Fibonacci Rabbits

gc-gerdr.pl

Computing GC Content

grph-grondilu.pl

Overlap Graphs

hamm-grondilu.pl

Counting Point Mutations

iev-grondilu.pl

Calculating Expected Offspring

indc-grondilu.pl

Independent Segregation of Chromosomes

iprb-grondilu.pl

Mendel's First Law

itwv-grondilu.pl

Finding Disjoint Motifs in a Gene

lcsq-grondilu.pl

Finding a Shared Spliced Motif

lia-grondilu.pl

Independent Alleles

mmch-grondilu.pl

Maximum Matchings and RNA Secondary Structures

mprt-grondilu.pl

Finding a Protein Motif

mrna-grondilu.pl

Inferring mRNA from Protein

nwck-grondilu.pl

Distances in Trees

orf-grondilu.pl

Open Reading Frames

pmch-grondilu.pl

Perfect Matchings and RNA Secondary Structures

pper-grondilu.pl

Partial Permutations

prob-grondilu.pl

Introduction to Random Strings

qrt-grondilu.pl

Quartets

README.md

revc-gerdr.pl

Complementing a Strand of DNA

rna-gerdr.pl

Transcribing DNA into RNA

rstr-grondilu.pl

Matching Random Motifs

sexl-grondilu.pl

Sex-Linked Inheritance

sgra-grondilu.pl

Using the Spectrum Graph to Infer Peptides

spec-grondilu.pl

Inferring Protein from Spectrum

sseq-grondilu.pl

Finding a Spliced Motif

subs-grondilu.pl

Finding a Motif in DNA

suff-grondilu.pl

Encoding Suffix Trees

tran-grondilu.pl

Transitions and Transversions

trie-grondilu.pl

Introduction to Pattern Matching

The Camelia image is copyright 2009 by Larry Wall. "Raku" is trademark of the Yet Another Society. All rights reserved.