Type: source /Volumes/USB/Unix_and_Perl_course/.profile each time when you open terminal.
grep: search for matched lines, -v to invert. -i to ignore case. -c to count.
$ grep "ATGTGA" intro_IME_data.fasta | less
$ grep -i ACGTC * | head #show first 10 lines of matched item
$ head -n 1 chr1.fasta | sed 's/Chr1/Chromosome 1/' # head -n 1 means the first line. sed to substitute.
concept of pipe. then press "/" to search some kind of pattern say "ATGTGA", "?" to search backward
die "non-DNA character in input\n" if ($input =~ /[efijlopqxz]/i);
die ... if syntax : to stop perl if necessary.
$sequence =~ tr/A-Z/a-z/;
push @animals, "fox";
my $length = @animals;
my @gene_names = qw(unc-10 cyc-1 act-1 let-7 dyf-2);
my $joined_names = join(", ", @gene_names);
my @digest = split("", $dna); # split at every possible position at $dna (string);
If you assign a list to a scalar variable, then the scalar variable becomes the length of the list.
difference between :
$length = @animals; # variable $length means the size of the array;
($length) = @animals; # list ($length) contains one element of array @animals;
scalar(@array) : function that calculate the length of the array;
index at 1.2, 1.7, .. rounded to 2. -1 means count from tail.
@sorted_list = sort{$a <=> $b or uc($a) cmp uc($b)} @list;
foreach $animal (@animals) {print "$animal\n"}
for my $i (0..5) {print "$i\n"}
0 ""(null string) be considered false.
next redo last => continue, redo, break;
while(<>) equals while($_ = <>). chomp() function removes a \n character from the end of a line if present.
#!/usr/bin/perl
# filemunge.pl
use strict; use warnings;
open(IN, "<$ARGV[0]") or die "error reading $ARGV[0] for reading";
open(OUT, ">$ARGV[0].munge") or die "error creating $ARGV[0].munge";
while(<IN>) {
chomp;
my $rev = reverse $_;
print OUT "$rev\n";
}
close IN;
close OUT:
how to do I/Os: http://www.ualberta.ca/~hquamen/303/filehandles.html ; $! to store error messages. select handle. Perl now allows you to use a regular scalar as a filehandle.
reverse() function both reverse arrays and strings.
#hash
%genetic_code = (
ATG => 'Met';
AAA => 'Lys';
CCA => 'Pro';
);
foreach $key (keys %genetic_code) {
print "$key $genetic_code{key}\n";
}
if (exists $genetic_code{AAA}) {print "AAA codon has a value\n"}
else {print "No values set for AAA codon\n"}
delete $genetic_code{AAA};
The keys() function returns an array of keys, function values() returns an array of values.
varible $&: The string matched by the last successful pattern match.
uc(), lc() function: make string uppercase or lowercase:
my $str = "What is Perl Language for";
lc($str);
print $str, "\n";
# displays: What is Perl Language for
$str = lc($str);
print $str, "\n";
# displays: what is perl language for
if($text =~ m/A{1,3}/) {...} # matches between 1 and 3 As
if($text =~ m/C{42}/) {...} # matches exactly 42 Cs
if($text =~ m/T{6,}/) {...} # matches at least 6 Ts
# to match a "." you have to use backslash "\." eg. $sequence =~ m/A\. thaliana/ {...}
my @fields = split; # \s+ or $_ are assumed, and space-delimited
Regular expressions in list context return values from parenthesized patterns: my ($beg, $end) = $line =~ /(\d+)\.\.(\d+)/;
file handler can only be processed one at a time. "If your inner loop is a filehandle iterator, then you will need to reset it."