perl notes

edit

Programming Style

edit
  • Do not use "package::foo();" directly. Must use objects. Using package::foo() is non-object.
Even if the object is only used once "$::g_res = new Objx();"
  • Always create a main() subroutine to avoid global conflicts.

Basic

edit
  • foreach note
$hh{abc}=123;
$hh{def}=456;
$hh{ghi}=789;
foreach $line (keys(%hh)) {    # any changes to hh inside foreach loop will not be taken
  $hh{jkl}=23;                 # That is, jkl will not show up in print
  print "$line\n";
}
  • stat
use Fcntl ':mode';
($dum,$dum,$mode,$dum,$dum,$dum,$dum,$size,$dum,$mtime)=stat($fname);
$mode = S_IMODE($mode)
  • "my" is local to the loop/block (not only on the function level).
  • Deleting a hash element
delete($h{abc});       # deleted
$h{abc}=();            # assign undef to element "abc". "abc" still exist.
$h{abc}=undef;         # same as $h{abc}=();
  • Clearing a hash element
 %{$this->{hipats}}=();
 my $hipat = \%{$this->{hipats}};       # pointer to the hash
 $$hipat{abc}=23;
 print "Orig: $hipat\n";
 foreach $line (keys(%{$hipat})) {
   print "$line\n";
 }
 print "Clearing\n";
 %{$this->{hipats}}=();                 # clear it, the address is still the same.
 print "Orig: $hipat\n";                # Same as above
 foreach $line (keys(%{$hipat})) {      # Result is cleared
   print "$line\n";
 }
  • File checks
-e Exist
-w Writable
-x Executable
  • Avoiding Greedy Expressions
if($line=~/(a.*?b)/) { }
  • References vs Deep Copy
my %hcopy = %{$john->{h}};   # This is not a reference. Performs a deep copy.
my $hr=\%hh;                 # This is a reference. Same as: $hr={ key=>val, key1=>val1, };
${$hr}{key}="new";           # same as $hh{key}
  • Hash scalar reference
 $this->{val} = 453;
 my $rval = \$this->{val};
 my $xxval = \${$this->{val}};
 print "$this->{val}\n";        # 453
 print "$$rval\n";              # 453
 print "$$xxval\n";             # undefined
 
 my $nval = \${$this->{nval}};
 $$nval = 444;
 print "$$nval\n";              # 444
 print "${$this->{nval}}\n";    # 444
 print "$this->{nval}\n";       # SCALAR(0x0000)
  • Hash of different types
 $rec = {
   TEXT     => $string,
   ARY      => [ @aryvariable ],   
   LOOKUP   => { %some_table },
   THATCODE => \&some_function,
   THISCODE => sub { ....code.... },
   HANDLE   => \*STDOUT,
 }

 # Accessing the hash
 foreach $k (keys(%{$rec->{LOOKUP}})) {   };
 print( $rec->{TEXT} );
 print( ${$rec}{TEXT} );   # same as above
 
 # Function reference
 $rec->{THATCODE}();       # Call some_function()
 $rec->{THISCODE}();       # Call the anonymous function
 $rf = \&some_function;
 &$rf();
  • Sorting
String:       sort( { $a cmp $b } @arr );    # just reverse $a and $b for reverse. Default is string.
Numeric:      sort( { $b <=> $a } @arr );
sort by keys: sort( keys( %hh ));
sort by values (return the keys):
    sort( { $hh{$a} <=> $hh{$b} } keys(%hh) );
  • hash as argument
foo({arg1=>1, arg2=>2});
exit(0);
sub foo {
  my($harg) = @_;
  my %h=%{$harg};    # This makes a physical copy of the hash
  foreach $k (keys(%h)) {
    print "$k=[$h{$k}]\n";
  }
}
  • hash use:
$hy{top}{abc}=1;   # this will not compile during strict!
$hy{top}{abc}{key1}=1;
$hy{top}{abc}{key2}=2;
$hy{top}{def}{key2}=2;

Modules

edit
  • Basename
use File::Basename;
$fname   = &File::Basename::basename($path);
$dirname = &File::Basename::dirname($path);
  • Cwd
use Cwd;
print cwd();

Negative Regex

edit
  • Contributor: Dan Phillips
$line="blah hsw foo bar";
if($line=~/^(?!.*bdw.*)/) {
     print "case 1 True\n";
} else {
     print "case 1 False\n";
}

$line="fee fi foe fum bdw";
if($line=~/^(?!.*bdw.*)/) {
     print "case 2 True\n";
} else {
     print "case 2 False\n";
}

Pointers

edit
  • Function pointer
push(@ar, \&func1);
push(@ar, \&func2);
push(@ar, sub { print "i'm in anonymous\n"; } );
&{$ar[1]}();   # result: i'm in func2
&{$ar[2]}();   # result: i'm in anonymous
exit(0);

sub func1 {
  print "i'm in func1\n";
}
sub func2 {
  print "i'm in func2\n";
}

Goodie Stuff

edit

Trap uninitialized

edit
# Put this at top of file
$SIG{__WARN__} = sub { for ($_[0]) { &process_warn_subr; } };   # trap uninitialized values

.....

# die out if uninitialized warning happens
sub process_warn_subr {
  package process_warn_subr;
  my @c = caller(1);

  if(/Use of uninitialized value/i) {
    print "ERROR: perl uninitialized value access detected in $0:\n";
    print "-e- => package: $c[0]\n";
    print "-e- => file   : $c[1]\n";
    print "-e- => line   : $c[2]\n";

    ;# promote warning to a fatal version
    die "-e- => trap: $_";
  } else {
    ;# other warning cases to catch go here
    warn "-w- => trap: $_";
  }

}

Trap Ctrl+C

edit
# Put this at top of file
$SIG{'INT'} = 'dokill';    # or "= sub {  }" also works

....

sub dokill {
  die("Ctrl+C happened\n\n");      # pressing ctrl+c while inside dokill() has no effect.
}  # NOTE: All DESTROY object routines are called here.

fork(), parent, child code

edit
# NOTE!!!!!! pls use "package MakeChild()" instead!
# from Joanna H
my $pid = fork();
if ($pid) {
  # parent
  push(@childs, $pid);
} elsif ($pid == 0) {
  # child
  local $SIG{INT} = 'IGNORE';
  $cmd = "/bin/sleep 10 ; echo \"done sleeping\"";
  system($cmd );
  exit(0);
} else {
  #could not fork
}

#waiting for child to finish
foreach (@::childs) {
  waitpid($_, 0);
}

system vs exit numbers

edit
The following numbers are $res value when $res=system("command");
# exit(0)  -  0
# exit(1)  -  256
# exit(2)  -  512
# ctrl+c   -  2
# exit(-1) - 65280
# die      - 2304

Value of $1 and $2 are retained

edit
  • See example below:
 if($ARGV[0]=~/(\w+):(\w+)/) {
   try2($1);
   try2($2);   # value of $2 is the *real* $2, not the $2 from try2()
 }
 exit(0);
sub try2 {
 my($var) = @_;
 print "try2 input: [$var]\n";
 if($var=~/(\w)(\w+)/) {
   print "try2 inside: [$1] [$2]\n";
 }
}

Objects

edit
  • Usage example
use Person;
my $john=new Person("John", "Male");

print Person::direct()."\n";    # Access "static" methods directly
print $john->{NAME}."\n";       # Retrieves the {NAME} property.
print $john->name."\n";         # Calls the name() method. $john->name() is the same

print @{$john->array};          # Array access
print %{$john->hash};           # Hash access

$john=();    # calls the destructor.

  • Object that contain hash
package Data;
sub new {
  my ($class)=@_;
  my $this = {};
  bless($this, $class);

  my $hs = {};
  $hs->{data1}=33;       # information hash
  $hs->{data2}=35;

  $this->{dd}=$hs;       # Assign it
  $this->{tag}="TAG";

  return($this);
}


package UserObject;
sub new {
  my ($class)=@_;
  my $this = {};
  bless($this, $class);

  $this->{obj} = new Data();

  my $all = $this->{obj}->{dd};     # Access the hash of the Data object
  my $line;
  foreach $line (keys(%{$all})) {   # Reference way
    print "$line ${$all}{$line}\n";
    ${$ali}{$line}+=100;            # increment it
  }
  foreach $line (keys(%{$this->{obj}->{dd}})) {        # Direct way
    print "$line ".${$this->{obj}->{dd}}{$line}."\n";  # Incremented value is seen here
  }

  return($this);
}

Object Template

edit
# =============================================================
# OBJECT template
# =============================================================
use strict;
package Person;

# Constructor
sub new {
  my ($class, $name, $sx)=@_;    # 1st arg is always the classname ($class=="Person")
  my $this = {};
  bless($this, $class);

  $this->{NAME} = $name || ();    # Property
  $this->{AGE}  =  3;
  $this->{SEX} = $sx;             # same as $$this{SEX}, $this->{SEX}

  return($this);
}

# Methods
sub peers {
  my($this, @peer) = @_;

  # alias all properties to be used
  my $PEER = \@{$this->{PEER}};
  my $SEX  = \$this->{SEX};
  my $HH   = \%{$this->{hh}};            # Access by ${$HH}{...}
  my $AA   = \@{$this->{ary}};           # Access by $$AA[..]

  $this->SUPER::method();                 # to access the base method

  if($#peer>=0) {
    push(@{$$PEER}, @peer);
  }
  return($$PEER);
}

sub direct {            # Can be accessed from main directly via: Person::direct(). However, don't call methods directly! (violation to programming style)
  return($static_sex);
}

sub DESTROY {
  print("I'm doing destructor\n");
  # NOTE: DESTROY is not called if ctrl+C happened. Add the following line in the constructor:
  # $SIG{'INT'} = sub { die("Ctrl+C happened\n\n"); } ;    # This is necessary for DESTROY to be called even if Ctrl+C

}

# =============================================================
# Inheritance template
# =============================================================
use strict;

package Person2;
use obj;                # The base obj. Remove this if the base obj is on the same file
use vars qw(@ISA);
@ISA = ("Person");      # inherits from Person

# Constructor
sub new {                             # ok to inherit as long as it is not on main file.
  my ($class, $n, $s, $job)=@_;       # 1st arg is always the classname
  my $this = new Person($n, $s);      # same as $class->Person::new($n, $s);
  bless($this, $class);

  .....  # e.g.  $$JOB = $job;

  return($this);
}

; # override any methods that needed to be overridden...

Use and package scope

edit
  • Given the following myuse.pm
#!/usr/intel/bin/perl5.85 -w
use strict;

package myuse;

my $abc = 123;

sub try1 {
  print "i'm try1 [$abc]\n";
  $abc++;
  return;
}

sub returnabc {
  return($abc);
}

1;
  • See below notes on variable scope on package:
use myuse;
main();
sub main {

  print "i'm in main\n";

  #try1();  # error, undefined subroutine
  &myuse::try1();   # valid
  &myuse::try1();   # valid

  print "in main: $myuse::abc\n";   # undefined
  print "in main via method: ".&myuse::returnabc()."\n";

}

package inside a function (used for scoping)

edit
main();
foo();
mainlocal::foo();
exit(0);

sub main {
  package mainlocal;      # used to localize a group of functions
  my $var=1;
  print "I'm in main var=$var\n";
  foo();          # this will call local
  &::fooban();    # this will call main fooban
  &::foo();       # this will call main foo
  $var++;
  print "I'm exiting main\n";
  return;

  sub foo {
    print "i'm in local foo var=$var\n";   # accessing $var is illegal (perl only show as warning)
  }
}

sub foo {
  print "i'm in main foo\n";
}
sub fooban {
  print "i'm in main fooban\n";
}

Benchmarks

edit

perl invoke windows vs unix

edit
  • invoking perl: 1000 system call to perl:
Windows: 93ms per perl invoke (via system)
Windows: 43ms per touch invoke (via system)
UNIX:    40ms per perl invoke

hash format comparison

edit
executed three times: 616148 keys
using $$hd:      0.67 sec
using $hd->{}:   0.67 sec
using ${$hd}:    0.67 sec
using direct hh: 0.65 sec

string concatenation

edit
a) $res="$res$txt"  # very slow
vs
b) $res.=$txt       # much faster (more than 3x)

Unit testing

edit
use strict;
use Test;

BEGIN { plan tests => 2, todo => [1] }

# test #1
# ok(<function>, <expect>);
ok(func1(1),1);   # this is fail

# test #2
ok(func1(1),2);   # this is pass
exit(0);

# this is the function being tested
sub func1 {
  my($i) = @_;
  return($i+1);
}
  • output:
1..2 todo 1;
# Running under perl version 5.008005 for linux
# Current time local: Thu Sep  2 10:54:08 2010
# Current time GMT:   Thu Sep  2 17:54:08 2010
# Using Test.pm version 1.25
Name "main::ary" used only once: possible typo at /nfs/pdx/home/jqdelosr/perl/notes.pl line 42.
not ok 1
# Test 1 got: "2" (notes.pl at line 18 *TODO*)
#   Expected: "1"
#  notes.pl line 18 is: ok(func1(1),1);
ok 2

Others

edit

Max perl require size - 5.0MB

edit
  • see tvpvhelp#21302 for details
  • jen() is a function to encode.