0)This is to be interpreted as described in RFC 2119
So you learned the basics, what's next?
In a 40 minute timeslot I can't provide you with a lot of practise, so my main focus wuld be on the other parts
Most of the advanced features regarding captures and backreferences a new in perl 5.10
(?X...)
(?:...)
$ and \b
(?=...)
(?!...)
(?<=...)
(?<!...)
(?<=\w)(?=\W|$)|(?<=^|\W)(?=\w)
(\d*)(?<=.{17})
s/Alice(?! and Bob)/Anita/g
Perl(?=.*sucks)
Perl 5.10 provides some new options for captures and back references:
/(?<hour>\d\d):(?<minute>\d\d)/
%+ (ie. $+{hour} etc.)
\g{hour}
\g{-1}
/(?|(\d\d)(\d\d)|(\d\d):(\d\d))/
With global matches enabled, Perl keeps track of the position it has reached while performing matches on a string
//g - enable global matching
//c - don't reset pos() on failure
/\G.../ - anchor at pos()
pos() = 12 - setting pos()
while(<>) {
print(" lowercase") and redo if /\G[a-z]+\s+/gc;
print(" uppercase") and redo if /\G[A-Z]+\s+/gc;
print(" digits") and redo if /\G[0-9]+\s+/gc;
print(" mixed") and redo if /\G[a-zA-Z0-9]+\s+/gc;
print(" noise") and redo if /\G[^a-zA-Z0-9]+\s+/gc;
print "\n";
}
(?>...)
^\w+:
^(?>\w+):
^(?>a*)ab
a*+ is equivalent to (?>a*)
a++ is equivalent to (?>a+)
a?+ is equivalent to (?>a?)
a{min,max}+ is equivalent to (?>a{min,max})
The qr// operator turns the regexp into a value you can store and parse around
@regexp = ( qr/foo/, qr/bar/, qr/baz/ );
while(defined( $line = <> )) {
print $line if grep { $line =~ $_ } for @regexp;
}

It is easy to try to solve everything with a single regular expression but ...
However, don't end up like this
qr/
(((0[48]|[2468][048]|[13579][26])00|\d\d(0[48]|[2468]
[048]| [13579] [26])) |(( [02468] [1235679]| [13579]
[01345789])00|\d\d([02468][1235679]|[13579][01345789]
))(?!(-|)02\g{-1}29))(?<sep>-|)(02(?!\g{sep}3)|0[469]
(?!\g{sep}31)|11(?!\g{sep} 31)|0[13578]|1[02])\g{sep}
(0[1-9]|[12][0-9]|3[01])
/x;
For improved readability, use /x to add whitespaces and comments:
$isLeapYear =
qr/(
# Either century divisible by 400:
( 0[48] | [2468][048] | [13579][26] ) 00
# Or year divisible by 4, but not a century
\d\d ( 0[48] | [2468][048] | [13579][26] )
)/x;
Often people want to match "this" and "that" in a single regexp. It's possible, but ...
print if /^(?=.*foo)(?=.*bar)/;
print if /foo/ && /bar/;
Which one is easier to read?
Remember this?
qr/
(((0[48]|[2468][048]|[13579][26])00|\d\d(0[48]|[2468]
[048]| [13579] [26])) |(( [02468] [1235679]| [13579]
[01345789])00|\d\d([02468][1235679]|[13579][01345789]
))(?!(-|)02\g{-1}29))(?<sep>-|)(02(?!\g{sep}3)|0[469]
(?!\g{sep}31)|11(?!\g{sep} 31)|0[13578]|1[02])\g{sep}
(0[1-9]|[12][0-9]|3[01])
/x;
It validates dates on the form 'YYYY-MM-DD' or 'YYYYMMDD'.
Excercise: Add support for 'dd/mm/yyyy'
Try this instead:
my @daysInMonth = ( 0, 31, 28, 31, 30, 31,
30, 31, 31, 30, 31, 30, 31);
if( /(?<y>\d\d\d\d)-(?<m>\d\d)-(?<d>\d\d)/ ) {
return if $+{m} == 0
|| $+{d} == 0;
|| $+{m} > 12;
return 1 if $+{d} <= $daysInMonth[ $+{m} ];
return 1 if $+{m} = 2
&& $+{d} = 29
&& leapYear( $+{y} );
return;
}
Ready to use regexpes for a lot of common cases:
use Test::Regexp 'no_plan';
match subject => "Foo bar",
keep_pattern => qr /(?<first_word>\w+)\s+(\w+)/,
captures => [[first_word => 'Foo'], ['bar']];
no_match subject => "Baz",
pattern => qr /Quux/;