Previous Table of Contents Next


Listing 2-17 onceupon: Using the /m modifier

#!/usr/bin/perl -w
$story = "Once upon a time\nthere was a bad man, who died.\nThe end.\n";
$story =~ s/^(\w)+\b/WORD/mg;    # Turn the first word of every line into
WORD
print $story;

The ^ in s/^(\w)+\b/WORD/mg matches not just once, but four times: at the beginning of the string, as usual, and after each of the three newlines. onceupon then creates a single backreference to the first word in the line and replaces it with WORD.

% onceupon
RESULT: WORD upon a time
WORD was a bad man, who died.
WORD end.


The Metacharacters \A and \Z
\A matches the beginning of a string, like ^.
\Z matches the end of a string, like $.
But unlike ^ and $, neither \A nor \Z matches multiple times when /m is used.

When using /m, ^ matches multiple times: at the beginning of the string and immediately after every newline. Likewise, $ matches immediately before every newline and at the very end of the string. \A and \Z are stricter; they match only at the actual beginning and end of the entire string, ignoring the embedded newlines. In onceupon, we could have substituted \A for ^.

$story =~ s/\A(\w)+\b/WORD/mg;

which would uppercase only the first word, like this:

WORD upon a time
there was a bad man, who died.
The end.

Because the /s modifier lets . match newlines, you can match phrases split across lines.

$lines = "Perl 5 is object\noriented";

print "Hit!         \n" if $lines =~ /object.oriented/;
print "Hit with /m! \n" if $lines =~ /object.oriented/m;
print "Hit with /s! \n" if $lines =~ /object.oriented/s;

will print only Hit with /s!

/m and /s aren’t mutually exclusive. Suppose you want to remove everything between the outer BEGIN and END below.

We'd like to keep this line
BEGIN
but not this line
or this one,
or the BEGIN or END.
END
This line should stay, too.

You can do it using /m to match the BEGIN and END and /s to treat the newlines as plain characters.

s/^BEGIN.*^END//sm

Perl also provides a few special variables for accessing the results of a match.


The Special Variables $`, $’, $&, and $+
$` (or $PREMATCH) returns everything before the matched string.
$’ (or $POSTMATCH) returns everything after the matched string.
$& (or $MATCH) returns the entire matched string.
$+ (or $LAST_PAREN_MATCH) returns the contents of the last parenthetical match.

Each of these variables has two names: a short name ($`, $’, $&, $+) and a synonymous English name ($PREMATCH, $POSTMATCH, $MATCH, $LAST_PAREN_MATCH). You can use only the English names if you place the statement use English at the top of your program.

$_ = "Macron Pilcrow";
/cr../;
print $&;

prints cron, as does

use English;
$_ = "Macron Pilcrow";
/cr../;
print $MATCH;

(The use English statement loads the English module, which makes the longer names available. Modules are the subject of Chapter 8; the English module is discussed in Chapter 8, Session 2. Every “punctuation” special variable has an English name ($_ is also $ARG); there’s a complete list in Appendix H.)

quadrang (Listing 2-18) matches the string quadrangle to a very simple regex: /d(.)/.

Listing 2-18 quadrang: Using $`, $’, $&, and $+

#!/usr/bin/perl -wl
$_ = 'quadrangle';

/d(.)/;                      # match a d followed by any character

print $`;                    # What's before dr
print $';                    # What's after dr
print $&;;                   # What matched /d(r)/
print $&;;                   # What matched (r)

If it didn’t match, all four variables would become empty strings. But quadrangle does match, so here’s what you see:

% quadrang
RESULT: qua
angle
dr
r

Why was each variable printed on a separate line? Because quadrang uses the -l command-line flag.


The -L Command-Line Flag
When the -l command-line flag is supplied, Perl adds a newline after every print().

The -l flag lets you skip the newlines at the end of every print(). It does a little bit more, too: It automatically chomp()s when used with -n or -p and, if you supply digits afterward, it assigns that value (interpreted as an octal number) to the line terminator $\ (see Chapter 6, Session 4).


Previous Table of Contents Next