| Previous | Table of Contents | Next |
Listing 2-17 onceupon: Using the /m modifier
#!/usr/bin/perl -w $story = "Once upon a time\nthere was a bad man, who died.\nThe end.\n"; $story =~ s/^(\w)+\b/WORD/mg; # Turn the first word of every line into WORD print $story;
The ^ in s/^(\w)+\b/WORD/mg matches not just once, but four times: at the beginning of the string, as usual, and after each of the three newlines. onceupon then creates a single backreference to the first word in the line and replaces it with WORD.
% onceupon RESULT: WORD upon a time WORD was a bad man, who died. WORD end.
The Metacharacters \A and \Z
- \A matches the beginning of a string, like ^.
- \Z matches the end of a string, like $.
- But unlike ^ and $, neither \A nor \Z matches multiple times when /m is used.
When using /m, ^ matches multiple times: at the beginning of the string and immediately after every newline. Likewise, $ matches immediately before every newline and at the very end of the string. \A and \Z are stricter; they match only at the actual beginning and end of the entire string, ignoring the embedded newlines. In onceupon, we could have substituted \A for ^.
$story =~ s/\A(\w)+\b/WORD/mg;
which would uppercase only the first word, like this:
WORD upon a time there was a bad man, who died. The end.
Because the /s modifier lets . match newlines, you can match phrases split across lines.
$lines = "Perl 5 is object\noriented"; print "Hit! \n" if $lines =~ /object.oriented/; print "Hit with /m! \n" if $lines =~ /object.oriented/m; print "Hit with /s! \n" if $lines =~ /object.oriented/s;
will print only Hit with /s!
/m and /s arent mutually exclusive. Suppose you want to remove everything between the outer BEGIN and END below.
We'd like to keep this line BEGIN but not this line or this one, or the BEGIN or END. END This line should stay, too.
You can do it using /m to match the BEGIN and END and /s to treat the newlines as plain characters.
s/^BEGIN.*^END//sm
Perl also provides a few special variables for accessing the results of a match.
The Special Variables $`, $, $&, and $+
- $` (or $PREMATCH) returns everything before the matched string.
- $ (or $POSTMATCH) returns everything after the matched string.
- $& (or $MATCH) returns the entire matched string.
- $+ (or $LAST_PAREN_MATCH) returns the contents of the last parenthetical match.
Each of these variables has two names: a short name ($`, $, $&, $+) and a synonymous English name ($PREMATCH, $POSTMATCH, $MATCH, $LAST_PAREN_MATCH). You can use only the English names if you place the statement use English at the top of your program.
$_ = "Macron Pilcrow"; /cr../; print $&;
prints cron, as does
use English; $_ = "Macron Pilcrow"; /cr../; print $MATCH;
(The use English statement loads the English module, which makes the longer names available. Modules are the subject of Chapter 8; the English module is discussed in Chapter 8, Session 2. Every punctuation special variable has an English name ($_ is also $ARG); theres a complete list in Appendix H.)
quadrang (Listing 2-18) matches the string quadrangle to a very simple regex: /d(.)/.
Listing 2-18 quadrang: Using $`, $, $&, and $+
#!/usr/bin/perl -wl $_ = 'quadrangle'; /d(.)/; # match a d followed by any character print $`; # What's before dr print $'; # What's after dr print $&;; # What matched /d(r)/ print $&;; # What matched (r)
If it didnt match, all four variables would become empty strings. But quadrangle does match, so heres what you see:
% quadrang RESULT: qua angle dr r
Why was each variable printed on a separate line? Because quadrang uses the -l command-line flag.
The -L Command-Line Flag
When the -l command-line flag is supplied, Perl adds a newline after every print().
The -l flag lets you skip the newlines at the end of every print(). It does a little bit more, too: It automatically chomp()s when used with -n or -p and, if you supply digits afterward, it assigns that value (interpreted as an octal number) to the line terminator $\ (see Chapter 6, Session 4).
| Previous | Table of Contents | Next |