Previous Table of Contents Next


Review

Robert: Regular expressions don’t seem very regular to me. It’s hard to keep track of all the metacharacters, modifiers, and functions with one- and two-letter names. Is Perl deliberately cryptic?

Cybill: Some people think so. But that’s the price you pay for being able to express complicated patterns concisely. The more you shrink something, the more cryptic it gets.

Robert: Wonderful. And we’re only half-done with the chapter. I bet things get even worse in the second half.

Cybill: Well, there’s one easy rule to remember: Backslash weird characters to get their normal meaning and normal characters to get their weird meaning.

Robert: That makes sense. Still, I’m not sure how many more of these special characters and metacharacters I can handle.

Cybill: There’s a full list in Appendix I.

Session 5
Pattern Anchors and Some More Special Variables

The patterns you’ve used so far match substrings regardless of their position in the string (Figure 2-8). In Session 3, we used the agree (Listing 2-7) to illustrate simple matching. Listing 2-15 features agree again (this time slightly beefed up with m// and /i modifiers).


Figure 2-8  Matching is like a funnel

Listing 2-15 agree: An example of matching with ~

#!/usr/bin/perl -w
print 'Yes or no? ';
chomp($answer = <>);
if ($answer =~ m/y/i) { # TRUE if $answer is y or yes or okay or NO WAY!
    print "I'm glad you agree!\n";
}

But even with its improvements, we may still be able to fool agree.

% agree
RESULT: Yes or no? yes
I’m glad you agree!

% agree
Yes or no? not really
I’m glad you agree!

Whoops—because not really contains a y, agree agreed. We need a way to anchor the y to the beginning of the string so that only strings starting with y match (Figure 2-9).


Figure 2-9  Anchors lock down the beginning and end of a string


Anchors: the ^ and $ Special Characters
^ matches the beginning of the string.
$ matches the end.

Use ^ and $ whenever you can; they increase the speed of regex matches.

^ and $ don’t match particular characters, just positions. A few examples:

/^Whereas/

matches any string beginning with Whereas.

/\s$/

matches any string ending with whitespace.

$greeting =~ s/Goodbye$/Adieu/;

replaces the Goodbye at the end of $greeting with Adieu.

/^Turkey stuffing$/

matches only one string: Turkey stuffing.

/^Turkey stuffing$/ is the first regular expression in this chapter that matches exactly one string because it’s anchored at both beginning and end. (But it’s useless: If there’s only one possible match, you should be using eq instead.) agree2 (Listing 2-16) will match any $answer that begins with a y or a Y.

Listing 2-16 agree2: Using ^ to anchor a pattern

#!/usr/bin/perl

print 'Yes or no? ';
chomp($answer = <>);

if ($answer =~ /^ y/i) {  # TRUE if $answer is y or yes or Yo or yeah or
yahoo!
    print "I'm glad you agree!\n";
}

% agree2
RESULT: Yes or no? Yes
I’m glad you agree!

% agree2
RESULT: Yes or no? no way!

% agree2
RESULT: Yes or no? yup
I’m glad you agree!


The /M and /S Modifiers for s/// and m//
/m lets ^ and $ match more than once inside a string.
/s lets . match newlines.

m// and s/// each has two related modifiers: /m and /s.

Got that? By default, m// and s/// assume that each pattern is a single line of text. But strings can have embedded newlines, as in the string “Line ONE \n Line TWO. \n”. The /m modifier tells Perl to treat that as two separate strings. onceupon (Listing 2-17) shows the /m modifier in action.


Previous Table of Contents Next