Previous Table of Contents Next


Listing 2-8 tickets6: Specifying a substring

#!/usr/bin/perl
@movies = ('Flashdance', 'Invasion of the Body Snatchers', 'King Kong',
           'Raiders of the Lost Ark', 'Flash Gordon');
print 'What movie would you like to see? ';
chomp($_ = <>);
foreach $movie (@movies) {
    $found = ($movie =~ /$_/); # $found is TRUE if $movie contains $_
    if ($found) {
       print "Oh!  You mean $movie!\n";
       last;                   # exit the foreach loop
    }
}
if (!$found) { print "Hmmmm.  Never heard of $_.\n"; }

Pay close attention to the line in bold. First, $movie is matched to the regex /$_/. If $movie contains $_, the match returns TRUE and $found is set to 1. If that’s the case, the subsequent if statement fires and the foreach loop exits.

% tickets6
RESULT: What movie would you like to see? Raiders
Oh!  You mean Raiders of the Lost Ark!
% tickets6
RESULT: What movie would you like to see? Kong
Oh!  You mean King Kong!
% tickets6
RESULT: What movie would you like to see? body snatchers
Hmmmm.  Never heard of body snatchers.

Whoops! tickets6 didn’t recognize body snatchers because the capitalization wasn’t correct. Wouldn’t it be nice if you didn’t have to worry about capitalization? Luckily, s/// and m// have the /i modifier, which tells Perl to perform case-insensitive matches.


The /i Modifier for s/// and m//
The /i modifier enables case-insensitive matching.

Had tickets6 used /$_/i in place of /$_/, it would have accepted both body snatchers and Body Snatchers (and boDy SnAtchERS...).

If you want to compare two strings ignoring capitalization (as in C’s strcasecmp()), you could use the /i modifier and regex anchors (covered in Session 5). Or you could use these functions instead:


The lc(), uc(), lcfirst(), and ucfirst() Functions
lc(STRING) returns STRING with all letters lowercased.
uc(STRING) returns STRING with all letters uppercased.
lcfirst(STRING) returns STRING with the first letter lowercased.
ucfirst(STRING) returns STRING with the first letter uppercased.

Both

lc($string1) eq lc($string2)

and

uc($string1) eq uc($string2)

will be TRUE if $string1 and $string2 are equivalent strings (regardless of case).

But back to tickets6.

% tickets6
RESULT: What movie would you like to see? Flash
Oh!  You mean Flashdance!

Hmmm...Flash Gordon would have been a better choice. Unfortunately, tickets6 immediately matched Flash to Flashdance, leaving the foreach loop before Flash Gordon was tested. The \b metacharacter will help.


The \B and \B Metacharacters
\b is a word boundary; \B is anything that’s not.

So

/or\b/

matches any word that ends in or, and

/\bmulti/

matches any word that begins with multi, and

/\b$_\b/

matches any string that contains $_ as an unbroken word. That’s what tickets6 needs.

The tickets7 program (Listing 2-9) uses two new regex features, \d and parentheses, to ask users to pay for their movie.

Listing 2-9 tickets7: Using parentheses and \d

#!/usr/bin/perl -w

print 'That will be $7.50, please: ';
chomp($money = <>);

# If the user types 6.75, this places 6 into $dollars and 75 into $cents.
($dollars, $cents) = ($money =~ /(\d+)\.(\d\d)/);


$cents += ($dollars * 100);
if ($cents < 750) { print "That's not enough!\n" }

The line in bold first matches $money against the regex /(\d+)\.(\d\d)/, which separates the digits before and after the decimal point, sticking the results in $dollars and $cents.

% tickets7
RESULT: That will be $7.50, please: $6.75
That’s not enough!

As you might have guessed, \d matches digits.


The \D and \D Metacharacters
\d is any digit (0 through 9); \D is any character that isn’t.

These statements all do the same thing:

print 'That's not a number from 0 to 9!' if ! /\d/;

print 'That's not a number from 0 to 9!' unless m^\d^;

if ($_ !~ /\d/) { print 'That's not a number from 0 to 9!' }

print 'That's not a number from 0 to 9!' unless /0|1|2|3|4|5|6|7|8|9/;

Remember the Perl motto: There’s more than one way to do it.


Previous Table of Contents Next