| Previous | Table of Contents | Next |
Its been a while since weve seen any new functions. In fact, there havent been any in this chapter until now, because m//, s///, and tr/// are actually operators, not functions. But heres one: grep() (Figure 2-10).
Figure 2-10 The grep() function extracts certain elements from an array
The grep() Function
grep(EXPRESSION, ARRAY) extracts any elements from ARRAY for which EXPRESSION is TRUE.
grep() returns a subarray of ARRAYs elements. Often, but not always, EXPRESSION is a regular expression.
@has_digits = grep(/\d/, @array)
@has_digits now contains any elements of @array with digits.
Within a grep(), $_ is temporarily set to each element of the array, which you can use in the EXPRESSION:
@odds = grep($_ % 2, @numbers)
@odds now contains all the odd numbers in @numbers.
You can even use grep() to emulate a foreach loop that modifies array elements.
grep(s/a/b/g, @strings)
replaces all as with bs in every element of @strings. It also returns a fresh array, which the above statement ignores. Try to avoid this; people who hate side effects consider this poor form since a foreach loop is a cleaner alternative (not to mention faster).
Movie sequels are never as good as the originals. Luckily, theyre usually easy to identify by the Roman numerals in their titles. Wed like to weed them out from an array of movies. Well do this using grep() (the origin of this word is a mystery, but it probably stood for Generate Regular Expression and Print. Thats not what the Perl grep does, however.) nosequel (Listing 2-22) demonstrates grep().
Listing 2-22 nosequel: Using grep() to extract array elements
#!/usr/bin/perl -w
@movies = ('Taxi Driver', 'Rocky IV', 'Casablanca', 'Godfather II',
'Friday the 13th Part VI', 'I, Claudius', 'Pulp Fiction',
'Police Academy III');
# extract elements that don't (!) contain a word boundary (\b) followed by
# one or more Is or Vs ((I|V)+), at the end of the string ($).
@good_movies = grep( ! /\b(I|V)+$/, @movies);
print "@good_movies";
nosequel excludes any movie with Is and Vs at the end of its title. (Hopefully well never need to exclude Xs or Ls.)
% nosequel RESULT:Taxi Driver Casablanca I, Claudius Pulp Fiction
/\LEE\EEE/
$code = '82940374837';
($first, $sep, $second) = ($code =~ /^([2-9])(0|1)([2-9])/);
($first, $sep, $second) = ($code =~ /^([^01]+)(0|1)([^01]+)/);
($first, $sep, $second) = ($code =~ /^([^0-1])+(0|1)([^0-1])+/);
($first, $sep, $second) = ($code =~ /^([^2-9+])(0|1)([^2-9+])/);
@codelines = grep(!/[^2-9]+/, @text);
@codelines = grep(/./, @text);
@codelines = grep(/[^01]+/, @text);
@codelines = grep(/[23-89]+/, @text);
@bignums = grep($_ > 1234, @numbers);
@bignums = grep($_ = 1234, @numbers);
@bignums = grep(/\d\d\d\d/, @numbers) > 1234;
@bignums = grep('$_ > 1234', @numbers);
Difficulty: Hard
Write a program that reads a Perl script and prints a modified version of the script: All comments should be removed and one-line if statements of the form if (CONDITION) {STATEMENT} should be changed to STATEMENT if CONDITION. Assume that STATEMENT doesnt contain a semicolon.
Hint: Use grep() and backreferences.
Pat yourself on the back; youve now covered most of regular expressions. The remaining two sessions in this chapter cover some regex arcana: features that you should know about, but might well never need.
| Previous | Table of Contents | Next |