Refactoring |
![]() |
Improving the Design of Existing Code | |
by Martin Fowler, with contributions by Kent Beck, John Brant, William Opdyke, and Don Roberts | |
Chapter 1 |
I originally wrote the Perl5 version shortly after I received the book in 2003 and then more
or less forgot about it. I recently revisited my code, which I will not
make available here, and it made me WinCE™ (ha, ha, killer joke, I know).
Ladies and gentlemen, I have four
words for ya: my old code sucks. Ever heard of pseudohashes? No? Good. Pseudohashes were
an experimental feature that has fortunately gone the way of the Dodo. What I thought
was a good idea at the time (using the fields pragma) is considered quite the
contrary by today's standards. Writing OOP code in Perl5 has improved dramatically since we
have Moose. Furthermore, since perl
5.10, we now have a real switch/case statement, only it has a proper name:
given/when. (It's also much more
powerful.)
And today it is preferable to create constant values through the
Readonly module, as recommended in
“Perl Best Practices” (O'Reilly) by Damian Conway. The old way still works, but it's
more obscure than necessary:
*FOO = \1;More common was the use of the
constant pragma, or, if you wanted to demonstrate your deep knowledge of Perl:
sub FOO () { 1 }
Both of these variants had the disadvantage to produce constants
that couldn't be interpolated into strings. Finally, I used chromatic's
Modern::Perl
so I don't have to write the whole litany of use strict; use warnings; use feature ":5.10";
etc. on top of every module.
So I rewrote the whole thing in a coding style that should be more
appealing to a year 2011 perl programmer. OK, time for a little rant: Perl-haters will always
look down on Perl5 code no matter how it is written. You only have to read the comments on
Proggit
whenever an article about Perl (any version) gets posted. To use a little conversation snippet from
Frank Zappa's “Porn Wars”, only modifying it slightly:
Speaker1: Have you seen this line noise?
Speaker2: Unbelievable!
Speaker3: What, all those crazy symbols?
Speaker1: Every symbol found on the keyboard? Parseable only by Perl itself? Unbelievable!
I also included links to the different versions of the modules as they undergo
their refactory evolution, so you can test them yourself. Here's a
little test script you can use with prove, the tool that comes with Test::More:
$ prove -v rental.tI recommend using symlinks for testing the different versions:
$ ln -sf Movie1.pm Movie.pm $ ln -sf Rental1.pm Rental.pm $ ln -sf Customer1.pm Customer.pmThen you overwrite the symlinks successively as the refactoring progresses and run
prove
to make sure everything still works correctly (Poor man's SCM).
I tried to write the code as strict as possible, so I used two useful Moose
extensions:
MooseX::Params::Validate
and
MooseX::StrictConstructor.
But once I validated input and then merely pass it around to other methods, I don't validate it again and
again ad nauseam.
While I'm deeply fascinated by
MooseX::Declare,
I chose not to use it here, because it's still considered experimental by its authors. Once it reaches
version 1.0 or so, I will probably rewrite the code again, because it's too cool. It isn't even a
source filter!
One final note: CamelCase is_not_much_used in Perl, but kept the style of the original Java code as
much as possible. The same goes for choice of variable names: $each is a
horrible name for a variable, especially when used as an iteration alias, but I kept
anyway, for the same reason.
Feel free to write any suggestions to: jczeus [at] googlemail [dot] com"
This document was made entirely with open-source-tools: vim editor,
Dia for the diagrams.
You can order the original book at the following addresses:
ISBN 0-201-48567-2 (Amazon)
ISBN 0-201-48567-2 (Barnes & Noble)
Jean-Christophe Zeus, February 2011
So I'm going to start this book with an example of refactoring. During the process I'll tell you a lot about how refactoring works and give you a sense of the process of refactoring. I can then provide the usual principles-style introduction.
With an introductory example, however, I run into a big problem. If I pick a large program, describing it and how it is refactored is too complicated for any reader to work through. (I tried and even a slightly complicated example runs to more than a hundred pages.) However, if I pick a program that is small enough to be comprehensible, refactoring does not look like it is worthwhile.
Thus I'm in the classic bind of anyone who wants to describe techniques that are useful for real-world programs. Frankly it is not worth the effort to do the refactoring that I'm going to show you on a small program like the one I'm going to use. But if the code I'm showing you is part of a larger system, then the refactoring soon becomes important. So I have to ask you to look at this and imagine it in the context of a much larger system.
Several classes represent various video elements. Here's a class diagram to show them (Figure 1.1).
![]() |
| Figure 1.1 Class diagram of the starting-point classes. Only the most important features are shown. The notation is Unified Modeling Language [Fowler, UML]. |
I'll show the code for each of these classes in turn.
# Movie.pm package Movie; use Modern::Perl; use Moose; use Moose::Util::TypeConstraints; use MooseX::StrictConstructor; use Readonly; Readonly our $CHILDRENS => 2; Readonly our $REGULAR => 0; Readonly our $NEW_RELEASE => 1; enum 'PriceCode' => [$REGULAR, $NEW_RELEASE, $CHILDRENS]; has 'priceCode' => ( is => 'rw', isa => 'PriceCode', default => $REGULAR, reader => 'getPriceCode', writer => 'setPriceCode', ); has 'title' => ( is => 'ro', isa => 'Str', required => 1, reader => 'getTitle', ); no Moose; __PACKAGE__->meta->make_immutable; 1;
# Rental.pm package Rental; use Modern::Perl; use Moose; use Moose::Util::TypeConstraints; use MooseX::StrictConstructor; use Movie; subtype 'PositiveInt' => as 'Int', => where { $_ > 0 } => message { "$_ is no positive number" }; has 'movie' => ( is => 'ro', isa => 'Movie', required => 1, reader => 'getMovie', ); has 'daysRented' => ( is => 'ro', isa => 'PositiveInt', default => 1, reader => 'getDaysRented', ); no Moose; __PACKAGE__->meta->make_immutable; 1;
# Customer.pm package Customer; use Modern::Perl; use Moose; use MooseX::StrictConstructor; use MooseX::Params::Validate; use Rental; has 'name' => ( is => 'ro', isa => 'Str', required => 1, reader => 'getName', ); has 'rentals' => ( is => 'ro', isa => 'ArrayRef[Rental]', default => sub { [] }, reader => '_getRentals', ); sub addRental { my $self = shift; my ($rental) = pos_validated_list(\@_, { isa => 'Rental' }); push @{ $self->_getRentals }, $rental; return; }Customer also has the method that produces a statement. Figure 1.2 shows the interactions for this method.
![]() |
| Figure 1.2 Interactions for the statement method |
sub statement { my $self = shift; my $totalAmount = 0.0; my $frequentRenterPoints = 0; my $result = 'Rental Record for ' . $self->getName . "\n"; for my $each (@{ $self->_getRentals }) { my $thisAmount = 0.0; # determine amounts for each line given ($each->getMovie->getPriceCode) { when ($Movie::REGULAR) { $thisAmount += 2.0; if ($each->getDaysRented > 2) { $thisAmount += ($each->getDaysRented - 2) * 1.5; } } when ($Movie::NEW_RELEASE) { $thisAmount += $each->getDaysRented * 3; } when ($Movie::CHILDRENS) { $thisAmount += 1.5; if ($each->getDaysRented > 3) { $thisAmount += ($each->getDaysRented - 3) * 1.5; } } } # add frequent renter points ++$frequentRenterPoints; # add bonus for a two day new release rental if (($each->getMovie->getPriceCode == $Movie::NEW_RELEASE) && $each->getDaysRented > 1) { ++$frequentRenterPoints; } # show figures for this rental $result .= "\t" . $each->getMovie->getTitle . "\t$thisAmount\n"; $totalAmount += $thisAmount; } # add footer lines $result .= "Amount owed is $totalAmount\n"; $result .= "You earned $frequentRenterPoints frequent renter points"; return $result; } no Moose; __PACKAGE__->meta->make_immutable; 1;
Even so the program works. Is this not just an aesthetic judgment, a dislike of ugly code? It is until we want to change the system. The compiler doesn't care whether the code is ugly or clean. But when we change the system, there is a human involved, and humans do care. A poorly designed system is hard to change. Hard because it is hard to figure out where the changes are needed. If it is hard to figure out what to change, there is a strong chance that the programmer will make a mistake and introduce bugs.
In this case we have a change that the users would like to make. First they want a statement printed in HTML so that the statement can be Web enabled and fully buzzword compliant. Consider the impact this change would have. As you look at the code you can see that it is impossible to reuse any of the behavior of the current statement method for an HTML statement. Your only recourse is to write a whole new method that duplicates much of the behavior of statement. Now, of course, this is not too onerous. You can just copy the statement method and make whatever changes you need.
But what happens when the charging rules change? You have to fix both
statement and htmlStatement and ensure the fixes are
consistent. The problem with copying and pasting code comes when you have to
change it later. If you are writing a program that you don't expect to change,
then cut and paste is fine. If the program is long lived and likely to change,
then cut and paste is a menace.
This brings me to a second change. The users want to make changes to the way they classify movies, but they haven't yet decided on the change they are going to make. They have a number of changes in mind. These changes will affect both the way renters are charged for movies and the way that frequent renter points are calculated. As an experienced developer you are sure that whatever scheme users come up with, the only guarantee you're going to have is that they will change it again within six months.
The statement method is where the changes have to be made to deal with changes in classification and charging rules. If, however, we copy the statement to an HTML statement, we need to ensure that any changes are completely consistent. Furthermore, as the rules grow in complexity it's going to be harder to figure out where to make the changes and harder to make them without making a mistake.
You may be tempted to make the fewest possible changes to the program; after all, it works fine. Remember the old engineering adage: "if it ain't broke, don't fix it." The program may not be broken, but it does hurt. It is making your life more difficult because you find it hard to make the changes your users want. This is where refactoring comes in.
Because the statement result produces a string, I create a few customers, give each customer a few rentals of various kinds of films, and generate the statement strings. I then do a string comparison between the new string and some reference strings that I have hand checked. I set up all of these tests so I can run them from one command on the command line. The tests take only a few seconds to run, and as you will see, I run them often.
An important part of these tests is the way they report their results. They either say “OK”, meaning that all the strings are identical to the reference strings, or they print a list of failures: lines that turned out differently. The tests are thus self-checking. It is vital to make tests self-checking. If you don't, you end up spending time hand checking some numbers from the test against some numbers of a desk pad, and that slows you down.
As we do the refactoring, we will lean on the tests. I'm going to be relying on the tests to tell me whether I introduce a bug. It is essential for refactoring that you have good tests. It's worth spending the time to build the tests, because the tests give you the security you need to change the program later. This is such an important part of refactoring that I go into more detail on testing in Chapter 4.
The first phase of refactorings in this chapter show how I split up the long method and move the pieces to better classes. My aim is to make it easier to write an HTML statement method with much less duplication of code.
My first step is to find a logical clump of code and use Extract Method (110). An obvious piece here is the switch statement. This looks like it would make a good chunk to extract into its own method.
When I extract a method, as in any refactoring, I need to know what can go wrong. If I do the extraction badly, I could introduce a bug into the program. So before I do the refactoring I need to figure out how to do it safely. I've done this refactoring a few times before, so I've written down the safe steps in the catalog.
First I need to look in the fragment for any variables that are local in scope
to the method we are looking at, the local variables and parameters. This
segment of code uses two: $each and
$thisAmount. Of
these $each is not modified by the code but
$thisAmount
is modified. Any non-modified variable I can pass in as a parameter. Modified
variables need more care. If there is only one, I can return it. The temp
is initialized to 0 each time around the loop and is not altered until the
switch gets to it. So I can just assign the result.
The following table shows the code before and after refactoring. The before code is on the left, the resulting code on the right. The code I'm extracting from the original and any changes in the new code that I don't think are immediately obvious are in boldface type. As I continue with this chapter, I'll continue with this left-right convention.
package Customer; ... sub statement { my $self = shift; my $totalAmount = 0.0; my $frequentRenterPoints = 0; my $result = 'Rental Record for ' . $self->getName . "\n"; for my $each (@{ $self->_getRentals }) { my $thisAmount = 0.0; # determine amounts for each line given ($each->getMovie->getPriceCode) { when ($Movie::REGULAR) { $thisAmount += 2.0; if ($each->getDaysRented > 2) { $thisAmount += ($each->getDaysRented - 2) * 1.5; } } when ($Movie::NEW_RELEASE) { $thisAmount += $each->getDaysRented * 3; } when ($Movie::CHILDRENS) { $thisAmount += 1.5; if ($each->getDaysRented > 3) { $thisAmount += ($each->getDaysRented - 3) * 1.5; } } } # add frequent renter points ++$frequentRenterPoints; # add bonus for a two day new release rental if (($each->getMovie->getPriceCode == $Movie::NEW_RELEASE) && $each->getDaysRented > 1) { ++$frequentRenterPoints; } # show figures for this rental $result .= "\t" . $each->getMovie->getTitle . "\t$thisAmount\n"; $totalAmount += $thisAmount; } # add footer lines $result .= "Amount owed is $totalAmount\n"; $result .= "You earned $frequentRenterPoints frequent renter points"; return $result; } |
package Customer; ... sub statement { my $self = shift; my $totalAmount = 0.0; my $frequentRenterPoints = 0; my $result = 'Rental Record for ' . $self->getName . "\n"; for my $each (@{ $self->_getRentals }) { my $thisAmount = $self->_amountFor($each); # add frequent renter points ++$frequentRenterPoints; # add bonus for a two day new release rental if (($each->getMovie->getPriceCode == $Movie::NEW_RELEASE) && $each->getDaysRented > 1) { ++$frequentRenterPoints; } # show figures for this rental $result .= "\t" . $each->getMovie->getTitle . "\t$thisAmount\n"; $totalAmount += $thisAmount; } # add footer lines $result .= "Amount owed is $totalAmount\n"; $result .= "You earned $frequentRenterPoints frequent renter points"; return $result; } sub _amountFor { my ($self, $each) = @_; my $thisAmount = 0.0; given ($each->getMovie->getPriceCode) { when ($Movie::REGULAR) { $thisAmount += 2; if ($each->getDaysRented > 2) { $thisAmount += ($each->getDaysRented - 2) * 1.5; } } when ($Movie::NEW_RELEASE) { $thisAmount += $each->getDaysRented * 3; } when ($Movie::CHILDRENS) { $thisAmount += 1.5; if ($each->getDaysRented > 3) { $thisAmount += ($each->getDaysRented - 3) * 1.5; } } } } |
Whenever I make a change like this, I test. I didn't get off to a very good start - the tests blew up. A couple of the test figures gave me the wrong answer. I was puzzled for a few seconds then realized what I had done. Foolishly I had forgotten the return statement:
package Customer; ... sub _amountFor { my ($self, $each) = @_; my $thisAmount = 0.0; given ($each->getMovie->getPriceCode) { when ($Movie::REGULAR) { $thisAmount += 2; if ($each->getDaysRented > 2) { $thisAmount += ($each->getDaysRented - 2) * 1.5; } } when ($Movie::NEW_RELEASE) { $thisAmount += $each->getDaysRented * 3; } when ($Movie::CHILDRENS) { $thisAmount += 1.5; if ($each->getDaysRented > 3) { $thisAmount += ($each->getDaysRented - 3) * 1.5; } } } return $thisAmount; } 1;It's the kind of silly mistakes that I often make, and it can be a pain to track down. In this case Perl simply returns the result of the last statement in the method without complaining. Fortunately it was easy to find in this case, because the change was so small and I had a good set of tests. Here is the essence of the refactoring process illustrated by accident. Because each change is so small, any errors are easy to find. You don't spend a long time debugging, even if you are as careless as I am.
Because I'm working in Perl, I need to analyze the code to figure out what to do with the local variables. With a tool, however, this can be made really simple. Such a tool does exist in Smalltalk, the Refactoring Browser. With this tool refactoring is very simple. I just highlight the code, pick “Extract Method” from the menus, type in a method name, and it's done. Furthermore, the tool doesn't make silly mistakes like mine. Padre is a IDE that is capable of doing some refactorings in Perl.
Now that I've broken the original method down into chunks, I can work on them
separately. I don't like some of the variable names in _amountFor,
and this is a good place to change them.
Here's the original code and the renamed code:
# package Customer; ... sub _amountFor { my ($self, $each) = @_; my $thisAmount = 0.0; given ($each->getMovie->getPriceCode) { when ($Movie::REGULAR) { $thisAmount += 2.0; if ($each->getDaysRented > 2) { $thisAmount += ($each->getDaysRented - 2) * 1.5; } } when ($Movie::NEW_RELEASE) { $thisAmount += $each->getDaysRented * 3; } when ($Movie::CHILDRENS) { $thisAmount += 1.5; if ($each->getDaysRented > 3) { $thisAmount += ($each->getDaysRented - 3) * 1.5; } } } return $thisAmount; } |
# package Customer; ... sub _amountFor { my ($self, $aRental) = @_; my $result = 0.0; given ($aRental->getMovie->getPriceCode) { when ($Movie::REGULAR) { $result += 2.0; if ($aRental->getDaysRented > 2) { $result += ($aRental->getDaysRented - 2) * 1.5; } } when ($Movie::NEW_RELEASE) { $result += $aRental->getDaysRented * 3; } when ($Movie::CHILDRENS) { $result += 1.5; if ($aRental->getDaysRented > 3) { $result += ($aRental->getDaysRented - 3) * 1.5; } } } return $result; } |
Once I've done the renaming, I test to ensure I haven't broken anything.
Is renaming worth the effort? Absolutely. Good code should communicate what it is doing clearly, and variable names are a key to clear code. Never be afraid to change the names of things to improve clarity. With good find and replace tools, it is usually not difficult. The strict pragma, runtime type checking and testing will highlight anything you miss. Remember
Code that communicates its purpose is very important. I often refactor just when I'm reading some code. That way as I gain understanding about the program, I embed that understanding into the code for later so I don't forget what I learned.
amountFor, I can see that it uses information from
the rental, but does not use information from the customer.
This immediately raises my suspicions that the method is on the wrong object.
In most cases a method should be on the object whose data it uses, thus the
method should be moved to the rental. To do this I use Move Method (142)
. With this you first copy the code over to the rental, adjust it to fit in
its new home, and compile (with perl -c)
, as follows:
package Customer; ... sub _amountFor { my ($self, $aRental) = @_; my $result = 0.0; given ($aRental->getMovie->getPriceCode) { when ($Movie::REGULAR) { $result += 2; if ($aRental->getDaysRented > 2) { $result += ($aRental->getDaysRented - 2) * 1.5; } } when ($Movie::NEW_RELEASE) { $result += $aRental->getDaysRented * 3; } when ($Movie::CHILDRENS) { $result += 1.5; if ($aRental->getDaysRented > 3) { $result += ($aRental->getDaysRented - 3) * 1.5; } } } return $result; } |
package Rental; ... sub getCharge { my $self = shift; my $result = 0.0; given ($self->getMovie->getPriceCode) { when ($Movie::REGULAR) { $result += 2; if ($self->getDaysRented > 2) { $result += ($self->getDaysRented - 2) * 1.5; } } when ($Movie::NEW_RELEASE) { $result += $self->getDaysRented * 3; } when ($Movie::CHILDRENS) { $result += 1.5; if ($self->getDaysRented > 3) { $result += ($self->getDaysRented - 3) * 1.5; } } } return $result; } |
In this case fitting into its home means removing the
second parameter and renaming $aRental
to $self. I also renamed the method as I did the
move.
I can now test to see whether the method works. To do this I replace the body
of Customer::_amountFor to delegate to the new method.
package Customer; ... sub _amountFor { my ($self, $aRental) = @_; return $aRental->getCharge; }I can now and test to see whether I've broken anything.
The next step is to find every reference to the old method and adjust the reference to use the new method, as follows.
In this case this step is easy because we just created the method and it is in only one place. In general, however, you need to do a "find" across all the classes that might be using that method:
package Customer; ... sub statement { my $self = shift; my $totalAmount = 0.0; my $frequentRenterPoints = 0; my $result = 'Rental Record for ' . $self->getName . "\n"; for my $each (@{ $self->_getRentals }) { my $thisAmount = $self->_amountFor($each); # add frequent renter points ++$frequentRenterPoints; # add bonus for a two day new release rental if (($each->getMovie->getPriceCode == $Movie::NEW_RELEASE) && $each->getDaysRented > 1) { ++$frequentRenterPoints; } # show figures for this rental $result .= "\t" . $each->getMovie->getTitle . "\t$thisAmount\n"; $totalAmount += $thisAmount; } # add footer lines $result .= "Amount owed is $totalAmount\n"; $result .= "You earned $frequentRenterPoints frequent renter points"; return $result; } |
package Customer; ... sub statement { my $self = shift; my $totalAmount = 0.0; my $frequentRenterPoints = 0; my $result = 'Rental Record for ' . $self->getName . "\n"; for my $each (@{ $self->_getRentals }) { my $thisAmount = $each->getCharge; # add frequent renter points ++$frequentRenterPoints; # add bonus for a two day new release rental if (($each->getMovie->getPriceCode == $Movie::NEW_RELEASE) && $each->getDaysRented > 1) { ++$frequentRenterPoints; } # show figures for this rental $result .= "\t" . $each->getMovie->getTitle . "\t$thisAmount\n"; $totalAmount += $thisAmount; } # add footer lines $result .= "Amount owed is $totalAmount\n"; $result .= "You earned $frequentRenterPoints frequent renter points"; return $result; } |
When I've made the change (Figure 1.3) the next thing is to remove the old method. Perl should tell me whether I missed anything. I then test to see if I've broken anything.
![]() |
| Figure 1.3 State of classes after moving the charge method |
Sometimes I leave the old method to delegate to the new method. This is useful if it is a public method and I don't want to change the interface of the other class.
There is certainly some more I would like to do to Rental::getCharge
but I will leave it for the moment and return to
Customer::statement.
The next thing that strikes me is that $thisAmount
is now redundant. It is set to the result of
$each->getCharge and not changed
afterward. Thus I can eliminate $thisAmount
by using Replace Temp with Query (120):
package Customer; ... sub statement { my $self = shift; my $totalAmount = 0.0; my $frequentRenterPoints = 0; my $result = 'Rental Record for ' . $self->getName . "\n"; for my $each (@{ $self->_getRentals }) { my $thisAmount = $each->getCharge; # add frequent renter points ++$frequentRenterPoints; # add bonus for a two day new release rental if (($each->getMovie->getPriceCode == $Movie::NEW_RELEASE) && $each->getDaysRented > 1) { ++$frequentRenterPoints; } # show figures for this rental $result .= "\t" . $each->getMovie->getTitle . "\t$thisAmount\n"; $totalAmount += $thisAmount; } # add footer lines $result .= "Amount owed is $totalAmount\n"; $result .= "You earned $frequentRenterPoints frequent renter points"; return $result; } |
package Customer; ... sub statement { my $self = shift; my $totalAmount = 0.0; my $frequentRenterPoints = 0; my $result = 'Rental Record for ' . $self->getName . "\n"; for my $each (@{ $self->_getRentals }) { # add frequent renter points ++$frequentRenterPoints; # add bonus for a two day new release rental if (($each->getMovie->getPriceCode == $Movie::NEW_RELEASE) && $each->getDaysRented > 1) { ++$frequentRenterPoints; } # show figures for this rental $result .= "\t" . $each->getMovie->getTitle . "\t" . $each->getCharge . "\n"; $totalAmount += $each->getCharge; } # add footer lines $result .= "Amount owed is $totalAmount\n"; $result .= "You earned $frequentRenterPoints frequent renter points"; return $result; } |
Once I've made that change I test to make sure I haven't broken anything.
I like to get rid of temporary variables such as this as much as possible. Temps are often a problem in that they cause a lot of parameters to be passed around when they don't have to be. You can easily lose track of what they are there for. They are particularly insidious in long methods. Of course there is a performance price to pay; here the charge is now calculated twice. But it is easy to optimize that in the rental class, and you can optimize much more effectively when the code is properly factored. I'll talk more about that issue later in Refactoring and Performance on page 69.
Again we look at the use of locally scoped variables. Again
$each is used and can be passed in
as a parameter. The other temp used is
$frequentRenterPoints. In this case
$frequentRenterPoints does have a value beforehand. The
body of the extracted method doesn't read the value, however, so we don't need
to pass it in as a parameter as long as we use an appending assignment.
I did the extraction, tested and then did a move and tested again. With refactoring, small steps are the best; that way less tends to go wrong.
package Customer; ... sub statement { my $self = shift; my $totalAmount = 0.0; my $frequentRenterPoints = 0; my $result = 'Rental Record for ' . $self->getName . "\n"; for my $each (@{ $self->_getRentals }) { # add frequent renter points ++$frequentRenterPoints; # add bonus for a two day new release rental if (($each->getMovie->getPriceCode == $Movie::NEW_RELEASE) && $each->getDaysRented > 1) { ++$frequentRenterPoints; } # show figures for this rental $result .= "\t" . $each->getMovie->getTitle . "\t" . $each->getCharge . "\n"; $totalAmount += $each->getCharge; } # add footer lines $result .= "Amount owed is $totalAmount\n"; $result .= "You earned $frequentRenterPoints frequent renter points"; return $result; } |
package Customer; ... sub statement { my $self = shift; my $totalAmount = 0.0; my $frequentRenterPoints = 0; my $result = 'Rental Record for ' . $self->getName . "\n"; for my $each (@{ $self->_getRentals }) { $frequentRenterPoints += $each->getFrequentRenterPoints; # show figures for this rental $result .= "\t" . $each->getMovie->getTitle . "\t" . $each->getCharge . "\n"; $totalAmount += $each->getCharge; } # add footer lines $result .= "Amount owed is $totalAmount\n"; $result .= "You earned $frequentRenterPoints frequent renter points"; return $result; } package Rental; ... sub getFrequentRenterPoints { my $self = shift; if (($self->getMovie->getPriceCode == $Movie::NEW_RELEASE) && $self->getDaysRented > 1) { return 2 } else { return 1 } } |
I'll summarize the changes I just made with some before-and-after Unified Modeling Language (UML) diagrams (Figures 1.4 through 1.7). Again the diagrams on the left are before the change; those on the right are after the change.
![]() | |
| Figure 1.4 Class diagram before extraction and movement of the frequent renter points calculation | Figure 1.6 Class diagram after extraction and movement of the frequent renter points calculation |
![]() |
![]() |
| Figure 1.5 Sequence diagram before extraction and movement of the frequent renter points calculation | Figure 1.7 Sequence diagram after extraction and movement of the frequent renter points calculation |
$totalAmount and
$frequentRenterPoints with query
methods. Queries are accessible to any method in the class and thus encourage
a cleaner design without long, complex methods.
I began by replacing $totalAmount
with a charge method on customer:
package Customer; ... sub statement { my $self = shift; my $totalAmount = 0.0; my $frequentRenterPoints = 0; my $result = 'Rental Record for ' . $self->getName . "\n"; for my $each (@{ $self->_getRentals }) { $frequentRenterPoints += $each->getFrequentRenterPoints; # show figures for this rental $result .= "\t" . $each->getMovie->getTitle . "\t" . $each->getCharge . "\n"; $totalAmount += $each->getCharge; } # add footer lines $result .= "Amount owed is $totalAmount\n"; $result .= "You earned $frequentRenterPoints frequent renter points"; return $result; } |
package Customer; ... sub statement { my $self = shift; my $frequentRenterPoints = 0; my $result = 'Rental Record for ' . $self->getName . "\n"; for my $each (@{ $self->_getRentals }) { $frequentRenterPoints += $each->getFrequentRenterPoints; # show figures for this rental $result .= "\t" . $each->getMovie->getTitle . "\t" . $each->getCharge . "\n"; } # add footer lines $result .= "Amount owed is " . $self->getTotalCharge . "\n"; $result .= "You earned $frequentRenterPoints frequent renter points"; return $result; } sub getTotalCharge { my $self = shift; my $result = 0.0; for my $each (@{ $self->_getRentals }) { $result += $each->getCharge; } return $result; } |
This isn't the simplest case of Replace Temp with Query (120),
$totalAmount was assigned to within
the loop, so I have to copy the loop into the query method.
After testing that refactoring, I did the same for
$frequentRenterPoints:
package Customer; ... sub statement { my $self = shift; my $frequentRenterPoints = 0; my $result = 'Rental Record for ' . $self->getName . "\n"; for my $each (@{ $self->_getRentals }) { $frequentRenterPoints += $each->getFrequentRenterPoints; # show figures for this rental $result .= "\t" . $each->getMovie->getTitle . "\t" . $each->getCharge . "\n"; } # add footer lines $result .= "Amount owed is " . $self->getTotalCharge . "\n"; $result .= "You earned $frequentRenterPoints frequent renter points"; return $result; } |
package Customer; ... sub statement { my $self = shift; my $result = 'Rental Record for ' . $self->getName . "\n"; for my $each (@{ $self->_getRentals }) { # show figures for this rental $result .= "\t" . $each->getMovie->getTitle . "\t" . $each->getCharge . "\n"; } # add footer lines $result .= "Amount owed is " . $self->getTotalCharge . "\n"; $result .= "You earned " . $self->getTotalFrequentRenterPoints . " frequent renter points"; return $result; } sub getTotalFrequentRenterPoints { my $self = shift; my $result = 0; for my $each (@{ $self->_getRentals }) { $result += $each->getFrequentRenterPoints; } return $result; } |
Figures 1.8 through 1.11 show the change for these refactorings in the class diagrams and the interaction diagram for the statement method.
![]() |
![]() |
| Figure 1.8 Class diagram before extraction of the totals | Figure 1.10 Class diagram after extraction of the totals |
![]() |
![]() |
| Figure 1.9 Sequence diagram before extraction of the totals | Figure 1.11 Sequence diagram after extraction of the totals |
It is worth stopping to think a bit about the last refactoring. The concern with this refactoring lies in performance. The old code executed the "for" loop once, the new code executes it three times. A for loop that takes a long time might impair performance. Many programmers would not do this refactoring simply for this reason. But note the words if and might. Until I profile I cannot tell how much time is needed for the loop to calculate or whether the loop is called often enough for it to affect the overall performance of the system. Don't worry about this while refactoring. When you optimize you will have to worry about it, but you will be in a much better position to do something about it, and you will have more options to optimize effectively (see the discussion on page 69).
These queries are now available for any code written in the customer class. They can easily be added to the interface of the class should other parts of the system need this information. Without queries like these, other methods have to deal with knowing about the rentals and building the loops. In a complex system, that will lead to much more code to write and mantain.
You can see the difference immediately with the htmlStatement. I
am now at the point where I can take off my refactoring hat and put on my
adding function hat. I can write htmlStatement as follows and add
appropriate tests:
package Customer; ... sub htmlStatement { my $self = shift; my $result = "<H1>Rentals for <EM>" . $self->getName . "</EM></H1><P>\n"; for my $each (@{ $self->_getRentals }) { # show figures for each rental $result .= $each->getMovie->getTitle . ": " . $each->getCharge . "<BR>\n"; } # add footer lines $result .= "<P>You owe <EM>" . $self->getTotalCharge . "</EM><P>\n"; $result .= "On this rental you earned <EM>" . $self->getTotalFrequentRenterPoints . "</EM> frequent renter points<P>"; return $result; }
By extracting the calculations I can create the htmlStatement
method and reuse all of the calculation code that was in the original statement
method. I didn't copy and paste, so if the calculation rules change I have only
one place in the code to go. Any other kind of statement will be really quick
and easy to prepare. The refactoring did not take long. I spent most of the
time figuring out what the code did, and I would have had to do that anyway.
Some code is copied from the ASCII version, mainly due to setting up the loop. Further refactoring could clean that up. Extracting methods for header, footer and detail line are one route I could take. You can see how to do this in the example for Form Template Method (345). But now the users are clamoring again. They are getting ready to change the classification of the movies in the store. It's still not clear what changes they want to make, but it sounds like new classifications will be introduced, and the existing ones could well be changed. The charges and frequent renter point allocations for these classifications are to be decided. At the moment, making these kind of changes is awkward. I have to get into the charge and frequent renter point methods and alter the conditional code to make changes to film classifications. Back on with the refactoring hat.
This implies that getCharge should move onto movie:
package Rental; ... sub getCharge { my $self = shift; my $result = 0.0; given ($self->getMovie->getPriceCode) { when ($Movie::REGULAR) { $result += 2; if ($self->getDaysRented > 2) { $result += ($self->getDaysRented - 2) * 1.5; } } when ($Movie::NEW_RELEASE) { $result += $self->getDaysRented * 3; } when ($Movie::CHILDRENS) { $result += 1.5; if ($self->getDaysRented > 3) { $result += ($self->getDaysRented - 3) * 1.5; } } } return $result; } |
package Movie; ... sub getCharge { my ($self, $daysRented) = @_; my $result = 0.0; given ($self->getPriceCode) { when ($Movie::REGULAR) { $result += 2; if ($daysRented > 2) { $result += ($daysRented - 2) * 1.5; } } when ($Movie::NEW_RELEASE) { $result += $daysRented * 3; } when ($Movie::CHILDRENS) { $result += 1.5; if ($daysRented > 3) { $result += ($daysRented - 3) * 1.5; } } } return $result; } package Rental; ... sub getCharge { my $self = shift; return $self->getMovie->getCharge($self->getDaysRented); } |
For this to work I had to pass in the length of the rental, which of course is data from the rental. The method effectively uses two pieces of data, the length of the rental and the type of the movie. Why do I prefer to pass the length of the rental to the movie rather than the movie type to the rental? It's because the proposed changes are all about adding new types. Type information generally tends to be more volatile. If I change the movie type, I want the least ripple effect, so I prefer to calculate the charge within the movie.
I added the method to movie and then changed the getCharge on
rental to use the new method.
Once I've moved the getCharge method, I'll do the same with the
frequent renter point calculation. That keeps both things that vary with the
type together on the class that has the type (Figures 1.12 and 1.13):
package Rental; ... sub getFrequentRenterPoints { my $self = shift; if (($self->getMovie->getPriceCode == $Movie::NEW_RELEASE) && $self->getDaysRented > 1) { return 2 } else { return 1 } } |
package Movie; ... sub getFrequentRenterPoints { my ($self, $daysRented) = @_; if (($self->getPriceCode == $Movie::NEW_RELEASE) && $daysRented > 1) { return 2 } else { return 1 } } package Rental; ... sub getFrequentRenterPoints { my $self = shift; return $self->getMovie->getFrequentRenterPoints($self->getDaysRented); } |
![]() |
| Figure 1.12 lass diagram before moving methods to movie |
![]() |
| Figure 1.13 Class diagram after moving methods to movie |
![]() |
| Figure 1.14 Using inheritance on movie |
This allows me to replace the switch statement by using polymorphism. Sadly it
has one slight flaw - it doesn't work. A movie can change its classification
during its lifetime. An object cannot change its class during its lifetime.
(In fact, we could do this with Moose by writing
RegularMovie, ChildrensMovie and NewReleaseMovie
as roles and apply them to the Movie instance at runtime. What I don't
like with this solution, however, is that you can't currently replace a role
with another, you can only add them.)
There is a solution, however, the State pattern [Gang of Four]. With
the State pattern the classes look like Figure 1.15.
![]() |
| Figure 1.15 Using the State pattern on movie |
By adding the indirection we can do the subclassing from the price code object and change the price whenever we need to.
If you are familiar with the Gang of Four patterns, you may wonder, "Is this a state, or is it a strategy?" Does the price class represent an algorithm for calculating the price (in which case I prefer to call it the Pricer or PricingStrategy), or does it represent a state of the movie (Star Trek X is a new release). At this stage the choice of pattern (and name) reflects how you want to think about the structure. At the moment I'm thinking about this as state of movie. If I later decide a strategy communicates my intention better, I will refactor to do this by changing the names.
To introduce the state pattern I will use three refactorings. First I'll move the type code behavior into the state pattern with Replace Type Code with State/Strategy (227). Then I can use Move Method (142) to move the switch statement into the price class. Finally I'll use Replace Conditional with Polymorphism (255) to eliminate the switch statement.
I begin with Replace Type Code with State/Strategy (227). [...] First I add the new classes. I provide the type code behavior in the price object. I do this with an abstract method on price and concrete methods in the subclasses:
# Price.pm use Modern::Perl; use Movie; { package Price; use Moose::Role; requires 'getPriceCode'; no Moose::Role; } { package ChildrensPrice; use Moose; with 'Price'; sub getPriceCode { return $Movie::CHILDRENS } no Moose; __PACKAGE__->meta->make_immutable; } { package NewReleasePrice; use Moose; with 'Price'; sub getPriceCode { return $Movie::NEW_RELEASE } no Moose; __PACKAGE__->meta->make_immutable; } { package RegularPrice; use Moose; with 'Price'; sub getPriceCode { return $Movie::REGULAR } no Moose; __PACKAGE__->meta->make_immutable; } 1;
Now I need to change the movie
class. I replace the priceCode attribute by a price
attribute. Users of the class, however, should not be able to see the
difference from the outside. To achieve this, I have to write my own BUILD method so that
the user can continue using a priceCode key-value pair when calling the constructor of
the class. Unfortunately, this also means that I can no longer use MooseX::StrictConstructor.
I also write my own getPriceCode and setPriceCode methods, which
were previously automatically generated by Moose. These new methods work as before, as far
as the user of the class can tell, but internally they use the new price attribute.
package Movie; ... use MooseX::StrictConstructor; ... has 'priceCode' => ( is => 'rw', isa => 'PriceCode', default => $REGULAR, reader => 'getPriceCode', writer => 'setPriceCode', ); |
package Movie; ... use MooseX::Params::Validate; ... use Price; ... has 'price' => ( is => 'rw', isa => 'Price', default => sub { RegularPrice->new }, reader => 'getPrice', writer => 'setPrice', ); ... sub BUILD { my ($self, $args) = @_; if (exists $args->{priceCode}) { $self->setPriceCode($args->{priceCode}); } } sub getPriceCode { my $self = shift; return $self->getPrice->getPriceCode; } sub setPriceCode { my $self = shift; my ($priceCode) = pos_validated_list(\@_, { isa => 'PriceCode' }); given ($priceCode) { when ($Movie::REGULAR) { $self->setPrice(RegularPrice->new); } when ($Movie::CHILDRENS) { $self->setPrice(ChildrensPrice->new); } when ($Movie::NEW_RELEASE) { $self->setPrice(NewReleasePrice->new); } } return; } |
I can now test, and the more complex methods don't realize the world has changed.
Now I apply Move Method (142) to getCharge. It is simple to
move:
package Movie; ... sub getCharge { my ($self, $daysRented) = @_; my $result = 0.0; given ($self->getPriceCode) { when ($Movie::REGULAR) { $result += 2; if ($daysRented > 2) { $result += ($daysRented - 2) * 1.5; } } when ($Movie::NEW_RELEASE) { $result += $daysRented * 3; } when ($Movie::CHILDRENS) { $result += 1.5; if ($daysRented > 3) { $result += ($daysRented - 3) * 1.5; } } } return $result; } |
package Movie; ... sub getCharge { my ($self, $daysRented) = @_; return $self->getPrice->getCharge($daysRented); } { package Price; ... sub getCharge { my ($self, $daysRented) = @_; my $result = 0.0; given ($self->getPriceCode) { when ($Movie::REGULAR) { $result += 2; if ($daysRented > 2) { $result += ($daysRented - 2) * 1.5; } } when ($Movie::NEW_RELEASE) { $result += $daysRented * 3; } when ($Movie::CHILDRENS) { $result += 1.5; if ($daysRented > 3) { $result += ($daysRented - 3) * 1.5; } } } return $result; } ... } |
Once it is moved I can start using Replace Conditional with Polymorphism (255).
I do this by taking one leg of the case statement at time and
creating an overriding method. I start with RegularPrice.
This overrides the parent case statement, which I just leave as it is. I compile and test for this case then take the next leg, compile and test. (To make sure I'm executing the subclass code, I like to throw in a deliberate bug and run it to ensure the tests blow up. Not that I'm paranoid or anything.)
{
package Price;
...
sub getCharge
{
my ($self, $daysRented) = @_;
my $result = 0.0;
given ($self->getPriceCode) {
when ($Movie::REGULAR) {
$result += 2;
if ($daysRented > 2) {
$result += ($daysRented - 2) * 1.5;
}
}
when ($Movie::NEW_RELEASE) {
$result += $daysRented * 3;
}
when ($Movie::CHILDRENS) {
$result += 1.5;
if ($daysRented > 3) {
$result += ($daysRented - 3) * 1.5;
}
}
}
return $result;
}
...
}
|
{
package Price;
...
}
{
package RegularPrice;
...
sub getCharge
{
my ($self, $daysRented) = @_;
my $result = 2.0;
if ($daysRented > 2) {
$result += ($daysRented - 2) * 1.5;
}
return $result;
}
...
}
{
package ChildrensPrice;
...
sub getCharge
{
my ($self, $daysRented) = @_;
my $result = 1.5;
if ($daysRented > 3) {
$result += ($daysRented - 3) * 1.5;
}
return $result;
}
...
}
{
package NewReleasePrice;
...
sub getCharge
{
my ($self, $daysRented) = @_;
return $daysRented * 3;
}
...
}
|
When I've done that with all the legs, I make the
Price::getCharge method abstract:
{
package Price;
...
requires qw(getPriceCode getCharge);
...
}
I can now do the same procedure with getFrequentRenterPoints.
First I move the method over to Price:
package Movie; ... sub getFrequentRenterPoints { my ($self, $daysRented) = @_; if (($self->getPriceCode == $Movie::NEW_RELEASE) && $daysRented > 1) { return 2 } else { return 1 } } |
package Movie; ... sub getFrequentRenterPoints { my ($self, $daysRented) = @_; return $self->getPrice->getFrequentRenterPoints($daysRented); } { package Price; ... sub getFrequentRenterPoints { my ($self, $daysRented) = @_; if (($self->getPriceCode == $Movie::NEW_RELEASE) && $daysRented > 1) { return 2 } else { return 1 } } ... } |
In this case however, I don't make the superclass method abstract. Instead, I create an overriding method for the new releases and leave a defined method (as the default) on the superclass:
{
package Price;
...
sub getFrequentRenterPoints { return 1 }
...
}
{
package NewReleasePrice;
...
sub getFrequentRenterPoints
{
my ($self, $daysRented) = @_;
return ($daysRented > 1) ? 2 : 1;
}
...
}
Putting in the state pattern was quite an effort. Was it worth it? The gain is that if I change any price's behavior, add new prices, or add extra price-dependent behavior, the change will be much easier to make. The rest of the application does not know about the use of the state pattern. For the tiny amout of behavior I currently have, it is not a big deal. In a more complex system with a dozen or so price-dependent methods, this would make a big difference. All these changes were small steps. It seems slow to write it this way, but not once did I have to open the debugger, so the process actually flowed quite quickly. It took me much longer to write this section of the book than it did to change the code.
I've now completed the second major refactoring. It is going to be much easier to change the classification structure of movies, and to alter the rules for charging and the frequent renter points system. Figures 1.16 and 1.17 show how the state pattern works with price information.
![]() |
| Figure 1.16 Interactions using the state pattern |
![]() |
| Figure 1.17 Class diagram after addition of the state pattern |
The most important lesson from this example is the rhythm of refactoring: test, small change, test, small change, test, small change. It is that rhythm that allows refactoring to move quickly and safely.
If you're with me this far, you should now understand what refactoring is all about. We can now move on to some background, principles, and theory (although not too much!)