Category Archives: Refactoring

Coupling: The factory method

One of the challenges we face with coding is dealing with coupling. Coupling is an important aspect of programming, it tells us how much our code is tangled. When coupling is too high, we can’t easily re-use code. When the coupling is too low it does little. You can measure coupling, there are several metrics for it even (for instance “Coupling between Objects, CBO”).

In this blog post I’d like to talk about a subtle introduction of coupling: when you introduce a factory method.

Consider you have an interesting piece of code, and this piece of code has quite a lot of properties:

[sourcecode language=”java”]
class Person {
private String firstName;
private String lastName;
// .. more properties here

public void subcribeTo(Subscription subscription) {
// do something interesting here
}

}
[/sourcecode]

The problem is, because of the amount of properties and other dependencies, we’d like to simplify its creation by introducing a Factory method. In this case we are building a web application, so we take the Request as input to read the parameters:

[sourcecode language=”java”]
class Person {
private String firstName;
private String lastName;
// .. more properties here

public static Person create(HttpServletRequest request, .. more arguments here .. ) {
this.firstName = request.getParameter("firstName");
// .. read more properties
// .. set up dependencies, etc.
}

public void subcribeTo(Subscription subscription) {
// do something interesting here
}

}
[/sourcecode]

In the code that uses Person, it becomes easier to construct the Person and we’re happy with that. However, we have introduced coupling on several levels:

  • We construct the object with specific parameters in the create method. If we want to create from different parameters, we cannot use it. There is a coupling between the parameters and the properties.
  • The object is constructed using a Request object. We cannot now move the class to an application that does not use the web. A person has nothing to do with a request, it is just convenience that we put the factory method in the Person class. There is a coupling between the code of Person and the dependency delivering the Request object.

There are several ways to deal with this. But lets start with the last reason of coupling. It is easy to fix this coupling by creating a Factory class within your web application. From there you can generate the Person object out of a request. The Person class has no create method anymore, and thus is not tightly coupled to a Request class. The newly created Factory however is coupled to the Request, which is fine as it is meant to convert Requests into Person objects. Hence we could even name it that way:

[sourcecode language=”Java”]
class Person {
private String firstName;
private String lastName;
// .. more properties here

Person(String firstName, String lastName, …) {
this.firstName = firstName;
this.lastName = lastName;
// …
}

public void subcribeTo(Subscription subscription) {
// do something interesting here
}

}

class PersonFromRequestFactory {

// .. dependencies here

public Person create(HttpServletRequest request) {
Person person = new Person(request.getParameter("firstName"), )
// .. read more properties
// .. set up dependencies in Person, etc.
}

}
[/sourcecode]

Once we have this Factory, you can take it a step further:
If you have different kind of request parameters to create the same object you could create different methods in the new Factory:

[sourcecode language=”Java”]
class PersonFromRequestFactory {

// .. dependencies here

public Person createFromRegistrationForm(HttpServletRequest request) {
Person person = new Person(request.getParameter("firstName"), )
// .. read more properties
// .. set up dependencies in Person, etc.
}

public Person createFromSubscriptionForm(HttpServletRequest request) {
Person person = new Person(request.getParameter("givenName"), )
// .. read more properties
// .. set up dependencies in Person, etc.
}

}
[/sourcecode]

You could also create a Parameter object and go from there. For instance, if your web application uses Spring, you could wire your request parameters to an object (called “Form binding“) automagically and use that object as input in your Factory. This way it is type safe:

[sourcecode language=”Java”]
class PersonFromRequestFactory {

// .. dependencies here

public Person create(RegistrationForm form) {
Person person = new Person(form.getFirstName(), …)
// .. read more properties
// .. set up dependencies in Person, etc.
}

public Person createFromSubscriptionForm(SubscriptionForm form) {
Person person = new Person(form.getGivenName(), )
// .. read more properties
// .. set up dependencies in Person, etc.
}

}
[/sourcecode]

But how do you test all this?
Did you notice the Person has private fields, and no get/set methods? The only way to set the fields is using the Person constructor. How do you test the correct construction of this Person class from the request? Since we are not able to read the properties, we have to use other ways to test that code. I’ll cover that in the next blog post.

My answer for – What is wrong with this code #02

A lot of things can be found in the snippet at “What is wrong with this code #02″. Here it is again:

[sourcecode language=”java”]
public MyObject {

private String myField;

… (imagine multiple fields) …

public GenericObject toGenericObject() {
GenericObjectFactory genericObjectFactory = GenericObjectFactory.getInstance();
GenericObject genericObject = genericObjectFactory.create("myObject");
genericObject.setField("myField", myField);
// imagine multiple lines using genericObject.setField("field", field)
return genericObject;
}

}
[/sourcecode]

Already many things are pointed out in the comments of the initial post.

One thing that immediately struck me is this line:
[sourcecode language=”java”]
GenericObjectFactory genericObjectFactory = GenericObjectFactory.getInstance();
[/sourcecode]

Don’t do this.

GenericObjectFactory implies with the getInstance to be a Singleton. Even though it is a singleton, never retrieve and instance inside your method like that.

Even though there “should be only one instance” of the GenericObjectFactory, it still is a dependency for serializing MyObject. Never hide your dependencies. It makes your code hide its intent, it’s hard to decouple, test, refactor, in short hard to maintain. The code should clearly communicate its intent and usage(s).

In my opinion, there are three (well four sort-off) options to deal with this:

1. Deliver GenericObjectFactory as a parameter (invasive in method signature, but automated refactoring tools (i.e. IntelliJ) handle this perfectly):
[sourcecode language=”java”]
public GenericObject toGenericObject(GenericObjectFactory genericObjectFactory ) {
GenericObject genericObject = genericObjectFactory.create("myObject");
[/sourcecode]

2. Inject GenericObjectFactory as a property of MyObject (invasive in constructor)
[sourcecode language=”java”]
public MyObject {

private final GenericObjectFactory genericObjectFactory;

public MyObject(final GenericObjectFactory genericObjectFactory) {
this.genericObjectFactory = genericObjectFactory;
}

private String myField;

… (imagine multiple fields) …

public GenericObject toGenericObject() {
GenericObject genericObject = genericObjectFactory.create("myObject");

[/sourcecode]

2B. If you find option two too invasive, add the default constructor again but let it call the new constructor (makes point 2 less invasive, remove this default constructor after period of time).
[sourcecode language=”java”]
public MyObject {

private final GenericObjectFactory genericObjectFactory;

public MyObject(final GenericObjectFactory genericObjectFactory) {
this.genericObjectFactory = genericObjectFactory;
}

// add this default constructor
public MyObject() {
this(GenericObjectFactory.getInstance());
}

private String myField;

… (imagine multiple fields) …

public GenericObject toGenericObject() {
GenericObject genericObject = genericObjectFactory.create("myObject");

[/sourcecode]

3. Introduce protected getInstance method in MyObject (least invasive, increases testability)
[sourcecode language=”java”]
public MyObject {

… snip …

protected GenericObjectFactory getInstance() {
return GenericObjectFactory.getInstance();
}

public GenericObject toGenericObject() {
GenericObjectFactory genericObjectFactory = getInstance();
GenericObject genericObject = genericObjectFactory.create("myObject");

}

[/sourcecode]
This is sorted by my personal preference. Although option one and option two trade places from time to time. The third option is less favorable, as you still do not expose the need for dependencies, but it does make it more testable. It is also the least invasive way to refactor, which is sometimes beneficial in legacy projects with lots of untested code.

To conclude, never hide your dependencies, if you ever do see some class using getInstance, please, refactor it. It will make your, and your fellow-programmers, life a lot easier.

Thank you!

Working with legacy code – how to start & reveal intent

Recently I posted my opinion about regression. Regression bugs are likely to occur on projects with a lot of legacy code. I consider legacy code as untested code.

At the legacy coderetreat we used a small codebase (you can find it here). With that codebase we exercised in sessions to improve the code. The nice thing is that this is very similar with your daily job. You open up a project and you have to make changes in code you haven’t seen before and do not understand.

In order to get a better understanding you can use various techniques. I have practiced them with the legacy coderetreat and also applied this at work. In this blog post I’d like to share my experiences. Btw: If you haven’t experienced a coderetreat yet, join one. Just like kata’s, they are really worth your time (and more fun)!

Step 1: Get a sense of what is happening
Before we can do anything, we have to understand what we need to change and how it has impact on the system. One way to find out is to simply execute the code. You could just run the application, or… you could try to write a simple unit test executing the ‘main method’ you think that should be ran. Poke around with the parameters, and see what happens.

I prefer writing Characterization tests. The benefit is that while I am trying to understand what is happening, I am also building a safety net. Writing a Characterization test goes like this:
– create new test
– do some setup
– run specific piece of code (method) you want to try out
– check outcome / read state
– create assertion to make it pass with the outcome

When I don’t know what it actually does, I call my tests ‘monkey‘. Once I know the behavior with the given input, I rename the test to what the behavior is. Example:

[sourcecode language=”java”]
package com.adaptionsoft.games.uglytrivia;

import org.junit.Assert;
import org.junit.Test;

import static org.hamcrest.core.Is.*;
import static org.junit.Assert.*;

public class GameTest {

@Test
public void isPlayableReturnsFalseWhenInitialized() {
Game game = new Game();
assertThat(game.isPlayable(), is(false));
}

@Test
public void isPlayableReturnsTrueWithTwoPlayers() {
Game game = new Game();
game.add("Stefan");
game.add("Niels");
assertThat(game.isPlayable(), is(true));
}

@Test
public void monkey() {
Game game = new Game();
game.add("Stefan");
game.add("Niels");
game.roll(5);
// no idea yet what happens, need to look into roll method to get a clue
}

}
[/sourcecode]

So this gives me a rough idea what is happening, and it gives me a suite of tests.

It is important that you focus on black box tests. Try not to bother about the internals. If you are deep-stubbing in your test setup then try to think of a different way to approach the problem. Sometimes it is not possible to do black box testing, only then you need to do white box testing. In these cases deep-stubbing is often needed. Deep stubbing indicates a design problem: your class is bothered with internal states of other objects. You can reduce this by applying Tell Don’t Ask.

Step 2: Reveal intent.
This is even less invasive (actually it is not invasive at all if done well) than the small refactorings I have blogged about in the past.

To reveal intent:
– go through the code, find magic numbers and strings. Introduce constants for them with descriptive names
– find method names that do not describe well their behavior, and rename them. Try to keep the name about behavior, and if it does more then one thing, concate these behaviors with “And”.
– do the same for variables

This may sound trivial, but it really enhances the understandability of the code. As a bonus your understanding of the code is increased a lot, and all you did was renaming things and perhaps introduced a few constants. Let me show you how much it matters:

Can you find things to improve in this code?
[sourcecode language=”java”]
if (roll % 2 != 0) {
isGettingOutOfPenaltyBox = true;

System.out.println(players.get(currentPlayer) + " is getting out of the penalty box");
places[currentPlayer] = places[currentPlayer] + roll;
if (places[currentPlayer] > 11) places[currentPlayer] = places[currentPlayer] – 12;

System.out.println(players.get(currentPlayer)
+ "’s new location is "
+ places[currentPlayer]);
System.out.println("The category is " + currentCategory());
askQuestion();
} else {
[/sourcecode]

What about this?
[sourcecode language=”java”]
if (roll % 2 != 0) {
isGettingOutOfPenaltyBox = true;

System.out.println(players.get(currentPlayer) + " is getting out of the penalty box");
places[currentPlayer] = places[currentPlayer] + roll;
if (places[currentPlayer] > PLACE_BEFORE_STARTING_PLACE) places[currentPlayer] = places[currentPlayer] – MAX_PLACES;

System.out.println(players.get(currentPlayer)
+ "’s new location is "
+ places[currentPlayer]);
System.out.println("The category is " + getCurrentCategoryForCurrentPlayerOnPlace());
askQuestionAndRemoveFromQuestionFromDeck();
} else {
[/sourcecode]

This method name is called “roll” initially. If you would sum up all its behavior it would be more like:

[sourcecode language=”java”]
public void movePlayerAmountRolledAndAskQuestionOrWhenInPenaltyBoxIfUnevenRolledGetOutOfPenaltyBox(int roll) {
[/sourcecode]

Who would ever accept such a long method name? I would, but it should trigger something. This method name tells you there is way too much going on in one place. And, since the method is public, we communicate to other classes what this thing is doing.

It is ok to rename multiple times. The longer you work with the code, the better you understand it. When the method names do not reflect their real intent, make it clearer and improve their names. Communicating what the code actually *does* is important, make it explicit. especially if the method name violates conventions (ie, a getSomething() method that is not getting a property, but does more than that.)

It is very tempting to extract expressions and methods
Before you do this. Make sure you have the Characterization tests and integration tests in place. The tests will tell you if you have broken something while refactoring using extract method or extract conditions into variables. Yes, even such small refactoring’s could cause bugs.

Here an example, take this expression:
[sourcecode language=”java”]
if (rolled % 2 != 0) {
[/sourcecode]

Which you could turn into (extract into variable):

[sourcecode language=”java”]
boolean isUnevenRoll = roll % 2 != 0;
if (isUnevenRoll) {
[/sourcecode]

Or extract method:

[sourcecode language=”java”]
if (isUneven(roll)) {
[/sourcecode]

I prefer (automated!) extract method over extracting into variables. The main reason is that extracting into methods introduce very small pieces of code that you can re-use. You could eventually even find that the methods are not particularly bound to the current class’ behavior and move them out of this class into a new class. With variables this is much harder to see and refactor.

With these two steps, we could have brought the code we had earlier into a state like this:

[sourcecode language=”java”]
if (isUneven(roll)) {
isGettingOutOfPenaltyBox = true;

System.out.println(getCurrentPlayer() + " is getting out of the penalty box");
moveCurrentPlayer(roll);

System.out.println(getCurrentPlayer()
+ "’s new location is "
+ places[currentPlayer]);
System.out.println("The category is " + getCurrentCategoryForCurrentPlayerOnPlace());
askQuestionAndRemoveFromQuestionFromDeck();
} else {
[/sourcecode]

Conclusion
When working with legacy code, it is of importance to understand the code before making changes. In order to understand the code we can use introduce constants or rename methods to make the code reveal its intent. Using Characterization tests we can fixate the current behavior and label it in our tests names. Then, once we have this test suite, we can start using small refactoring’s like extract method or extract variable to make conditionals reveal their intent.

When creating a test suite, creating mostly black box tests will help us in the future when refactoring opposed to white box tests. Sometimes white box tests cannot be avoided.

Without any tests we can already have more insight in what is happening. With a test suite we can more safely start refactoring.

More about coderetreats
I have been greatly inspired by the legacy code retreat day, where we could experiment more in our spare time. Just like the previous time I have learned a lot, and I am convinced that others will benefit from this as well. Therefor I have decided to lend a hand and offer to organize and facilitate a coderetreat myself at the end of this year. Stay tuned!

Regression – Lets stop it!

I hate it.

You change something and you can’t tell if your change broke something in the system.

If you’re lucky, you did not break anything. Or nobody noticed it.
Next to that, on the same scale of luck, the potential bugs are found in the manual testing phase.

But often there is no time to do all the regression testing by hand. It will take days and days, and the change you made looked so insignificant. It should go live. What could possibly go wrong?.

Then it happens. You’re live, your changes work, but the inevitable happens. Regression!

Of course, this has to be fixed. We can’t let our customers have a product with new features while the features of the previous version(s) are broken.

And so the patching process begins.

I call it patching, because often you are not done with one patch. While you were working hard to get the first patch live, other regression bugs are found and need to be patched asap as well! And so you end up with a few patches. You could be done with with a few patch releases. But it could easily extend ten-fold.

This process is very stressful for the customer and the development team. As the team is working to get these patches out soon, the customer is unhappy with his ‘broken system’. Even worse, once a few bugs are found, more testing is done on the live system to make sure everything still works, and more regression bugs poor in, adding up to the stress. To the development team it begins to look like…

From the customer’s point of view, it looks like the team working on the product is not in control. It is as if the team does not seem to know what they are doing. To them their product, which seemed rock solid at start, is degrading to a house of cards.

You can debate about high and low impact issues, and the matter of urgency to fix these issues. The perception of the customer is likely to be the same, regardless.

This is how I see it:

It is us developers who are responsible for letting regression happen.

Not testers.
Not project managers.
Not stakeholders.
Not the customer.

It is us and us alone.

We write the code, we change the code, we are in control of the code (at least we should be!).

Even if you happened to be depending on a third-party system, it is your job to keep an eye out on that system. Verify that it behaves as you would expect it to. Why? Your system depends on the behaviour of another system, trusting that this behaviour does not change is not enough. You have to be *sure*.

Its all about attitude
Do you always deal with regression bugs after each release?

Stop accepting it, it is not normal.

Rather, start thinking about how you can prevent this. Don’t look how other people could prevent this. Think of what you could do right now. There are many ways to reduce the amount of regression bugs. For instance: add tests before changing any code. Fixate the behavior with black box tests. When you refactor, keep running your tests so you know you did not break existing behaviour. Add new tests for new features you introduce. Create a test suite that you can trust. Make integration tests. Is it hard to write tests? Make it easier. Don’t back away from the code, it is your code and you should be in control.

Besides the code, improve your own skills. Start reading about how to deal with legacy code. Attend a legacy code retreat to hone your skills. Practice, practice, practice! Get better.

Reap what you sow in your daily work.

But isn’t the whole team responsible?
Ah, of course! But does that mean that you, as a developer can now do less? Would it be okay in a team to not test, because you have testers? (“its their job right?”).

In a team we all have our strengths and weaknesses.

We understand code, and we can change code. No other role in your team is responsible for understanding the code then you. Being in a team does not make you less responsible.

Again, it is all about attitude. Stand for your craft, deliver high quality work and make sure the system is in check. It should be you who controls the system.

Attitude, again
There are developers out there who really think they know everything of the system. And to be honest, I once had a time where I always knew what changes had impact and what I could do. And even though I was right about the impact on changes…

…I was at least wrong as many times as I was right.

But sometimes it is not just being over-confident. Sometimes it is being ignorant, or even arrogant.

Please, don’t be like this guy…

Just because it is hard, doesn’t mean you shouldn’t do it
Regression is a pain. It can be dealt with.

It is not easy.

You will not completely eliminate regression bugs. But with the correct mindset, tools and safety-net(s), you will greatly reduce the amount of regression bugs.

It is necessary. Just do it. For the love of our (your!) craft, do it, for everyone who depends on us:

The customer.
The stakeholders.
The project managers.
Yes, even the testers.

First coderetreat of 2012 in Amsterdam – Retrospective

At the end of 2011 I started organizing a coderetreat. It started on twitter around October. I’ve also posted about it in my last mini blog. The original event can be found here.

If anyone was interested, they could sign up (max 25 people) for free. All you needed to do was bring your best humor and if possible a laptop with your preferred dev environment set up. (Its not hard to organize one, check here if you’re interested)

If you want to know more about what a coderetreat is, click here. Even better: join a coderetreat somewhere near you and experience it. It is way better than just reading about it 🙂

Honing the craft together

Coderetreat

Lets start with a management summary:

It was awesome!

It reminded me of my experience with the bowling game kata last year. Since you’re repeating the exercise over and over again, you will find different approaches. Even better, because you’re switching pairs, you will have a different mindset literally to approach the problem presented by the coderetreat. Instead of writing a bowling game, you will be working on Conway’s Game Of Life.

The most notable things of that day where:

  • In the very first session we where let ‘free’. We could approach this problem how we wanted. Me and my pair where able to implement the first three rules. However we where not able to implement the fourth rule. Our design was not easy enough to revive dead cells. (gosh, this reminds me of the bowling game code kata first attempt…)
  • The second session we got to choose from different constraints. I picked the “no conditionals” one, because I can get my methods under 4 lines without pain. Programming without no conditions is a whole different story though.
  • The third session with ‘only check in within 2 minutes, else revert everything’ was an eye opener! It really forced you into thinking how to make all (baby) steps. Hence, I am using this at work now and it really works. I commit 10 times more often. Although I don’t make the 2 minute mark yet at work (5 minutes is easy though now).
  • The fourth session was fun, as we where able to implement *all rules* (opposed to the first session), but without the code we had implemented in the first session! We totally isolated the behaviour (this session was called “tdd as if you meant it”) and it blew our minds.

Will I attend more coderetreats? You bet! Just need to take a look at the list of events and pick an appropiate one. If I attend one, I will let you know (on twitter surely, perhaps on this blog).

If you want to know how it looked like, click here to see som pictures of the coderetreat.

I loved the coderetreat, and I’ll surely organize one again in the future. I would recommend anyone who loves his profession to join a coderetreat and practice. You’ll learn new things for sure!

How hard can it be, right? 😉

How to fry your co-developers brain; and then make it better

Here is a little example of code I’ve been faced with (not written by me), that struck me. Although the syntax is correct (it is javascript), it took me a little while to actually understand what is going on.

Here is the code:
[sourcecode language=”javascript”]
someObject: function(data) {
return data.json ? data.json.stateObject ? data.json.stateObject : {} : {};
},
[/sourcecode]

And here is how I refactored it.
[sourcecode language=”javascript”]
someObject: function(data) {
if (data.json) {
if (data.json.stateObject) {
return data.json.stateObject;
}
}
return {};
}
[/sourcecode]

So I mentioned brianpower in a while ago. Michael Feathers though calls such a thing “forming a mental model“. Which is exactly what I meant. You also hear this in the scene of Usability; the less your mental model matches with the actual object you use (expectations), the less likely you’ll probably understand it, let alone *use it*.

So tell me, which one is easier to understand? And how much impact do you think this has?

Creating (default) test instances of a class, without exposing default constructor.

Did you ever need to just have an instance of a class you cannot instantiate because the default constructor is not available? Do you want to create a test instance just to be used in unit tests? Don’t want to break up the design of your code just for testing? This post might help you:

Sample code:

[sourcecode language=”java”]
public MyClass {
final int someField;

private MyClass() {
// may not use this
someField = -1;
}

public MyClass(int someFieldValue) {
someField = someFieldValue;
}

int getSomeField() {
return someField;
}
}
[/sourcecode]

So lets say this class is used in a lot of places. And all we want is have a default test instance. We don’t care about the internals. Lets say we only use it to toss around using an interface which we mock in our tests. Like so:

[sourcecode language=”java”]
public void someUnitTestUsingMyClassAndMockingAnInterface() {
MyClass instance = new MyClass(-1); // this works, but its not really descriptive
MyClass someotherMyClass = new MyClass(-1); // like, what is -1? we don’t want to be bothered by this
EasyMock.expect(someinterface.doSomethingWithMyClass(instance)).andReturn(someotherMyClass);

}

[/sourcecode]

There are two ways to approach this problem:

  • Make private constructor public
  • Use a Factory or a creation method

So, just making the default constructor public sounds easy and is used very often. And, because we want to let our fellow developers know it is just to be used in unit-tests we place a comment in them. This mostly results in something like this:

[sourcecode language=”java”]
public MyClass {
final int someField;

public MyClass() {
// Use this only in tests!
someField = -1;
}
[/sourcecode]

However, this has a few drawbacks:

  • You cannot see if the default constructor should only be used for tests, unless you look into the code itself and read that comment line
  • You broke with the initial design decision to have only objects instantiated with a given “someField”

So how do we prevent this?

We use a factory, or a creation method. Lets start with the first, the factory: If there is a factory, just use it to get a test instance. If the factory lacks a ‘createTestInstance()’ method, then add it. Like so:

[sourcecode language=”java”]
public MyClassFactoryImpl implements MyClassFactory {

public MyClass createTestInstance() {
return new MyClass(-1);
}

}
[/sourcecode]

We could discuss about wether you want the method to be factory method to be static here, and so forth, but the idea is clear.

However, when there is no factory present it is unlikely you want to create a factory just for your unit tests. In that case I suggest you use the creation method, also refered to as the factory method pattern. Simply add a public static method that returns a new instance. The final result would be:

[sourcecode language=”java”]
public MyClass {
final int someField;

public static MyClass createTestInstance() {
return new MyClass();
}

private MyClass() {
// may not use this
someField = -1;
}

public MyClass(int someFieldValue) {
someField = someFieldValue;
}

int getSomeField() {
return someField;
}
}

[/sourcecode]

The benefits are clear:

  • The default constructor remains private, so the design is left untouched
  • It is clear what the createTestInstance method does. No comments needed.

And in your unit test (using EasyMock as well) it looks like this:

[sourcecode language=”java”]
public void someUnitTestUsingMyClassAndMockingAnInterface() {
MyClass instance = MyClass.createTestInstance();
MyCLass someotherMyClass = MyClass.createTestInstance();
EasyMock.expect(someinterface.doSomethingWithMyClass(instance)).andReturn(someotherMyClass);

}

[/sourcecode]

Note: When you wind up with multiple create methods you should consider using the Factory pattern instead. Yes, creating a factory is the best thing once creation logic starts to dominate your class. (ie, think of the Single Responsibility Principle).
Note 2: If you’re a design purist, you might even debate that the public constructor of MyClass should be a creation method. It improves consistency as all instances are created by creation methods. It also has the benefit of describing what kind of instance you get (ie, what the field means). Such improvements are valid and should be made. I consider these improvements small refactorings. They should never be undervalued.

An example of refactoring

As I have promised in my previous post, I would post an example of small refactorings in order to greatly improve the readability and understandability of code.

I own a little project called Dune II – The Maker, and I started writing it a little over 10 years ago. In those years I have learned a lot. I did not have much time in those days to apply my new knowledge to the project. You could say the software was rotting. In order to make it better I need to refactor a lot and I encounter the best examples to improve code without pointing fingers :). In any case I have experienced you have to make mistakes in order to get better. I hope you will learn from the mistakes I made.

So here is a little example I have just checked in the dune2themaker repository, I’ll give you the before (revision 411) and after (revision 412). Of course, I have taken smaller steps to get to the end result. First the original piece of code:

Revision 411 (before)
[sourcecode language=”cpp”]
void cGame::think_winlose() {
bool bSucces = false;
bool bFailed = true;

// determine if player is still alive
for (int i = 0; i < MAX_STRUCTURES; i++)
if (structure[i])
if (structure[i]->getOwner() == 0) {
bFailed = false; // no, we are not failing just yet
break;
}

// determine if any unit is found
if (bFailed) {
// check if any unit is ours, if not, we have a problem (airborn does not count)
for (int i = 0; i < MAX_UNITS; i++)
if (unit[i].isValid())
if (unit[i].iPlayer == 0) {
bFailed = false;
break;
}
}

// win by money quota
if (iWinQuota > 0) {
if (player[0].credits >= iWinQuota) {
// won!
bSucces = true;
}
} else {
// determine if any player (except sandworm) is dead
bool bAllDead = true;
for (int i = 0; i < MAX_STRUCTURES; i++)
if (structure[i])
if (structure[i]->getOwner() > 0 && structure[i]->getOwner()
!= AI_WORM) {
bAllDead = false;
break;
}

if (bAllDead) {
// check units now
for (int i = 0; i < MAX_UNITS; i++)
if (unit[i].isValid())
if (unit[i].iPlayer > 0 && unit[i].iPlayer != AI_WORM)
if (units[unit[i].iType].airborn == false) {
bAllDead = false;
break;
}

}

if (bAllDead)
bSucces = true;

}

// On succes…
if (bSucces) {
// <snip>

}

if (bFailed) {
// <snip>

}
}
[/sourcecode]

The intention of the think_winlose() function is to determine if the player has won or lost, and if so it transitions the game state. These transitions have been snipped.

So when does a player win or lose? It depends if there is a ‘win quota’, or not. The win quota is a number, whenever it is above zero it means the player has to collect at least that many of credits (spice) in order to win. If the win quota is not set, the default win rule : destroy everything of the enemy, will be used. (do you notice I need this much text for just a simple rule? Which I could have prevented If I had code that said this in the first place? At the bottom of this post you can see what I mean :))

Lets take a look at the code and point out what could be done better:

  • There are two booleans bSuccess and bFailed. Which is confusing and ambigious. What is succesfull? What did fail? Why aren’t they one boolean?
  • There are comments all over the place, meaning we could refactor these pieces to code so comments are not needed. (Comments are seen as clutter and should be removed)
  • The code formatting could be done better. If statements should start with { and end with }, even with one line.

And there are more things you will probably find yourself. What I’ll do is point out a few things that could be improved. If you just want to see the final result, just take a look below.

Lets start with the booleans bSuccess and bFailed. Why are there two booleans and whey are they called so vaguely? A little bit of searching in the code and we find out that bSuccess actually means “Mission is accomplished” (player has won), and bFailed means the player has no units and structures (which implicates the player has lost the game). They are not the same boolean, because a player could be alive and not have yet won the game of course. Now we know they are not actually the same boolean, but their naming was vague. A simple “rename variable” made things easier to understand!

[sourcecode language=”cpp”]
void cGame::think_winlose() {
bool bMissionAccomplished = false;
bool isPlayerAlive= true;
[/sourcecode]
(when posting this I realize the two booleans are named differently, consistency is also important to improve readability, so either both should start with “is” or both with a “b”, I prefer the first though)

Right after the booleans a few for loops are used just to find out if there is anything alive for the player. A little bit below we see such for loops again, but for the AI. This is duplicate code and should be removed. Extracting them into a method and make them return a boolean value is easy to do:

[sourcecode language=”cpp”]
bool cGame::playerHasAnyStructures(int iPlayerId) {
for (int i = 0; i < MAX_STRUCTURES; i++) {
if (structure[i]) {
if (structure[i]->getOwner() == iPlayerId) {
return true;
}
}
}
return false;
}
[/sourcecode]

(Again, while posting this I realize this could be even improved a bit more, the iPlayerId should be called ownerId (or the getOwner should be a getPlayerId), so it is obvious we match two of the same kind. Now it could confuse us: is an owner the same as the playerId? Since I know it is, why isn’t it called that way?… :))

Since we extract these for loops we can now set the isPlayerAlive boolean immidiately instead of setting a variable within the loop as it was done in the original example above. Reducing 24 lines into one!:

[sourcecode language=”cpp”]
bool isPlayerAlive = playerHasAnyStructures(HUMAN) || playerHasAnyGroundUnits(HUMAN);
[/sourcecode]

The final result of revision 412 is shown below. It will clearly show the major improvement regarding readability and understandability. Any other developer who comes to this code can see what it does and it is almost a no-brainer.

Result revision 412
[sourcecode language=”cpp”]
void cGame::think_winlose() {
bool bMissionAccomplished = false;
bool isPlayerAlive = playerHasAnyStructures(HUMAN) || playerHasAnyGroundUnits(HUMAN);

if (isWinQuotaSet()) {
bMissionAccomplished = playerHasMetQuota(HUMAN);
} else {
bool isAnyAIPlayerAlive = false;
for (int i = (HUMAN + 1); i < AI_WORM; i++ ) {
if (playerHasAnyStructures(i) || playerHasAnyGroundUnits(i)) {
isAnyAIPlayerAlive = true;
break;
}
}

bMissionAccomplished = !isAnyAIPlayerAlive;
}

if (bMissionAccomplished) {
// <snip>

} else if (!isPlayerAlive) {
// <snip>

}
}
[/sourcecode]

The tremendous power of tiny refactorings

More and more I am being intrigued by the power of a small code refactorings. The positive impact it has on the readability, the maintainability and understandability of your code is great. It keeps code clean(er) and since the changes you make are really small (I’ll demonstrate how small), the chance they will break things is small. Of course, with unit tests (you are writing them right?) making sure you did not break anything: a small refactoring is a low-risk high-benefit practice.

In my experience, small refactorings are undervalued. In fact, I undervalued them much myself since not too long ago. They are disregarded as refactorings that don’t help at all, because it is obvious what the code does. However, the flaw in this rationale, as I see it, is that the intended audience is not only you but also the other developer you work with. Also, you know what code does right now. But would you understand it as quickly if you did not look at it for a week and came back? Would another developer understand the code right away?

When working on code, you’re constantly trying to ‘translate’ the code in your mind in order to know what it is doing. Doing this it leads you to where the bugs are or the areas where you need to make changes, etcetera. This process of ‘translating’ code in your mind comes at a price. Literally the energy you need to burn in your brain to grasp the meaning of a piece of code: brainpower; The easier we understand code, the less brainpower we need. The less energy we burn by understanding what is going on, the more energy we have left to create new things, or fix that bug.

I’ve created a little example. The code below represents an implementation of a mail service. The mail service allows you to send an email using a method that uses 4 parameters: to, from, the subject and the message. When all parameters are filled, the email needs to be sent. That is the only requirement for now. Of course, later we might want to validate if the given email adress of from and to are valid. But for the sake of the argument, lets keep it simple. The following code is ‘mind-boggling’, compared to its simple intention:

[sourcecode language=”java”]
public class MailServiceImpl implements MailService {

public void sendMail(String from, String to, String subject, String message) {
if (from != null && !"".equals(from) &&
to != null && !"".equals(to) &&
subject != null && !"".equals(subject) &&
message != null && !"".equals(subject)) {
// send the email
}
}

}
[/sourcecode]

Basically what this says it that any field may not be null or an empty string. It took 4 lines for just to show. Even though you recognize the pattern of a ‘null or empty check’, it costs you time and energy to make that translation happen. So here is a first suggestion to make it read easier:

[sourcecode language=”java”]
public class MailServiceImpl implements MailService {

public void sendMail(String from, String to, String subject, String message) {
if (parametersAreNotNullOrEmpty(from, to, subject, message)) {
// send the email
}
}

private boolean parametersAreNotNullOrEmpty(String from, String to, String subject,
String message) {
return from != null && !"".equals(from) &&
to != null && !"".equals(to) &&
subject != null && !"".equals(subject) &&
message != null && !"".equals(subject);
}

}
[/sourcecode]

When another developer is reading the sendMail method, he will now know that when the parametersAreNotNullOrEmpty the mail will be sent. It does not need any translation, the method name just says what it does! Simple! By doing this, you greatly reduce the needed brainpower to understand what is going on. The refactoring method used is called Extract method.

Reading code is sometimes easy for your brain to handle. Sometimes your brain seems to explode because of the complex statements and context you need to be aware of. It is strongly tied with the Cyclomatic Complexity, the Coupling between Objects (CBO) and the lack of Cohesion in your code. If you are using any tools to measure your code, like Sonar for example, look for these metrics to find code that needs attention. But it is always better to refactor while you have made the translation in your brain, if you see things can be written simpler to reduce the needed brainpower, by all means do so. Not giving software the appropriate attention might let your code rot. Small refactorings help you prevent that.

I hope you have seen a bit of the power of small refactorings. I will get back to them in my future posts as I will post more concrete examples and how I would/have dealt with them. To me, small refactorings need to be part of your system and are introduced when you do TDD. All too often when the code works, it is not looked at again. Making these small refactorings can make a big difference and take relatively no time.