bookmark_borderAn example of refactoring

As I have promised in my previous post, I would post an example of small refactorings in order to greatly improve the readability and understandability of code.

I own a little project called Dune II – The Maker, and I started writing it a little over 10 years ago. In those years I have learned a lot. I did not have much time in those days to apply my new knowledge to the project. You could say the software was rotting. In order to make it better I need to refactor a lot and I encounter the best examples to improve code without pointing fingers :). In any case I have experienced you have to make mistakes in order to get better. I hope you will learn from the mistakes I made.

So here is a little example I have just checked in the dune2themaker repository, I’ll give you the before (revision 411) and after (revision 412). Of course, I have taken smaller steps to get to the end result. First the original piece of code:

Revision 411 (before)

void cGame::think_winlose() {
	bool bSucces = false;
	bool bFailed = true;

	// determine if player is still alive
	for (int i = 0; i < MAX_STRUCTURES; i++)
		if (structure[i])
			if (structure[i]->getOwner() == 0) {
				bFailed = false; // no, we are not failing just yet
				break;
			}

	// determine if any unit is found
	if (bFailed) {
		// check if any unit is ours, if not, we have a problem (airborn does not count)
		for (int i = 0; i < MAX_UNITS; i++)
			if (unit[i].isValid())
				if (unit[i].iPlayer == 0) {
					bFailed = false;
					break;
				}
	}

	// win by money quota
	if (iWinQuota > 0) {
		if (player[0].credits >= iWinQuota) {
			// won!
			bSucces = true;
		}
	} else {
		// determine if any player (except sandworm) is dead
		bool bAllDead = true;
		for (int i = 0; i < MAX_STRUCTURES; i++)
			if (structure[i])
				if (structure[i]->getOwner() > 0 && structure[i]->getOwner()
						!= AI_WORM) {
					bAllDead = false;
					break;
				}

		if (bAllDead) {
			// check units now
			for (int i = 0; i < MAX_UNITS; i++)
				if (unit[i].isValid())
					if (unit[i].iPlayer > 0 && unit[i].iPlayer != AI_WORM)
						if (units[unit[i].iType].airborn == false) {
							bAllDead = false;
							break;
						}

		}

		if (bAllDead)
			bSucces = true;

	}

	// On succes...
	if (bSucces) {
		// <snip>

	}

	if (bFailed) {
		// <snip>

	}
}

The intention of the think_winlose() function is to determine if the player has won or lost, and if so it transitions the game state. These transitions have been snipped.

So when does a player win or lose? It depends if there is a ‘win quota’, or not. The win quota is a number, whenever it is above zero it means the player has to collect at least that many of credits (spice) in order to win. If the win quota is not set, the default win rule : destroy everything of the enemy, will be used. (do you notice I need this much text for just a simple rule? Which I could have prevented If I had code that said this in the first place? At the bottom of this post you can see what I mean :))

Lets take a look at the code and point out what could be done better:

  • There are two booleans bSuccess and bFailed. Which is confusing and ambigious. What is succesfull? What did fail? Why aren’t they one boolean?
  • There are comments all over the place, meaning we could refactor these pieces to code so comments are not needed. (Comments are seen as clutter and should be removed)
  • The code formatting could be done better. If statements should start with { and end with }, even with one line.

And there are more things you will probably find yourself. What I’ll do is point out a few things that could be improved. If you just want to see the final result, just take a look below.

Lets start with the booleans bSuccess and bFailed. Why are there two booleans and whey are they called so vaguely? A little bit of searching in the code and we find out that bSuccess actually means “Mission is accomplished” (player has won), and bFailed means the player has no units and structures (which implicates the player has lost the game). They are not the same boolean, because a player could be alive and not have yet won the game of course. Now we know they are not actually the same boolean, but their naming was vague. A simple “rename variable” made things easier to understand!

void cGame::think_winlose() {
	bool bMissionAccomplished = false;
	bool isPlayerAlive= true;

(when posting this I realize the two booleans are named differently, consistency is also important to improve readability, so either both should start with “is” or both with a “b”, I prefer the first though)

Right after the booleans a few for loops are used just to find out if there is anything alive for the player. A little bit below we see such for loops again, but for the AI. This is duplicate code and should be removed. Extracting them into a method and make them return a boolean value is easy to do:

bool cGame::playerHasAnyStructures(int iPlayerId) {
    for (int i = 0; i < MAX_STRUCTURES; i++) {
		if (structure[i]) {
			if (structure[i]->getOwner() == iPlayerId) {
				return true;
			}
		}
	}
    return false;
}

(Again, while posting this I realize this could be even improved a bit more, the iPlayerId should be called ownerId (or the getOwner should be a getPlayerId), so it is obvious we match two of the same kind. Now it could confuse us: is an owner the same as the playerId? Since I know it is, why isn’t it called that way?… :))

Since we extract these for loops we can now set the isPlayerAlive boolean immidiately instead of setting a variable within the loop as it was done in the original example above. Reducing 24 lines into one!:

bool isPlayerAlive = playerHasAnyStructures(HUMAN) || playerHasAnyGroundUnits(HUMAN);

The final result of revision 412 is shown below. It will clearly show the major improvement regarding readability and understandability. Any other developer who comes to this code can see what it does and it is almost a no-brainer.

Result revision 412

void cGame::think_winlose() {
	bool bMissionAccomplished = false;
	bool isPlayerAlive = playerHasAnyStructures(HUMAN) || playerHasAnyGroundUnits(HUMAN);

    if (isWinQuotaSet()) {
		bMissionAccomplished = playerHasMetQuota(HUMAN);
	} else {
		bool isAnyAIPlayerAlive = false;
		for (int i = (HUMAN + 1); i < AI_WORM; i++ ) {
			if (playerHasAnyStructures(i) || playerHasAnyGroundUnits(i)) {
				isAnyAIPlayerAlive = true;
				break;
			}
		}

		bMissionAccomplished = !isAnyAIPlayerAlive;
	}

    if (bMissionAccomplished) {
		// <snip>
		
	} else if (!isPlayerAlive) {
		// <snip>

	}
}

bookmark_borderThe tremendous power of tiny refactorings

More and more I am being intrigued by the power of a small code refactorings. The positive impact it has on the readability, the maintainability and understandability of your code is great. It keeps code clean(er) and since the changes you make are really small (I’ll demonstrate how small), the chance they will break things is small. Of course, with unit tests (you are writing them right?) making sure you did not break anything: a small refactoring is a low-risk high-benefit practice.

In my experience, small refactorings are undervalued. In fact, I undervalued them much myself since not too long ago. They are disregarded as refactorings that don’t help at all, because it is obvious what the code does. However, the flaw in this rationale, as I see it, is that the intended audience is not only you but also the other developer you work with. Also, you know what code does right now. But would you understand it as quickly if you did not look at it for a week and came back? Would another developer understand the code right away?

When working on code, you’re constantly trying to ‘translate’ the code in your mind in order to know what it is doing. Doing this it leads you to where the bugs are or the areas where you need to make changes, etcetera. This process of ‘translating’ code in your mind comes at a price. Literally the energy you need to burn in your brain to grasp the meaning of a piece of code: brainpower; The easier we understand code, the less brainpower we need. The less energy we burn by understanding what is going on, the more energy we have left to create new things, or fix that bug.

I’ve created a little example. The code below represents an implementation of a mail service. The mail service allows you to send an email using a method that uses 4 parameters: to, from, the subject and the message. When all parameters are filled, the email needs to be sent. That is the only requirement for now. Of course, later we might want to validate if the given email adress of from and to are valid. But for the sake of the argument, lets keep it simple. The following code is ‘mind-boggling’, compared to its simple intention:

public class MailServiceImpl implements MailService {

	public void sendMail(String from, String to, String subject, String message) {
		if (from != null && !"".equals(from) &&
			to != null && !"".equals(to) &&
			subject != null && !"".equals(subject) &&
			message != null && !"".equals(subject)) {
			// send the email
		}
	}

}

Basically what this says it that any field may not be null or an empty string. It took 4 lines for just to show. Even though you recognize the pattern of a ‘null or empty check’, it costs you time and energy to make that translation happen. So here is a first suggestion to make it read easier:

public class MailServiceImpl implements MailService {

	public void sendMail(String from, String to, String subject, String message) {
		if (parametersAreNotNullOrEmpty(from, to, subject, message)) {
			// send the email
		}
	}

	private boolean parametersAreNotNullOrEmpty(String from, String to, String subject,
			String message) {
		return from != null && !"".equals(from) &&
			to != null && !"".equals(to) &&
			subject != null && !"".equals(subject) &&
			message != null && !"".equals(subject);
	}

}

When another developer is reading the sendMail method, he will now know that when the parametersAreNotNullOrEmpty the mail will be sent. It does not need any translation, the method name just says what it does! Simple! By doing this, you greatly reduce the needed brainpower to understand what is going on. The refactoring method used is called Extract method.

Reading code is sometimes easy for your brain to handle. Sometimes your brain seems to explode because of the complex statements and context you need to be aware of. It is strongly tied with the Cyclomatic Complexity, the Coupling between Objects (CBO) and the lack of Cohesion in your code. If you are using any tools to measure your code, like Sonar for example, look for these metrics to find code that needs attention. But it is always better to refactor while you have made the translation in your brain, if you see things can be written simpler to reduce the needed brainpower, by all means do so. Not giving software the appropriate attention might let your code rot. Small refactorings help you prevent that.

I hope you have seen a bit of the power of small refactorings. I will get back to them in my future posts as I will post more concrete examples and how I would/have dealt with them. To me, small refactorings need to be part of your system and are introduced when you do TDD. All too often when the code works, it is not looked at again. Making these small refactorings can make a big difference and take relatively no time.

bookmark_borderA Software Quality Model (Part II) – Translating customer language into metrics, scoring quality

In my previous post I have explained the context of my thesis, and the various software quality models that are evaluated.

For my thesis I have extended the Software Quality Model of Willmer. Although it is not an exact implementation of the model, it is inspired by it. Also, the influence of the customer is processed into this model. The goal of this model is to translate customer desires into metrics, in order to calculate the total quality.

When talking about software quality, it is either very concrete (I want a red car!) or very abstract (it has to be reliable!). Customers tend to tell their ‘experiences of quality’ in sentences. This is the first step of the model. Try to get a few (eight at most) of these sentences. They must be distinctive. (Don’t have four sentences preach about security…)

Translating these sentences into concrete, measurable ‘things’ for developers, is another story. But before doing that , ask your customer what the relative importance is of all these sentences you just have written down. Just imagine that there is this situation where you have to go live. But , there is an issue that needs to be tackled, all blocking, one for each aspect (sentence). If you could pick one, which would you pick? Would you tackle the first sentence, the second? etc?

Of course, your customer will would tell  in his own domain language what the most important thing is. Try to map that (and confirm) with these sentences. Score them and try to get an ‘order of importance’. After you have done that, you have your first important goal reached: You know the relative importance of each quality sentence (aspect from now on).

So, what now? The next step is to map these aspects/sentences to Software Quality Attributes (also known as non-functional requirements). You either need a Software Engineer to do that, or even better, try to do it with your team. Before mapping these, try to make a selection of Quality Attributes first that are most relevant to you. Ie, try to use tree of Boehm or ISO9160 as reference. Within my theses I have used 9 Quality Attributes, some Quality Attributes are ‘sub-quality attributes’ of others. Example of Quality Attributes: Understandability, Reliability, Security, Availability, Complexity, etc.

The result of this mapping is that you get for each aspect several quality attributes. Not all quality attributes are applicable to an aspect. Try to figure out how much quality attributes are applicable to an aspect. Do this by asking teammembers and for each member that selects a quality attribute, score it. This way you can calculate relatively how your team thinks the mapping should be. This is important, because the eventual result of your measurement (see below) should be the product of the customer and the team that work on that product.

So, you have a few aspects, and each aspect has a few quality attributes. All that is left is to map metrics to quality attributes. Mapping this is fairly easy, there are quite a bunch of metrics out there. Each metric tries to measure (a piece of) a quality attribute. Some are easy, like complexity (quality attribute) can be measured by (although it is not limited to) the Cyclomatic Complexity metric by McCabe.

So basically you end up with this:

Aspect(n..8) –> Quality Attribute(n) –> Metrics(n)

Where:

  • the total quality of the system, is the combination of all aspects (all aspects relativity make 100%)
  • you should keep eight aspects (believe me, more will only make it harder to distinguish and make decisions)
  • you should attach quality attributes to each aspect, and determine their relative applicability to this aspect
  • you should attach metrics to quality attributes

So in the end, how do you score quality? Is it possible with this model? Yes, certainly it is.

Once you have found metrics, and attached them to quality attributes. You should formulate ‘scoring rules’. This means, you need to write down how you will interpret results of a metric and translate that onto a scale of 1 to 10. A scoring rule could be:

“Lines of Code (LOC) may be 1000 for a file max. Every 100 lines more substracts one point. (Giving 10 points with 1000 or lower and with 2000 lines a 1)”

This means, a LOC of 1000 will score a 10. A LOC of 1500 scores a 5, a LOC of 2000 or higher scores a 1.

Do this for all metrics, and eventually you will be able to calculate the total quality of the system.

In order to make this more concrete, here an example of such a calculation:

Total Quality Score = sum of scores of each aspect

Aspect score = (Sum of all relative scores of all applicable attributes) * relative importance

Attribute score = (Sum of all relative scores of all applicable metrics)

Example (For the sake of this example, attributes are scored already)

Aspect #1

Is for 30% important (relative to Aspect #2)

Attributes:

A -> for 40% applicable

B -> for 60% applicable

Aspect #2

Is for 70% important (relative to Aspect #1)

Attributes:

C -> for 70% applicable

B -> for 30% applicable

Scoring:

Attribute A scores 7 (out of 10)

Attribute B scores 5 (out of 10)

Attribute C scores 8 (out of 10)

Total quality calculation:

Aspect #1

A = 7 * 40% = 2,8

B = 5 * 60% = 3,0

Absolute score is 2,8 + 3,0 = 5,8

Aspect #2

C = 8 * 70% = 5,6

B = 5 * 30% = 1,5

Absolute score is 5,6 + 1,5 = 7,1

Total quality is:

Aspect #1 –> 5,8 * 30% (importance) = 1,74

Aspect #2 –> 7,1 * 70% (importance) = 4,97

Total quality score is 1,74 + 6,71 ==> 6,71

6,71 on a score of 1 to 10 (1 being bad, 10 being best).

bookmark_borderThe need for quality of Web Applications

I’ve been working on my thesis (about meassuring Software Quality) and its reaching its final state. I will blog some posts about my thoughts and findings about Software Quality from here. This post was a bit old, I dusted it off and actualised it a bit. More to come!

Here it comes:

Today, it is unthinkable to not use an internet related service in your daily life. Think about staying in touch with family and friends using email (gmail, yahoo, hotmail, etc), social networking (twitter, facebook, hyves, etc) or more.

If you want to book a trip, you arrange this using the internet. When you want to buy a new laptop, you find the specs and reviews on the internet. You use a comparison site to find the cheapest (web)store. And if you want to buy it, you use the internet.

It is fair to say that the internet has become more and more important to our lives. Even more than we imagined 5 years ago. How would we deal when our mail provider is down? How would we react when facebook is down? What about twitter? How do we do our bank business when we cannot reach the online-banking site?

I think you got the point.

Companies developing and maintaining (ecommerce) websites or services are constantly pressured to deliver high quality software in the shortest possible amount of time. Having these factors, time and quality; one cannot exclude the other. They do influence each other though. When you want to deliver fast, you ought to make sacrifices in terms of designing and testing the product. This will eventually lead to a lower quality. The opposite is true as well, if you spend more time in the design and testing, you will likely deliver higher quality. Also, unlike with traditional software, it is not always the first to deliver that is the best.

In the land of the internet, it is not only the case to get users to your website, but also to keep them revisiting. If you deliver a fancy new feature, you might get the attention. But if you did not put enough effort to make it high quality, you might lose them to your competitor who delivers a similiar feature a few days later, but twice as good. Remember, your visitors can easily switch if you cannot deliver what they expect.

So what does a visitor expect from a website? Looking around on the net you can find some definitions of quality of good websites. That is the biggest question of all. How can you deliver fast, and yet have some sort of quality to rely on?

You could use a Software Quality Model for that, like the ISO-9126. I would say, you’d use a model like the ISO-9126 and create/use a Web Application Quality Standard.

So what is this standard? I haven’t found a definite model, although I have found a good resource here.
In my opinion, the following Software Quality Attributes would belong in such a model:
performance. Your visitors will leave if loading a page takes longer than 2 seconds. Directly influencing possible sales!
security. Atleast be aware of the top ten of OWASP. Better yet, make sure you are not vulnerable to these exploits.
maintainability. Developers should be able to easily extend code, or fix bugs without causing too much regression.
reliability. Should your website simply shut down once the load is too high? How much ‘page faults’ may appear? One of thousand requests? One of hundred?.

Each of these attributes can be split into several measurable units. For example, performance wise you could measure the amount of time it takes to get a page loaded in the browser. Maintainability has some known measures (like McCabe’s Cyclomatic Complexity).

Note: Recently I figured that the ISO-9126 is actually not a model, but a ‘standard’. It really does not tell you *how* to do things, but it does give guidelines. The Software Quality model I’ve made does tell  you how to do things.

bookmark_borderChoosing a Software Quality Model

For my thesis (Software Quality) I am currently figuring out how to choose a proper Software Quality Model in order to lay down the foundation of a Software Quality Definition for the project I am working on.

For now I have choosen this approach, and I’d like to hear from others how they (would) have tackled this problem.

It is good to know that I have narrowed down what “Software Quality” is. I’ll be focussing on the product (Garvin, Gosby) quality and the targetted audience are Software Engineers.

At first I will determine what kind of product I am working on. In this case it is a website, selling things. Some characteristics come to mind, like “24/7” or “should display always the correct (price) information”.

With these characteristics, I’ll look up Software Quality Attributes that would fit in.

This gives me a table, with in the first column all the characteristics, and next to them a few Software Quality Attributes that would fall into that specific characteristic description.

Now that I have a list of Software Quality Attributes, I can look up what Quality Model has the most matching Software Quality Attributes in them as well. Models that focus on different attributes would not interest me anymore, because it would not apply to the kind of product I focus on.

I found a nice model comparison, with Software Quality Attributes, from NASA. For your convenience I show it here:

With this, it would be possible to choose a Software Quality Model and work from there.

I am half-way with this method, and so far I think this is the way to go. If you think this is a bit too easy, or I should really take other considerations. Please let me know.

Note: thanks to Wouter for his suggestion who got me into this direction.