Developers Guide: How To Write Good Test

Code without tests is bad code. It doesn’t matter how well written it is; it doesn’t matter how pretty or object-oriented or well-encapsulated it is. With tests, we can change the behavior of our code quickly and verifiably. Without them, we really don’t know if our code is getting better or worse. — Michael C. Feathers

This is an attempt to help developers write better test codes.

Fundamentals Of Writing Good Tests

Writing tests first is not difficult at all. Once a programmer adheres this pattern, it becomes a habit which helps to view systems inside-out as well as outside-in, and end-to-end. But to do this the individual must have a passion to be a good software engineer who wants to design better components, write efficient, maintainable, human-readable code, and build a robust system. We will try to cover those basic units that will help us write better tests.


Functions are the building blocks of any system. We have to take care of how we write functions so that we can assure it’s functionality is always as expected, no supersizes and magics!

General guideline to write quality functions:

  • A function should be small, 8–10 lines max.
  • A function should do just one thing.
  • A functions name should use descriptive names to accurately describe what it does.
  • A function should take fewer arguments. This also helps in designing the data structure.
  • A function should have no side effects. A function or expression is said to have a side effect if it modifies some state variable value outside its local environment, that is to say has an observable effect besides returning a value to the invoker of the operation.
  • A function should not use flag arguments. A flag argument is a kind of function argument that tells the function to carry out a different operation depending on its value. Split method into several independent methods that can be called from the client without the flag.

Unit Test

A test that verifies the behavior of some small part of the overall system.

What makes a test a unit test is that the System Under Test (SUT) is a very small subset of the overall system and may be unrecognizable to someone who is not involved in building the software. The actual SUT may be as small as a single object or method that is a consequence of one or more design decisions although its behavior may also be traced back to some aspect of the functional requirements. There is no need for unit tests to be readable, recognizable or verifiable by the customer or business domain expert.

A test is NOT a unit test if:

  • It talks to the database.
  • It communicates across the network.
  • It touches the file system.
  • It can’t run correctly at the same time as any of your other unit tests.
  • You have to do special things to your environment (such as editing config files) to run it.

Unit tests encourage good design and rapid feedback and they seem to help teams avoid a lot of trouble.

Test Driven Development (TDD)

TDD is a software development process that relies on the repetition of a very short development cycle: requirements are turned into very specific test cases, then the software is improved to pass the new tests, only. This is opposed to software development that allows software to be added that is not proven to meet requirements.

The Three Laws of TDD:

  • First Law : You may not write production code until you have written a failing unit test.
  • Second Law : You may not write more of a unit test than is sufficient to fail, and not compiling is failing.
  • Third Law : You may not write more production code than is sufficient to pass the currently failing test.

Behavior-Driven Development (BDD)

BDD is a software development process that puts feature behaviors first.

A behavior is how a feature operates within a well-defined scenario of inputs, actions, and outcomes. Behaviors are identified using specification by example. Behavior specs become the requirements, the acceptance criteria, and the acceptance tests.

Standard way to write BDD test specs in the format “Given-When-Then”, so called Gherkin language.

  • Given steps should use past or present-perfect tense, because they represent an initial state that must already be established.
  • When steps should use present tense, because they represent actions actively performed as part of the behavior.
  • Then steps should use present or future tense, because they represent what should happen after the behavior actions.


Feature: Google Searching; Scenario: Search result linking

  • Given Google search results for “panda” are shown
  • When the user clicks the first result link
  • Then the page for the chosen result link is displayed

In a way, BDD is an extension of TDD to test customers of the systems behaviors in the form of acceptance tests.

Red Green Refactor — Improve Your Test Code

So we know the basics now let’s improve the test codes using a very effective process called Red-Green-Refactor.

Before understanding the process let’s understand why it works. This process works well for two reasons:

First, we’re working in baby steps, constantly forming hypotheses and checking them. (“The bar should turn red now… now it should turn green… now it should still be green… now it should be red…”). Whenever we make a mistake, we catch it immediately. It’s only been a few lines of code since we made the mistake, those makes that are easy to find and fix. And we all know that finding mistakes, not fixing them, is the most expensive part of programming.

The other reason is we are always thinking about design, deciding which test to write next, or which interface to create to, or we how to refactor a certain section, we are designing the system. All of this thought on design is immediately tested by turning it into code, which very quickly shows you if the design is good or bad.

There are five steps in this process:

  • Think: Figure out what test will best move your code towards completion. (Take as much time as you need. This is the hardest step for beginners.)
  • Red: Write a very small amount of test code. Only a few lines, usually no more than five. Run the tests and watch the new test fail: the test bar should turn red. This should only take about 30 seconds.
  • Green: Write a very small amount of production code. Again, usually no more than five lines of code. Don’t worry about design purity or conceptual elegance. Sometimes you can just hard-code the answer. This is okay because you’ll be refactoring in a moment. Run the tests and watch them pass: the test bar will turn green. This should only take about 30 seconds, too.
  • Refactor: Now that your tests are passing, you can make changes without worrying about breaking anything. Pause for a moment. Take a deep breath if you need to. Then look at the code you’ve written, and ask yourself if you can improve it. Look for duplication and other “code smells.” If you see something that doesn’t look right, but you’re not sure how to fix it, that’s okay. Take a look at it again after you’ve gone through the cycle a few more times. Take as much time as you need on this step. After each little refactoring, run the tests and make sure they still pass.
  • Repeat: Do it again. You’ll repeat this cycle dozens of times in an hour. Typically, you’ll run through several cycles, three to five, very quickly, then find yourself slowing down and spending more time on refactoring. Than you’ll speed up again. 20–40 cycles in an hour is not unreasonable.

Different Types Of Test

Unit test is not the only type of test we write. Lets fill in more on them.

Test Pyramid
Test Pyramid

Unit tests are very low level, close to the source of your application. They consist in testing individual methods and functions of the classes, components or modules used by your software. Unit tests are in general quite cheap to automate and can be run very quickly by a continuous integration server.

Integration tests verify that different modules or services used by your application work well together. For example, it can be testing the interaction with the database or making sure that micro-services work together as expected. These types of tests are more expensive to run as they require multiple parts of the application to be up and running.

Functional tests focus on the business requirements of an application. They only verify the output of an action and do not check the intermediate states of the system when performing that action.

There is sometimes a confusion between integration tests and functional tests as they both require multiple components to interact with each other. The difference is that an integration test may simply verify that you can query the database while a functional test would expect to get a specific value from the database as defined by the product requirements.

End-to-end testing replicates a user behavior with the software in a complete application environment. It verifies that various user flows work as expected and can be as simple as loading a web page or logging in or much more complex scenarios verifying email notifications, online payments, etc…

End-to-end tests are very useful, but they’re expensive to perform and can be hard to maintain when they’re automated. It is recommended to have a few key end-to-end tests and rely more on lower level types of testing (unit and integration tests) to be able to quickly identify breaking changes.

Acceptance tests are formal tests executed to verify if a system satisfies its business requirements. They require the entire application to be up and running and focus on replicating user behaviors. But they can also go further and measure the performance of the system and reject changes if certain goals are not met.

Performance tests check the behaviors of the system when it is under significant load. These tests are non-functional and can have the various form to understand the reliability, stability, and availability of the platform. For instance, it can be observing response times when executing a high number of requests, or seeing how the system behaves with a significant of data.

Performance tests are by their nature quite costly to implement and run, but they can help you understand if new changes are going to degrade your system.

Smoke tests are basic tests that check basic functionality of the application. They are meant to be quick to execute, and their goal is to give you the assurance that the major features of your system are working as expected.

Smoke tests can be useful right after a new build is made to decide whether or not you can run more expensive tests, or right after a deployment to make sure that they application is running properly in the newly deployed environment.

The State

A lot of developers do not want to write tests. Many engineering cultures do not follow Test Driven Development(TDD), Behavior Driven Development, eXtreme Programming or any other forms programing principles that mandates writing tests because they do not see it as a productive activity. Hence, the comic notion here is that many of them just wants to “Be Agile” but the agile community mandates writing TDD, they encourage “DevOps” where their primary objective is to automate operations but they wish to achieve it with out any automated tests. Irony!

We should write test. Start with Unit test. It’s easy. Following any framework immediately is hard, it takes some time to adapt. If want to improve the quality of code and provide some assurance to guarantee that the system being built works under agreed criteria, a culture must be set that prioritizes writing tests.

I encourage developers to write tests, and write them first because it’s not just and act, but a philosophy!


Michael C. Feathers, Martin Fowler, Robert C. Martin and James Shore