Unraveling the Power of Property-Based Testing in Unveiling our Own Biases

Yoan Thirion
Yoan Thirion
Unraveling the Power of Property-Based Testing in Unveiling our Own Biases
Unraveling the Power of Property-Based Testing in Unveiling our Own Biases
Table of Contents
Table of Contents

With this article I would like to address an often-overlooked aspect of our craft: our inherent biases in test case identification and, consequently, in the production code we craft.

As Alberto Brandolini aptly put it:

"It is not the domain experts' knowledge that goes into production; it is the assumption of the developers that goes into production."

This insight strikes a chord, highlighting a truth we often sidestep: We are all biased when identifying test cases and so when implementing production code.

💡
How can we fight them?

Different kinds of tests

Before talking about Property-Based Testing, let’s talk about the different test technics we use to check the correctness of our implementations.

Different kind of tests
Different kinds of tests

Those technics are represented on 2 axis :

  • Input scope covered: How much do we cover the scope of possible inputs ?
  • Feature compliance: How much the developed feature is compliant with what is expected ?

Examples are really good to understand requirements :

Given (x, y, ...)
When I [call the subject under test] with (x, y, ...)
Then I expect this (output)

Let's illustrate the power of examples.

Example-based on Goat Numerals

Imagine that we need to implement a system that will allow us to translate numbers to goat numeral language. Goats learned to count in ancient Rome. They adapted Roman numbers into their own dialect but the logic is exactly the same:

Number Roman Goat
1 I M
3 III MMM
4 IV MBa
5 V Ba
10 X Meh
100 C Meeh
400 CD MeehBaaa
... ... ...

We have worked with some 🐐 business experts and have identified with them the following test list:

1 - M
3 - MMM
4 - MBa
5 - Ba
10 - Meh
13 - MehMMM
50 - Baa
100 - Meeh
500, "Baaa"
1000 - 🐐
2499 - 🐐🐐MeehBaaaMehMeehMMeh
0 - None

We will use the test list above to drive our implementation according to the Canon T.D.D approach. After a few iterations, we may end up with a code that looks like this:

public class GoatNumeralsTest
{
    [Fact]
    public void GenerateEmptyFor0()
        => Convert(0)
            .Should()
            .Be(Empty);

    [Theory]
    [InlineData(1, "M")]
    [InlineData(3, "MMM")]
    [InlineData(4, "MBa")]
    [InlineData(5, "Ba")]
    [InlineData(10, "Meh")]
    [InlineData(13, "MehMMM")]
    [InlineData(50, "Baa")]
    [InlineData(100, "Meeh")]
    [InlineData(500, "Baaa")]
    [InlineData(1000, "🐐")]
    [InlineData(2499, "🐐🐐MeehBaaaMehMeehMMeh")]
    public void GenerateGoatNumeralsForNumbers(int number, string expectedGoatNumeral)
        => Convert(number)
            .Should()
            .Be(expectedGoatNumeral);
}

public static class GoatNumeralsConverter
{
    private static readonly Dictionary<int, string> IntToGoatNumerals = new()
    {
        {1000, "🐐"},
        {900, "Meu🐐"},
        {500, "Baaa"},
        {400, "MeehBaaa"},
        {100, "Meeh"},
        {90, "MehMeeh"},
        {50, "Baa"},
        {40, "MehBaa"},
        {10, "Meh"},
        {9, "MMeh"},
        {5, "Ba"},
        {4, "MBa"},
        {1, "M"}
    };

    // Ww choose to return an empty string for edge cases (0 here)
    public static string Convert(int number)
    {
        if (number != 0)
        {
            var goatNumerals = new StringBuilder();
            var remaining = number;

            foreach (var toGoat in IntToGoatNumerals)
            {
                while (remaining >= toGoat.Key)
                {
                    goatNumerals.Append(toGoat.Value);
                    remaining -= toGoat.Key;
                }
            }

            return goatNumerals.ToString();
        }
        else
        {
            return string.Empty;
        }
    }
}

We are pretty happy with this implementation because we have followed our test list, our stakeholders are happy as well.

However we only focused on the input scope we identified and we may have missed a scenario or misunderstood something...

Now, let's imagine an approach as powerful as example-based testing in terms of being feature compliant while also covering a wider range of inputs. That’s the promise behind Property-Based Testing.

That’s the promise behind Property-Based Testing.

Property-Based Testing to the/our Rescue

Enter Property-Based Testing (PBT), a paradigm shift from our traditional example-based testing. Unlike specifying a set of inputs and expected outputs, PBT allows us to define properties our code must adhere to, generating a wide range of test cases that challenge our unconscious biases.

A property looks like this:

for all (x, y, ...)
such that property (x, y, ...)
is satisfied

In other words :

  • Describe the input
  • Describe a property of the output
  • Let the test run a lot of random examples and check if it fails
Details of a property
Anatomy of a property

Property-Based on Goat Numerals

For instance, when working on Goat Numerals we may identify those properties:

for all(numbers)
such as n in n < 1 or n > 3999 holds
convert(invalidNumber) is empty

for all(validNumbers)
such as n in [1; 3999] holds
convert(validNumber) contains only sounds in "M, Ba, Meh, Baa, Meeh, Baaa, 🐐"

Let's implement those properties using FsCheck.
It is a .NET library written in F# for property-based testing, allowing us to define properties that our code should satisfy and automatically generating test cases to verify these properties.

FsCheck: Random Testing for .NET

FsCheck official repository

It is inspired by Haskell's QuickCheck, it helps uncover edge cases and bugs by testing code against a broad set of inputs, enhancing the robustness and reliability of software.

We add the nuget packages:

Install-Package FsCheck
Install-Package FsCheck.Xunit

Create our first Property: only valid Goat Characters

Now that FsCheck is installed we can write a first property:

public partial class GoatNumeralsTest
{
    ...
    
    // We use a Regex to check the result is valid
    [GeneratedRegex("^(?:M|Ba|Meh|Baa|Meeh|Baaa|🐐)+$")]
    private static partial Regex ValidGoatRegex();

    private static bool AllGoatCharactersAreValid(string goatNumber) 
        => ValidGoatRegex().IsMatch(goatNumber);

    // We define a new Property using this FsCheck Attribute
    [Property]
    public void ReturnsOnlyValidSymbolsForValidNumbers() => 
        // for all(validNumbers) such as n in [1; 3999] holds
        Prop.ForAll(ValidNumbers,
                // We call the Convert method for each n value and return true when the string value is valid regarding the Regex
                n => AllGoatCharactersAreValid(Convert(n)))
            // Glue for failing the tests through xUnit
            .QuickCheckThrowOnFailure();
            
    // We define how our machine can generate valid numbers by using Arb and Gen classes (from FsCheck)
    private static readonly Arbitrary<int> ValidNumbers = Gen.Choose(1, 3999).ToArbitrary();
}

At each run, FsCheck will generate new random data:

Run 1 : 1921, 3420, 292, 897, 52, ...
Run 2 : 1229, 1205, 919, 1243, ...
Run 3 : 3466, 644, 2027, 705, ...

Let's run it:

Proprty is falsified easily
Our property fails at first run

It fails after 5 passing values for the value: 2900... meaning one value yields false.
The property is said to be falsified and checking is aborted.

What does it mean if a property-test fails?

If the framework manages to find an edge case, there are three possibilities :

  • The production code is not correctly implemented
  • We are not testing the parameters / property the right way
  • The understanding and definition of the property are not corrects

Isolate the Edge Case

Once a value falsifies our property, a good practice is to isolate it in a classic Unit Test to investigate.

Let's add this test case to isolate the problem:

[Theory]
...
// We asked to our goat expert what is expected for 2900
[InlineData(2900, "🐐🐐Meeh🐐")]
public void GenerateGoatNumeralsForNumbers(int number, string expectedGoatNumeral)
    => Convert(number)
        .Should()
        .Be(expectedGoatNumeral);

Here is the actual result of the test...

Failing Unit Test
Identify a bug in our production code

Well we have identified a bug in our implementation... After investigating the code we have found a typo in the Dictionary used in the conversion process:

private static readonly Dictionary<int, string> IntToGoatNumerals = new()
{
    {1000, "🐐"},
    {900, "Meu🐐"}, // Should be 1000 - 100, CM in roman and Meeh🐐 in Goat
    {100, "Meeh"},
    ...
};

Is it the kind of stuff we do often as developers? 🤔

After having made this fix our property is now green 🥳

Create a second property: empty for invalid numbers

This one is even simpler to write:

// Define what are invalid numbers
private static readonly Arbitrary<int> InvalidNumbers = Arb.Default.Int32().Filter(x => x is <= 0 or > 3999);

[Property]
public void ReturnsNoneForAnyInvalidNumber()
    // Returns true if convert result is Empty
    => Prop.ForAll(InvalidNumbers, n => Convert(n) == Empty)
                .QuickCheckThrowOnFailure();

As for the first property, this one is quickly falsified (after 2 values):

Second property is falsified
Second property fails for -3

Indeed, the current implementation is not really handling the boundaries of the algorithm...

public static string Convert(int number)
{
    // Check only that number is not 0...
    if (number != 0)
    {
        ...
    }
    else
    {
        return string.Empty;
    }
}

We can easily refactor our code to make it pass our property:

public static class GoatNumeralsConverter
{
    private const int Min = 1;
    private const int Max = 3999;

    private static readonly Dictionary<int, string> IntToGoatNumerals = new()
    {
        {1000, "🐐"},
        {900, "Meeh🐐"},
        {500, "Baaa"},
        {400, "MeehBaaa"},
        {100, "Meeh"},
        {90, "MehMeeh"},
        {50, "Baa"},
        {40, "MehBaa"},
        {10, "Meh"},
        {9, "MMeh"},
        {5, "Ba"},
        {4, "MBa"},
        {1, "M"}
    };

    // Add an `IsInRange` method that checks the boundaries
    public static string Convert(int number)
        => IsInRange(number) ? ConvertSafely(number) : Empty;

    private static bool IsInRange(int number) => number is >= Min and <= Max;

    private static string ConvertSafely(int number)
    {
        var goatNumerals = new StringBuilder();
        var remaining = number;

        foreach (var toGoat in IntToGoatNumerals)
        {
            while (remaining >= toGoat.Key)
            {
                goatNumerals.Append(toGoat.Value);
                remaining -= toGoat.Key;
            }
        }

        return goatNumerals.ToString();
    }
}

We have mixed "classic" Unit Tests and Properties in our tests. We end up with a test class that looks like this:

public partial class GoatNumeralsTest
{
    [Fact]
    public void GenerateEmptyFor0()
        => Convert(0)
            .Should()
            .Be(Empty);

    [Theory]
    [InlineData(1, "M")]
    [InlineData(3, "MMM")]
    [InlineData(4, "MBa")]
    [InlineData(5, "Ba")]
    [InlineData(10, "Meh")]
    [InlineData(13, "MehMMM")]
    [InlineData(50, "Baa")]
    [InlineData(100, "Meeh")]
    [InlineData(500, "Baaa")]
    [InlineData(1000, "🐐")]
    [InlineData(2499, "🐐🐐MeehBaaaMehMeehMMeh")]
    [InlineData(2900, "🐐🐐Meeh🐐")]
    public void GenerateGoatNumeralsForNumbers(int number, string expectedGoatNumeral)
        => Convert(number)
            .Should()
            .Be(expectedGoatNumeral);

    [GeneratedRegex("^(?:M|Ba|Meh|Baa|Meeh|Baaa|🐐)+$")]
    private static partial Regex ValidGoatRegex();

    private static readonly Arbitrary<int> ValidNumbers = Gen.Choose(1, 3999).ToArbitrary();

    [Property]
    public void ReturnsOnlyValidSymbolsForValidNumbers()
        => Prop.ForAll(ValidNumbers,
                n => AllGoatCharactersAreValid(Convert(n)))
            .QuickCheckThrowOnFailure();

    private static bool AllGoatCharactersAreValid(string goatNumber) => ValidGoatRegex().IsMatch(goatNumber);

    private static readonly Arbitrary<int> InvalidNumbers = Arb.Default.Int32().Filter(x => x is <= 0 or > 3999);

    [Property]
    public void ReturnsEmptyForAnyInvalidNumber()
        => Prop.ForAll(InvalidNumbers, n => Convert(n) == Empty)
            .QuickCheckThrowOnFailure();
}

People often equate example tests with property. In my opinion, they complement each other. One verify the expected results and the other validate business properties.

Conclusion

> By using PBT we have quickly identified 2 bugs in our implementation by fighting our own biases that limited during test list elaboration and implementation. Imagine what could be the impact of this practice on your own code 🤔.

PBT shines in its ability to unearth edge cases that might elude manual testing, thus preventing potential bugs from sneaking into production.

There are lots of Use Cases with PBT:

  • Idempotence: Verify operations like resetting a Goat's state are idempotent.
    • PBT can automatically check that multiple applications of the operation do not alter its outcome: f(f(x)) = f(x)
    • Examples: UpperCase, Delete
  • Round Tripping: For a given process, PBT can ensure that "serializing" and then "deserializing" a Goat object returns to the original state, thus guaranteeing data integrity.
    • from(to(x)) = x
    • Examples: Serialization, Conversion
  • Checking Invariants: Invariants such as "A Goat's age cannot be negative" can be continuously validated across generated test cases, ensuring the model's integrity.
  • Replacing Parameterized Tests: Instead of manually crafting tests for boundary conditions—such as exceptions thrown for invalid Goat ages—PBT systematically explores these scenarios.
  • Refactoring or Rewriting Code: PBT assists in validating the new implementation against the old, ensuring behavioral consistency while refactoring or rewriting sections of the codebase.
    • f(x) = new_f(x)
💡
What do you think about it?
How it could be useful to you? and when?
GitHub - ythirion/goat
Contribute to ythirion/goat development by creating an account on GitHub.

Source code of this article

To Go Further

If you want to go further on this kind of topics I invite you to take a look at our Advent of Craft repository. We talk and share about topics related to: Test-Driven Development, Clean Testing, Refactoring, Design, and Functional Progamming.

GitHub - advent-of-craft/advent-of-craft: Advent of Craft
Advent of Craft. Contribute to advent-of-craft/advent-of-craft development by creating an account on GitHub.

Advent Of Craft repository



Join the conversation.

Great! Check your inbox and click the link
Great! Next, complete checkout for full access to Goat Review
Welcome back! You've successfully signed in
You've successfully subscribed to Goat Review
Success! Your account is fully activated, you now have access to all content
Success! Your billing info has been updated
Your billing was not updated