Mastering Property-Based Testing in .NET with FsCheck
With this article I would like to address an often-overlooked aspect of our craft: our inherent biases in test case identification and, consequently, in the production code we craft.
As Alberto Brandolini
aptly put it:
"It is not the domain experts' knowledge that goes into production; it is the assumption of the developers that goes into production."
This insight strikes a chord, highlighting a truth we often sidestep: We are all biased when identifying test cases and so when implementing production code.
Different kinds of tests
Before talking about Property-Based Testing
, let’s talk about the different test technics we use to check the correctness of our implementations.
Those technics are represented on 2 axis :
- Input scope covered: How much do we cover the scope of possible inputs ?
- Feature compliance: How much the developed feature is compliant with what is expected ?
Examples are really good to understand requirements :
Given (x, y, ...)
When I [call the subject under test] with (x, y, ...)
Then I expect this (output)
Let's illustrate the power of examples.
Example-based on Goat Numerals
Imagine that we need to implement a system that will allow us to translate numbers to goat numeral language
. Goats learned to count in ancient Rome. They adapted Roman numbers into their own dialect but the logic is exactly the same:
Number | Roman | Goat |
---|---|---|
1 | I | M |
3 | III | MMM |
4 | IV | MBa |
5 | V | Ba |
10 | X | Meh |
100 | C | Meeh |
400 | CD | MeehBaaa |
... | ... | ... |
We have worked with some 🐐 business experts and have identified with them the following test list:
1 - M
3 - MMM
4 - MBa
5 - Ba
10 - Meh
13 - MehMMM
50 - Baa
100 - Meeh
500, "Baaa"
1000 - 🐐
2499 - 🐐🐐MeehBaaaMehMeehMMeh
0 - None
We will use the test list above to drive our implementation according to the Canon T.D.D
approach. After a few iterations, we may end up with a code that looks like this:
public class GoatNumeralsTest
{
[Fact]
public void GenerateEmptyFor0()
=> Convert(0)
.Should()
.Be(Empty);
[Theory]
[InlineData(1, "M")]
[InlineData(3, "MMM")]
[InlineData(4, "MBa")]
[InlineData(5, "Ba")]
[InlineData(10, "Meh")]
[InlineData(13, "MehMMM")]
[InlineData(50, "Baa")]
[InlineData(100, "Meeh")]
[InlineData(500, "Baaa")]
[InlineData(1000, "🐐")]
[InlineData(2499, "🐐🐐MeehBaaaMehMeehMMeh")]
public void GenerateGoatNumeralsForNumbers(int number, string expectedGoatNumeral)
=> Convert(number)
.Should()
.Be(expectedGoatNumeral);
}
public static class GoatNumeralsConverter
{
private static readonly Dictionary<int, string> IntToGoatNumerals = new()
{
{1000, "🐐"},
{900, "Meu🐐"},
{500, "Baaa"},
{400, "MeehBaaa"},
{100, "Meeh"},
{90, "MehMeeh"},
{50, "Baa"},
{40, "MehBaa"},
{10, "Meh"},
{9, "MMeh"},
{5, "Ba"},
{4, "MBa"},
{1, "M"}
};
// Ww choose to return an empty string for edge cases (0 here)
public static string Convert(int number)
{
if (number != 0)
{
var goatNumerals = new StringBuilder();
var remaining = number;
foreach (var toGoat in IntToGoatNumerals)
{
while (remaining >= toGoat.Key)
{
goatNumerals.Append(toGoat.Value);
remaining -= toGoat.Key;
}
}
return goatNumerals.ToString();
}
else
{
return string.Empty;
}
}
}
We are pretty happy with this implementation because we have followed our test list
, our stakeholders are happy as well.
However we only focused on the input scope we identified and we may have missed a scenario or misunderstood something...
Now, let's imagine an approach as powerful as example-based testing in terms of being feature compliant while also covering a wider range of inputs. That’s the promise behind Property-Based Testing.
That’s the promise behind Property-Based Testing
.
Property-Based Testing to the/our Rescue
Enter Property-Based Testing
(PBT), a paradigm shift from our traditional example-based testing. Unlike specifying a set of inputs and expected outputs, PBT allows us to define properties our code must adhere to, generating a wide range of test cases that challenge our unconscious biases.
A property looks like this:
for all (x, y, ...)
such that property (x, y, ...)
is satisfied
In other words :
- Describe the input
- Describe a property of the output
- Let the test run a lot of random examples and check if it fails
Property-Based on Goat Numerals
For instance, when working on Goat Numerals
we may identify those properties:
for all(numbers)
such as n in n < 1 or n > 3999 holds
convert(invalidNumber) is empty
for all(validNumbers)
such as n in [1; 3999] holds
convert(validNumber) contains only sounds in "M, Ba, Meh, Baa, Meeh, Baaa, 🐐"
Let's implement those properties using FsCheck
.
It is a .NET library written in F#
for property-based testing, allowing us to define properties that our code should satisfy and automatically generating test cases to verify these properties.
It is inspired by Haskell's QuickCheck
, it helps uncover edge cases and bugs by testing code against a broad set of inputs, enhancing the robustness and reliability of software.
We add the nuget
packages:
Install-Package FsCheck
Install-Package FsCheck.Xunit
Create our first Property: only valid Goat Characters
Now that FsCheck
is installed we can write a first property:
public partial class GoatNumeralsTest
{
...
// We use a Regex to check the result is valid
[GeneratedRegex("^(?:M|Ba|Meh|Baa|Meeh|Baaa|🐐)+$")]
private static partial Regex ValidGoatRegex();
private static bool AllGoatCharactersAreValid(string goatNumber)
=> ValidGoatRegex().IsMatch(goatNumber);
// We define a new Property using this FsCheck Attribute
[Property]
public void ReturnsOnlyValidSymbolsForValidNumbers() =>
// for all(validNumbers) such as n in [1; 3999] holds
Prop.ForAll(ValidNumbers,
// We call the Convert method for each n value and return true when the string value is valid regarding the Regex
n => AllGoatCharactersAreValid(Convert(n)))
// Glue for failing the tests through xUnit
.QuickCheckThrowOnFailure();
// We define how our machine can generate valid numbers by using Arb and Gen classes (from FsCheck)
private static readonly Arbitrary<int> ValidNumbers = Gen.Choose(1, 3999).ToArbitrary();
}
At each run, FsCheck
will generate new random data:
Run 1 : 1921, 3420, 292, 897, 52, ...
Run 2 : 1229, 1205, 919, 1243, ...
Run 3 : 3466, 644, 2027, 705, ...
Let's run it:
It fails after 5 passing values for the value: 2900
... meaning one value yields false.
The property is said to be falsified and checking is aborted.
What does it mean if a property-test fails?
If the framework manages to find an edge case, there are three possibilities :
- The production code is not correctly implemented
- We are not testing the parameters / property the right way
- The understanding and definition of the property are not corrects
Isolate the Edge Case
Once a value falsifies our property, a good practice is to isolate it in a classic Unit Test
to investigate.
Let's add this test case to isolate the problem:
[Theory]
...
// We asked to our goat expert what is expected for 2900
[InlineData(2900, "🐐🐐Meeh🐐")]
public void GenerateGoatNumeralsForNumbers(int number, string expectedGoatNumeral)
=> Convert(number)
.Should()
.Be(expectedGoatNumeral);
Here is the actual result of the test...
Well we have identified a bug in our implementation... After investigating the code we have found a typo in the Dictionary
used in the conversion process:
private static readonly Dictionary<int, string> IntToGoatNumerals = new()
{
{1000, "🐐"},
{900, "Meu🐐"}, // Should be 1000 - 100, CM in roman and Meeh🐐 in Goat
{100, "Meeh"},
...
};
Is it the kind of stuff we do often as developers? 🤔
After having made this fix our property is now green 🥳
Create a second property: empty for invalid numbers
This one is even simpler to write:
// Define what are invalid numbers
private static readonly Arbitrary<int> InvalidNumbers = Arb.Default.Int32().Filter(x => x is <= 0 or > 3999);
[Property]
public void ReturnsNoneForAnyInvalidNumber()
// Returns true if convert result is Empty
=> Prop.ForAll(InvalidNumbers, n => Convert(n) == Empty)
.QuickCheckThrowOnFailure();
As for the first property, this one is quickly falsified (after 2 values):
Indeed, the current implementation is not really handling the boundaries of the algorithm...
public static string Convert(int number)
{
// Check only that number is not 0...
if (number != 0)
{
...
}
else
{
return string.Empty;
}
}
We can easily refactor our code to make it pass our property:
public static class GoatNumeralsConverter
{
private const int Min = 1;
private const int Max = 3999;
private static readonly Dictionary<int, string> IntToGoatNumerals = new()
{
{1000, "🐐"},
{900, "Meeh🐐"},
{500, "Baaa"},
{400, "MeehBaaa"},
{100, "Meeh"},
{90, "MehMeeh"},
{50, "Baa"},
{40, "MehBaa"},
{10, "Meh"},
{9, "MMeh"},
{5, "Ba"},
{4, "MBa"},
{1, "M"}
};
// Add an `IsInRange` method that checks the boundaries
public static string Convert(int number)
=> IsInRange(number) ? ConvertSafely(number) : Empty;
private static bool IsInRange(int number) => number is >= Min and <= Max;
private static string ConvertSafely(int number)
{
var goatNumerals = new StringBuilder();
var remaining = number;
foreach (var toGoat in IntToGoatNumerals)
{
while (remaining >= toGoat.Key)
{
goatNumerals.Append(toGoat.Value);
remaining -= toGoat.Key;
}
}
return goatNumerals.ToString();
}
}
We have mixed "classic" Unit Tests and Properties in our tests. We end up with a test class that looks like this:
public partial class GoatNumeralsTest
{
[Fact]
public void GenerateEmptyFor0()
=> Convert(0)
.Should()
.Be(Empty);
[Theory]
[InlineData(1, "M")]
[InlineData(3, "MMM")]
[InlineData(4, "MBa")]
[InlineData(5, "Ba")]
[InlineData(10, "Meh")]
[InlineData(13, "MehMMM")]
[InlineData(50, "Baa")]
[InlineData(100, "Meeh")]
[InlineData(500, "Baaa")]
[InlineData(1000, "🐐")]
[InlineData(2499, "🐐🐐MeehBaaaMehMeehMMeh")]
[InlineData(2900, "🐐🐐Meeh🐐")]
public void GenerateGoatNumeralsForNumbers(int number, string expectedGoatNumeral)
=> Convert(number)
.Should()
.Be(expectedGoatNumeral);
[GeneratedRegex("^(?:M|Ba|Meh|Baa|Meeh|Baaa|🐐)+$")]
private static partial Regex ValidGoatRegex();
private static readonly Arbitrary<int> ValidNumbers = Gen.Choose(1, 3999).ToArbitrary();
[Property]
public void ReturnsOnlyValidSymbolsForValidNumbers()
=> Prop.ForAll(ValidNumbers,
n => AllGoatCharactersAreValid(Convert(n)))
.QuickCheckThrowOnFailure();
private static bool AllGoatCharactersAreValid(string goatNumber) => ValidGoatRegex().IsMatch(goatNumber);
private static readonly Arbitrary<int> InvalidNumbers = Arb.Default.Int32().Filter(x => x is <= 0 or > 3999);
[Property]
public void ReturnsEmptyForAnyInvalidNumber()
=> Prop.ForAll(InvalidNumbers, n => Convert(n) == Empty)
.QuickCheckThrowOnFailure();
}
People often equate example tests with property. In my opinion, they complement each other. One verify the expected results and the other validate business properties.
Conclusion
> By using PBT
we have quickly identified 2 bugs in our implementation by fighting our own biases that limited during test list elaboration and implementation. Imagine what could be the impact of this practice on your own code 🤔.
PBT shines in its ability to unearth edge cases that might elude manual testing, thus preventing potential bugs from sneaking into production.
There are lots of Use Cases with PBT:
Idempotence
: Verify operations like resetting a Goat's state are idempotent.- PBT can automatically check that multiple applications of the operation do not alter its outcome:
f(f(x)) = f(x)
- Examples:
UpperCase
,Delete
- PBT can automatically check that multiple applications of the operation do not alter its outcome:
Round Tripping
: For a given process, PBT can ensure that "serializing" and then "deserializing" a Goat object returns to the original state, thus guaranteeing data integrity.from(to(x)) = x
- Examples:
Serialization
,Conversion
Checking Invariants
: Invariants such as "A Goat's age cannot be negative" can be continuously validated across generated test cases, ensuring the model's integrity.Replacing Parameterized Tests
: Instead of manually crafting tests for boundary conditions—such as exceptions thrown for invalid Goat ages—PBT systematically explores these scenarios.Refactoring
orRewriting Code
: PBT assists in validating the new implementation against the old, ensuring behavioral consistency while refactoring or rewriting sections of the codebase.f(x) = new_f(x)
How it could be useful to you? and when?
To Go Further
If you want to go further on this kind of topics I invite you to take a look at our Advent of Craft repository. We talk and share about topics related to: Test-Driven Development
, Clean Testing
, Refactoring
, Design
, and Functional Progamming
.