Entity Framework and Dapper are usually compared to determine which is the best ORM on .NET. Both have their own advantages.
One thing EF is often criticized for is the slowness of doing simple data retrieval queries with LinQ (we won't cover raw SQL queries here, that's another debate).
Since LINQ instantiates the query context, it obviously has a greater impact on performance than raw SQL, but there is a way to reduce this delay from compiled queries.
What is compiled queries for ?
If we take the definition from Microsoft documentation:
LINQ queries are cached automatically. However, you can still use compiled LINQ queries to reduce this cost in later executions and compiled queries can be more efficient than LINQ queries that are automatically cached.
Reformulated, LINQ handles performance improvement through pre-compiled queries.
There are two types of compilation: synchronous with the CompileQuery
function returning a Task<T>
and asynchronous with CompileAsyncQuery
returning an IAsyncEnumerable<T>
.
Both are static methods of the EF
class declaring a function. They take a database context and parameters to populate the query, and can be simplified as follows:
private static readonly Func<DBCONTEXT, [PARAMETERS], Task<RETURNINGTYPE>> name = EF.CompileAsyncQuery((DBCONTEXT, [PARAMETERS]) => .... );
To clarify with an example, let's say we want to get the first goat matching the identifier 23
. This query will return a single Goat
or null if it does not exist. It takes an integer as a parameter and the database context MyDbContext
.
private static readonly Func<MyDbContext, int, Task<Goat?>> s_getFirstGoatById = EF.CompileAsyncQuery((MyDbContext dbContext, int id) => dbContext.Goats.FirstOrDefault(_ => _.Id == id));
The compiled queries are declared and ready to be used:
public async Task<Goat?> GetFirstGoatById_WithCompiledQuery()
{
var dbContext = new AppDbContext();
return await s_getFirstGoatById(dbContext, 56);
}
With a little practice, it becomes really easy to use.
Benchmark
To properly perform this benchmark, the context is a SQLite database containing 100,000 goats.
var goats = new Fixture().Build<Goat>().CreateMany(100000);
var dbContext = new AppDbContext();
dbContext.AddRange(goats);
dbContext.SaveChanges();
The goats are created with AutoFixture to fill in the properties and approximate the actual contents of the database. This will lead to goats with a name like Namec975de5d-c5fb-4c9c-a6dc-0959f77cdd4f
for example. They have 3 fields: an int
it, a string
name and an int
age.
Everything is ready, the benchmarks can start now!
Get one row
The first test will be a FirstOrDefault
getting an id filter set to 56
.
private static readonly Func<AppDbContext, int, Goat?> s_getFirstGoatById = EF.CompileQuery((AppDbContext dbContext, int id) => dbContext.Goats.FirstOrDefault(_ => _.Id == id));
[Benchmark]
public Goat? GetFirstGoatById_WithCompiledQuery()
{
var dbContext = new AppDbContext();
return s_getFirstGoatById(dbContext, 56);
}
[Benchmark]
public Goat? GetFirstGoatById_WithoutCompiledQuery()
{
var dbContext = new AppDbContext();
return dbContext.Goats.FirstOrDefault(_ => _.Id == 56);
}
The compiled query takes the id used for the filter as parameter and returns the first goat matching this id.
Method | Mean | Error | StdDev |
---|---|---|---|
GetFirstGoatById_WithCompiledQuery | 194.7 us | 3.86 us | 6.55 us |
GetFirstGoatById_WithoutCompiledQuery | 219.8 us | 4.39 us | 9.74 us |
This leads to a 10% performance, great!
To improve this benchmark, the query now filters on two fields: the id and a pattern that the name must contains.
private static readonly Func<AppDbContext, int, string, Goat?> s_getFirstGoatByIdAndName = EF.CompileQuery((AppDbContext dbContext, int id, string name) => dbContext.Goats.FirstOrDefault(_ => _.Id == id && _.Name.Contains(name)));
[Benchmark]
public Goat? GetFirstGoatByIdAndName_WithoutCompiledQuery()
{
var dbContext = new AppDbContext();
return dbContext.Goats.FirstOrDefault(_ => _.Id == 56 && _.Name.Contains("41d2"));
}
[Benchmark]
public Goat? GetFirstGoatByIdAndName_WithCompiledQuery()
{
var dbContext = new AppDbContext();
return s_getFirstGoatByIdAndName(dbContext, 56, "41d2");
}
And the result is the same. On average, the compiled query is faster.
Method | Mean | Error | StdDev |
---|---|---|---|
GetFirstGoatByIdAndName_WithoutCompiledQuery | 219.3 us | 4.24 us | 5.51 us |
GetFirstGoatByIdAndName_WithCompiledQuery | 179.0 us | 3.58 us | 5.68 us |
Get multiples rows
Tests are, for the moment, quite conclusive. Even if the percentage of improvment is not as much as one would expect, it remains correct.
Most performance problems are less frequent for queries that only request a value. They are really felt when manipulating large amounts of data, such as lists. This is what we will test.
private static readonly Func<AppDbContext, int, IEnumerable<Goat>> s_getGoatsOlderThanAndNameAsync =
EF.CompileQuery((AppDbContext db, int age) => db.Goats.Where(_ => _.Age > age));
[Benchmark]
public Goat[] GetGoatsOlderThanAndName_WithCompiledQuery()
{
var dbContext = new AppDbContext();
return s_getGoatsOlderThanAndNameAsync(dbContext, 20).ToArray();
}
[Benchmark]
public Goat[] GetGoatsOlderThanAndName_WithoutCompiledQuery()
{
var dbContext = new AppDbContext();
return dbContext.Goats.Where(_ => _.Age > 20).ToArray();
}
To obtain the following result:
Method | Mean | Error | StdDev |
---|---|---|---|
GetGoatsOlderThan_WithCompiledQuery | 457.7 ms | 9.12 ms | 19.63 ms |
GetGoatsOlderThan_WithoutCompiledQuery | 438.5 ms | 8.72 ms | 19.69 ms |
Contrary to expectations, the compiled query did not improve performance in retrieving list items, even after multiple runs.
Using complex queries
The previous benchmarks were quite simple on a single table. To go further, we create a multi-table query with a filter on dates.
A goat now contains a date of birth, and a list of habits.
We create a complex query, which does not necessarily make sense, but which creates a multitude of filters on different tables and different types.
private static readonly Func<AppDbContext, DateTime, int, string, IEnumerable<Goat>> s_complexQuery =
EF.CompileQuery((AppDbContext db, DateTime dateTime, int goatId, string habit) =>
db.Goats.Where(_ => _.BirthDate < dateTime &&
_.Id > goatId &&
_.Habits.Any(x => x.Name.Contains(habit)) &&
_.Habits.Count() == 3));
[Benchmark]
public Goat[] ComplexQuery_WithCompiledQuery()
{
var dbContext = new AppDbContext();
return s_complexQuery(dbContext, new DateTime(2023), 7000, "6b").ToArray();
}
[Benchmark]
public Goat[] ComplexQuery_WithoutCompiledQuery()
{
var dbContext = new AppDbContext();
return dbContext.Goats.Where(_ => _.BirthDate < new DateTime(2023) &&
_.Id > 7000 &&
_.Habits.Any(x => x.Name.Contains("6b")) &&
_.Habits.Count() == 3)
.ToArray();
}
The results show almost equal performance, even after several trials.
Method | Mean | Error | StdDev |
---|---|---|---|
GetGoatsOlderThanAndName_WithCompiledQuery | 12.74 ms | 0.074 ms | 0.069 ms |
GetGoatsOlderThanAndName_WithoutCompiledQuery | 12.86 ms | 0.144 ms | 0.134 ms |
Summary
Compiled queries do not seem to be a miracle solution to optimize EntityFramework. They must be used in specific cases that meet the following rules:
- The query will be executed more than once independently of the parameters used
- The query compilation is an important part of the query, mainly for single request retrieving
- You are not using a LINQ feature like
IEnumerable<T>.Contains()
which won't work withCompiledQuery
For my part, it looks like micro optimization for real specific cases like getting a single value from the database. Otherwise, the performances are not important enough to find a real interest. I also strongly recommend testing every compiled query before implementing it as we have seen that it is not 100% certain that it will have a positive impact.
The more results the query has, the less impact the compiled query will have, and this is not surprising. The compilation is only a small part of the overall constitution of a query. The larger the part, the more impact it will have.
When a query retrieves a large amount of data, especially lists, the execution takes the majority of the query time, not the compilation.
If you really need the performance of the query, it's better to focus on running the query rather than optimizing the compilation. Try to run the full query, and improve it with indexes, fewer joins between tables, retrieving only the necessary values, etc...
To go further, you can read the Microsoft documentation :
Have a goat day 🐐
Join the conversation.