Explore the differences between complex types and JSON columns in .NET development. Make informed decisions on the optimal data storage approach for your .NET applications
Entity Framework Core (EF Core) has revolutionized the way developers interact with databases in .NET applications. One of the key features that make EF Core so powerful is its ability to map complex types to database columns seamlessly.
In this article, we will delve into the ComplexType
attribute and its counterpart, the JsonColumn
, to understand their roles, advantages, and limitations. By the end of this article, you will have a comprehensive understanding of when and how to use these attributes effectively.
What's complex types ?
A complex type in Entity Framework Core (EF Core) is a class that encapsulates a set of related properties, allowing for more modular and maintainable code. Unlike entity types, complex types do not have their own primary keys and are not tracked independently by the context. Instead, they are owned by an entity and their lifecycle is tied to that of the owning entity.
Before the introduction of the ComplexType
attribute in EF Core, managing complex types was cumbersome. Developers had to rely on workarounds like owned types, which, although functional, introduced unnecessary complexity. Owned types are essentially entity types with hidden keys, leading to complications in scenarios where key semantics were not needed. This made the code harder to read, maintain, and extend, ultimately defeating the purpose of using complex types for better modularity and encapsulation.
The ComplexType
attribute was introduced in EF Core 8.0 to address the need for more sophisticated data modeling.
New solution : ComplexType attribute
The ComplexType
attribute allows developers to encapsulate related properties into a single class, making the code more modular and easier to manage. Complex type objects in EF Core:
- Are not identified or tracked by key values.
- Must be defined within an entity type, meaning you cannot have a
DbSet
of a complex type. - Can be either .NET value types or reference types
- Can share the same instance across multiple properties
For instance, consider a scenario where you have a Goat
entity with several properties related to its field. Without ComplexType
, you would have to include all these properties directly in the Goat
class, leading to a cluttered and less readable codebase.
public class Goat
{
public int Id { get; set; }
public string Name { get; set; }
public int Age { get; set; }
public string FieldStreet { get; set; }
public string FieldCity { get; set; }
public double FieldSize { get; set; }
}
This design is functional but not optimal. The field-related properties can be extracted into a nested class for better organization:
public class Goat
{
public int Id { get; set; }
public string Name { get; set; }
public int Age { get; set; }
public required Field Field { get; set; }
}
[ComplexType]
public class Field
{
public string Street { get; set; }
public string City { get; set; }
public double Size { get; set; }
}
The ComplexType
attribute should be applied to the nested class, in this case, Field
. This tells EF Core to treat the Field
class as a complex type, meaning its properties will be mapped to columns in the Goats
table rather than creating a separate table.
When you run the migration, EF Core will generate the following SQL:
migrationBuilder.CreateTable(
name: "Goats",
columns: table => new
{
Id = table.Column<int>(type: "INTEGER", nullable: false)
.Annotation("Sqlite:Autoincrement", true),
Name = table.Column<string>(type: "TEXT", nullable: false),
Age = table.Column<int>(type: "INTEGER", nullable: false),
Field_City = table.Column<string>(type: "TEXT", nullable: false),
Field_Size = table.Column<double>(type: "REAL", nullable: false),
Field_Street = table.Column<string>(type: "TEXT", nullable: false)
},
constraints: table =>
{
table.PrimaryKey("PK_Goats", x => x.Id);
});
The manipulation of the Goat
object is different from its representation in a database, and that's a good thing. On the one hand, we have a table containing all the information, and on the other, complex objects to improve the readability and manipulation of the data.
The alternative : JSON columns
The JsonColumn
attribute was introduced in EF Core 5.0 to provide an alternative way of handling complex types. Instead of mapping each property to a separate column, JsonColumn
serializes the entire object into a single JSON column.
Here's how you can configure a GoatJson
entity to use a JSON column:
public class GoatJsonConfiguration : IEntityTypeConfiguration<GoatJson>
{
public void Configure(EntityTypeBuilder<GoatJson> builder)
{
...
builder.Property(g => g.Field)
.HasConversion(
v => JsonSerializer.Serialize(v, new JsonSerializerOptions()),
v => JsonSerializer.Deserialize<FieldJson>(v, new JsonSerializerOptions()))
.HasColumnType("json");
}
}
When you run the migration, EF Core will generate the following SQL:
migrationBuilder.CreateTable(
name: "GoatsJson",
columns: table => new
{
Id = table.Column<int>(type: "INTEGER", nullable: false)
.Annotation("Sqlite:Autoincrement", true),
Name = table.Column<string>(type: "TEXT", nullable: false),
Age = table.Column<int>(type: "INTEGER", nullable: false),
Field = table.Column<string>(type: "json", nullable: false)
},
constraints: table =>
{
table.PrimaryKey("PK_GoatsJson", x => x.Id);
});
The result is the same as for complex types
Comparing ComplexType and JSON columns
Database representation
The first difference is the structure of the database and the storage of the Field
object. Below are two images: one showing a row inside the database using ComplexType
and the other using JsonColumn
.
The ComplexType
approach results in multiple columns, while the JsonColumn
approach results in a single column containing the serialized JSON object.
Schema Update
Now we can ask ourselves how the schema update is managed in both cases.
Let's update the schema by replacing the Size
property with Hectare
:
[ComplexType]
public class Field
{
public string Street { get; set; }
public string City { get; set; }
public double Hectare { get; set; }
}
public class FieldJson
{
public string Street { get; set; }
public string City { get; set; }
public double Hectare { get; set; }
}
The migration will look like this:
protected override void Up(MigrationBuilder migrationBuilder)
{
migrationBuilder.RenameColumn(
name: "Field_Size",
table: "Goats",
newName: "Field_Hectare");
}
For ComplexType
had an impact, by renaming the column. In the other hand, the migration has no impact for the JSON column.
This is not insignificant: the structure of the database is not strongly linked to its representation, as the content is stored in text format, without the strong constraint of columnar representation.
When we then try to retrieve the data added in the previous tests, we see that the management with the attribute has taken into account the column name change.
The ComplexType
approach will convert the value from Size
to Hectare
, while the JsonColumn
approach will lose the value since it's still related to the Size
column.
It's very important to be aware of this: when migrating a JSON column, it's essential to rework the existing data, otherwise the data will no longer be aligned with the structure of the entities.
Performances
An important metric in the comparison is the difference in performance between the two approaches.
For reading performance, tests are carried out on a database of 1 million goats with 3 types of test:
[Benchmark]
public void QueryAllGoats()
{
var goats = _context.Set<Goat>().ToList();
}
[Benchmark]
public void QueryGoatsWithWhereCondition()
{
var goats = _context.Set<Goat>().Where(g => g.Age % 10 == 0).ToList();
}
[Benchmark]
public void QueryAllGoatsWithOnlyOneProperty()
{
var goats = _context.Set<Goat>().Select(g => new
{
g.Id,
g.Age,
g.Name,
g.Field.Street
}).ToList();
}
AllGoats
to retrieve all itemsWithWhereCondition
with an age filterWithOnlyOneProperty
to retrieve a single element of theField
class
BenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3737/23H2/2023Update/SunValley3)
13th Gen Intel Core i9-13900H, 1 CPU, 20 logical and 14 physical cores
.NET SDK 8.0.107
[Host] : .NET 8.0.7 (8.0.724.31311), X64 RyuJIT AVX2 [AttachedDebugger]
DefaultJob : .NET 8.0.7 (8.0.724.31311), X64 RyuJIT AVX2
| Method | Mean | Error | StdDev | Gen0 | Gen1 | Gen2 | Allocated |
|------------------------------------- |------------:|----------:|----------:|-----------:|-----------:|----------:|----------:|
| QueryAllGoats | 270.00 ms | 4.013 ms | 3.351 ms | 19500.0000 | - | - | 252.52 MB |
| QueryAllGoatsJson | 288.95 ms | 5.401 ms | 7.393 ms | 19500.0000 | - | - | 252.52 MB |
| QueryGoatsWithWhereCondition | 75.37 ms | 1.204 ms | 1.005 ms | 1857.1429 | - | - | 25.66 MB |
| QueryGoatsJsonWithWhereCondition | 77.40 ms | 1.484 ms | 1.239 ms | 1857.1429 | - | - | 25.66 MB |
| QueryAllGoatsWithOnlyOneProperty | 827.62 ms | 16.525 ms | 42.656 ms | 28000.0000 | 14000.0000 | 1000.0000 | 344 MB |
| QueryAllGoatsJsonWithOnlyOneProperty | 1,706.00 ms | 33.944 ms | 91.189 ms | 59000.0000 | 21000.0000 | 2000.0000 | 738.23 MB |
Performance is identical for all the data, but the gap widens considerably when it comes to not retrieving the entire set. On searches to obtain some of the properties of the nested class, we see a factor of 2 on the response time. For reads, the ComplexType
attribute is either equivalent or more efficient, depending on the case.
For write performance, the test generates 1 million rows in an empty database.
BenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3737/23H2/2023Update/SunValley3)
13th Gen Intel Core i9-13900H, 1 CPU, 20 logical and 14 physical cores
.NET SDK 8.0.107
[Host] : .NET 8.0.7 (8.0.724.31311), X64 RyuJIT AVX2 [AttachedDebugger]
DefaultJob : .NET 8.0.7 (8.0.724.31311), X64 RyuJIT AVX2
| Method | Mean | Error | StdDev | Gen0 | Gen1 | Allocated |
|------------------------------------ |--------:|--------:|--------:|------------:|------------:|----------:|
| Write1MGoats | 18.30 s | 0.445 s | 1.304 s | 899000.0000 | 218000.0000 | 10.62 GB |
| Write1MGoatsJson | 16.28 s | 0.322 s | 0.301 s | 565000.0000 | 138000.0000 | 6.72 GB |
Performance is much the same, despite a slight advantage for JSON columns. This is probably due to the fact that JSON is managed using a single column, whereas ComplexType
is managed using 3 columns.
Advantages and Disadvantages
For ComplexType
:
- + Has a similar structure between the database and the entities
- + Simplify schema updates
- + Easier to update individual properties.
- - Can create a lot of NULL values when properties are not filled.
- - Has to be required, cannot be nullable
For JsonColumn
:
- + Enables the structure to be changed without redeployment
- - Not all databases support JSON columns.
- - Painful to use with encryption like Always Encrypted since you need to get the entire object
- - Harder to update individual properties with SQL or stored procedures.
Having used both on my projects, I recommend always prioritising ComplexType, for the simple reason that the objective of using a relational database is to create integrity constraints in the schema. If you don't want to do this, then opt for NoSQL databases.
The interest in JSON columns is mainly focused on complex objects for which there is no interest in querying other than reading. This could be the case for configurations, image representations etc... where the object is complex and requires no modification.
This format also brings its own constraints. If Always Encrypted is used, for example, it will be impossible to search the fields without having to decrypt the entire column, which can cause performance problems.
We tend to see the point of a JSON column at the start of a project, and then see its constraints when we start making changes later that impact the schema contained in these columns...
Conclusion
In summary, EF Core's introduction of the ComplexType
attribute in version 8.0 has greatly simplified the management of related properties within an entity, making code more modular and maintainable.
While the JsonColumn
attribute offers an alternative by storing entire objects as JSON, it comes with its own set of challenges, particularly in updating individual properties and ensuring data alignment during schema migrations.
Each approach has its strengths and is suited to different scenarios. By understanding these tools, developers can make more informed decisions to optimize their data modeling strategies.
Have a goat day 🐐
Join the conversation.