Blogs
Technology

Entity Framework Best Practices

Monday, Jan 08, 2018
Ibrahim Ahmed

Technical Lead

01 Posts

This blog post lists some development behaviors which slow down Entity Framework performance and show how to enhance these points to improve.

This blog post shows some performance aspects that you need to carefully consider when you opt to develop applications using Entity Framework. The objective of the post is to assist developers who need to improve the performance of applications that use the Entity Framework.

I have provided test results to demonstrate the degree of performance that can be achieved. You need to know that the achieved results are not planned or intended as absolute indicators that reflect the performance you will see in your application. For practical purposes, this blog post assumes Entity Framework 6 is run under .NET 4.5.2.

Testing Data

I will use the below diagram of testing database throughout the blog post.

 

1- Disable Change Tracking

 The purpose of the below example is to retrieve all students from database.

using (var dbModel = new EFModel()) 
 { 
    var results = dbModel.Students.ToList(); 
 }

Q1.1

This query retrieves 5000 records in 254 millisecond.

But by applying some changes on above query Q1.1 and adding AsNoTracking() to query.

using (var dbModel = new EFModel()) 
 { 
    var results = dbModel.Students.AsNoTracking().ToList(); 
 }

Q1.2

This query retrieves the same 5000 records in query Q1.1 in 25 millisecond.

So what is AsNoTracking()? How did it reduce query execution time?

By using Entity Framework, your database entities will be mapped to objects. The process of mapping database entities to objects called object materialization. These mapped objects can be derived from EntityObject, anonymous types or DbDataRecord.

By using the MergeOption.NoTracking or AsNoTracking() option, objects tracking will be minimized, so will increase the performance of your system in most situations. Consequently, this means you can't make changes to those entity instances and have those changes persisted by SaveChanges().

2- Batch Update

Most of applications that use Entity Framework perform batch updates "updates set of records". The purpose of the below example is to update all courses duration to be 12 weeks.

using (var dbModel = new EFModel()) 
{
       var courses = dbModel.Courses.AsNoTracking().ToList();
       foreach (var course in courses)
       {
                course.Duration = 12; 
                dbModel.Entry(course).State = System.Data.Entity.EntityState.Modified;
        }
            dbModel.SaveChanges(); 
}

Q2.1

This query updates 5000 records in 343.3 seconds and hits the databases 5001 times, hits means query Q2.1 is mapped to 5001 SQL queries.

Mapped SQL queries:

  1. Get all courses from database.SELECT

    [Extent1].[ID] AS [ID],
    [Extent1].[Name] AS [Name],
    [Extent1].[Duration] AS [Duration]
    FROM [dbo].[Courses] AS [Extent1]
  2. Update courses one by one.

    exec sp_executesql N'UPDATE [dbo].[Courses]
    SET [Name] = @0, [Duration] = @1
    WHERE ([ID] = @2)
    ',N'@0 nvarchar(50),@1 decimal(18,0),@2 int',@0=N'T',@1=12,@2=1

 But by applying some changes on above query Q2.1 and replacing updating mechanism by Update() method;

using (var dbModel = new EFModel()) 
{
   dbModel.Courses.Update(c=>new Course { Duration = 12 });
}


Q2.2

This query updates 5000 records in 2 seconds and hits the databases once.

Mapped SQL queries:

  1. Update courses.

exec sp_executesql N'UPDATE [dbo].[Courses] SET
[Duration] = @p__update__0
FROM [dbo].[Courses] AS j0 INNER JOIN (
SELECT
    [Extent1].[ID] AS [ID]
FROM [dbo].[Courses] AS [Extent1]
) AS j1 ON (j0.[ID] = j1.[ID])',N'@p__update__0 decimal(2,0)',@p__update__0=12

So what is Update()? How did it reduce query execution time?

If you try to write Update() in your query, you will get the below error.

'DbSet<Course>' does not contain a definition for 'Update' and no extension method 'Update' accepting a first argument of type 'DbSet<Course>' could be found (are you missing a using directive or an assembly reference?)

To use Update() you must use entity framework plus library. This library improves Entity Framework performance and overcomes limitations with its features.

As shown in query Update() reduce database hits and round trips between the application and database by updating a set of records using single query.

3- Bulk Delete

You may need to delete bulk of records. The purpose of the below example is to delete all courses with duration less than 10 weeks.

using (var dbModel = new EFModel()) 
{
       var courses = dbModel.Courses.AsNoTracking().ToList();
       foreach (var course in courses)
       {
                if(course.Duration<10) 
                  dbModel.Entry(course).State = System.Data.Entity.EntityState.Deleted;
        }
            dbModel.SaveChanges(); 
}

Q3.1

This query deletes 5000 records in 260 seconds and hits the databases 5001 times.

Mapped SQL queries:

  1. Get all courses from database.SELECT
       [Extent1].[ID] AS [ID],
        [Extent1].[Name] AS [Name],
        [Extent1].[Duration] AS [Duration]
        FROM [dbo].[Courses] AS [Extent1]

  2. Delete courses one by one.

        exec sp_executesql N'DELETE [dbo].[Courses]     WHERE ([ID] = @0)',N'@0 int',@0=1 

N.B: Accordingly, if you want to delete or update N items using Q2.1 or Q3.1 queries, Entity Framework will map it to N SQL queries.

But by applying some changes on above query (Q3.1) and replacing deleting mechanism by Delete() method;

using (var dbModel = new EFModel()) 
{
  dbModel.Courses.Where(c=>c.Duration<10).Delete();
}

This query deletes 5000 records in 2 seconds and hits the databases once.

Mapped SQL queries:

  1. Delete courses.

DELETE [dbo].[Courses]
FROM [dbo].[Courses] AS j0 INNER JOIN (
SELECT
[Extent1].[ID] AS [ID]
FROM [dbo].[Courses] AS [Extent1]
WHERE [Extent1].[Duration] < cast(10 as decimal(18))
) AS j1 ON (j0.[ID] = j1.[ID])

So what is Delete()? How did it reduce query execution time?

Delete() is on of entity framework plus library methods and it is acting like Update() in enhancing the performance.

4- Bulk Insertion

You may need to insert bulk of records. The purpose of the below example is to insert 5000 records in courses database table.

List<Course> coursesList = new List<Course>();
for (int i = 0; i < 5000; i++)
 {
  Course _course = new Course { Name = "EF", Duration = 5 };
  coursesList .Add(_course);
 }

using (var dbModel = new EFModel()) 
{
     dbModel.Courses.AddRange(coursesList);
     dbModel.SaveChanges();
}

Q4.1

This query adds 5000 records in 59.5 seconds and hits the databases 5000 times.

Mapped SQL queries:

  1. Insert all courses in database one by one.

    exec sp_executesql N'INSERT [dbo].[Courses]([Name], [Duration]) VALUES (@0, @1) SELECT [ID] FROM [dbo].[Courses] WHERE @@ROWCOUNT > 0 AND [ID] = scope_identity()',N'@0 nvarchar(50),@1 decimal(18,0)',@0=N'EF',@1=5

N.B: Accordingly, if you want to insert or update or delete N items using Q2.1 or Q3.1 or Q4.1 queries, entity framework will map it to N SQL queries.

But by applying some changes on above query(Q4.1) by changing AutoDetectChangesEnabled configurations;

List<Course> coursesList = new List<Course>();
for (int i = 0; i < 5000; i++)
{
  Course _course = new Course { Name = "EF", Duration = 5 };
  coursesList .Add(_course );
}

using (var dbModel = new EFModel()) 
{
  dbModel.Configuration.AutoDetectChangesEnabled = false;
  dbModel.Courses.AddRange(coursesList);
  dbModel.SaveChanges();
  dbModel.Configuration.AutoDetectChangesEnabled = true;
}

Q4.2

This query adds 5000 records in 55 seconds and hits the databases 5000 times.

Mapped SQL queries:

  1. Insert all courses in database one by one.

exec sp_executesql N'INSERT [dbo].[Courses]([Name], [Duration]) VALUES (@0, @1) SELECT [ID] FROM [dbo].[Courses] WHERE @@ROWCOUNT > 0 AND [ID] = scope_identity()',N'@0 nvarchar(50),@1 decimal(18,0)',@0=N'EF',@1=5

So why the second query performance is better than is first one? How did it reduce query execution time?

Simply the second query Q4.2 performance is better than first one Q4.1 as when you adding a number of objects then saving them, EF will check whether each of these new objects has changed. To avoid this, turn off AutoDetectChangesEnabled before adding, then turn back on again afterwards.

Although the second query Q4.2 is better than the first one Q4.1 the performance is still existent. By applying some change on adding queries Q4.1 and Q4.2 by replacing normal adding queries by BulkInsert().

List<Course> coursesList = new List<Course>();
for (int i = 0; i < 5000; i++)
{
  Course _course = new Course { Name = "EF", Duration = 5 };
  coursesList .Add(_course);
}

using (var dbModel = new EFModel()) 
{
  dbModel.Courses.BulkInsert(coursesList);
}

This query adds 5000 records in 212 milliseconds and hits the databases once.

Mapped SQL queries:

  1. Insert courses.

Insert bulk Courses ([Name] NVarChar(50) COLLATE SQL_Latin1_General_CP1_CI_AS, [Duration] Decimal(18,0)) 

So what is BulkInsert ()? How did it reduce query execution time?

BulkInsert () is one of entity framework plus library It is adding a bulk of records in one round trip between database and application using only single query while disabling changes tracking as well. 

5- Select Required Columns Only

For example, a student has properties (ID, name, age and grade Id). The purpose of the below queries is select student name.

using (var dbModel = new EFModel()) 
{ 
   var results = dbModel.Students.AsNoTracking().ToList(); 
}

Q5.1

This query retrieved 5000 records in 254 millisecond.

But by selection only students names;

using (var dbModel = new EFModel()) 
{ 
  var students = dbModel.Students.AsNoTracking().Select(s => s.Name).ToList();
 }

Q5.2

This query retrieved 5000 records in 38 millisecond.

So why Q5.2 is faster than Q5.1?

As the second query selects only name property while the first one selects all properties.

So avoid fetching inrequired fields from the database. So if you need all students names, select name only and not all properties as selecting not required columns will impact the query performance.

6 - Compiled Queries

In Entity Framework, if you use similar query many times you can increase this query performance by compiling the query.

You can apply compiled queries for Entity Framework versions less than .Net framework 4.5 as starting with version 4.5 LINQ queries are cached automatically.

Accordingly if you are using EF5 or higher version you can’t use compiled queries.

The purpose of below example is select students who their age is greater than 10 years using compiled queries.

static readonly Func<EFModel, int, IQueryable<Student>> s_compiledQuery =CompiledQuery.Compile<EFModel, int, IQueryable<Student>>(
            (dbModel, age) => from student in dbModel.Students
                              where student.Age > age          
   select student);

   using (var dbModel = new EFModel())
            {
                int age  =10;
                IQueryable<Student> students = s_compiledQuery.Invoke(dbModel, age);
            }

The compiled queries improve the performance by 7% than normal queries.

Finally, following the above points will help you in improving your application performance. However, there are bad practices that can slow down your application. So, what are Entity Framework bad practices and how to avoid them will be my next topic.

Comments

No comments yet

Some of our features will not be working properly on IE. We recommend using this website from our supported browsers ex: Google Chrome, Firefox, Opera, Microsoft Edge