Programming Entity Framework: Code First

Chapter 1. Welcome to Code First

Microsoft’s ADO.NET Entity Framework, known widely as EF, introduced out-of-the-box Object Relational Mapping to .NET and Visual Studio. Central to Entity Framework was the Entity Data Model, a conceptual model of your application domain that maps back to the schema of your database. This conceptual model describes the core classes in your application. Entity Framework uses this conceptual model while querying from the database, creating objects from that data and then persisting changes back to the database.

Modeling with EF Before Code First

The first iteration of Entity Framework, which came as part of .NET 3.5 and Visual Studio 2008, gave developers the ability to create this conceptual model by reverse engineering an existing database into an XML file. This XML file used the EDMX extension, and you could use a designer to view and customize the model to better suit your domain. Visual Studio 2010 and .NET 4 brought the second version of Entity Framework, named Entity Framework 4 (EF4), to align with the .NET version. On the modeling side, a new capability called Model First was added. Here you could design your conceptual model in the visual designer and then create the database based on the model.

Model First allows developers working on new projects that do not have legacy databases to benefit from the Entity Framework as well. Developers can start with a focus on their application domain by designing the conceptual model and let the database creation flow from that process.

Whether designing the EDMX by the database-first or model-first way, the next step for creating your domain is to let automatic code generation build classes based on the entities and their relationships that it finds in the model. From here, developers have strongly typed classes representing their domain objects—whether those are customers, baseball cards, or fairy-tale characters—and can go on their merry way developing their software applications around these classes.

Another critical change came in EF4. In .NET 3.5, the only way Entity Framework was able to manage in-memory objects was by requiring classes to inherit from Entity Framework’s EntityObject. The EntityObject communicates its changes to Entity Framework, which in turns keeps track of changes and eventually is able to persist them back to the database. In addition to this functionality, .NET 4 introduced POCO (Plain Old CLR Object) support to enable the Entity Framework to track changes to simpler classes without needing the EntityObject to be involved. This freed up developers to use their own classes, independent of Entity Framework. The EF runtime had a way of being aware of the classes and keeping track of them while in memory.

Inception of Code First

Building upon the pieces that were introduced in EF4, Microsoft was able to create one more path to modeling, which many developers have been requesting since EF’s inception. This new type of modeling is called Code First. Code First lets you define your domain model with code rather than using an XML-based EDMX file. Even though Model First and Database First use code generation to provide classes for you to work with, many developers simply did not want to work with a designer nor have their classes generated for them. They just wanted to write code.

In Code First you begin by defining your domain model using POCO classes, which have no dependency on Entity Framework. Code First can infer a lot of information about your model purely from the shape of your classes. You can also provide additional configuration to further describe your model or override what Code First inferred. This configuration is also defined in code: no XML files or designers.

Note

EF4 also has support for POCO classes when working with the designer. The EF team provided a POCO template that would allow POCO classes to be generated for you. These generated classes would be automatically updated as you made changes in the designer. You could also use your own POCO classes rather than having them generated for you. But if you decided to take this approach, you were responsible for keeping your classes and the EDMX file in sync. This meant that any changes had to be made in two places—once in the designer and again in your classes. One of the big advantages of Code First is that your classes become the model. This means any changes to the model only need to be made in one place—your POCO classes.

Code First, Database First, and Model First are all just ways of building an Entity Data Model that can be used with Entity Framework to perform data access. Once the model has been built, the Entity Framework runtime behaves the same, regardless of how you created the model. Whether you choose to go with a designer or to use the code-based modeling is entirely your decision. Figure 1-1 lays out the different options you have for modeling with Entity Framework.

Figure 1-1. Modeling workflow options

Note

Microsoft refers to the Database First, Model First, and Code First options as workflows (e.g., the Code First workflow). That’s because each of those options is really a set of steps, whether you execute the steps yourself or the steps happen automatically. For example, with the Database First workflow, you reverse engineer from a database and then let a code generator create the classes. The Code First workflow begins with you coding your classes and then optionally letting Code First create a database for you.

Getting Code First to Developers in Between .NET Releases

Code First was not ready in time to be released in .NET 4. Rather than waiting for the .NET 5 release to bring Code First to developers, Microsoft made Code First available in an out-of-band release, referred to as Entity Framework 4.1, in April 2011. The version number will increment as subsequent updates are released. Entity Framework 4.2 was released in October 2011 and replaces Entity Framework 4.1 as the release that included Code First. The core Entity Framework API, System.Data.Entity.dll, is still part of the .NET Framework and was untouched by Entity Framework 4.1 and 4.2.

The Entity Framework 4.1 release also included another important feature, called the DbContext API. DbContext is the core of this API, which also contains other dependent classes. DbContext is a lighter-weight version of the Entity Framework’s ObjectContext. It is a wrapper over ObjectContext, and it exposes only those features that Microsoft found were most commonly used by developers working with Entity Framework. The DbContext also provides simpler access to coding patterns that are more complex to achieve with the ObjectContext. DbContext also takes care of a lot of common tasks for you, so that you write less code to achieve the same tasks; this is particularly true when working with Code First. Because Microsoft recommends that you use DbContext with Code First, you will see it used throughout this book. However, a separate book, called Programming Entity Framework: DbContext, will delve more deeply into DbContext, DbSet, Validation API, and other features that arrived alongside the DbContext.

Figure 1-2 helps you to visualize how Code First and DbContext add functionality by building on the core Entity Framework 4 API, rather than modifying it.

Figure 1-2. Code First and DbContext built on EF4

Writing the Code…First

Code First is aptly named: the code comes first, the rest follows. Let’s take a look at the basic default functionality without worrying about all of the possible scenarios you might need to support. The rest of the book is dedicated to that.

Note

We don’t expect you to recreate the sample code shown in this first chapter. The code samples are presented as part of an overview, not as a walkthrough. Beginning with Chapter 2, you will find many walkthroughs. They are described in a way that you can follow along in Visual Studio and try things out yourself if you’d like.

Of course, the first thing you’ll need is some code—classes to describe a business domain. In this case a very small one, patients and patient visits for a veterinarian.

Example 1-1 displays three classes for this domain—Patient, Visit, and AnimalType.

Example 1-1. Domain classes

using System;
using System.Collections.Generic;

namespace ChapterOneProject
{
  class Patient
  {
    public Patient()
    {
      Visits = new List<Visit>();
    }
    public int Id { get; set; }
    public string Name { get; set; }
    public DateTime BirthDate { get; set; }
    public AnimalType AnimalType { get; set; }
    public DateTime FirstVisit { get; set; }
    public List<Visit> Visits { get; set; }
  }

  class Visit
  {
    public int Id { get; set; }
    public DateTime Date { get; set; }
    public String ReasonForVisit { get; set; }
    public String Outcome { get; set; }
    public Decimal Weight { get; set; }
    public int PatientId { get; set; }
  }

  class AnimalType
  {
    public int Id { get; set; }
    public string TypeName { get; set; }
  }
}

Core to Code First is the concept of conventions—default rules that Code First will use to build a model based on your classes. For example, Entity Framework requires that a class that it will manage has a key property. Code First has a convention that if it finds a property named Id or a property with the combined name of the type name and Id (e.g., PatientId), that property will be automatically configured as the key. If it can’t find a property that matches this convention, it will throw an exception at runtime telling you that there is no key. Other types of conventions determine the default length of a string, or the default table structure that Entity Framework should expect in the database when you have classes that inherit from each other.

This could be very limiting if Code First relied solely on convention to work with your classes. But Code First is not determined to force you to design your classes to meet its needs. Instead, the conventions exist to enable Code First to automatically handle some common scenarios. If your classes happen to follow convention, Code First doesn’t need any more information from you. Entity Framework will be able to work directly with your classes. If they don’t follow convention, you can provide additional information through Code First’s many configuration options to ensure that your classes are interpreted properly by Code First.

In the case of the three classes in Example 1-1, the Id properties in each class meet the convention for keys. We’ll let Code First work with these classes as they are without any additional configurations.

Managing Objects with DbContext

The domain classes described above have nothing to do with the Entity Framework. They have no knowledge of it. That’s the beauty of working with Code First. You get to use your own classes. This is especially beneficial if you have existing domain classes from another project.

To use Code First, you start by defining a class that inherits from DbContext. One of the roles of this class, which we’ll refer to as a context, is to let Code First know about the classes that make up your model. That’s how Entity Framework will be aware of them and be able to keep track of them. This is done by exposing the domain classes through another new class introduced along with DbContext—the DbSet. Just as DbContext is a simpler wrapper around the ObjectContext, DbSet is a wrapper around Entity Framework 4’s ObjectSet, which simplifies coding tasks for which we normally use the ObjectSet.

Example 1-2 shows what this context class might look like. Notice that there are DbSet properties for Patients and Visits. The DbSets will allow you to query against the types. But we don’t anticipate doing a direct query of AnimalTypes, so there’s no need for a DbSet of AnimalTypes. Code First is smart enough to know that Patient makes use of the AnimalType class and will therefore include it in the model.

Example 1-2. VetContext class which derives from DbContext

using System.Data.Entity;

namespace ChapterOneProject
{
  class VetContext:DbContext
  {
    public DbSet<Patient> Patients { get; set; }
    public DbSet<Visit> Visits { get; set; }
  }
}

Using the Data Layer and Domain Classes

Now here comes what may seem a little surprising. This is all you need for a data layer—that is based on the assumption that you’re going to rely 100 percent on Code First convention to do the rest of the work.

There’s no database connection string. There is not even a database yet. But still, you are ready to use this data layer. Example 1-3 shows a method that will create a new dog PatientType along with our first Patient. The method also creates the Patient’s first Visit record and adds it to the Patient.Visits property.

Then we instantiate the context, add the patient to the DbSet<Patient> (Patients) that is defined in the context, and finally call SaveChanges, which is a method of DbContext.

Example 1-3. Adding a patient to the database with the VetContext

private static void CreateNewPatient()
{
  var dog = new AnimalType { TypeName = "Dog" };
  var patient = new Patient
  {
    Name = "Sampson",
    BirthDate = new DateTime(2008, 1, 28),
    AnimalType = dog,
    Visits = new List<Visit>
    {
      new Visit
      {
       Date = new DateTime(2011, 9, 1)
      }
    }
  };

  using(var context = new VetContext())
  {
    context.Patients.Add(patient);
    context.SaveChanges();
  }
}

Remember that there’s no connection string anywhere and no known database. Yet after running this code, we can look in the local SQL Server Express instance and see a new database whose name matches the fully qualified name of the context class, ChapterOneProject.VetContext.

You can see the details of this database’s schema in Figure 1-3.

Figure 1-3. The new database created by Code First

Compare the database schema to the classes defined in Example 1-1. They match almost exactly, table to class and field to property. The only difference is that a foreign key, Patients.AnimalType_Id, was created, even though there was no foreign key property in the Patient class. Code First worked out that because of the relationship expressed in the class (Patient has a reference to AnimalType), a foreign key would be needed in the database to persist that relationship. This is one of many conventions that Code First uses when it’s dealing with relationships. There are many ways to express relationships between classes. Code First conventions are able to interpret many of them. Notice, for example, that the PatientId field, which has an explicit property in the Visit class, is not null, whereas the AnimalType_Id field that Code First inferred from a navigation property is nullable. Again, convention determined the nullability of the foreign keys, but if you want to modify how Code First interprets your classes, you can do so using additional configuration.

Getting from Classes to a Database

If you have worked with Entity Framework, you are familiar with the model that is expressed in an EDMX file that you work with in a visual designer. You may also be aware of the fact that the EDMX file is in fact an XML file, but the designer makes it much easier to work with. The XML used to describe the model has a very specific schema and working with the raw XML would be mind-boggling without the designer.

What is not as obvious in the designer is that the XML contains more than just the description of the conceptual model that is displayed in the designer. It also has a description of database schema that the classes map to and one last bit of XML that describes how to get from the classes and properties to the tables and columns in the database. The combination of the model XML, the database schema XML, and the mapping XML are referred to as metadata.

At runtime, the Entity Framework reads the XML that describes all three parts of the XML and creates an in-memory representation of the metadata. But the in-memory metadata is not the XML; it is strongly typed objects such as EntityType, EdmProperty, and AssociationType. Entity Framework interacts with this in-memory representation of the model and schema every time it needs to interact with the database.

Because there is no XML file with Code First, it creates the in-memory metadata from what it can discover in your domain classes. This is where convention and configuration come into play. Code First has a class called the DbModelBuilder. It is the DbModelBuilder that reads the classes and builds the in-memory model to the best of its ability. Since it is also building the portion of the metadata that represents the database schema, it is able to use that to create the database. If you add configurations to help the model builder determine what the model and database schema should look like, it will read those just after it inspects the classes and incorporate that information into its understanding of what the model and database schema should look like.

Figure 1-4 shows how Entity Framework can build the in-memory model from code or from an XML file maintained through the designer. Once the in-memory model is created, Entity Framework doesn’t need to know how the model was created. It can use the in-memory model to determine what the database schema should look like, build queries to access data, translate the results of queries into your objects, and persist changes to those objects back to the database.

Figure 1-4. In-memory metadata created from code or EDMX model

Working with Configuration

In those cases where Code First needs some help understanding your intent, you have two options for performing configuration: Data Annotations and Code First’s Fluent API. Which option you choose is most often based on personal preference and your coding style. There is some advanced configuration that is only possible via the Fluent API.

Code First allows you to configure a great variety of property attributes, relationships, inheritance hierarchies, and database mappings. You’ll get a sneak peek at configuration now, but the bulk of this book is dedicated to explaining the convention and configuration options that are available to you.

Configuring with Data Annotations

One way to apply configuration, which many developers like because it is so simple, is to use Data Annotations. Data Annotations are attributes that you apply directly to the class or properties that you want to affect. These can be found in the System.ComponentModel.DataAnnotations namespace.

For example, if you want to ensure that a property should always have a value, you can use the Required annotation. In Example 1-4, this annotation has been applied to the AnimalType’s TypeName property.

Example 1-4. Using an annotation to mark a property as required

class AnimalType
{
  public int Id { get; set; }
  [Required]
  public string TypeName { get; set; }
}

This will have two effects. The first is that the TypeName field in the database will become not null. The second is that it will be validated by Entity Framework, thanks to the Validation API that was also introduced in Entity Framework 4.1. By default, when it’s time to SaveChanges, Entity Framework will check to be sure that the property you have flagged as Required is not empty. If it is empty, Entity Framework will throw an exception.

The Required annotation affects the database column facets and property validation. Some annotations are specific only to database mappings. For example, the Table annotation tells Code First that the class maps to a table of a certain name. The data that you refer to as AnimalType in your application might be stored in a table called Species. The Table annotation allows you to specify this mapping.

Example 1-5. Specifying a table name to map to

[Table("Species")]
class AnimalType
{
  public int Id { get; set; }
  [Required]
  public string TypeName { get; set; }
}

Configuring with the Fluent API

While applying configurations with Data Annotations is quite simple to do, specifying metadata within a domain class may not align with your style of development. There is an alternative way to add configurations that uses Code First’s Fluent API. With configuration applied via the Fluent API, your domain classes remain “clean.” Rather than modify the classes, you provide the configuration information to Code First’s model builder in a method exposed by the DbContext. The method is called OnModelCreating. Example 1-6 demonstrates the same configurations that were used above, but now applied using the Fluent API. In each configuration, the code specifies that the model builder should configure the AnimalType.

Example 1-6. Configuring the model using the Fluent API

class VetContext:DbContext
{
  public DbSet<Patient> Patients { get; set; }
  public DbSet<Visit> Visits { get; set; }

  protected override void OnModelCreating
   (DbModelBuilder modelBuilder)
  {
    modelBuilder.Entity<AnimalType>()
                .ToTable("Species");
    modelBuilder.Entity<AnimalType>()
                .Property(p => p.TypeName).IsRequired();
    }
  }

The first configuration uses the Fluent API equivalent of the Table Data Annotation, which is the ToTable method, and passes in the name of the table to which the AnimalType class should be mapped. The second configuration uses a lambda expression to identify one of the properties of AnimalType and then appends the IsRequired method to that property.

This is just one way to build fluent configurations. You will learn much more about using both Data Annotations and the Fluent API to configure property attributes, relationships, inheritance hierarchies, and database mappings in the following chapters.

Creating or Pointing to a Database

Earlier in this chapter, you saw that by default Code First created a SQL Server Express database. Code First’s database connection handling ranges from this completely automated behavior to creating a database for you at a location designated in a connection string. There are a lot of variations to being able to drop and recreate a database when your model changes during development.

You’ll find that Chapter 6 is dedicated entirely to how Code First interacts with your database.

The examples in this book will walk you through how to configure database mappings. These concepts apply equally to generating a database or mapping to an existing database. When generating a database, they affect the schema that is generated for you. When mapping to an existing database, they define the schema that Entity Framework will expect to be there at runtime.

As we explore Code First conventions and configuration in this book, we will be allowing Code First to create a database. This allows you to run the application after each step and observe how the database schema has changed. If you are mapping to an existing database, the only difference is to point Code First at that database. The easiest way to do that is described in Controlling Database Location with a Configuration File (Chapter 6). You will also want to take a look at Reverse Engineer Code First (Chapter 8).

What Code First Does Not Support

Code First is a relatively new addition to Entity Framework and there are a few features that it currently does not support. The EF team has indicated that they plan to add support for most of these in future releases.

Database migrations: At the time of writing this book, Code First does not yet support database migrations, or in other words, modifying a database to reflect changes to the model. But work on this feature is well under way and will likely be available shortly after publication. You can read about an early preview of the Migrations support on the team’s blog.
Mapping to views: Code First currently only supports mapping to tables. This unfortunately means that you can’t map Code First directly to stored procedures, views, or other database objects. If you are letting Code First generate a database, there is no way to create these artifacts in the database, other than manually adding them once Code First has created the database. If you are mapping to an existing database, there are some techniques you can use to get data from non-table database artifacts. These techniques are described in Mapping to Nontable Database Objects (Chapter 7).
Schema definition defining queries: Entity Framework includes a DefiningQuery feature that allows you to specify a database query directly in the XML metadata. There is also a Query View feature that allows you to use the conceptual model to define a query that is used to load entities. This allows the query you specify to be database provider–independent. Code First does not support either of these features yet.
Multiple Entity Sets per Type (MEST): Code First does not support the Multiple Entity Sets per Type (MEST) feature. MEST allows you to use the same class in two different sets that map to two different tables. This is a more obscure Entity Framework feature that is rarely used. The EF team has said that, in an effort to keep the Code First API simpler, they do not plan to add support for MEST.
Conditional column mapping: When working with inheritance hierarchies, Code First also requires that a property is always mapped to a column with the same name. This is referred to as conditional column mapping. For example, you may have a Person base class with a NationalIdentifier property. American and Australian classes that derive from the Person base class are mapped to separate Australians and Americans tables in the database. When using the designer, you could map the NationalIdentifier property to an SSN column in the Americans table and PassportNumber in the Australians table. Code First does not support this scenario. The column that NationalIdentifier maps to must have the same name in every table.

Choosing Code First

Now that you know what Code First is, you may be wondering whether it’s the right modeling workflow for your application development. The good news is that the decision is almost entirely dependent on what development style you, or your team, prefer.

If writing your own POCO classes and then using code to define how they map to a database appeals to you, then Code First is what you are after. As mentioned earlier, Code First can generate a database for you or be used to map to an existing database.

If you prefer to use a designer to define the shape of your classes and how they map to the database, you probably don’t want to use Code First. If you are mapping to an existing database, you will want to use Database First to reverse engineer a model from the database. This entails using Visual Studio’s Entity Data Model Wizard to generate an EDMX based on that database. You can then view and edit the generated model using the designer. If you don’t have a database but want to use a designer, you should consider using Model First to define your model with the designer. You can then create the database based on the model you define. These approaches work well, provided you are happy for EF to generate your classes for you based on the model you create in the designer.

Finally, if you have existing classes that you want to use with EF, you probably want to go with Code First even if your first preference would be for designer-based modeling. If you choose to use the designer, you will need to make any model changes in the designer and in your classes. This is inefficient and error-prone, so you will probably be happier in the long run if you use Code First. In Code First, your classes are your model, so model changes only need to be made in one place and there is no opportunity for things to get out of sync.

Note

A designer tool that the Entity Framework team is working on will provide an additional option—reverse engineering a database into Code First classes and fluent configurations. This tool was created for developers who have an existing database but prefer using Code First to using a designer. You’ll learn more about this tool in Chapter 8.

The decision process for which EF workflow to use can be summarized in the decision tree shown in Figure 1-5.

Figure 1-5. Workflow decision tree

Learning from This Book

This book will focus on building and configuring a model with Code First. It is an extension to Programming Entity Framework (second edition) and you’ll find many references back to that book, rather than duplicating here the nearly 900 pages of detailed information about how Entity Framework functions; how to query and update; using it in a variety of application types and automated tests; and how to handle exceptions, security, database connections and transactions. Creating a model with Code First is just one more feature of the Entity Framework.

In fact, as you move forward to Chapter 2, you’ll find that the domain model changes from the veterinarian sample used in Chapter 1 to the business model of Programming Entity Framework, software applications built for a company called Break Away Geek Adventures.

Look for a second short book titled Programming Entity Framework: DbContext, which will focus on DbContext, DbSet, Validation API, and using the features that are also part of the Entity Framework NuGet package.

Get Programming Entity Framework: Code First now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Programming Entity Framework: Code First by Julia Lerman, Rowan Miller

Chapter 1. Welcome to Code First

Modeling with EF Before Code First

Inception of Code First

Note

Note

Getting Code First to Developers in Between .NET Releases

Writing the Code…First

Note

Managing Objects with DbContext

Using the Data Layer and Domain Classes

Getting from Classes to a Database

Working with Configuration

Configuring with Data Annotations

Configuring with the Fluent API

Creating or Pointing to a Database

What Code First Does Not Support

Choosing Code First

Note

Learning from This Book

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly