In Chapter 3, you wrote the same basic query over and over and over again. I hope you’ll agree that this was a great way to get exposure to the many different ways of writing queries against the Entity Data Model.
There is a lot more to querying an EDM, however. You’ll need to learn about the flexibility you have for expressing complex queries, projecting data, combining and nesting queries, and writing parameterized queries. There are also nuances regarding what type of data is returned based on how you construct your queries. Sometimes you will get objects, as you saw in the examples in Chapter 3, but other times you will get unknown objects (anonymous types). It is also possible for Object Services queries to return rows and columns. You’ll need to know when to expect these varied forms of data to be returned.
Covering all of this exhaustively would require hundreds of pages. Therefore, the goal of these next two chapters on LINQ to Entities and Entity SQL is to teach you the critical features and many of the possibilities, focusing on the most typically needed query features. You will learn how to project specific values (rather than entire objects) in queries, how to query across relationships, how to write nested queries and joins, and how to control when trips are made to the database. Along the way, I will introduce and explain additional new concepts to help you truly understand the workings of the Entity Framework.
This chapter will focus on LINQ to Entities and introducing new concepts. The queries you build here will be demonstrated using Entity SQL in Chapter 5.
Throughout the rest of the book, you will see variations on queries that take advantage of even more techniques as we use queries in real-world examples.
A number of resources provide many specific examples of queries. Here you will learn some of the more common query tasks so that you will know enough to write queries without constantly having to search online for the perfect example of what you are trying to accomplish. It is also useful to check out resources such as the 101 LINQ Examples on MSDN (for VB and for C#), the number of great books dedicated to LINQ, and the Entity Framework Samples, which provide a great variety of query examples with helpful commentary.
Due to some syntax differences between VB and C# when creating LINQ expressions, you will see a number of Visual Basic examples in this chapter along with the C# versions when the difference is significant.
Here is a list of terms used in this chapter (and throughout the book) that may be new to you:
- Projection
Selecting specific properties or expressions in a query, rather than the entity being queried. For example:
from c in context.contacts select c.firstname + c.lastname, c.contactID
.- Eager loading
Requesting that related data be returned along with query results from the database. For example: when querying contacts, eager-load their addresses. The contacts and their addresses will be retrieved in a single query.
- Deferred loading
Delaying the loading of related data until you specifically request it. For example: when working with query results for a particular contact, you can make a request to retrieve that contact’s addresses from the database. When deferred loading happens automatically (implicitly), it is called lazy loading.
- Navigating
Moving from an entity to its related data. For example: navigate from a contact to its addresses using the
contact.Addresses
property.
So far, the queries you have seen return an entire object,
comparable to writing a SELECT
SQL
query requesting every column in a table. By returning an entire object
in your query, you will get all of the benefits associated with the
entity classes—the most important of which is the ability to keep track
of changes to an entity class for database updates.
Often in SQL, you will select particular columns to return from a
table (SELECT
Firstname
,
Lastname FROM
Contact
) or from a set of joined tables. This is referred to
as projection. With LINQ or Entity SQL queries you
can shape your results by picking particular properties or expressions
rather than entities. You can also select properties from related
data.
In the Chapter 3 queries, you
returned an entire object but used only the Title
, FirstName
, and LastName
properties. You can rewrite those
queries to return only these three properties. As long as you won’t need
to modify and update these results, a projection will suffice.
To see how projections work, you can continue modifying the
QueryContacts
method that you worked
on in Chapter 3. Replace the latest
version of the query with the query in Example 4-1. The difference
from earlier LINQ queries is that rather than ending with select c
to select the entire contact, you are
selecting only a few properties.
Example 4-1. Simple LINQ to Entities query with projection in VB and C#
VB
Dim contacts = From c In context.Contacts
Where c.FirstName= "Robert" _
Select New With {c.Title, c.LastName, c.FirstName}
C#
var contacts = from c in context.Contacts
where c.FirstName=="Robert"
select new { c.Title, c.FirstName, c.LastName };
Note
Why are we back to using Dim
and var
again? You’ll see the
reason shortly in the section titled Implicitly typed local variables.
You may have noticed the syntax differences between VB and C# projections. This is not particular to LINQ to Entities, but it is common for all implementations of LINQ.
C# requires that you use select new
{...}
when projecting. Visual Basic is more lenient. The
most explicit syntax for VB is Select New
With {...}
as in Example 4-1, though you
could write the Visual Basic query in this simpler format:
From c In context.Contacts _ Where c.FirstName= "Robert" _ Select c.Title, c.LastName, c.FirstName
Note
There are plenty of other nuances to LINQ projections in both languages. For example, you can project into predefined types. In addition, C# projections always create immutable (read-only) results, whereas VB allows the creation of immutable and mutable results. You can learn more about projecting with LINQ in the MSDN Library and from the many great resources that focus on LINQ.
A number of language and compiler features that were added to Visual Basic and C# (in the VB 9 and C# 3.0 versions that were released along with Visual Studio 2008 and .NET 3.5) have made it easier for developers to implement LINQ projections. We’ll examine several of these in this section, including anonymous types and implicitly typed local variables.
If you hover your mouse pointer over the contacts
variable, when the code is not
running, the DataTip will show you what the query returns. It’s an
IQueryable
of an “anonymous type,”
rather than an IQueryable
of
contact types. The anonymous type is a result of the projection in
your query, which returned results that don’t match a defined type.
The DataTips and debuggers in Visual Basic and C# often show different
information. In this case, the difference is interesting, as you can
see in Figures 4-1 and 4-2.
What is this anonymous type that the LINQ to Entities projection is returning?
The anonymous type is a language enhancement that was introduced in Visual Basic 9 and C# 3.0 that allows compilers to work with types that were not previously defined. Anonymous types are generally used for on-the-fly types that won’t be used elsewhere in the application. You cannot even pass them from one method to another. Anonymous types relieve you from having to define a class for every type, even if the type is to be used only briefly. Yet an anonymous type returned by the query is still strongly typed, which means you can easily interact with it in the code following its creation.
The sidebar Wrapping Your Head Around Lambdas includes a link to an article by Anson Horton. The article contains a great introduction to anonymous types. Anonymous types are a powerful feature that you can use throughout .NET, but they have special importance for LINQ queries because of their ability to allow projections that can return anything a developer can dream up.
So, the query in Example 4-1 returned an
anonymous type that doesn’t have a name, but has the properties
Title
, FirstName
, and LastName
. If you are still modifying the
earlier query method, you can see a bit of .NET magic by removing
the Console.WriteLine
method and
retyping it. The anonymous type is strongly typed and recognized by
IntelliSense. Pretty cool!
Another new compiler trick that you have been taking advantage
of in some of the code samples so far is the use of implicitly typed
local variables. In C# you use them with a new keyword, var
, and in VB you use them with the
existing Dim
keyword. It is
possible to declare variables without identifying their types. They
will infer the type based on the value that is being set.
Hasn’t it always seemed redundant to say something like
Dim str as String="this is some
text"
or int MyInt=123
?
With implicitly typed local variables, Dim
str="this is some text"
and var
MyInt=123
are enough. In the case of replacing int
with var
the benefit is not very obvious. Had
that type been MyCustomType<Myothercustomtype<T>>
,
suddenly var
would look pretty
convenient.
Note
This shortcut is not always a good thing, as it removes some
of the explicitness of your code. I wrote a blog post on
DevSource.com titled “How Visual Studio 2008 made me even lazier”
(http://blogs.devsource.com/devlife/content/net_general/how_visual_studio_2008_made_me_even_lazier.html).
There is an interesting discussion in the comments about the pros
and cons of implicit typing. Throughout the book, I will attempt
to declare types explicitly for the sake of clarity. However, in
cases where the type name is quite long, you may find a var
in its place.
Where implicitly typed local variables really shine, however,
is with LINQ query projections, because there’s no way to say “Dim
contacts as a thing with a Title, a FirstName, and a LastName.”
Instead, you can write “Dim contacts (and just look at the other
side of the equals sign to figure out what this is).” In this
context, Dim
in VB and var
in C# essentially translate to
“thing,” or for some readers, “whatever.”
Run the application and you’ll see that, once again, the
results are the same as they were previously. You can modify the
Console.WriteLine
command to
include the Title
property that
is in the newest query.
Note
In Chapter 10, you will learn more about Object Services and all of the functionality it provides to objects returned by queries against the EDM. This will help you better understand the significance of returning anonymous types rather than entire entity objects defined by the EDM.
You can project into anonymous types in a number of ways. For
instance, it is possible to give a name to the returned variable,
such as ContactName
in Example 4-2.
Example 4-2. Naming a projected anonymous type in LINQ in VB and C#
VB
From c In context.Contacts _
Where c.FirstName = "Robert" _
Select ContactName = New With {c.Title, c.LastName, c.FirstName}
C#
from c in context.Contacts
where c.FirstName == "Robert"
let ContactName = new {c.Title, c.LastName, c.FirstName}
select ContactName
C# does not allow naming in the SELECT
statement; it has another operator,
LET
, that can be used for this
purpose.
Note
There are so many ways to do projection and use anonymous types in LINQ queries. Here you are seeing just a small slice of what you can achieve, so be sure to look to the dedicated LINQ resources to expand your understanding.
Naming the anonymous type is more useful if this new type is a property of the projected results. In Example 4-3, a projection is used to project much more than some strings. It creates a new type with another anonymous type as the first property and the addresses of the contact as the second.
Note
I’m projecting the Addresses
property here to highlight the
projection. You’ll learn more about working with related data
later in this chapter.
When you name the anonymous type, the property that results
will have the name specified in the query. Notice that the property
name is used later in the query for the Order By
operator and when working with
the results.
Example 4-3. Anonymous types as properties
var contacts = from c in context.Contacts where c.FirstName == "Robert" let foo= new { ContactName = new {c.Title, c.LastName, c.FirstName}, c.Addresses } orderby foo.ContactName.LastName select foo; foreach (var contact in contacts) { var name = contact.ContactName; Console.WriteLine("{0} {1} {2}: # Addresses {3}", name.Title.Trim(), name.FirstName.Trim(), name.LastName.Trim(),contact.Addresses.Count()); }
Figure 4-3
shows the shape of the new range variable, foo
. The first property is the ContactName
anonymous type.
Note
Unlike the ContactName
anonymous type in this query, the Address
entities that this query returns
will participate in the change tracking and database
updates.
To project using LINQ’s method-based query syntax, you would use
the Select
method and then identify
the properties you want in its parameter. The method-based query
syntax requires the syntax for creating an anonymous type in the
lambda (see Example 4-4).
One of the big benefits that the EDM lends to querying is that the relationships are built into the model and you won’t have to construct joins very often to access related data. Additionally, when using LINQ for the queries, the related data is presented via IntelliSense, which makes it very discoverable.
Using the model, let’s take a look at some more queries, this time digging into associations.
The model has only one association, that which lies between
Contact
and Address
. The association provides two
navigations—one from Contact
to all
of its related addresses and one from Address
to its related contact.
You can easily do projection, drilling into related entities,
although drilling into a collection is different from drilling into a
reference entity. For example, you can’t request Contact.Addresses.Street
in a query. Contact
to Addresses
is a one-to-many relationship and
Addresses
is a collection of Address
entities, not a single entity.
Street
is not a property of the
Addresses EntityCollection
. However,
you could select Address.Contact.LastName
, because you would be
navigating to a single entity. There is only one contact per address;
therefore, there is no question regarding from which entity the query
should retrieve the LastName
.
Recall that navigating to the “one” end of a one-to-one or
many-to-one relationship is referred to as a navigation
reference. The entity you are pointing to is referred to as
an EntityReference
, sometimes
called an EntityRef.
Note
Chapter 19
will drill further into EntityReference
s and EntityCollection
s,
and how they are surfaced as navigation properties.
The LINQ query in Example 4-5 returns an anonymous type containing an address and its related contact.
Example 4-5. Projecting into an EntityRef with LINQ to Entities
var addresses = from a in context.Addresses where a.CountryRegion == "UK" select new { a, a.Contact };
Figure 4-4 displays the anonymous type that results in the debugger, where you can see that one property is the address record and the other is the contact.
When working with the results, you’ll have to drill into the new
type’s properties (the Address
and
the Contact
) and from there you’ll
have to drill into their properties, as shown in Example 4-6.
Example 4-6. Accessing the properties of an anonymous type
foreach (var address in addresses) { Console.WriteLine("{0} {1} {2}", address.Contact.LastName, address.a.Street1, address.a.City); }
Note
The first property is named a
because it is using the variable name
given in the query. If you want to be sure the property is called
Address
you can use that instead
of the simpler a
, or use LINQ
syntax to rename the property:
VB
Select New With {.Address = a, a.Contact}
C#
select new {Address= a, a.Contact };
Then you can work with address.Address
in the data which
results.
Although this may suit many scenarios in your applications, you
may prefer to project individual properties from the reference
navigation. Example 4-7 shows such a
query using LINQ to Entities. This projection returns a new type with
three properties. The first is an Address
entity; the second and third are
strings. Again, the property names are based on the query
defaults—a
, FirstName
, and LastName
.
Example 4-7. Combining properties from related entities
var addresses = from a in context.Addresses where a.CountryRegion == "UK" select new { a, a.Contact.FirstName, a.Contact.LastName }; foreach (var address in addresses) { Console.WriteLine("{0} {1} {2} {3}", address.FirstName, address.LastName, address.a.Street1, address.a.City); }
You can filter and sort based on a property of an EntityReference
whether or not you are
selecting the related data.
For example, you can select all addresses for contacts with a
particular last name. The LINQ to Entities query in Example 4-8 sorts by Contact.LastName
and filters on the Contact.AddDate
field even though AddDate
is not part of the results.
Querying with related data is straightforward when the related
data is a single entity, but what about when the navigation property
is an EntityCollection
such as
Contact.Addresses
?
Let’s start with a simple scenario that you have seen a few
times already in this chapter: returning a contact and its collection
of addresses. To highlight the difference between the original
properties and the results, the EntityCollection
in the new type is given a
random name, as shown in Example 4-9.
Example 4-9. Projecting an EntityCollection with LINQ
var contacts = from c in context.Contacts select new {c, Foos = c.Addresses};
This query creates a new anonymous type with two properties. The
first is the Contact
and the second is Foos, which
is the EntityCollection
of Addresses
related to that Contact
.
You can enumerate through the results, and then, for each
result, enumerate through the collection of the Foos
property, as shown in Example 4-10.
If you wanted to select particular properties such as Street
and City
from each Address
of each Contact
, the method you should use to build
the query depends on what shape you want the results to be.
You could shape the data similar to the previous example, but
instead of a set of complete address entities as the Foos
property, you can project some of the
address properties. This would result in a set of anonymous types,
named StreetsCities
instead of
Foos
, in the second
property.
You can achieve this with a nested query, a feature we’ll look
at more closely later in the chapter. For now, you can see in the
query in Example 4-11
that the third property, StreetsCities
, contains the results of
querying the Contact
’s Addresses
.
Example 4-11. Projecting values from an EntityCollection
from c in context.Contacts select new {c.FirstName, c.LastName, StreetsCities = from a in c.Addresses select new { a.Street1, a.City } }
The anonymous type that is returned has the properties
FirstName
and LastName
, along with a collection of
anonymous types with Street1
and
City
properties. The debugger
screenshot in Figure 4-5
displays the new type.
Another way to project into the addresses is to merely turn the query around. That is, query the addresses and their contact data to flatten the results, as shown in Example 4-12, so that the data is no longer shaped.
Example 4-12. Flattening the related data
var contacts = from a in context.Addresses orderby a.Contact.LastName select new {a.Contact.LastName, a.Contact.FirstName, a.Street1, a.City};
This will result in a single type with four properties, but contacts with multiple addresses will appear multiple times, as you can see in this section of the results. For instance, Katherine Harding and Keith Harris each have two results:
Hanson, John: 825 W 500 S, Bountiful Harding, Katherine: 52560 Free Street, Toronto Harding, Katherine: 25 Flatiron Blvd., Vancouver Harrington, Lucy: 482505 Warm Springs Blvd., Fremont Harris, Keith: 3207 S Grady Way, Renton Harris, Keith: 7943 Walnut Ave., Renton Harui, Roger: 9927 N. Main St., Tooele Hass, Ann: Medford Outlet Center, Medford
Although you can easily use related data in projections or for
filtering, sorting, and other operations, it is important to keep in
mind that when the related data is in a collection, you need to
leverage operations that can be performed on a set of data. For
example, if you want to find contacts with addresses in the United
Kingdom (represented as UK in the database), you can use the ANY
method in LINQ to Entities (see Example 4-13) or the EXISTS
operator in Entity SQL (which you’ll
see in the next chapter) to search the contact’s addresses. The LINQ
query uses a predicate to provide the condition for ANY
.
Aggregates perform calculations on a series of data. Aggregate
methods include Count
, Sum
, Average
, Min
, and Max
. You may not want the entire collection
of addresses, but rather some aggregated information about that
collection.
Aggregating data with LINQ is easyusing one of the aggregate
methods such as Count
; simply
append the method to the collection name. The Count
method will return the count of the
items in the collection (see Example 4-14).
Example 4-14. Using the Count aggregate method in LINQ to Entities
from c in context.Contacts select new {c.LastName, c.Addresses.Count};
Other types of aggregates, such as Max
, require a specific value to
aggregate. You can supply that value using a lambda expression, as
shown in Example 4-15.
Example 4-15. Using an aggregate method with a lambda in LINQ
from c in context.Contacts select new { c.LastName, MaxPC = c.Addresses.Max(a => a.PostalCode)};
It’s important to name the property returned by the aggregate function, because LINQ is unable to derive one based on the method. If you forget to do this, both VB and C# will give a compiler error explaining the problem.
Note
Visual Basic has an Aggregate
operator for LINQ that you can
use in place of FROM
in your
LINQ queries. Check the MSDN Library topic “Aggregate Clause
(Visual Basic)” for more information.
The LINQ aggregates are methods, not query operators. Therefore,
they work very naturally with the LINQ query methods. Example 4-16 uses the
Max
aggregate as one of two
projected values to be returned.
Example 4-16. A LINQ method syntax query using an aggregate
context.Contacts .Select((c) => new { c.LastName, MaxCode = c.Addresses.Max(a => a.PostalCode) });
This query does two interesting things with the lambdas. First
it uses a lambda expression to specify what values should be
projected: LastName
and MaxCode
. Once the variable, c
, has been declared, the function projects
an anonymous type consisting of LastName
as the first property and MaxCode
as the second. MaxCode
is defined by using the Max
aggregate on the Addresses
collection of the
contact.
Although associations in the EDM minimize the need for joins in queries, sometimes a relationship may exist but there is no association to represent the relationship. In these and other cases, you can use nested queries or joins to bring the data together.
LINQ to Entities provides a JOIN
operator as well as GROUPJOIN
. Entity SQL provides a variety of
options in the JOIN FROM
clause,
including inner joins, as well as left, right, and full outer joins. It
also enables joining multiple collections separated by commas.
The vOfficeAddresses
entity
in the current model has all of the contact properties except for the contact’s
Title
. Because there is no
association between vOfficeAddresses
and Contact
, you will need to use JOIN
to combine the vOfficeAddresses
entity properties with
the Title
property.
Note
You could, of course, add the association to the model in this case, but then there would be no lesson here, would there?
Example 4-17 shows the syntax of a
LINQ JOIN
.
Example 4-17. JOIN syntax for LINQ
FROM [variableA] IN collectionA JOIN [variableB] IN collection ON variableA.commonproperty EQUALS variableB.commonProperty SELECT .....
Example 4-18 shows
how to combine data from Contact
entities and vOfficeAddresses
entities using the
JOIN
.
Example 4-18. A LINQ to Entities query using a JOIN
from c in context.Contacts join oa in context.vOfficeAddresses on c.ContactID equals oa.ContactID select new { oa.FirstName, oa.LastName, c.Title, oa.Street1, oa.City, oa.StateProvince };
This provides an inner join where only entities with matching
ContactID
s are returned. Any
contacts with no match in the vOfficeAddresses
will not be returned.
vOfficeAddresses
with no match in
Contacts
will not be returned
either.
Both LINQ and Entity SQL provide the ability to nest queries, and you have already seen some examples of this. When you write a query, anywhere a value is expected you can use another query in its place, as long as that query returns an acceptable type. You can use a nested query in place of an expression or a collection, as you will see in the following examples.
The goal of the previous JOIN
queries was to return properties from a Contact
entity combined with properties from
the vOfficeAddresses
entities where
the ContactID
matches.
Example 4-19 shows
how to express the previous query in LINQ using a nested query
instead of a JOIN
. The query uses
a nested query (highlighted) combined with the FirstOrDefault
method in place of a
projected value to return results from vOfficeAddresses
.
Example 4-19. Nested query in place of a SELECT expression in LINQ
from oa in context.vOfficeAddresses select new { oa.FirstName, oa.LastName, Title = (from c in context.Contacts where c.ContactID == oa.ContactID select c.Title).FirstOrDefault(), oa.Street1, oa.City, oa.StateProvince };
There are a few notable twists to this query. The first should
be familiar: an anonymous type is not able to automatically name the
return from the nested query. Therefore, it is given the name
“Title”. The second twist is that the subquery returns an IQueryable
of
String
, not just a string, which
is why the FirstOrDefault
method
is appended to the query.
You can also use the nested query in place of the collection being queried. The nested query merely returns another collection to be queried.
Let’s start with a basic example. Rather than querying all
vOfficeAddresses
, you could
create a subquery that returns only vOfficeAddresses
in Ontario and then query
against that. Example 4-20 is simplistic
and could easily be expressed without the nested query. The
technique can be useful when you are attempting to express queries
that are much more complex.
Example 4-20. Nested query in place of a target collection in LINQ
var contacts = from add in (from oa in context.vOfficeAddresses where oa.StateProvince == "Ontario" select oa) select ...
You can benefit from using nested queries to help with complicated queries by separating the nested query from the main query.
On its own, this particular example doesn’t seem very useful,
but imagine being able to use subqueries to redefine the universe of
vOfficeAddresses
from which to
query, and then passing that into different methods which will
perform additional queries on that subset.
Example 4-21
ties a subquery to a variable and then uses that variable in another
query. The second query is complex enough, using another nested
query to join vOfficeAddresses
back to Contact
. Breaking up the query makes the
code much more readable. When the query is executed, the Entity
Framework will create a single query from the combined
expressions.
Note
Don’t forget the importance of knowing what is going on at the database level by using some type of profiler, as suggested in Chapter 3.
Example 4-21. Breaking a nested query out of the main query in LINQ
var universe = from oa in context.vOfficeAddresses where oa.StateProvince == "Ontario" select oa; var query = from oa in universe select new { oa, contact = (from c in context.Contacts where c.ContactID == oa.ContactID select c) }; var AddressesWithContacts = query.ToList();
Note
You can’t separate out a nested query that’s inside a projection, as in Example 4-21, because its filter condition is dependent on the main query.
Warning
An Order
operator in a
subquery will be ignored. The main query controls ordering.
Both LINQ and Entity SQL provide operations for grouping data. You can use grouping in connection with aggregates or to shape data.
LINQ to Entities has a Group
operator (literally Group By
in
Visual Basic and Group
in C#) and a
GroupBy
method (with eight
overloads). Entity SQL provides a GROUP
BY
operator and a GroupBy
query builder method.
The results of the grouping can use automatic naming, and in other
cases can be explicitly named. In addition, an INTO GROUP
clause is required in Visual Basic.
C# has an optional INTO
clause.
The constructs for VB and C# are quite different and it’s easiest to explain them with examples. Example 4-22 shows the simplest form of grouping in LINQ for both Visual Basic and C#.
Example 4-22. Simple grouping in LINQ to Entities in VB and C#
VB
From c In context.Contacts Group By c.Title Into Group
C#
from c in context.Contacts group c by c.Title into mygroup select mygroup
The result of this query is an IQueryable
of an Entity Framework class called
Grouping
; more
specifically, System.Data.Objects.ELinq.InitializerMetadata.Grouping<K,T>
.
In our example, it’s a Grouping<string,Contact>
. This is
something like a key/value pair where the key is K
(the string in our example) and the value is
an IEnumerable
of T
(e.g., the group of Contact
types).
The results, therefore, are a set of these key/value pairs. If we select one of the groupings, as you can see in Figure 4-6, VB automatically names the property containing the title as “Title”.
Figure 4-6. The VB result, which contains a Title property and a Group property that contains three contacts
By default, C# uses the word Key as the name for the key of the grouping and doesn’t name the property that contains the grouped records, as you can see in Figure 4-7.
VB allows you to specify the property name rather than use the
default. In Visual Basic, to change the Title
property of the preceding query to
MyTitle
, you would use the syntax
Group By MyTitle=c.Title
.
In VB, the Group
property is
available to access the group. You can rename this as well. For example,
Into MyGroup = Group
renames the
property to MyGroup
.
The optional INTO
clause in
C# allows you to specify a group name, but this is not exposed as a
property. You specify the name with INTO
so that you can perform further
functions on the group. Note that in C#, using the INTO
clause requires that you also use the
SELECT
clause. The Key
property is then accessible as a
property of the group.
With the group specified, it is now possible to explicitly name
the properties in C#. LINQ queries in Visual Basic will imply a
SELECT
statement if it is not used.
In this case, the query will still return Title
and MyGroup
by default without specifying
SELECT
. Of course, you can shape
the data further by specifying your own output with an explicit
SELECT
operator.
Example 4-23 demonstrates these changes to the previous queries.
Visual Basic provides a simple way to use aggregates in grouping
queries, by specifying one or more aggregates in the INTO
clause separated by commas. In Example 4-24, your result will contain
the properties Max
and Count
.
Example 4-24. Chained aggregates in VB LINQ
VB
From c In context.Contacts _
Group By c.Title Into MyGroup = Group, _
Max(c.AddDate), Count()
In C#, you need to explicitly project these properties in the
Select
clause using methods and
predicates, as shown in Example 4-25.
There is so much more that you can do with grouping in LINQ. For now, we’ll take a look at one more variation: filtering on the grouping condition.
The Title
fields in the
sample data contain Mr., Mrs., Ms., Sr., and a few other titles. Also,
some contacts have no title. Perhaps you would like to group on title,
but exclude empty titles. To filter what is being grouped, such as
“only group contacts with something in the Title
field,” you can apply the filter to
the control variable, Title
, to
make sure it contains a value.
You may, however, want to filter on a property of the Group
. With LINQ you can continue to use the
WHERE
operator, as shown in Example 4-26.
Example 4-26. Filtering on a Group property with LINQ
VB
From c In context.Contacts _
Group By c.Title Into MyGroup = Group, Count() _
Where (MyGroup.Count() > 150)
C#
from c in context.Contacts
group c by c.Title into MyGroup
where MyGroup.Count() > 150
select new { MyTitle = MyGroup.Key,
MyGroup,
Count = MyGroup.Count()};
In LINQ, you will also need to be aware of variables going out
of scope, as in the Visual Basic query shown in Example 4-27, which won’t
compile. The a
in Group by
a.CountryRegion
is out of scope because
by this point in the query, you are working with the anonymous type
returned by the Select
statement.
And the Select
does need to go
before the Group By
.
Example 4-27. An out-of-scope variable preventing this query from compiling
VB
From a In context.Addresses _
Select a.Contact.FirstName, a.Contact.LastName, a.CountryRegion _
Group By a.CountryRegion Into MyGroup = Group, Count() _
Where (MyGroup.Count() > 150)
You can avoid this problem by naming the anonymous type, and then grouping by a field within the name, as shown in Example 4-28.
Example 4-28. Naming variables to keep them from going out of scope
VB
From a In context.Addresses _
Select c = New With {a.Contact.FirstName,
a.Contact.LastName, _
a.CountryRegion} _
Group By c.CountryRegion Into MyGroup = Group
C#
from a in context.Addresses
let c= new {a.Contact.FirstName, a.Contact.LastName,
a.CountryRegion} group c by c.CountryRegion
into MyGroup where (MyGroup.Count() > 150)
select MyGroup;
Both the Visual Studio documentation and the ADO.NET Entity
Framework documentation and samples can provide you with an astounding
array of data shaping that you can perform with Group By
/groupby
in LINQ, and even then there are
still many more.
Note
See Finding More Query Samples for links to these resources.
Like everything else this chapter has covered so far, we have
only skimmed the surface of GROUP
BY
in Entity Framework queries. You will see more uses
throughout this book and can find more details (and plenty of rules)
in the documentation. The rest of this chapter will explain some
important concepts that have been exposed by the queries you’ve seen
so far.
Whether you write a query that returns entities, anonymous types,
DbDataRecord
s, or DbDataReader
s, you can return shaped data.
You’ve seen this in several of the previous queries, with a variety of
shaped results. How you use this data depends on how the data is shaped.
Let’s take a further look at the results of some of the earlier
queries.
The LINQ and Object Services queries that returned entities defined in the model are not shaped. They are purely a collection of individual entities.
For instance, Example 4-13 returned an
IQueryable
of Contact
objects. Example 4-9, however, returned
an anonymous type with two properties. The first property was a Contact
entity and the second was a collection
of Address
entities related to that
Contact
. The
code in Example 4-10
enumerated over that data, albeit in a somewhat boring way, to
demonstrate what the data looked like. It showed the contacts and the
addresses but did not truly demonstrate the relationship between the
two.
Example 4-29 executes the
same query and then enumerates through the anonymous types that result.
This time, however, the code accesses the Addresses
as a navigation property of the
Contact
.
LazyLoadingEnabled
is set to
false
to ensure that the Count
method does not impact the
results.
Example 4-29. LINQ query creating shaped results
context.ContextOptions.LazyLoadingEnabled=false; var addressGraphs = from a in context.Addresses where a.CountryRegion == "Canada" select new { a, a.Contact }; foreach (var ag in addressGraphs) { Console.WriteLine("LastName: {0} # Addresses: {1} ", ag.Contact.LastName.Trim(), ag.Contact.Addresses.Count()); foreach (Address address in ag.Contact.Addresses) { Console.WriteLine(".....{0}", address.City); } Console.WriteLine(); }
Note
There’s a simpler way to express this particular query with the
Include
method, which you will see
next. But what differentiates this from Include
is that with it you can take the
projection in Example 4-29 a
step further in a direction that you won’t be able to do with Include
. I’ll discuss this after we look at
the results of this example.
Let’s turn the query around a bit to see how this can work. Imagine you are querying contacts and want to also return their addresses.
The WriteLine
method doesn’t
access the a
property of the
anonymous type, but instead navigates to the addresses through the
Contact
property of the anonymous
type.
As the Contact
and Address
entities are materialized, the Entity
Framework recognizes that they are related to each other and wires them
up so that you can navigate between them. The Address
objects have a Contact
object in their Contact
property and the Contact
objects have Address
objects in their Addresses
property. This is a very high-level
explanation of an important function of the Entity Framework’s Object
Services API, which you will learn plenty about throughout the
book.
There is an interesting thing to be aware of with respect to how
the Entity Framework connects the related entities in the scenario laid
out in Example 4-29. If you
look at the following sample of the output, you can see that two
addresses belong to the contact “Harding.” One is in Toronto and the
other is in Vancouver. But the first instance says that Harding has only
one address. Not until the code has reached the second address is the
contact aware that two addresses exist in its Addresses
navigation collection.
LastName: Garza # Addresses: 1 ....Burnaby LastName: Harding # Addresses: 1 ....Toronto LastName: Harding # Addresses: 2 ....Toronto ....Vancouver LastName: Caprio # Addresses: 1 ....Toronto LastName: Blackwell # Addresses: 1 ....Toronto LastName: Hamilton # Addresses: 1 ....Chalk Riber
The second address isn’t recognized initially because it hasn’t been materialized as an object yet. As the code enumerates through the query results for the first time, the objects are created from the query results as each contact or address is reached. Once the second address is encountered and turned into an object, its relationship to the contact is identified.
Note
I had you disable lazy loading in order to see this because when
you requested Addresses.Count
, lazy
loading would have kicked in and gone to the database to retrieve the
contact’s complete Addresses
collection. For the sake of the demo, I did not want this behavior.
You’ll learn more about lazy loading further on in this chapter, and
later in the book as well.
We will explore the object life cycle more deeply in a later chapter, but this should give you some idea for now about what’s going on in this example.
At the end of the previous example, I mentioned that projections
will allow something that the upcoming Include
method won’t allow. That is the
ability to filter which related data is returned.
If you were querying for contacts with their addresses, a projection would look like this:
var contactGraphs = from c in context.Contacts select new { c, c.Addresses };
You can modify the query to load all of the contacts, but only a subset of their addresses, as in Example 4-30.
Example 4-30. Filtering related data in a query using projections
var contactGraphs = from c in context.Contacts select new { c, MyAddresses = c.Addresses.Where(a=>a.CountryRegion="UK")};
I’ll refer back to this example as we look at other means of loading related data.
So far, all of the queries that involved returning related data have explicitly asked for that data in the query itself. The Entity Framework will only return data that you explicitly ask for. If your query asks only for contacts, the Entity Framework will not make an assumption that just because contacts have addresses, it should return the addresses anytime you query for contacts. Consider a typical model for sales information. A contact is related to a customer; a customer has sales orders; each sales order has line items; each line item relates to a product; each product comes from a vendor and is also related to a category. Can you imagine if you queried for contacts, and without expecting it, the entire contents of the database were returned—because it was all related?
It is possible to get related data after the fact. For example, if you queried for a selection of contacts, as you work with those contacts in code you can request the contacts’ addresses without performing another complete query.
Note
For developers coming from the first version of Entity
Framework, there is a big change to be aware of here. The implicit,
automatic loading of related data, controlled by the ObjectContext.ContextOptions.LazyLoadingEnabled
property, is a new option in the Entity Framework. It is enabled
(i.e., set to true
) by default, for
newly created models. The property will be false
on existing models pulled into .NET 4
to prevent breaking changes in your existing code.
This is referred to as deferred loading or implicit deferred loading, and is most commonly known as lazy loading.
As of .NET 4, Entity Framework performs lazy loading by default.
The LINQ to Entities query in Example 4-31 returns an
Object
Set
of Contact
entities. As the code enumerates
through the results, it also asks for information about the related
Addresses
. But the Addresses
were not returned with the original
query.
Example 4-31. Implicitly loading related data after the fact
var contacts= from c in context.Contacts select c; foreach (var contact in contacts) { Console.WriteLine("{0} #Addresses: {1}", contact.LastName,contact.Addresses.Count()); }
Unlike the filtered projection in Example 4-30, lazy loading has no means of filtering the data being loaded.
However, each time the code hits a request for the address count of the current contact, a new query will be executed on the server to retrieve the addresses for the current contact. You should understand that this means that if there were 10 contacts in the original result, there will be 10 additional trips to the database as you iterate through the 10 contacts.
Lazy loading is surely convenient, but if you are not paying attention, you could be abusing your server resources by unknowingly or even unnecessarily causing repeated trips to the database. You can disable (and reenable) lazy loading as needed in code or modify the default behavior for the context. There are other ways to load related data when you need it even if you are not depending on lazy loading.
Lazy loading can be controlled through the ObjectContext
’s ContextOptions.LazyLoadingEnabled
property:
var context = new SampleEntities(); context.ContextOptions.LazyLoadingEnabled = false;
Once it is disabled, you can still explicitly load related data on demand if needed, or even load the data along with the initial query. These two methods are covered in the next few pages.
In the default generated classes, the constructors for the
ObjectContext
(e.g., SampleEntities
) set
LazyLoadingEnabled
based on an
annotation in the EDMX. The XML annotation was pointed out in the
CSDL EntityContainer
section of
Chapter 2.
Models that are created in Visual Studio 2010 have this
annotation with the value set to true
. Models that were created in Visual
Studio 2008 SP1 do not have the annotation, and therefore, if you
are using an older model, by default, lazy loading will not be
enabled.
The Lazy Loading Enabled setting is exposed in the model’s Properties window in the Designer, in the Code Generation section, where you can change the default behavior for a particular model.
Let’s return to the query in Example 4-31:
var contacts= from c in context.Contacts select c;
When lazy loading is disabled, because the query does not
explicitly request the addresses, the Addresses.Count
for every single contact
will be zero.
But you can explicitly tell the Entity Framework to get the addresses for the current contact, as shown in the Example 4-32.
Example 4-32. Explicitly loading related data with the Load method
foreach (var contact in contacts) { contact.Addresses.Load(); Console.WriteLine(contact.Addresses.Count); }
When Load
is called, Object
Services will execute a query to retrieve all of the addresses for
that contact. In the preceding example, after Load
is called, the value of Count
will be correct and all of the
Address
entities for that contact
will be available.
Using Load
is another case
where you cannot filter the related data being loaded as you can with
the projection in Example 4-30.
In .NET 4, a new method was introduced to load from the context,
not from the navigation property. The method is ObjectContext.LoadProperty
and it was
created as part of the support for classes that do not inherit from
EntityObject
. You’ll learn about
LoadProperty
in Chapter 11.
You can also perform deferred loading for EntityReference
navigation properties—for
example, Address.Contact
.
However, rather than load from the Contact
property, you must load from the
additional property that was created by the code generation:
ContactReference
. The Entity
Framework sees Address.Contact
as
merely a Contact
entity, and the
Contact
class does not have the
Load
method. It is the ContactReference
property that has the
knowledge of how to load the related information. Each EntityReference
navigation property will
have a related property with the word Reference
appended to its name.
Example 4-33
shows how to load Contact
data
for particular addresses after the addresses have already been
queried.
There is a big performance consideration here. Whether you are lazy-loading or explicitly loading the related data for each contact, the code is forcing an extra round trip to the database, something many developers won’t realize unless they are profiling the database activity. This can be extremely inefficient and might also get you into big trouble with the IT pros in your company. With lazy loading disabled, you can have some control over when the extra trip is made.
Load
is a great choice in
cases where you want to inspect the contacts and then load addresses
for only particular contacts. Perhaps you want to list all contacts,
but for contacts that were added after a particular date you need to
see how many addresses are in the database. The code in Example 4-34 demonstrates
this scenario, where you may determine it is more efficient to make
a small number of database trips rather than preloading addresses
for every contact.
Example 4-34. Loading addresses for some of the contacts
foreach (Contact contact in contacts) { Console.WriteLine(contact.LastName); if (contact.AddDate > System.Convert.ToDateTime("1/1/2008")) { contact.Addresses.Load(); } }
With lazy loading enabled, this kind of granular control is a bit more difficult to achieve.
The benefit of having lazy loading enabled is that you won’t have to worry about reporting that there are no addresses for a contact when in fact there are a number of them in the database because you forgot to, or didn’t know that you needed to, explicitly load those related addresses.
In cases where you know you will need all of the addresses up
front, it may be more efficient to retrieve them as part of the
original query. Although you have seen how to do this with projection
by including the addresses in the SELECT
clause, the Include
method is another way to achieve
this and may be preferable for a variety of reasons. The most notable
reason is that the resultant objects will be your entities, rather
than anonymous types with entities as their properties. However,
Include
does not allow you to
filter the related data as you can with a projection.
Include
is a query builder
method and you can apply it to an ObjectQuery
or ObjectSet
(which, as you may recall, derives
from ObjectQuery
). Because context.Contacts
is an ObjectSet
, you can use Include
even within a LINQ query, as shown
in Example 4-35.
Example 4-35. The Include method in a LINQ to Entities query
from c in context.Contacts.Include("Addresses") where c.LastName.StartsWith("J") select c
The argument for Include
is a
string that is the name (or names) of the navigation properties to
bring back along with the contacts. This is referred to as
eager loading or eager
fetching.
You can use Include
only when
returning an ObjectQuery
or
ObjectSet
of a single entity type.
You cannot use it with projections, and if you do project, Include
will be ignored.
In the sample model, there is only one navigation property for
contact
, which is Addresses
. Imagine a sales model with a
number of entities and a variety of navigations. You could query
customers and eager-load the orders and all of the orders’ details by
querying Customers.Include("Orders.OrderDetails")
.
The string is called a query path because it
defines the path that the query should navigate through the model.
This will bring in both the Orders
and OrderDetails
. Additionally, you
could eager-load the orders and the customers’ addresses by chaining
the Include
methods like
this:
Customers.Include("Orders.OrderDetails").Include("Addresses")
Data shaping is one of the interesting benefits of Include
. The previous Contacts.Include("Addresses")
query
returns a set of Contact
entities. This does not have
the same effect as projection, which would have to return DbDataRecord
s.
Figure 4-8
shows the query results in the debugger’s QuickWatch window. You can
see that the results are strictly a set of Contact
entities. Where are the
addresses?
Figure 4-8. The result of the Include with no projections, which returns only the primary entity of the query
Figure 4-9
drills into one of the contacts, and you can see that both of this
contact’s addresses are there. The Include
brings in the related data, and
unlike the issue you saw in the results of Example 4-29 (not all addresses
were being attached to Ms. Harding from Toronto until the addresses
had been enumerated), all of these addresses are present as soon as
you get to the contact.
Figure 4-9. The result of the Include with projections, with lazy loading disabled, which returns the contact’s related Addresses in the query
Note
Lazy loading will still be active when you are inspecting
data in debug windows such as the QuickWatch window in Figure 4-9. I disabled
lazy loading for the context prior to opening the QuickWatch
window. You can also watch a database profiler to ensure that the
Addresses
count you are looking
at is truly a result of eager loading and is not being provided by
way of lazy loading and an extra hit to the database.
You can use the properties of the Include
entities in many of the same ways
you can use properties of any related data when querying.
Example 4-36 uses
the CountryRegion
field of
Address
to limit which contacts
are retrieved. But be sure you are clear on the results. This will
return contacts that happen to have any of their addresses in the
United Kingdom. If a contact has multiple addresses and only one of
them is in the United Kingdom, you will still get all of those
addresses.
Example 4-36. Limiting which contacts are retrieved
from c in context.Contacts.Include("Addresses") where c.Addresses.Any((a) => a.CountryRegion == "UK") select c
Note
Although you can use the properties of the included data in your query, you cannot filter or sort the included data. There’s no way to say “when you return the addresses along with the contacts, please sort the addresses by city.” Additionally, as mentioned before, you can’t filter the included data either.
You have some things to consider when choosing between the
Load
and Include
methods. Although the Load
method may require additional round
trips to the server, the Include
method may result in a large amount of data being streamed back to the
client application and then processed as the data is materialized into
objects. This would be especially problematic if you are doing all of
this work to retrieve related data that may never be used. As is true
with many choices in programming, this is a balancing act that you
need to work out based on your particular scenario.
The documentation also warns that using query paths with
Include
could result in very
complex queries at the data store because of the possible need to use
numerous joins. As the model becomes more complex, the potential for
trouble increases.
You could certainly balance the pros and cons by combining the
two methods. For example, you can load the customers and orders with
Include
and then pull in the order
details on an as-needed basis with Load
.
The correct choice, or combination, will most likely change on a case-by-case basis.
All of the queries so far have returned sets of data. What if you
wanted to retrieve a single entity or a single result? The queries
return IQueryable
s or ObjectQuery
s and you need to dig into those to
get at the actual data, which might be entities, anonymous types, or
DbDataRecord
s.
This is reasonable if you are returning multiple items, but what
about cases where you query for one particular item—for example, the
contact whose ContactID
is 63
—and you don’t want to have an IQueryable
returned, but just the item?
LINQ to Entities has a pair of methods, First
and FirstOrDefault
, which will return the first
item in the result set. Additionally, Single
and SingleOrDefault
are useful when you are
expecting only one item in the result set—for example, if you are
querying for a single contact. These methods are not specific to LINQ to
Entities, but come from LINQ and may be familiar to you
already.
Example 4-37 shows two
techniques for using these methods. In the first technique, a query is
defined and then the Single
method is
called. This will cause the query to be executed and the contact entity
to be returned. The second technique appends the Single
method directly to the query. Even
though Single
is a LINQ method, you
can combine it with the query operator syntax by wrapping the query in
parentheses. In this case, the query is executed immediately and the
contact is returned.
Example 4-37. Querying with the Single method
IQueryable<Contact> contacts = from c in context.Contacts where c.ContactID == 1 select c; Contact contact = contacts.Single(); Console.WriteLine(contact.LastName); Contact singleContact = (from c in context.Contacts where c.ContactID == 2 select c).Single(); Console.WriteLine(singleContact.LastName);
There’s a potential problem here. If there are no items, First
and Single
will throw an InvalidOperationException
with the message
“Sequence contains no elements.” FirstOrDefault
and SingleOrDefault
protect you from the exception
by returning the default, which is generally a null (Nothing
in VB). Additionally, if you use
Single
or SingleOrDefault
but the result set contains
more than one item, an exception will be thrown. In that case, you
should be using First
or FirstOrDefault
.
In Example 4-38,
SingleOrDefault
is used to avoid an
exception being thrown. Contact
in this case will be Nothing
/null
after the query is executed.
Example 4-38. Using SingleOrDefault to avoid an exception
var contact = (from c in context.Contacts where c.ContactID == 7654321 select c).SingleOrDefault();
Another way to use these methods is to pass the predicate directly
to them, rather than using a where
operator.
For example:
var contact = context.Contacts.Single(c => c.ContactID == 1);
The ObjectContext.GetObjectByKey
method and its
counterpart, TryGetObjectByKey
,
provide a way to query for an object without having to construct and
execute a query. However, this has a notable twist. The runtime will
first look in the existing instantiated objects to see whether the
object has already been retrieved. If it is found, this is what will
be returned. If not, the query to the data store will be executed
automatically and the object will be returned.
GetObjectByKey
takes an
EntityKey
type that defines what
object to retrieve based on its EntitySet
, its key property name, and the
value of that property. For example, EntityKey("SampleEntities.Contacts","ContactID",5)
defines an object in the Contacts EntitySet
with a ContactID
value of 5
. Once the EntityKey
has been created, GetObjectByKey(myEntityKey)
will return the
object either from memory or from the database.
TryGetObjectByKey
uses the
.NET Try
pattern to avoid returning
an exception if the object is not found in memory or in the
database.
You will see both of these used many times in later chapters,
and you will learn all about the EntityKey
class in Chapter 10.
Note
There is also a method for retrieving an entity by only
looking in memory and not checking the database, called GetObjectStateEntry
. You’ll learn about
this method in Chapter 10.
This chapter is filled with many queries, but there are so many possibilities for querying with LINQ or Entity SQL that you will certainly benefit from checking these other great resources:
- MSDN’s 101 C# LINQ Samples
- MSDN’s 101 Visual Basic LINQ Samples
- MSDN’s Entity Framework Query Samples
There are also a number of excellent books that are focused on LINQ or that contain LINQ content. Some that I recommend are LINQ Pocket Reference by Joseph Albahari and Ben Albahari (O’Reilly), LINQ in Action by Fabrice Marguerie et al. (Manning Press), and Essential LINQ by Charlie Calvert and Dinesh Kulkarni (Addison-Wesley).
In this chapter, you have learned a variety of ways to use LINQ to
Entities to express more complicated queries. You have used projections,
queried across navigations, and learned how to group. You have also
learned about various ways to load related data, whether through
returning shaped results with the Include
method, retrieving related data after
the fact with lazy loading or explicitly calling a Load
method. With LINQ to Entities, Entity
SQL, Object Services, and EntityClient
, the Entity Framework provides
myriad possibilities for querying data and shaping results. In the next
chapter you will see how many of the queries written in this chapter can
be written with Entity SQL.
Although it would take a few hundred more pages to ensure that you have seen an example of almost any type of query you may want to write, these past two chapters should leave you very prepared to venture forth.
In Chapter 6, you will learn about updating the data you have queried and taking advantage of stored procedures. Then, beginning with Chapter 9, you will start to write some small applications and be able to leverage many of these types of queries.
Get Programming Entity Framework, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.