Chapter 1. Connecting to Data

1.0. Introduction

This chapter shows how to connect to a variety of data sources from ADO.NET; how to handle security-related issues including storing connection strings and using different authentication methods; and how to set up, monitor, and optimize connection pooling.

ADO.NET Overview

ADO.NET is the part of the .NET Framework that connects applications to data sources and lets you retrieve and update the contained data. ADO.NET supports a variety of different data sources, including relational databases such as Microsoft SQL Server, Oracle, and Microsoft Access, as well as other data sources such as Microsoft Excel, Outlook, and text files.

A .NET Framework data provider is used to connect to a data source, execute commands, and retrieve results. The .NET Framework ships with the data providers shown in Table 1-1.

Table 1-1. Data providers included in the .NET Framework

.NET Framework data provider

Data source access

SQL Server

Microsoft SQL Server version 7.0 or later

OLE DB

Data sources using OLE DB

ODBC

Data sources using ODBC

Oracle

Oracle client software version 8.1.7 or later

Other providers are also available; for example, Oracle has developed its own .NET data provider. Data providers also exist for databases such as Sybase and MySQL. Database-specific providers usually access the underlying data store directly and offer the best performance, broadest functionality, and support for database-specific features. Since a data provider needs only to implement a core set of standard interfaces, the capabilities and performance of data providers for the same data source can differ significantly.

In addition to database-specific providers, the OLE DB .NET data provider allows access to most OLE DB data sources through OLE DB providers. Similarly, the ODBC .NET data provider uses the ODBC drivers to access most ODBC data sources. You can also develop your own data provider to access proprietary data sources or to meet special requirements.

ADO.NET is fundamentally different from ADO despite sharing a similar name. ADO.NET is based on a disconnected architecture with tight XML integration and is designed specifically to facilitate development of loosely coupled solutions.

ADO.NET code is forward-compatible—ADO.NET code written using .NET Framework 1.1 or later will run on later versions of the .NET Framework.

ADO.NET has both connected and disconnected classes. The connected classes let you retrieve and update data in underlying data sources. The disconnected classes let you access and manipulate offline the data you retrieved using the connected classes and later synchronize it with the underlying data source using the connected class.

Each data provider is responsible for implementing the connected classes. A brief description of each follows:

Connection

A unique session with the data source. A Connection specifies necessary authentication information needed to connect to a data source. The Connection object is specific to the type of data source—for example, the .NET Framework data provider for SQL Server includes the SqlConnection class.

Command

Issues database commands against the data source using an established Connection. The CommandText property of the Command class contains the SQL statement, stored procedure name, or table name executed at the data source. The Command object is specific to the type of data source—for example, the .NET Framework data provider for SQL Server includes the SqlCommand object.

DataReader

Retrieves a forward-only, read-only data stream from a data source. The DataReader object is specific to the type of data source—for example, the .NET Framework data provider for SQL Server includes the SqlDataReader object.

DataAdapter

Bridges the connected classes with the disconnected classes by retrieving data from a data source and filling a (disconnected) DataSet. The DataAdapter also updates the data source with changes made to a disconnected DataSet. The DataAdapter uses the Connection object to connect the data source and up to four Command objects to retrieve data from and resolve changes (i.e., update, insert, and delete rows) to the data source. The DataAdapter object is specific to the type of data source—for example, the .NET Framework data provider for SQL Server includes the SqlDataAdapter object.

The disconnected classes are part of the ADO.NET classes in the .NET Framework. They provide a consistent programming model regardless of the data source or data provider. The disconnected classes include:

DataSet

An in-memory cache of data retrieved from the data source. The DataSet exhibits similar properties to an in-memory relational database—for example, data is organized into multiple tables using DataTable objects, tables can be related using DataRelation objects, and data integrity can be enforced using the constraint objects UniqueConstraint and ForeignKeyConstraint.

The DataSet retains no information about the source of the data used to fill it with data. It maintains both current and original versions of data allowing the data source to be updated with changes at some future time. Disconnected data classes (DataSet and DataTable) are serializable. This supports transport-independent marshaling between application tiers and across a distributed application. You can also use these classes to persist data independently of a database.

DataTable

A single table of in-memory data that can exist independently or as part of a col-lection of DataTable objects in a DataSet.

DataColumn

The schema of a column in a DataTable.

DataRow

A row of data in the DataTable.

DataView

A data-bindable view of a DataTable used for custom sorting, filtering, searching, editing, and navigation.

DataRelation

A parent/child relationship between two DataTable objects in a DataSet.

Constraint

A constraint on one or more columns in a DataTable used to maintain data integrity. A constraint is either a UniqueConstraint that ensures that a column or collection of Column objects are unique within a DataTable or a ForeignKeyConstraint that represents an action restriction on one or more columns in a relationship in a DataTable when a value or row is either updated or deleted.

ADO.NET and XML converge in .NET. You can save the DataSet as an XML document, or fill it from an XML document. You can access and modify data simultaneously using both the DataSet classes and XML classes.

Connections, Connection Strings, and Connection Pooling

Database connections are a critical and limited resource. Connections must be managed to ensure that an application performs well and is scalable. SQL Server and Oracle data providers provide connection pooling, while the OLE DB and ODBC providers use the pooling provided by OLE DB or ODBC, respectively.

Connections should be opened as late as possible and closed as soon as possible using the Close()method. Alternatively, you can create the connection in a using block to ensure that the system disposes of the connection when the code exits the block. The connection should be used as briefly as possible, meaning that connections should not last longer than a method call. Connections should not be passed between methods—in addition to creating performance problems and limiting scalability, this can lead to security vulnerabilities.

Data providers use a connection string containing a collection of attribute/value pairs to establish the connection with the database. You specify connection strings using ConnectionString property of a Connection object. The DBConnectionStringBuilder class or the strongly typed version for each specific .NET data provider is used to build connection strings programmatically.

Get ADO.NET 3.5 Cookbook, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.