Chapter 3. Data Mining Concepts and DMX

This chapter provides a detailed overview of the query language created by Microsoft for performing data mining operations—Data Mining Extensions to SQL (DMX). In order to review DMX, why it was created, and how it is used to represent and operate a data mining system, this chapter identifies the parts you use to analyze a problem, both on the conceptual level and on an object level. This chapter discusses the mandatory process that must be followed in any data mining implementation, and gives the necessary language constructs (along with many tips and tricks) required for that implementation.

History of DMX

DMX was first introduced in the OLE DB for Data Mining specification authored by Microsoft in conjunction with other vendors in 1999. The goal of this specification was to create a vendor-neutral, programmable interface that leverages concepts already known to the people most able to take advantage of such interfaces—the programmers.

At the time, the target programmer was a database programmer who would do application programming in Visual Basic. The data interface at the time was ActiveX Data Objects (ADO), which was a front end for OLE DB, and the standard language was Standard Query Language (SQL). The OLAP Services team (where SQL Server Data Mining started) decided to capitalize on this developer knowledge and use OLE DB as the application programming interface (API) and create a query language as close to SQL as possible, while still ...

Get Data Mining with Microsoft® SQL Server® 2008 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.