O'Reilly Databases

oreilly.comSafari Books Online.Conferences.

We've expanded our coverage and improved our search! Search for all things Database across O'Reilly!

Search Search Tips

advertisement
AddThis Social Bookmark Button

Graying MySQL, and MySQL learns a second language (early conference report )

   Print.Print
Email.Email weblog link
Blog this.Blog this

Andy Oram
Apr. 19, 2005 06:27 AM
Permalink

Atom feed for this author. RSS 1.0 feed for this author. RSS 2.0 feed for this author.

URL: http://www.mysqluc.com/...

MySQL is graying in several metaphorical ways. Of course, it is simply getting older--aren't we all? But it is by no means over the hill. More significantly, its adherents are getting less colorful and reflect instead the grayness of the corporate settings it is conquering. Finally, MySQL is graying the distinctions that separated it from Oracle and other heavy-duty database engines. MySQL, in short, is becoming conventional.

The early achievements of this disruptive technology were to bring a high-performance relational database down from the top shelf where only those of means could afford it, and put it in the hands of students, enterpreneurs working out of their homes, and modest web site developers. This was a revolution dubbed situated software by Clay Shirky. Although MySQL was already being used by sites that could afford more expensive databases (and the computer systems and expert administrators who came in tow), these did not drive its initial popularity.

Now MySQL AB has built a formidable marketing machine and carried their product into the database mainstream, following a path similar to Linux. Their trappings are starting to evince familiar themes. They have salespeople in at least a dozen cities around North America. Their new support and update mechanism, MySQL Network, reminds me of a similarly named support system from Red Hat. MySQL's development of an online FAQ called a Knowledge Base, and the slogan "MySQL Everywhere" plastered all around this conference, are reminiscent of another large software vendor.

But MySQL AB has not forgotten the little guys who want a DBMS that runs lean and fast, with near-zero administration. These users will probably continue to be its largest base. Significantly, under the conventional trappings I mentioned, I believe MySQL AB is still structured in a fundamentally different way from a conventional propriety vendor, and is still behaving like a network of brilliant independent software developers. They have always listened closely to their users--you can see that at their conferences, where dozens of developers turn up in distinctive shirts and attract flocks of petitioners for new features--but they now are listening to paying customers in the same intense, investigative manner.

For instance, I saw one of their leading engineers walk around an evening reception recruiting representatives from international customers to sit in on a session about internationalization, just so he could hear their perspective on some problems he had been told by other customers.

It takes a certain financial and time commitment to attend a conference, so for those who pony up the money to do so, the theme at this one is "bigger and better." Sessions on Java interfaces, clustering, scaling, high availability, and replication decorate the calendar for the next few days. One panel is even called "Challenges in the Enterprise."

And what are the newest features MySQL is pushing hardest? There are no breakthroughs here (and I wouldn't expect any, because relational databases are a mature area in a research sense). The announcements focus on things that competitors have had for years: stored procedures, triggers, views. MySQL is not leading the conquest of new territories. Rather, MySQL is catching up. That's something they're proud of, and rightfully so.

I attended one session last night on a feature that will implement a tiny snippet of the SQL standard, XPath support. In effect, MySQL, which has always understood the SQL language, is learning a second language--not a natural language (although MySQL offers increasing support for character sets and other internationalization features) but the complex world of XPath.

I find this feature an odd way to support XML. Most XML users carry out XML/database interaction by using Java or some other programming tool to break down the XML into constituent pieces of test and store those pieces in a database structure that mirrors the XML. But SQL's XPath support buries the XML without alteration into a field in the database.

The idea of XPath support in the database is that you start by storing a string such as <p>Why do <em>you</em> want to represent <em>structured text</em>?</p> bodily in a text column. This text column can be any standard text datatype in SQL (although MySQL will add a special XML tag eventually, to support validation and some optimizations).

In itself, this doesn't help deal with XML. But MySQL will also provide a couple functions such as ExtractValue and UpdateXML that manipulate the XML with XPath queries. You could tell it to extract or change, for example, the second <em> entity in the string just shown. Full text searches can reduce the time it takes to search large collections of XML by two orders of magnitude, in comparison to database queries without indexes.

The design of the XPath support is oddly disconnected from the traditional structures of a relational database. As already shown, the storage model jams all the XML into a single column, so that the XML structure is handled independently from the schema of the table. Furthermore, an XPath query that returns multiple strings from different parts of the XML document concatenates them together, space-separated, in a single row. I would have expected them to be granted individual rows in the results.

There are many uses for XPath support in a database. One could extract and display all the titles of different documents. One could run a traditional SELECT to retrieve data from other columns or tables and join it to XML content. One could find everything within <price> tags and let the database perform some calculations such as averaging. The more XML processing you can do in the database, the less data has to be sent over the wire to the client.

This new MySQL feature--not planned until 5.1 or even later--is probably less useful with data-crunching XML (which has many small pieces of text within multiple tags) than with documents, which are flatter and have a high ratio of content to tagging. However, one participant in last night's BOF suggested the feature could be applied to storing SOAP queries too.

MySQL's turn to the mainstream is being reciprocated by its intended audience. Attendance at yesterday's tutorials was impressive; a couple tutorials sold out, and the halls were filled with people at break time. Today's sessions and exhibitors will draw even more.

Andy Oram is an editor for O'Reilly Media, specializing in Linux and free software books, and a member of Computer Professionals for Social Responsibility. His web site is www.praxagora.com/andyo.

Comments on this weblog
Full Threads Oldest First

Showing messages 1 through 9 of 9.

  • XML support makes sense
    2005-04-20 23:25:04  rpbourret [View]

    Storing XML in a single field whose data type is XML makes sense, just as breaking it into relational pieces makes sense. It really depends on the schema of the XML.

    Lots of data-centric XML maps easily to relational tables. On the other hand, document-centric XML does not map well to relational tables, nor does some data-centric XML, such as when documents can contain arbitrary XML.

    The real question is how the XML data type and XPath processor will be implemented. One assumes that the XML will be heavily indexed and XPath queries executed against those indexes -- parsing the XML documents at run time would be a performance disaster.

    For what it's worth, the other major relational databases (DB2, Oracle, Sybase, SQL Server) have all implemented / are implementing similar functionality, although they are supporting XQuery instead of just XPath.

    And even thought it would be nice for MySQL to natively transfer data between XML documents and relational tables, there are plenty of middleware products to do this.
  • MySQL is VHS to Postgresql's Beta though
    2005-04-19 07:25:17  Randal L. Schwartz | O'Reilly Author [View]

    I'm really sad to see so many people still pushing MySQL, which was definitely the right choice in 2000, and skipping over PostgreSQL, which has become the superior choice in the last few years. {sigh}


    Once again, because of marketing, the marketplace loses.

    • How Vague the "Superior"!
      2005-04-19 08:56:23  chromatic | O'Reilly AuthorO'Reilly Blogger [View]

      Once, I saw someone writing a small, single-user application to manage recipes. He used MySQL. "My word, man!" I said. "You need updatable views, triggers, transaction logs, and hot-swappable disks! What if by some quantum miracle you split into multiple users and try to update the database simultaneously? What if you sell this application to a bank sometime? Don't you care about the data integrity of customer financial transactions?"



      Sadly, he didn't listen.

    • John W. Adams photo MySQL is to DW what PostgreSQL is to OLTP
      2005-04-19 08:05:48  John W. Adams | O'Reilly Blogger [View]

      I agree with you fifty percent, merlyn. PostgreSQL is the clear choice over MySQL for any sort of OLTP. Only now is MySQL adding features which are needed for OLTP.


      However, those same features are of little or no value to MySQL as a DW platform--they're cruft. Why in the world would a properly designed data warehouse require triggers?


      I understand why every RDBMS is now adding such featurs--after all, Oracle has them, so everyone else needs them, too. I wouldn't use Oracle for a DW on a bet, though. It's not capable of scaling to a sufficient size for a large enterprise, and it's not a good vslue for a smaller business. I do understand why MySQL, partnering with SAP, might feel the need to match Oracle feature for feature.


      That's a shame. MySQL is built nicely for high performance in the DW space. Adding views was absolutely necessary, and even updatable views have their place. (Last year, I had to use updatable views to get around an incredibly stupid, incredibly popular etl product's inability to do what we needed.) But most of what MySQL has picked up in database features is unuseful for DW.

      • MySQL is to DW what PostgreSQL is to OLTP
        2005-04-19 11:52:20  danielhanks [View]

        "I wouldn't use Oracle for a DW on a bet, though. It's not capable of scaling to a sufficient size for a large enterprise".

        What???

        Can you elaborate on that statement? What do you mean by scale in that context?
        • John W. Adams photo Probably I'm showing my bias, or my limited experience
          2005-04-19 12:07:35  John W. Adams | O'Reilly Blogger [View]

          I've had the good (or bad) fortune to never work on a data warehouse of under a terabyte. If you want something really big like, oh, Southwestern Bell has, you need DB2 or Teradata (SWB's choice).

          I don't work extensively with Oracle, but in my limited experience, it runs out of horsepower somewhere near a terabyte. Maybe the newer parallelism features will fix this.

          One session I really wanted to attend this year was Multi-Terabyte Data Warehouse and MySQL. Having seen the challenges involved in getting systems of that size to perform efficiently, and thus being a skeptic when someone claims they can do so, I'd really like to hear what he's got to say.
          • Probably I'm showing my bias, or my limited experience
            2005-04-19 13:45:18  danielhanks [View]

            I see. Where I work we have an 11T data warehouse running on Oracle. I don't work directly with it, but it does provide daily reports, so it must be working ok enough.

            Just my $0.02.
            • John W. Adams photo Is this a single installation, or do you use datalink to go from machine to machine?
              2005-04-19 14:20:28  John W. Adams | O'Reilly Blogger [View]

              I can't argue with a working installation, and my Oracle skills are rusty at best.


              But I'm curious--can a user open up SQLPlus and (assuming appropriate privileges) join two arbitrary tables without intervention by an administrator? If not, by my lights it's not a data warehouse.

              • Is this a single installation, or do you use datalink to go from machine to machine?
                2005-12-16 11:00:21  Tabooyah [View]

                I worked as a business intelligence analyst for a number of years where one of my primary duties was to perform not just 2 joins, but up to a dozen joins in order to collect data and format it into meaningful reports. Oracle has a number of great features that allow you to perform adhoc queries against million and billion-row tables very quickly. One such feature is materialized views which allows you to precompute dimensional summaries much like a MOLAP tool and store them in the database which can be access and just like a normal view. Another nifty Oracle feature is table partitioning which allows localized indexes and logic data grouping which cut query times down significantly. Analytic workspaces are another great Oracle DW feature which require more heavy lifting at the application layer.

                Anyone claming that Oracle isn't a well-suited DW application obviously has little experience with the product or isn't using it correctly.

Showing messages 1 through 9 of 9.

Return to weblogs.oreilly.com.



Weblog authors are solely responsible for the content and accuracy of their weblogs, including opinions they express, and O'Reilly Media, Inc., disclaims any and all liabililty for that content, its accuracy, and opinions it may contain.

Creative Commons License This work is licensed under a Creative Commons License.



Advertisement
Sign up today to receive special discounts,
product alerts, and news from O'Reilly.
Privacy Policy >
View Sample Newsletter >
  • Youtube
  • http://www.youtube.com/OreillyMedia
  • Twitter
  • Subscribe
  • View All RSS Feeds >
O'Reilly Media

800-889-8969 or 707-827-7019
Monday-Friday 7:30am-5pm PT
©2011, O'Reilly Media, Inc.
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
  • About O'Reilly
  • Academic Solutions
  • Contacts
  • Customer Service
  • Careers
  • Press Room
  • Privacy Policy
  • Terms of Service
  • Writing for O'Reilly
  • Community
  • Authors
  • Forums
  • Membership
  • Newsletters
  • RSS Feeds
  • User Groups
  • Partner Sites
  • makezine.com
  • makerfaire.com
  • craftzine.com
  • igniteshow.com
  • PayPal Developer Zone
  • O'Reilly Insights on Forbes.com