In this chapter, we move from creating Java source files to creating Java objects. In Chapter 3, you built a framework of objects (compiled source files) that represented your constraints. However, this framework isn’t particularly useful on its own. Just as a DTD isn’t of much use without XML, generated classes aren’t any good without instance data. We take the next logical step in this chapter and work on taking an XML document and generating instance data.
I start out by walking you through the process flow for unmarshalling, which is the technical term for converting an XML document into Java object instances. This will give you the same background as the class generation process flow section did and prepare you to work through the rest of the chapter. From there on, it’s all working code. First, I discuss creating instance documents, XML documents that conform to your constraint set. Once you’ve got your data represented in that format, you’re ready to convert the XML into Java; the result is instances of the classes you generated in the last chapter. Finally, I cover how to take this data, in Java format, and use it within your application. You’ll want to have your XML editor and Java IDE fired up because there is a lot of code in this chapter; let’s get to it.
As in the case of class generation, I want
to spend a little time walking through
the process flow of unmarshalling XML data into Java objects. This is
useful in understanding exactly what happens when you invoke
that unmarshal( )
method
(or whatever it’s called with your framework).
Rather than relying on a black box process, you’ll
be able to know exactly what goes on, troubleshoot oddities in your
applications, and maybe even help out the framework programmers with
a bug here and there.
Construct XML data to unmarshal into Java objects.
Convert the XML data into instances of generated Java objects.
Use the resultant Java object instances.
Each step is detailed here.
First, you need to have some XML data
to
start with. This probably isn’t any great revelation
to you, but it’s worth taking a look at.
You’ll need an XML document that matches up with the
constraints designed in the class generation process. Additionally,
this document must be valid
with respect to
those constraints. Valid means that the structure and data in the
document fulfill the data contract set out by your DTD. I talk in
detail about how to validate your documents both before and during
data binding later on in this chapter.
There’s not a lot of complexity in this step, so I won’t dwell on it. There are certainly some subtle issues to work through in ensuring that the data in your XML document correctly maps to where it belongs in your Java classes, and I cover that in the more detailed sections of the chapter. For now, though, as long as you’ve got an XML document and have a set of generated classes from the document’s DTD, you’re ready to roll.
The guts of the unmarshalling process is the conversion from XML to Java. This is where the most interesting action takes place in any framework. However, it’s also the place where the process itself varies the most between frameworks. While the starting point (an XML document) and ending point (Java object instances) are the same, the “in-between” is not. Still, basic principles that are important to understand are at work, and these basics apply to all frameworks.
First, you’ll need to convert your XML data into
some form of an input stream (usually an
InputStream
or Reader
in Java
parlance). This may seem too simple to be worth mentioning, but it
turns out to be an important point. It’s a common
misconception to think about data binding as a process that takes an
XML file and converts it to Java instance data.
However, it’s just as likely that the XML data come
from a network stream, email message, or some other medium entirely,
as opposed to a static file on a hard drive. This opens up all sorts
of possibilities and also allows you to think a bit outside of the
box. Consider taking a SOAP message, the response to a questionnaire,
or an XML shipping manifest, all from a third party. Instead of
having to write SAX or DOM code to deal with this information, data
binding allows a simple means of interacting with this business data
in a business way—a very handy option to have available.
The actual object that the unmarshal( )
method is
invoked on is where variance begins to creep
in. For example, using JAXB, generated classes are all concrete; to
unmarshal an object, you will have code like this:
// Get the input stream for the XML InputStream inputStream = getXMLInputStream( ); // Unmarshal into an object Movies moviesObject = Movies.unmarshal(inputStream); // Operate on the instance data
This code would seem to create a problem, though, since Zeus creates
interfaces. Because unmarshal( )
must
be
a static method (you don’t have instance data yet,
so you can’t work on an instance), it must exist
only on the implementation. To get around this issue, Zeus generates
an additional class, called
[top-level-object]Unmarshaller
. Since
movies
is the top-level object in the movie
database XML, this would be MoviesUnmarshaller
.
Invoke the unmarshal( )
method on this object like
this:
// Get the input stream for the XML InputStream inputStream = getXMLInputStream( ); // Unmarshal into an object Movies movieObject = MoviesUnmarshaller.unmarshal(inputStream); // Operate on instance data
You’ll see similar variances in other frameworks. In
all cases, you should get a Java Object
back from
this method, which is the top-level Java object instance. Depending
on the framework, you may have to cast this object to the expected
type, as shown here:
// Get the input stream for the XML InputStream inputStream = getXMLInputStream( ); // Unmarshal into an object Movies movieObject = (Movies)Unmarshaller.unmarshal(inputStream); // Operate on instance data
Still, while these approaches may vary, the basic result is the same: a Java object instance that you can then use to access the XML data without having to work in XML.
Once you’ve performed unmarshalling, you’re left with a set of result object instances. The returned value from the unmarshalling process, as I already mentioned, is the top-level instance of the unmarshalled XML document. This is going to be an instance of the object that corresponds with the root element of your XML document. It’s going to have any references to member objects, as well. Thus, for the movies database shown in the last chapter (Example 3-2), you would end up with an object tree like that shown in Figure 4-1.
Other than understanding this structure, there’s not much else to these result objects. In fact, that’s what is worth emphasizing here: these result objects are normal, ordinary Java object instances. There aren’t any special instructions to use them, gotchas to worry about, or other pitfalls.
Use these objects as you would any others, and don’t worry about them being data bound. And with that (lack of) admonition, you’ve got a handle on the unmarshalling process flow. Figure 4-2 illustrates the entire process.
Get Java & XML Data Binding now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.