Chapter 2. Indexing

When you are indexing with Ferret, you need to know about the following classes:

  • Ferret::Store::Directory

  • Ferret::Index::FieldInfo

  • Ferret::Index::FieldInfos

  • Ferret::Field

  • Ferret::Document

  • Ferret::Index::IndexWriter

  • Ferret::Index::IndexReader

We will discuss each of these classes in this chapter. We’ll begin by discussing index storage. This involves looking at the two implementations of the Directory class: RAMDirectory and FSDirectory. After that, we’ll look at the building blocks of the Ferret index—the Field and Document classes—and we’ll discuss document and field boosting. We’ll then look at setting up the index. This involves using the FieldInfo and FieldInfos class and is probably the most important topic in this chapter. Finally, we’ll discuss how the actual indexing process works, at which time you’ll learn when to use the IndexReader and IndexWriter classes. Readers should pay special attention to the Index Locking and Concurrency Issues” section in Chapter 3, as it seems to be the biggest problem area for new users in Ferret.

Index Storage

Ferret indexes are stored in a Ferret::Store::Directory. Directory is an abstract class that specifies how an index should be stored. Ferret comes packaged with two implementations of the Directory class: Ferret::Store::FSDirectory for storing the index on the filesystem, and Ferret::Store::RAMDirectory for storing an index in memory. You’ve used both of these classes already: RAMDirectory in Example 1-1 and ...

Get Ferret now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.