Chapter 6. Ferret in Practice
In Chapter 1, you saw how to index all the text files under a directory. Unfortunately, most of the files in a filesystem aren’t text files, so let’s extend the indexer to handle some different file types. We also want to filter and sort files by different fields, such as their modification date. In this chapter, we’ll implement these extensions and more. We’ll call our application ferretfind.
Indexing Multiple Document Types
7
module
FerretFind
8
class
Reader
9
@@subclasses
=
[]
10
@@readers
=
{}
11
12
def
Reader.inherited
(
subclass
)
13
@@subclasses
<<
subclass
14
end
15
16
def
Reader.load_readers
(
field_infos
)
17
@@subclasses
.
each
do
|
subclass
|
18
reader
=
subclass
.
new
(
field_infos
)
19
subclass
::
EXTENSIONS
.
each
do
|
ext
|
20
@@readers
[
ext
.
downcase
]
=
reader
21
end
22
end
23
end
24
25
def
Reader.get_reader
(
path
)
26
@@readers
[(
File
.
extname
(
path
)[/
[^.]+
/]||"
").
downcase
]
27
end
28
29
def
Reader.read
(
path
)
30
document
=
{
31
:path
=>
path
,
32
:accessed
=>
File
.
atime
(
path
).
strftime
("
%Y%m%d
"),
33
:modified
=>
File
.
mtime
(
path
).
strftime
("
%Y%m%d
")
34
}
35
if
File
.
readable?
(
path
)
and
reader
=
Reader
.
get_reader
(
path
)
36
document
.
merge!
(
reader
.
read
(
path
))
37
end
38
return
document
39
end
40
41
protected
42
def
initialize
(
field_infos
);
end
43
def
read
(
path
);
{}
end
44
def
add_field
(
field_infos
,
field
,
options
)
45
field_infos
.
add_field
(
field
,
options
)
unless
field_infos
[
field
]
46
end
47
end
48
end
The FerretFind::Reader
class is ...
Get Ferret now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.