$fp = fopen('fixed-width-records.txt','r') or die ("can't open file"); while ($s = fgets($fp,1024)) { $fields[1] = substr($s,0,10); // first field: first 10 characters of the line $fields[2] = substr($s,10,5); // second field: next 5 characters of the line $fields[3] = substr($s,15,12); // third field: next 12 characters of the line // a function to do something with the fields process_fields($fields); } fclose($fp) or die("can't close file");
$fp = fopen('fixed-width-records.txt','r') or die ("can't open file"); while ($s = fgets($fp,1024)) { // an associative array with keys "title", "author", and "publication_year" $fields = unpack('A25title/A14author/A4publication_year',$s); // a function to do something with the fields process_fields($fields); } fclose($fp) or die("can't close file");
Data in which each field is allotted a fixed number of characters per line may look like this list of books, titles, and publication dates:
$booklist=<<<END Elmer Gantry Sinclair Lewis1927 The Scarlatti InheritanceRobert Ludlum 1971 The Parsifal Mosaic Robert Ludlum 1982 Sophie's Choice William Styron1979 END;
In each line, the title occupies the first 25 characters, the
author’s name the next 14 characters, and the
publication year the next 4 characters. Knowing those field widths,
it’s straightforward to use substr( )
to parse the fields into an array:
$books = explode("\n",$booklist); for($i = 0, $j = count($books); $i < $j; $i++) { $book_array[$i]['title'] = substr($books[$i],0,25); $book_array[$i]['author'] = substr($books[$i],25,14); $book_array[$i]['publication_year'] = substr($books[$i],39,4); }
Exploding $booklist
into an array of lines makes
the looping code the same whether it’s operating
over a string or a series of lines read in from a file.
The loop can be made more flexible by specifying the field names and
widths in a separate array that can be passed to a parsing function,
as shown in the pc_fixed_width_substr( )
function in Example 1-3.
Example 1-3. pc_fixed_width_substr( )
function pc_fixed_width_substr($fields,$data) { $r = array(); for ($i = 0, $j = count($data); $i < $j; $i++) { $line_pos = 0; foreach($fields as $field_name => $field_length) { $r[$i][$field_name] = rtrim(substr($data[$i],$line_pos,$field_length)); $line_pos += $field_length; } } return $r; } $book_fields = array('title' => 25, 'author' => 14, 'publication_year' => 4); $book_array = pc_fixed_width_substr($book_fields,$books);
The variable $line_pos
keeps track of the start of
each field, and is advanced by the previous field’s
width as the code moves through each line. Use rtrim( )
to
remove trailing whitespace from each field.
You can use unpack( )
as a substitute for
substr( )
to extract fields. Instead
of specifying the field names and widths as an associative array,
create a format string for unpack( )
. A
fixed-width field extractor using unpack( )
looks
like the pc_fixed_width_unpack( )
function shown
in Example 1-4.
Example 1-4. pc_fixed_width_unpack( )
function pc_fixed_width_unpack($format_string,$data) { $r = array(); for ($i = 0, $j = count($data); $i < $j; $i++) { $r[$i] = unpack($format_string,$data[$i]); } return $r; } $book_array = pc_fixed_width_unpack('A25title/A14author/A4publication_year', $books);
Because
the A
format to unpack( )
means
“space padded string,”
there’s no need to rtrim( )
off
the trailing spaces.
Once the fields have been parsed into $book_array
by either function, the data can be printed as an HTML table, for
example:
$book_array = pc_fixed_width_unpack('A25title/A14author/A4publication_year', $books); print "<table>\n"; // print a header row print '<tr><td>'; print join('</td><td>',array_keys($book_array[0])); print "</td></tr>\n"; // print each data row foreach ($book_array as $row) { print '<tr><td>'; print join('</td><td>',array_values($row)); print "</td></tr>\n"; } print '</table>\n';
Joining data on </td><td>
produces a
table row that is missing its first <td>
and
last </td>
. We produce a complete table row
by printing out <tr><td>
before the
joined data and </td></tr>
after the
joined data.
Both substr( )
and unpack( )
have equivalent capabilities when the fixed-width fields are strings,
but unpack( )
is the better solution when the
elements of the fields aren’t just strings.
For more information about unpack( )
, see Recipe 1.14 and
http://www.php.net/unpack; Recipe 4.9 discusses join( )
.
Get PHP Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.