Effective awk Programming, 3rd Edition

Chapter 3. Reading Input Files

In the typical awk program, all input is read either from the standard input (by default, this is the keyboard, but often it is a pipe from another command) or from files whose names you specify on the awk command line. If you specify input files, awk reads them in order, processing all the data from one before going on to the next. The name of the current input file can be found in the built-in variable FILENAME (see the Section 6.5 in Chapter 6).

The input is read in units called records, and is processed by the rules of your program one record at a time. By default, each record is one line. Each record is automatically split into chunks called fields. This makes it more convenient for programs to work on the parts of a record.

On rare occasions, you may need to use the getline command. The getline command is valuable, both because it can do explicit input from any number of files, and because the files used with it do not have to be named on the awk command line (see Section 3.8 later in this chapter).

How Input Is Split into Records

The awk utility divides the input for your awk program into records and fields. awk keeps track of the number of records that have been read from the current input file. This value is stored in a built-in variable called FNR. It is reset to zero when a new file is started. Another built-in variable, NR, is the total number of input records read so far from all datafiles. It starts at zero, but is never automatically ...

Get Effective awk Programming, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Effective awk Programming, 3rd Edition by Arnold Robbins

Chapter 3. Reading Input Files

How Input Is Split into Records

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly