awk Overview and Common Methods

Introduction to awk

awk is a text processing tool commonly used for data manipulation and generating reports.

The name awk is derived from the initials of its creators: Alfred Aho, Peter Weinberger, and Brian Kernighan.

awk Working Mode

The following diagram illustrates the basic working flow of awk:

awk Working Mode

Syntax Format

There are two common forms of awk commands:

Based on file input

awk 'BEGIN{pattern}{commands}END{}' file_name

Based on standard input via pipe

standard_command | awk 'BEGIN{pattern}{commands}END{}'

The following diagram explains the syntax components:

awk Syntax Diagram

awk Built-in Variables

Here is a reference table of awk's built-in variables:

Built-in Variables Table

Built-in Variables Table (continued)

Below are the most commonly used built-in variables:

$0          : The entire current record (line).
$1 ... $n   : The first to nth field of the current record.
NF          : Number of fields in the current record.
NR          : Number of records (lines) processed so far (cumulative).
FNR         : Number of records processed in the current file (resets per file).
FS          : Input field separator (default: whitespace/tab).
RS          : Input record separator (default: newline).
OFS         : Output field separator (default: space).
ORS         : Output record separator (default: newline).
FILENAME    : Name of the current input file.
ARGC        : Number of command-line arguments.
ARGV        : Array of command-line arguments.

Examples of Working with Built-in Variables

Print the entire line

awk '{print $0}' passwd

print $0

Specify : as field separator and print the first field

awk 'BEGIN{FS=":"}{print $1}' passwd

print $1 with FS=":"

Default field separator (space/tab) — assuming a file list with content:

Hadoop Spark Flume
Java Python Scala
Allen Mike Meggie

Print the first field using space separator

awk 'BEGIN{FS=" "}{print $1}' list

print $1 with space

Print the number of fields in each line

awk '{print NF}' list

print NF

Print cumulative record number (NR) when processing multiple files

awk '{print NR}' list passwd /etc/fstab

print NR

Print file-specific record number (FNR) when processing mutliple files

awk '{print FNR}' list /etc/fstab

print FNR

More Advanced Examples with Custom Separators

Assume file list has the following content:

Hadoop|Spark:Flume
Java|Python:Scala:Golang
Allen|Mike:Meggie

Use | as field separator and print the second field

awk 'BEGIN{FS="|"}{print $2}' list

print $2 with |

Use : as field separator and print the second field

awk 'BEGIN{FS=":"}{print $2}' list

print $2 with :

Record Separator (`RS`) and Output Separators

Assume file list has content:

Hadoop|Spark|Flume--Java|Python|Scala|Golang--Allen|Mike|Meggie

Specify -- as the record separator and print whole records

awk 'BEGIN{RS="--"}{print $0}' list

RS --

Combine RS and FS to get structured output

awk 'BEGIN{RS="--";FS="|"}{print $3}' list

RS+FS

Use ORS to separate output records with &

awk 'BEGIN{RS="--";FS="|";ORS="&"}{print $3}' list

ORS &

Print multiple fields with default output separator (space)

awk 'BEGIN{RS="--";FS="|";ORS="&"}{print $1,$3}' list

multiple fields with default OFS

Use OFS to change output field sepaartor to :

awk 'BEGIN{RS="--";FS="|";ORS="&";OFS=":"}{print $1,$3}' list

OFS :

Printing File Name (`FILENAME`)

awk '{print FILENAME}' list

FILENAME

If the file has multiple lines, FILENAME is printed for each line because awk processes line by line. For instance, with a file list containing:

Hadoop|Spark|Flume--Java|Python|Scala|Golang--Allen|Mike|Meggie
Test File
Line

The output will show the filename three times:

FILENAME multiple lines

Command-Line Argument Count (`ARGC`)

awk '{print ARGC}' list

This will print 2 (one for awk and one for list). If you run:

awk '{print ARGC}' list /etc/fstab

The output will be 3 (three arguments).

ARGC

ARGC 3

Using `NF` to Access the Last Field

NF gives the total number of fields. Therefore $NF always refers to the last field.

awk 'BEGIN{FS=":"}{print $NF}' passwd

$NF

Formatted Output with `printf`

Format Specifiers

Format Specifier	Description
`%s`	String
`%d`	Decimal integer
`%f`	Floating-point number
`%e`	Scientific notation (lowercase)
`%E`	Scientific notation (uppercase)
`%x`	Hexadecimal (lowercase)
`%X`	Hexadecimal (uppercase)
`%o`	Octal
`%%`	Print a literal `%`

Format Specifiers Table

Modifiers

Modifier	Meaning
`-`	Left-justify within the field width
`+`	Always print sign for numeric values
`0`	Pad with zeros instead of spaces
width	Minimum field width
.prec	Number of decimal places (for `%f`)

Format Modifiers Table

Examples of `printf`

printf without newline (default behavior)

awk 'BEGIN{FS=":"}{printf $1}' passwd

printf without newline

Add newline with %s\n

awk 'BEGIN{FS=":"}{printf "%s\n",$1}' passwd

printf with newline

Use placeholders for aligned output (right-aligned by default)

awk 'BEGIN{FS=":"}{printf "%20s %20s\n",$1,$7}' /etc/passwd

printf right aligned

Left-align with -

awk 'BEGIN{FS=":"}{printf "%-20s %-20s\n",$1,$7}' /etc/passwd

printf left aligned

Print strings

awk 'BEGIN{FS=":"}{printf "%s\n",$7}' passwd

printf %s

Print decimal integers

awk 'BEGIN{FS=":"}{printf "%d\n",$3}' passwd

printf %d

Print floating-point with 2 decimal places

awk 'BEGIN{FS=":"}{printf "%0.2f\n",$3}' passwd

printf %0.2f

Print hexadecimal

awk 'BEGIN{FS=":"}{printf "%x\n",$3}' passwd

printf %x

Print octal

awk 'BEGIN{FS=":"}{printf "%o\n",$3}' passwd

printf %o

Print scientific notation

awk 'BEGIN{FS=":"}{printf "%e\n",$3}' passwd

printf %e

Pattern Matching in awk

There are two main ways to perform pattern matching:

Regular Expression Matching
Operator Matching

Reference Table for Pattern Matching

Pattern Matching Table

Pattern Matching Operators

1. Regular Expression Matching

Find lines containing the string "root"

awk 'BEGIN{FS=":"}/root/{print $0}' passwd

regex root

Find lines starting with "nginx"

awk '/^nginx/{print $0}' passwd

regex ^nginx

2. Operator Matching

Available comparison operators:

< less than
> greater than
<= less than or equal
>= greater than or equal
== equal
!= not equal
~ matches regular expression
!~ does not match regular expression

Lines where the third field is less than 50

awk 'BEGIN{FS=":"}$3<50{print $0}' passwd

operator < 50

Lines where the third field is greater than 50

awk 'BEGIN{FS=":"}$3>50{print $0}' passwd

operator > 50

Lines where the seventh field equals /bin/bash

awk 'BEGIN{FS=":"}$7=="/bin/bash"{print $0}' passwd

operator == /bin/bash

Lines where the seventh field is NOT /bin/bash

awk 'BEGIN{FS=":"}$7!="/bin/bash"{print $0}' passwd

operator != /bin/bash

Lines where the third field contains three or more digits

awk 'BEGIN{FS=":"}$3 ~ /[0-9]{3,}/{print $0}' passwd

operator ~ regex

Boolean Operators in Patterns

|| logical OR
&& logical AND
! logical NOT

Lines where the first field is "ftp" OR "mail"

awk 'BEGIN{FS=":"}$1=="ftp" || $1=="mail"{print $0}' passwd

operator ||

Lines where third field < 50 AND fourth field > 50

awk 'BEGIN{FS=":"}$3<50 && $4>50{print $0}' passwd

operator &&

Lines starting with "nginx" (using regex)

awk 'BEGIN{FS=":"}/^nginx/{print $0}' passwd

regex ^nginx again

Lines where UID equals 1

awk 'BEGIN{FS=":"}$3==1{print $0}' passwd

Lines where UID (third field) consists of 3 or more digits

awk 'BEGIN{FS=":"}$3~/[0-9]{3,}/{print $0}' passwd

regex for 3+ digits

Lines that do NOT contain /sbin/nologin

awk 'BEGIN{FS=":"}$0!~\/sbin\/nologin/{print $0}' passwd

not matching /sbin/nologin

Lines where UID < 50 AND shell contains /bin/bash

awk 'BEGIN{FS=":"}$3<50 && $7~/\/bin\/bash/ {print $0}' passwd

combined conditions

Tags: awk Text processing Linux programming

Posted on Wed, 13 May 2026 12:21:36 +0000 by jdashca

Freaks City

awk Overview and Common Methods

Introduction to awk

awk Working Mode

Syntax Format

awk Built-in Variables

Examples of Working with Built-in Variables

More Advanced Examples with Custom Separators

Record Separator (`RS`) and Output Separators

Printing File Name (`FILENAME`)

Command-Line Argument Count (`ARGC`)

Using `NF` to Access the Last Field

Formatted Output with `printf`

Format Specifiers

Modifiers

Examples of `printf`

Pattern Matching in awk

Reference Table for Pattern Matching

1. Regular Expression Matching

2. Operator Matching

Boolean Operators in Patterns

Hot Tags

Freaks City

awk Overview and Common Methods

Introduction to awk

awk Working Mode

Syntax Format

awk Built-in Variables

Examples of Working with Built-in Variables

More Advanced Examples with Custom Separators

Record Separator (RS) and Output Separators

Printing File Name (FILENAME)

Command-Line Argument Count (ARGC)

Using NF to Access the Last Field

Formatted Output with printf

Format Specifiers

Modifiers

Examples of printf

Pattern Matching in awk

Reference Table for Pattern Matching

1. Regular Expression Matching

2. Operator Matching

Boolean Operators in Patterns

Hot Tags

Record Separator (`RS`) and Output Separators

Printing File Name (`FILENAME`)

Command-Line Argument Count (`ARGC`)

Using `NF` to Access the Last Field

Formatted Output with `printf`

Examples of `printf`