Beginner's Guide to AWK

Overview

awk is a powerful text-processing tool for Unix/Linux systems. It reads input line by line, splits lines into fields, and allows you to manipulate and extract data efficiently. This guide covers basic to intermediate awk usage, with practical examples for system administrators and developers.

Basics of AWK Syntax

awk 'pattern {action}' file

pattern: The text or condition to match.
action: What to do when the pattern matches (e.g., print, modify fields).
file: Input text file or output from a command.

Examples for Beginners

Print Entire File

Print every line of the file:

awk '{print}' file.txt

Print Specific Field

Print the first field (column) of each line:

awk '{print $1}' file.txt

Fields are separated by spaces/tabs by default.

Custom Field Separator

Use a custom delimiter (e.g., commas):

awk -F ',' '{print $1}' file.csv

Match Specific Lines

Print lines containing "pattern":

awk '/pattern/ {print}' file.txt

Simple AWK Commands for Practice

Count Lines in a File

awk 'END {print NR}' file.txt

Convert Text to Uppercase

awk '{print toupper($0)}' file.txt

Intermediate-Level AWK Features

Conditional Statements

Print fields based on conditions:

awk '{if ($3 > 50) print $1, $2}' file.txt

Built-In Functions

String manipulation:

awk '{print substr($1, 1, 3)}' file.txt

Mathematical operations:

awk '{sum += $2} END {print sum}' file.txt

Using Variables

Count lines in a file:

awk '{count++} END {print "Total lines:", count}' file.txt

Working with Multiple Files

Process multiple files:

awk '{print FILENAME, $0}' file1.txt file2.txt

Useful AWK Use Cases for System Administrators

Log Analysis

Extract error messages:

awk '/error/ {print}' /var/log/syslog

Count occurrences of a keyword:

awk '/error/ {count++} END {print count}' /var/log/syslog

Disk Usage Reports

Summarize disk usage:

df -h | awk 'NR>1 {print $1, $5}'

Process Monitoring

Filter high CPU processes:

ps aux | awk '$3 > 80 {print $1, $2, $3, $11}'

User Account Management

List usernames and shells:

awk -F ':' '{print $1, $7}' /etc/passwd

Automating Task Reports

Generate CSV-like output:

ls -l | awk 'BEGIN {print "File, Size, Owner"} {print $9, $5, $3}'

Conclusion

awk is an indispensable tool for text processing and automation in Unix/Linux systems. Whether you're analyzing logs, managing user accounts, or generating reports, awk offers a powerful and efficient solution. By mastering these techniques, you can streamline your workflow and enhance your productivity as a system administrator or developer.