Shell Script Code Review Checklist for Linux Teams: Prevent Production Incidents

Last updated on


Target keyword: shell script code review checklist

Monthly keyword cluster: linux shell scripting, bash scripting linux, secure shell scripting, automasi tugas linux

Search intent: Best-practice / Implementation

Most Linux teams have at least one “small Bash script” that quietly became business-critical.

At first, it was just a helper for backups, deploys, or log cleanup. Then it got scheduled by cron or systemd timer. Then more people touched it. Then one day, a tiny change broke production.

This is exactly why you need a shell script code review checklist.

Not to make reviews bureaucratic. Not to slow down delivery. But to catch dangerous mistakes early—before they become 2 AM incidents.

In this guide, you’ll get a practical, production-oriented checklist your team can apply today. It is designed for real environments where scripts touch services, filesystems, secrets, and deployment flows.

Why code review for shell scripts is often ignored (and why that is risky)

Many teams review application code strictly, but shell scripts are treated as “ops glue.”

That mindset is expensive.

Shell scripts usually run with elevated privileges, execute destructive commands (rm, mv, chown, systemctl restart), and interact with production state directly. A single unquoted variable or missing guard can trigger data loss or downtime.

If your automation depends on Bash, then bash scripting linux quality standards should be as serious as backend code standards.

Quick rule: review by failure modes, not by style only

A good review should ask:

  1. What can fail?
  2. How does the script detect failure?
  3. How does it recover or stop safely?
  4. How observable is the failure?

This mindset aligns with reliability and security goals at the same time.


Shell Script Code Review Checklist (Production Edition)

Use this as your pull request template for shell-based automation.

1) Safety baseline is present

Minimum baseline:

#!/usr/bin/env bash
set -Eeuo pipefail
IFS=$'\n\t'

What to check:

  • set -Eeuo pipefail exists (or documented reason if not)
  • script uses predictable IFS
  • script does not rely on ambiguous shell behavior

Why it matters:

Without strict mode, scripts can continue after partial failure and leave systems in inconsistent state.

Related reading: Bash Strict Mode and Safe Automation Checklist for Linux Servers

2) Variables are quoted and validated

Check for unsafe expansions:

# risky
rm -rf $TARGET_DIR/*

# safer
rm -rf "${TARGET_DIR:?}"/*

Review points:

  • variable expansion is quoted ("$var")
  • required variables use fail-fast checks (${VAR:?message})
  • defaults are explicit (VAR="${VAR:-default}")

This single item prevents many incidents in linux shell scripting.

3) Destructive commands are guarded

Any operation with high blast radius needs explicit safeguards.

Commands to inspect carefully:

  • rm -rf
  • mv or cp over critical paths
  • chmod -R, chown -R
  • database CLI calls
  • firewall/service restart operations

Ask reviewers to verify:

  • target path is validated
  • dry-run option exists for risky operation
  • critical commands have logging before/after execution

4) Idempotency is considered

Can this script run twice safely?

If not, is that clearly documented?

Typical idempotent patterns:

  • check-before-create
  • conditional updates
  • lockfile to avoid overlap
  • state markers

Reference: Idempotent Shell Script: Jalankan Berkali-kali Tanpa Berantakan

5) Error handling is explicit and useful

Look for trap and meaningful error logs:

on_error() {
  local code=$?
  echo "[ERROR] line=$1 cmd='$2' exit=$code"
}
trap 'on_error "${LINENO}" "${BASH_COMMAND}"' ERR

Review questions:

  • does the error output include line/command context?
  • is exit code preserved?
  • does the script stop on critical failures?

For deployment scripts, reviewers should also check rollback behavior.

Related: Bash Trap and Rollback Patterns for Safer Linux Deployments

6) Logging is structured for debugging

A script without clear logs is hard to support in production.

Checklist:

  • timestamps included
  • key decision points logged
  • stderr/stdout capture strategy defined
  • sensitive values are masked

Example helper:

log() {
  printf '[%s] %s\n' "$(date '+%F %T')" "$*"
}

If your team runs scheduled jobs, this reduces MTTR dramatically.

7) Concurrency control exists for scheduled jobs

If script can be triggered by timer/cron/webhook, guard against overlap.

exec 9>/tmp/myjob.lock
flock -n 9 || { echo "job already running"; exit 1; }

Reviewers should ask: what happens if two executions run at the same time?

For timer strategy decisions, see: Systemd Timer vs Cron untuk Automasi Linux Production

8) Dependencies and environment are verified

Never assume binary availability or environment variables.

for cmd in curl jq awk; do
  command -v "$cmd" >/dev/null || { echo "missing: $cmd"; exit 1; }
done

: "${APP_ENV:?APP_ENV is required}"

Reviewers should verify that the script fails early with a clear message when dependencies are missing.

9) Security review basics are covered

For secure shell scripting, verify:

  • no hardcoded secrets
  • temporary files are handled safely
  • user input is sanitized
  • no dangerous eval unless unavoidable and reviewed
  • external downloads are verified when possible

For infrastructure-facing scripts, this is non-negotiable.

Security-related follow-up: UFW vs Fail2Ban vs SSH Hardening: Kombinasi Keamanan Server Linux

10) Portability and shell compatibility are explicit

Reviewers should confirm the target shell and runtime assumptions:

  • script is Bash-specific? then use shebang #!/usr/bin/env bash
  • uses GNU options? document Linux distro assumptions
  • avoid relying on local machine behavior only

If portability is not required, state it clearly in comments or README.

11) Test strategy exists (even lightweight)

Not every shell script needs full test suite, but every production script needs some test path.

Minimum acceptable:

  • lint check via shellcheck
  • test run in staging
  • one failure-path simulation

Great baseline:

shellcheck ./script.sh
bash -n ./script.sh

Review PR should include evidence that these checks were run.

12) Runbook and ownership are clear

A script is not production-ready if nobody knows how to operate it.

Reviewers should look for:

  • who owns this script?
  • where is the runbook?
  • what are rollback steps?
  • where are logs located?

This connects code quality with real incident response readiness.


Lightweight PR template you can copy

Use this template in your repo for shell-script pull requests:

  • Strict mode enabled (set -Eeuo pipefail)
  • Variables quoted and required vars validated
  • Destructive commands guarded
  • Idempotency behavior documented
  • Error handling and trap are clear
  • Logging includes timestamps and useful context
  • Concurrency guard (flock/equivalent) considered
  • Dependency checks implemented
  • Security checks (secrets, input, temp files)
  • shellcheck + bash -n passed
  • Rollback/runbook notes included

This is practical enough for daily operations and strict enough to prevent common production failures.

Common review anti-patterns (avoid these)

“Looks fine, approved” without running lint

If reviewers skip shellcheck, easy mistakes survive.

Focusing only on formatting

Style matters, but resilience matters more. Ask failure-mode questions first.

Ignoring scheduler context

Scripts that run fine manually can still fail in cron/systemd due to PATH/env differences.

No post-incident feedback loop

After an incident, checklist should evolve. A static checklist becomes stale quickly.


Final recommendation

If your team already uses shell for deployment and operations, introduce this checklist incrementally:

  1. Start with strict mode + quoting + lint.
  2. Add concurrency and dependency checks.
  3. Add rollback and incident-oriented logging.
  4. Make it part of PR definition of done.

You don’t need “perfect scripts.” You need scripts that fail safely, are easy to debug, and are hard to misuse.

That’s exactly what a solid shell script code review checklist gives you.

FAQ

1) Is shell script code review necessary for small teams?

Yes. Small teams are usually more exposed to operational risk because one person often handles multiple roles. A checklist prevents avoidable mistakes and reduces incident stress.

2) What is the first checklist item with biggest impact?

Enable strict mode (set -Eeuo pipefail) and enforce variable quoting. This combination catches many high-risk issues early.

3) Should every script implement rollback?

Not always full rollback, but every state-changing script should at least define safe stop behavior and recovery steps.

4) Is ShellCheck enough as a review gate?

ShellCheck is essential but not sufficient. You still need logic review, failure-mode review, and runtime context checks.

5) How often should we update the checklist?

Update it after incidents, major platform changes, or every quarter. Treat it as a living operational standard.

Komentar

Real-time

Memuat komentar...

Tulis Komentar

Email tidak akan ditampilkan

0/2000 karakter

Catatan: Komentar akan dimoderasi sebelum ditampilkan. Mohon bersikap sopan dan konstruktif.