Essential Linux Commands for Server Administration
systemctl, journalctl, ps, df, ss, and grep — the commands you reach for when something breaks in production and there's no dashboard installed.
You SSH into a production server and get hit with Active: failed (Result: exit-code). The service is down, nobody knows since when, and there are no dashboards installed. Or a colleague hands you an issue described only as "it's slow" — and you need to figure out what's eating CPU, memory, or disk right now.
This is where Linux admin commands earn their keep. Not the navigation basics — those are covered in the Linux for beginners guide — but the commands you reach for when something is actually wrong in production.
systemctl — service management with systemd
systemctl is the command-line interface to systemd, the default init system on Debian, Ubuntu, RHEL, Fedora, and their derivatives. It starts, stops, restarts, and inspects any service on the system.
# Check service status
systemctl status nginx
# Start, stop, restart
systemctl start nginx
systemctl stop nginx
systemctl restart nginx
# Reload config without downtime (when supported)
systemctl reload nginx
# Enable/disable at boot
systemctl enable nginx
systemctl disable nginx
# List all services and their states
systemctl list-units --type=service
# List only failed units
systemctl --failed
systemctl status deserves special attention — it shows the last few log lines for the service right in the terminal, which often answers the question without opening journalctl:
● nginx.service - A high performance web server
Loaded: loaded (/lib/systemd/system/nginx.service; enabled)
Active: active (running) since Fri 2026-06-13 10:42:15 UTC; 2h 3min ago
Main PID: 1234 (nginx)
If Active shows failed, the next step is journalctl.
journalctl — systemd logs without the guesswork
journalctl reads from the systemd journal — a binary, indexed log store with powerful filtering. It's not a replacement for /var/log, but for systemd-managed services it's far more practical.
# Logs for a specific service
journalctl -u nginx
# Follow (like tail -f)
journalctl -u nginx -f
# Last 100 lines
journalctl -u nginx -n 100
# Since current boot
journalctl -u nginx -b
# Time range
journalctl -u nginx --since "2026-06-13 10:00" --until "2026-06-13 11:00"
# Errors and above only
journalctl -u nginx -p err
# No pager (useful in scripts)
journalctl -u nginx --no-pager
When a service crashes and you need to see what happened at the exact moment:
journalctl -u nginx -b -1
# -b -1 = previous boot (useful when the server rebooted because of the crash)
journalctl without -u shows the entire system log — too noisy for everyday use. Always filter.
ps — what's actually running
ps lists processes. Without arguments it's nearly useless — it only shows processes in the current session. With the right flags, it becomes the first command to run when something is consuming unexpected resources.
# Full list of all running processes
ps aux
# Sort by CPU usage (highest first)
ps aux --sort=-%cpu | head -20
# Sort by memory
ps aux --sort=-%mem | head -20
# Process tree (who spawned what)
ps auxf
# Filter by process name
ps aux | grep nginx
# Just the PID
pgrep nginx
The aux format shows: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND. The STAT field matters:
R— running or waiting on CPUS— sleeping (waiting for an event, normal)D— uninterruptible sleep (usually waiting for disk I/O — concerning if there are many)Z— zombie (child process that exited but parent hasn't collected the exit code)
Many processes in state D indicate I/O saturation. Many Z processes indicate a bug in the application that doesn't call wait() on its children.
df and du — disk space
df shows available space per partition. du shows how much each directory is using. Together they handle "disk full" without needing any external tool.
# Disk usage per partition (human-readable)
df -h
# Including filesystem type
df -hT
# Check inodes (another form of "disk full")
df -i
Example output:
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 50G 47G 3.1G 94% /
/dev/sda2 200G 80G 120G 40% /data
tmpfs 1.9G 0 1.9G 0% /dev/shm
When df -h shows 94% and you need to find what's consuming it:
# Top 20 directories by size under /var
du -h /var --max-depth=2 | sort -rh | head -20
# Files larger than 100MB in /var/log
find /var/log -size +100M -type f
Running out of inodes (df -i at 100%) is a less obvious failure mode — you have space in bytes but can't create new files because there are no available inodes. Common on servers with large numbers of small files (mail queues, PHP sessions, object caches).
ss — network connections
ss is the modern replacement for netstat. Faster, more informative, same flag logic.
# All TCP/UDP listening and established connections
ss -tuln
# Include the process that's listening
ss -tulnp
# Established connections only
ss -tn state established
# What's on port 80
ss -tlnp sport = :80
# Count connections by state
ss -tan | awk '{print $1}' | sort | uniq -c | sort -rn
The last command is useful for diagnosing overload: thousands of connections in TIME_WAIT or CLOSE_WAIT usually point to connection management issues in the application or TCP kernel settings.
For a specific process:
# All open connections by nginx (PID 1234)
ss -tp | grep pid=1234
grep — extracting signal from log noise
grep needs no introduction, but in server administration what matters is knowing which flags to combine to extract signal from the noise in logs.
# Lines with ERROR in nginx error log
grep "ERROR" /var/log/nginx/error.log
# Case-insensitive
grep -i "error" /var/log/app/app.log
# With 3 lines of context (before and after)
grep -C 3 "Connection refused" /var/log/app/app.log
# Count occurrences
grep -c "404" /var/log/nginx/access.log
# Invert (lines that do NOT contain)
grep -v "GET /health" /var/log/nginx/access.log
# Recursive across a directory
grep -r "database connection" /var/log/
# Only the filename (useful with -r)
grep -rl "CRITICAL" /var/log/
# Top IPs in access log
grep -oP '^\d+\.\d+\.\d+\.\d+' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head -10
Combining grep with journalctl is the most common flow in practice:
journalctl -u postgresql --no-pager | grep -i "fatal\|error\|could not"
Putting it together: real-world troubleshooting
Typical scenario: new deploy just went to production and the service is responding slowly. In sequence:
# 1. Is the service even running?
systemctl status myapp
# 2. Any recent errors in the log?
journalctl -u myapp -n 50 --no-pager | grep -i "error\|warn\|fatal"
# 3. What's the heaviest process?
ps aux --sort=-%cpu | head -5
# 4. Is there disk space?
df -h
# 5. Are connections piling up?
ss -tan | awk '{print $1}' | sort | uniq -c | sort -rn
Five commands, two minutes. In most cases, one of them already points to the cause.
Frequently asked questions
What's the difference between systemctl restart and systemctl reload?
restart stops the service and starts it again — there's a brief moment of downtime. reload asks the process to re-read its configuration without terminating — no downtime, but not every service supports it. nginx supports it; PostgreSQL does too (via pg_reload_conf()). When in doubt, use restart.
How do I see logs for a service that crashed on boot?
journalctl -u service-name -b
-b limits output to the current boot. If the service failed in the previous boot and the server was restarted:
journalctl -u service-name -b -1
-b -1 is the boot immediately prior to the current one.
Does ss fully replace netstat?
For most use cases, yes. netstat is no longer installed by default on many distributions — it's part of the net-tools package, which is in maintenance mode. ss reads directly from the kernel via netlink socket, is faster, and has equivalent flags (ss -tuln ≈ netstat -tuln). The main practical difference is in advanced filter syntax, where ss is more expressive.
How do I figure out what file permissions a web server needs?
One of the most common questions when setting up nginx or Apache. Rather than calculating octal by hand, I use the chmod-calculator — you toggle the permission bits visually, see the result in both octal and symbolic notation, and get a warning when the combination is insecure (like 777).
Diagnosis is half the job
A broken server isn't the exception — it's part of the work. The difference between resolving it in two minutes versus an hour comes down to knowing which command to run first.
systemctl status to see if the service is up. journalctl -u name -n 100 to see what happened. ps aux --sort=-%cpu to check for runaway processes. df -h to check disk. ss -tuln to see what's listening.
These six commands cover the majority of issues that come up on a Linux server in production. Everything else is variation — different flags, combinations with grep, time filters in journalctl. The core is this.
Note: editorial content ends here. What follows is a reference to a related tool.
Related tool
The CHMOD Calculator on Quick Tools lets you configure owner, group, and others permissions visually, shows the result in both octal and symbolic notation at once, supports SUID/SGID/sticky bit, and displays security warnings when the chosen combination has dangerous implications — all in the browser, no installation needed.
- 01 How to Organize Your Programming Studies Without Getting Lost Escape tutorial hell, build consistency, and actually finish projects — a practical system for learning programming that holds up over time.
- 02 Clean Code Without Dogma: What Actually Matters Clean Code became a religion. Here's which principles have real ROI, which are cargo-culted rules, and how to push back on dogmatic code reviews.