How to Profile Application Performance on Fedora (perf, strace)

Fedora provides powerful native profiling tools including perf for CPU and hardware-event analysis and strace for system-call tracing, helping you pinpoint where an application spends its time.

You launched the app and the fans screamed

You started your application and the CPU usage jumped to 100%. The system monitor shows the process eating all available cores, but the window is frozen. You don't know if the app is stuck in a tight loop, waiting on a lock, or thrashing the disk. You need to see what the process is actually doing, not just that it's busy.

What's actually happening

Profiling is like putting a black box recorder on a car. The dashboard tells you the speed, but the black box records every gear shift, brake tap, and engine rev. perf records hardware events and kernel scheduling data. It tells you which functions are burning cycles. strace intercepts system calls. It shows every time the application asks the kernel for a file, a network packet, or a new process. Together they answer why the system is slow and what the code is executing.

Install tools and debug symbols

Install the profiler and tracer. Enable debug symbols immediately so the output shows function names instead of hex addresses. Without symbols, the call graph is unreadable.

sudo dnf install perf strace # Install the profiler and syscall tracer
sudo dnf debuginfo-install myapp # Fetch symbols for your specific app
# WHY: perf shows raw addresses without debuginfo.
#      The debuginfo package maps addresses back to function names.
#      Run this before recording to avoid re-running the workload.

Fedora keeps debuginfo packages separate to save disk space. You can enable the debuginfo repositories globally if you profile often, but debuginfo-install is safer for one-off checks. It installs only the symbols you need and removes them later if you choose.

Enable debuginfo before you record. Hex addresses are impossible to debug.

Profile CPU with perf

Record a profile to see where the CPU time goes. The perf record command samples the program counter while the application runs. It builds a call graph that shows which functions call which other functions.

sudo perf record -g -- myapp --args # Record call graph for a new process
# WHY: -g captures the call stack, not just leaf functions.
#      This lets you see which function called the hot function.
#      The double dash separates perf flags from app arguments.
sudo perf record -g -p $(pgrep myapp) -- sleep 10 # Attach to running PID
# WHY: -p attaches to an existing process.
#      sleep 10 sets a timeout so perf stops automatically.
#      Use pgrep to find the PID if you don't know it.

Open the recorded data in the interactive viewer. The TUI lists functions sorted by overhead. Drill down by pressing Enter on a function to see its callers and callees.

sudo perf report # Open the TUI to browse the call graph
# WHY: perf report reads the perf.data file created by record.
#      Use arrow keys to navigate. Enter to drill down.
#      Press q to quit.

Use perf top for a live view when you don't need to save the data. It updates continuously like the top command.

sudo perf top -p $(pgrep myapp) # Live top-like view of hot functions
# WHY: perf top samples continuously without saving a file.
#      Good for quick checks when you don't need a permanent record.
#      Press q to exit.

Count hardware events to check for cache misses or branch mispredictions. These counters help identify memory-bound bottlenecks that CPU profiling alone might miss.

sudo perf stat -e cache-misses,cache-references,instructions,cycles myapp
# WHY: -e selects specific events.
#      cache-misses vs cycles helps identify memory-bound bottlenecks.
#      A high miss ratio suggests the algorithm doesn't fit in cache.

Record with -g. Leaf functions hide the real caller.

Trace syscalls with strace

Trace system calls to find I/O bottlenecks or permission errors. strace intercepts every call the application makes to the kernel. It prints the call name, arguments, return value, and duration.

strace -f -o /tmp/trace.log myapp # Trace syscalls and follow children
# WHY: -f follows forked child processes.
#      -o writes to a file so the terminal doesn't flood.
#      Without -o, output scrolls too fast to read.
#      Check /tmp/trace.log after the app finishes.

Summarize syscall counts to spot excessive calls. The -c flag aggregates data into a table showing how many times each syscall was called and the total time spent.

strace -c myapp # Summarize syscall counts and time
# WHY: -c aggregates data into a table.
#      Use this to spot excessive read, write, or stat calls.
#      High counts on stat often indicate directory scanning.

Filter by syscall name when the output is too noisy. Focus on file I/O or network calls to reduce the log size.

strace -e trace=open,read,write -c myapp # Filter specific syscalls and summarize
# WHY: -e trace limits output to listed syscalls.
#      This reduces noise when you only care about file I/O.
#      Combine with -c for a focused summary.

Write to a file with -o. Terminal scrolling destroys context.

Interpret the data

The perf report TUI shows an "Overhead" column. This percentage represents the fraction of samples attributed to that function. A function with 50% overhead means half your CPU time spent there. Drill down to see the call path. If main calls process_data and process_data shows 40%, the bottleneck is inside process_data. If futex appears high, the application is waiting on locks. If read or write appears, the bottleneck is I/O, not CPU.

strace output includes a duration in angle brackets at the end of each line. This is the time spent in the syscall in seconds. Look for calls that take milliseconds or seconds. A read call taking 500ms indicates a slow disk or network. A stat call failing with ENOENT means the app is looking for a missing file.

openat(AT_FDCWD, "/etc/config.conf", O_RDONLY) = 3 <0.000050>
read(3, "key=value\n", 4096) = 10 <0.000012>
# WHY: The number in angle brackets is the duration in seconds.
#      0.000050 is 50 microseconds.
#      Values above 0.001 indicate noticeable latency.
#      Scan the log for large numbers to find slow calls.

Check the duration column. Slow calls hide in plain sight.

Common pitfalls

perf cannot read kernel pointers by default on some configurations. You will see a warning about restricted address maps. The profiler still works for user-space, but kernel functions show as [unknown].

WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted.
Check /proc/sys/kernel/kptr_restrict.

Disable the restriction temporarily to see kernel symbols. This requires root.

sudo sysctl kernel.kptr_restrict=0 # Allow perf to read kernel pointers
# WHY: kptr_restrict=0 removes the mask on kernel addresses.
#      This lets perf resolve kernel function names.
#      Revert to 1 or 2 after profiling for security.

Debuginfo must match the running kernel. If you upgraded the kernel but didn't reboot, perf may fail to resolve symbols. Check the kernel version and ensure the debuginfo package matches.

rpm -q kernel # Check running kernel version
rpm -q kernel-debuginfo # Check installed debuginfo version
# WHY: Mismatched versions cause symbol resolution failures.
#      perf uses the debuginfo package to map addresses.
#      Reboot if the versions differ.

SELinux denials can mask the real issue. An app might appear slow because it is blocked by policy, not because of code inefficiency. Check the audit log for denials before assuming a performance bug.

sudo journalctl -t setroubleshoot | tail -20 # Check for SELinux denials
# WHY: setroubleshoot translates audit logs into readable messages.
#      Look for "denied" or "avc" entries.
#      Fix the policy before profiling the app logic.

Check kptr_restrict. Kernel pointer masking breaks perf silently.

When to use this vs alternatives

Use perf record when you need to find which functions consume the most CPU time. Use perf stat when you suspect a hardware bottleneck like cache misses or branch mispredictions. Use strace -c when you want a summary of system call frequency without reading raw logs. Use strace -f -o when the application forks children and you need to trace the entire process tree. Use sysprof when you prefer a graphical interface and need to correlate CPU, memory, and I/O in one timeline. Use valgrind when you suspect memory leaks or use-after-free errors rather than performance issues.

Where to go next