How to Debug Boot Failures on Fedora Atomic Desktops

You upgraded and the screen froze

You ran an update on your Fedora Silverblue or Kinoite system. You rebooted. The screen hangs on the Fedora logo, or you get dropped to a black terminal with a root prompt. Panic sets in. You worry that the update bricked the laptop or corrupted the filesystem. You didn't. Atomic desktops are designed to survive bad updates. The system is likely sitting on a broken deployment while a perfectly good previous deployment waits in the background. Your job is to find the error, switch back to the working state, and fix the root cause.

How Atomic desktops handle updates

Fedora Atomic desktops treat the operating system like a version-controlled repository. The root filesystem is immutable. You cannot install packages directly onto the base system with dnf. Instead, updates create new "deployments." Think of deployments as snapshots. When you boot, the system loads the latest snapshot. If that snapshot has a broken kernel module, a conflicting package, or a misconfigured service, the boot fails. The previous snapshot remains untouched. You can switch back to it instantly.

This architecture provides a safety net that traditional mutable systems lack. On a standard Fedora Workstation, a bad update can leave the system in an inconsistent state that requires manual intervention to fix. On an Atomic desktop, the bad update is isolated in its own deployment. You can roll back to the previous deployment without losing data or configuration.

The failure usually happens in one of three places. The kernel itself may have a bug or a missing module. A service may fail to start and block the boot target. Or a layered package may conflict with the base image. Layered packages are user-installed additions that sit on top of the immutable base. If a layer depends on a specific library version that the base image no longer provides, the layer breaks. When the layer breaks, the boot can fail.

Convention aside: rpm-ostree manages the base image and deployments. dnf manages layered packages. Never run dnf upgrade on the base system. Run rpm-ostree upgrade for system updates. Running dnf upgrade can create conflicts that break the atomic transaction model.

Diagnose the failure

Start by looking at the logs from the failed boot. The journal contains the full record of what happened during startup. You need to see the errors that occurred before the system stopped responding. If you can access a terminal, either by switching to a TTY with Ctrl+Alt+F3 or by booting into the recovery environment, run the journal command to inspect the previous boot.

Here's how to check the logs from the last failed boot attempt.

# -b -1 selects the previous boot entry.
# -e jumps to the end of the log so you see the most recent errors first.
# -x adds explanatory text for common error codes.
journalctl -b -1 -xe

Look for lines marked Failed or error. Pay attention to service names. If you see Failed to start systemd-udevd.service, the issue is likely related to hardware detection or kernel modules. If you see SELinux: Could not recover process context, the issue is a security policy denial. SELinux denials often block services from starting. Check journalctl -t setroubleshoot for a human-readable summary of SELinux issues.

If the system drops to an emergency shell, you are already in a diagnostic environment. The root filesystem is mounted read-only. You can run journal commands here. If the system hangs completely and you cannot access a terminal, you need to boot from the GRUB menu. Hold Shift or press Esc during boot to reveal the GRUB menu. Select the recovery entry or the previous deployment entry. This gives you a working environment to run commands.

Convention aside: journalctl -xeu <unit> is the muscle-memory command for checking a specific service. Replace <unit> with the service name, like NetworkManager.service. This filters the log to only show lines related to that unit, making it easier to find the root cause.

Recover the system

Once you have identified the error, choose the recovery method that matches the problem. If the error is in the base image, such as a kernel regression or a broken system package, roll back to the previous deployment. If the error is caused by a layered package, you may need to reset the layers. If the OSTree metadata is corrupted, you need to repair the tree state.

Here's how to roll back to the previous deployment.

# This command swaps the boot order.
# The current broken deployment becomes second in line.
# The previous working deployment becomes first.
# Your layers are preserved across the rollback.
rpm-ostree rollback

After running rollback, reboot the system. The system should boot into the previous deployment. If the rollback succeeds, you have a working system again. You can then investigate the update that caused the failure. Check the release notes for the Fedora version. Look for known issues. If the problem is a bug, report it. If the problem is a configuration issue, fix it and try the update again.

If the system still fails to boot after a rollback, the OSTree state may be corrupted. This can happen after a power loss during an update or a disk error. Use the repair command to fix the metadata.

Here's how to repair the OSTree state.

# This command checks the consistency of the OSTree repository.
# It verifies the checksums of all deployments.
# It fixes any metadata inconsistencies.
# It does not modify user data or layers.
rpm-ostree repair

Run repair from the recovery environment or a working deployment. If repair reports errors, it will attempt to fix them. If it cannot fix the errors, you may need to reinstall the system. Reinstallation preserves the /home directory, so your data remains safe.

If a layered package broke the boot, rollback may not help. Layers are applied to deployments. If the layer is incompatible with the base image, it can break both the current and previous deployments. In this case, you need to reset the layers. Resetting removes all layered packages and returns the system to the clean base image.

Here's how to reset the system to remove all layers.

# This command removes all layered packages.
# It reverts the system to the base image state.
# Use this when a layer conflicts with the base.
# Your configuration files in /etc are preserved.
rpm-ostree reset

After resetting, reboot. The system will boot without the layers. You can then reinstall the necessary packages one by one to identify the conflicting layer.

Verify the state

After recovery, verify that the system is in the expected state. Check the deployment list and the boot order. Ensure that the asterisk marks the booted deployment and the arrow marks the next deployment. This confirms that the rollback or reset took effect.

Here's how to check the current deployment status.

# This command shows the list of deployments.
# The asterisk (*) marks the currently booted deployment.
# The arrow (>) marks the deployment that will boot next.
# The output also lists any pinned versions and layered packages.
rpm-ostree status

Look for the * and > markers. If the * is on the old deployment and the > is on the new deployment, the system is ready to try the update again. If the * and > are on the same deployment, the boot order is stable. Check the Layers section. If you see packages listed there, they will be reinstalled on the next boot. If you want to remove them permanently, use rpm-ostree uninstall.

Check the asterisk. The asterisk tells you exactly what is running. Trust the asterisk.

Common errors and pitfalls

Boot failures on Atomic desktops often stem from specific patterns. Understanding these patterns helps you avoid them in the future.

Layer conflicts are the most common user-induced failure. When you layer a package, rpm-ostree checks for dependencies. If a dependency is missing, the transaction fails. However, if a dependency is present but incompatible, the transaction may succeed, and the boot fails. For example, layering a kernel module that targets a different kernel version will break the boot. The module will not load, and the service depending on it will fail.

Error text fence:

error: Transaction test error:
  package kernel-module-extra-1.0-1.fc40.x86_64 requires kernel = 6.6.8, but none of the providers can be installed

This error appears during the layer installation. It tells you exactly what is wrong. The package requires a specific kernel version that is not available. Do not force the installation. Forcing breaks the atomic model and can leave the system in an unrecoverable state.

Editing files in /usr/lib is another pitfall. On Atomic desktops, /usr/lib is part of the immutable base. Any changes you make to files in /usr/lib will be overwritten on the next boot. If you need to modify a configuration file, edit the copy in /etc. The system merges /etc with /usr/lib at boot time. Changes in /etc persist across updates.

Convention aside: Config files in /etc/ are user-modified. Files in /usr/lib/ ship with the package. Edit /etc/. Never edit /usr/lib/. If you edit /usr/lib/, your changes vanish on reboot, and you waste time debugging a problem that disappears.

SELinux denials can also cause boot failures. If a service is denied access to a resource, it may fail to start. If that service is critical, the boot stops. Check the SELinux logs. Look for avc: denied messages. Use ausearch or journalctl -t setroubleshoot to find the denial. Fix the policy or adjust the service configuration. Do not disable SELinux. Disabling SELinux reduces security and often masks the underlying configuration issue.

Read the error before guessing. The package manager tells you exactly what conflicts. Forcing the install breaks the rollback safety.

Choose the right recovery method

Different symptoms require different tools. Using the wrong tool can waste time or make the problem worse. Match the method to the cause.

Use rpm-ostree rollback when the latest update introduced a regression and you need to return to the previous working state immediately.

Use rpm-ostree repair when the OSTree metadata is inconsistent or the deployment list is corrupted after a power loss.

Use rpm-ostree reset when you layered a package that broke the system and you want to remove all layers and return to the clean base image.

Use the GRUB recovery menu when the system cannot mount the root filesystem or drops to an emergency shell before you can run commands.

Pick the tool that matches the symptom. Rolling back fixes updates. Repair fixes corruption. Reset fixes layers.

Where to go next

When your Fedora Atomic Desktop won't start, it usually means a recent update broke the system or the boot configuration got messed up. Think of it like a car that won't start after a bad gas fill-up; you need to switch back to the last known good fuel (deployment) or fix the engine (OSTree state). This process lets you safely revert to a working version without losing your data.