Troubleshooting the Linux boot process


Knowing the boot sequence is important if you are troubleshooting it. There are many good articles in this area like Inside the Linux boot process published in ibm website and Joey’s notes on redhat boot process. I shall not bore everyone again with the details. Instead, I will focus more on how to troubleshoot the redhat boot process.

Stage 1: Power On Self Test (POST)

You press the ‘on’ button and the computer does its own test on the ram, motherboard, disk…etc. Listen for the beeps. If you can’t get pass this stage, don’t bother thinking about linux.

Stage 2: Master Boot Record (MBR)

The MBR is 512 bytes. The first 446 bytes contains a code that points to a boot loader somewhere on the disk, usually the /boot partition.

Stage 3: Boot Loader

Grub is the default boot loader for redhat. It uses /boot/grub/grub.conf config and attempts to load the kernel, which is usually /boot/vmlinuzXX. If grub is corrupted, you need go into rescue environment and do a ‘grub-install’.

Stage 4: Kernel

The kernel checks your hardware tries to configure it by loading the modules from the ramdisk, usually /boot/initrdXX. ‘dmesg’ is good to check for kernel logs. If there is anything wrong with the kernel, you need to install a new one using the rpm command in rescue mode. Similarly, if there is a problem with initrd, you need to issue a ‘mkinitrd’ in rescue mode. After loading all the required kernel modules, the OS will mount root partition as read only.

Stage 5: /sbin/init

The init process reads /etc/inittab file which specifies a number of tasks to be done.

Firstly, /etc/rc.d/rc.sysinit (which does a bunch of things) is executed. sysinit mounts root partition as read-write and reads the /etc/fstab config. Since /etc/inittab already specifies the run level, init will start the services as specified in /etc/rc.d/rcX.d directory. After everything is loaded, it will run /etc/rc.local. Then 6 terminals will then be spawned and lastly, X windows will be started with the ‘prefdm’ command.

Init is a long process and things can go wrong in several places. If there is a problem with the /etc/inittab or /etc/fstab file, one needs to go into emergency mode and re-edit the files. Note that in emergency mode, the filesystem is not yet mounted as read-write. So, one needs to remount the root directory before making any changes. If the problem is related to lost passwords, x windows or anything after the init process, run level 1 is good enough and it doesn’t even prompt you for password!

You might think that troubleshooting is easy but when you combine a few problems into one, it can be complicated. To be good at troubleshooting, practice is the key.

Like it.? Share it:

Comments are closed.