This page provides a set of tips, that you can select from, to improve boot time.
Strip debug symbols from modules
Similar to how debug symbols are stripped from the kernel on a production device, make sure you also strip the debug symbols from modules. Stripping debug symbols from modules helps boot time by reducing the following:
- The time it takes to read the binaries from flash.
- The time it takes to decompress the ramdisk.
- The time it takes to load the modules.
Stripping debug symbol from modules may save several seconds during boot.
Symbol stripping is enabled by default in the Android platform build, but
to explicitly enable them, set
BOARD_DO_NOT_STRIP_VENDOR_RAMDISK_MODULES in your device-specific config
Use LZ4 compression for kernel and ramdisk
Gzip generates a smaller compressed output compared to LZ4, but LZ4 decompresses faster than Gzip. For the kernel and modules, the absolute storage size reduction from using Gzip isn't that significant compared to the decompression time benefit of LZ4.
Support for LZ4 ramdisk compression has been added to the Android platform
BOARD_RAMDISK_USE_LZ4. You can set this option in your
device-specific config. Kernel compression can be set through kernel defconfig.
Switching to LZ4 should give 500ms to 1000ms faster boot time.
Avoid excessive logging in your drivers
In ARM64 and ARM32, function calls that are more than a specific distance from the call site need a jump table (called a procedure linking table, or PLT) to be able to encode the full jump address. Since modules are loaded dynamically, these jump tables need to be fixed up during module load. The calls that need relocation are called relocation entries with explicit addends (or RELA, for short) entries in the ELF format.
The Linux kernel does some memory size optimization (such as cache hit
optimization) when allocating the PLT. With this upstream
the optimization scheme has an O(N^2) complexity, where N is the number of
RELAs of type
R_AARCH64_CALL26. So having fewer RELAs
of these types is helpful in reducing the module load time.
One common coding pattern that increases the number of
R_AARCH64_JUMP26 RELAs is excessive logging in a
driver. Each call to
printk() or any other logging scheme typically adds a
JUMP26 RELA entry. In the commit text in the upstream
,notice that even with the optimization, the six modules take about 250ms
to load—that is because those six modules were the top six modules with the most
amount of logging.
Reducing logging can save can save about 100 - 300ms on boot times depending on how excessive the existing logging is.
Enable asynchronous probing, selectively
When a module is loaded, if the device that it supports has already been
populated from the DT (devicetree) and added to driver core, then the device
probe is done in the context of the
module_init() call. When a device probe is
done in the context of
module_init(), the module can't finish loading until
the probe completes. Since module loading is mostly serialized, a device that
takes a relatively long time to probe slows the boot time.
To avoid slower boot times, enable asynchronous probing for modules that take a while to probe their devices. Enabling asynchronous probing for all modules might not be beneficial as the time it takes to fork a thread and kick off the probe might be as high as the time it takes to probe the device.
Devices that are connected through a slow bus such as I2C, devices that do firmware loading in their probe function, and devices that do a lot of hardware initialization can lead to the timing issue. The best way to identify when this happens is to collect the probe time for every driver and sort it.
To enable asynchronous probing for a module, it isn't sufficient to only
flag in the driver code. For modules, you also need to add
module_name.async_probe=1 in the kernel command line
async_probe=1 as a module parameter when loading the module using
Enabling asynchronous probing can save about 100 - 500ms on boot times depending on your hardware/drivers.
Probe your CPUfreq driver as early as possible
The earlier your CPUfreq driver probes, the sooner you can scale the CPU
frequency to maximum (or some thermally limited maximum) during boot. The
faster the CPU, the faster the boot. This guideline also applies to
drivers that control the DRAM, memory, and interconnect frequency.
With modules, the load ordering can depend on the
initcall level and
compile or link order of the drivers. Use an alias
MODULE_SOFTDEP() to make
cpufreq driver is among the first few modules to load.
Apart from loading the module early, you also need to make sure all the dependencies to probe the CPUfreq driver have also probed. For example, if you need a clock or regulator handle to control the frequency of your CPU, make sure they are probed first. Or you might need thermal drivers to be loaded before the CPUfreq driver if it is possible for your CPUs to get too hot during boot up. So, do what you can to make sure the CPUfreq and relevant devfreq drivers probe as early as possible.
The savings from probing your CPUfreq driver early can be very small to very large depending on how early you can get these to probe and at what frequency the bootloader leaves the CPUs in.
Move modules to second stage init, vendor or vendor_dlkm partition
Because the first stage init process is serialized, there aren't many
opportunities to parallelize the boot process. If a module isn't needed for
first stage init to finish, move the module to second stage init by placing it
in the vendor or
First stage init doesn't require probing several devices to get to second stage init. Only console and flash storage functionality are needed for a normal boot flow.
Load the following essential drivers:
For recovery and user space
fastbootd mode, first stage init requires more
devices to probe (such as USB), and display. Keep a copy of these modules in the
first stage ramdisk and in the vendor or
vendor_dlkm partition. This allows them
to be loaded in first stage init for recovery or
fastbootd boot flow. However,
don't load the recovery mode modules in first stage init during normal boot
flow. Recovery mode modules can be deferred to second stage init to decrease the
boot time. All other modules that aren't needed in first stage init should be
moved to the vendor or
Given a list of leaf devices (for example, the UFS or serial),
dev needs.sh script
finds all drivers, devices, and modules needed for dependencies or suppliers
(for example, clocks, regulators, or
gpio) to probe.
Moving modules to second stage init decreases boot times in the following ways:
- Ramdisk size reduction.
- This yields faster flash reads when the bootloader loads the ramdisk (serialized boot step).
- This yields faster decompression speeds when the kernel decompresses the ramdisk (serialized boot step).
- Second stage init works in parallel, which hides the module's loading time with the work being done in second stage init.
Moving modules to second stage can save 500 - 1000ms on boot times depending on how many modules you're able to move to second stage init.
Module loading logistics
The latest Android build features board configurations that control which modules copy over to each stage, and which modules load. This section focuses on the following subset:
BOARD_VENDOR_RAMDISK_KERNEL_MODULES. This list of modules to be copied into the ramdisk.
BOARD_VENDOR_RAMDISK_KERNEL_MODULES_LOAD. This list of modules to be loaded in first stage init.
BOARD_VENDOR_RAMDISK_RECOVERY_KERNEL_MODULES_LOAD. This list of modules to be loaded when recovery or
fastbootdis selected from the ramdisk.
BOARD_VENDOR_KERNEL_MODULES. This list of modules to be copied into the vendor or
BOARD_VENDOR_KERNEL_MODULES_LOAD. This list of modules to be loaded in second stage init.
The boot and recovery modules in ramdisk must also be copied to the vendor or
vendor_dlkm partition at
/vendor/lib/modules. Copying these modules to the
vendor partition ensures the modules aren't invisible during second stage init,
which is useful for debugging and collecting
modinfo for bugreports.
The duplication should cost minimal space on the vendor or
as long as the boot module set is minimized. Make sure that the vendor's
modules.list file has a filtered list of modules in
The filtered list ensures boot times aren't affected by the modules loading
again (which is an expensive process).
Ensure that recovery mode modules load as a group. Loading recovery mode modules can be done either in recovery mode, or at the beginning of the second stage init in each boot flow.
You can use the device
Board.Config.mk files to perform these actions as seen
in the following example:
# All kernel modules KERNEL_MODULES := $(wildcard $(KERNEL_MODULE_DIR)/*.ko) KERNEL_MODULES_LOAD := $(strip $(shell cat $(KERNEL_MODULE_DIR)/modules.load) # First stage ramdisk modules BOOT_KERNEL_MODULES_FILTER := $(foreach m,$(BOOT_KERNEL_MODULES),%/$(m)) # Recovery ramdisk modules RECOVERY_KERNEL_MODULES_FILTER := $(foreach m,$(RECOVERY_KERNEL_MODULES),%/$(m)) BOARD_VENDOR_RAMDISK_KERNEL_MODULES += \ $(filter $(BOOT_KERNEL_MODULES_FILTER) \ $(RECOVERY_KERNEL_MODULES_FILTER),$(KERNEL_MODULES)) # ALL modules land in /vendor/lib/modules so they could be rmmod/insmod'd, # and modules.list actually limits us to the ones we intend to load. BOARD_VENDOR_KERNEL_MODULES := $(KERNEL_MODULES) # To limit /vendor/lib/modules to just the ones loaded, use: # BOARD_VENDOR_KERNEL_MODULES := $(filter-out \ # $(BOOT_KERNEL_MODULES_FILTER),$(KERNEL_MODULES)) # Group set of /vendor/lib/modules loading order to recovery modules first, # then remainder, subtracting both recovery and boot modules which are loaded # already. BOARD_VENDOR_KERNEL_MODULES_LOAD := \ $(filter-out $(BOOT_KERNEL_MODULES_FILTER), \ $(filter $(RECOVERY_KERNEL_MODULES_FILTER),$(KERNEL_MODULES_LOAD))) BOARD_VENDOR_KERNEL_MODULES_LOAD += \ $(filter-out $(BOOT_KERNEL_MODULES_FILTER) \ $(RECOVERY_KERNEL_MODULES_FILTER),$(KERNEL_MODULES_LOAD)) # NB: Load order governed by modules.load and not by $(BOOT_KERNEL_MODULES) BOARD_VENDOR_RAMDISK_KERNEL_MODULES_LOAD := \ $(filter $(BOOT_KERNEL_MODULES_FILTER),$(KERNEL_MODULES_LOAD)) # Group set of /vendor/lib/modules loading order to boot modules first, # then the remainder of recovery modules. BOARD_VENDOR_RAMDISK_RECOVERY_KERNEL_MODULES_LOAD := \ $(filter $(BOOT_KERNEL_MODULES_FILTER),$(KERNEL_MODULES_LOAD)) BOARD_VENDOR_RAMDISK_RECOVERY_KERNEL_MODULES_LOAD += \ $(filter-out $(BOOT_KERNEL_MODULES_FILTER), \ $(filter $(RECOVERY_KERNEL_MODULES_FILTER),$(KERNEL_MODULES_LOAD)))
This example showcases an easier-to-manage subset of
RECOVERY_KERNEL_MODULES to be specified locally in the board configuration
files. The preceding script finds and fills each of the subset modules from the
selected available kernel modules, leaving the reamining modules for second
For second stage init, we recommend running the module loading as a service so it doesn't block boot flow. Use a shell script to manage the module loading so that other logistics, such as error handling and mitigation, or module load completion, can be reported back (or ignored) if necessary.
You can ignore a debug module load failure that isn't present on user builds.
To ignore this failure, set the
vendor.device.modules.ready property to
trigger later stages of
init rc scripting bootflow to continue onto the launch
screen. Reference the following example script, if you have the following code
#!/vendor/bin/sh . . . if [ $# -eq 1 ]; then cfg_file=$1 else # Set property even if there is no insmod config # to unblock early-boot trigger setprop vendor.common.modules.ready setprop vendor.device.modules.ready exit 1 fi if [ -f $cfg_file ]; then while IFS="|" read -r action arg do case $action in "insmod") insmod $arg ;; "setprop") setprop $arg 1 ;; "enable") echo 1 > $arg ;; "modprobe") modprobe -a -d /vendor/lib/modules $arg ;; . . . esac done < $cfg_file fi
In the hardware rc file, the
one shot service could be specified with:
service insmod-sh /vendor/etc/init.insmod.sh /vendor/etc/init.insmod.<hw>.cfg class main user root group root system Disabled oneshot
Additional optimizations can be made after modules move from the first to second stage. You can use the modprobe blocklist feature to split up the second stage boot flow to include deferred module loading of nonessential modules. Loading of modules used exclusively by a specific HAL can be deferred to load the modules only when the HAL is started.
To improve apparent boot times, you can specifically choose modules in the
module loading service that are more conducive to loading after the launch
screen. For example, you can explicitly late load the modules for
video decoder or wifi after the init boot flow has been cleared (
Android property signal, for example). Make sure the HALs for the late loading
modules block long enough when the kernel drivers aren't present.
Alternatively, you can use init's
wait<file>[<timeout>] command in the boot flow
rc scripting to wait for select
sysfs entries to show that driver modules have
completed the probe operations. An example of this is waiting for the display
driver to complete loading in the background of recovery or
presenting menu graphics.
Initialize the CPU frequency to a reasonable value in the bootloader
Not all SoCs/products might be able to boot the CPU at the highest frequency due to thermal or power concerns during boot loop tests. However, make sure the bootloader sets the frequency of all the online CPUs to as high as safely possible for a SoC/product. This is very important because, with a fully modular kernel, the init ramdisk decompression takes place before the CPUfreq driver can be loaded. So, if the CPU is left at the lower end of its frequency by the bootloader, the ramdisk decompression time can take longer than a statically compiled kernel (after adjusting for ramdisk size difference) because the CPU frequency would be very low when doing CPU intensive work (decompression). The same applies to memory/interconnect frequency.
Initialize CPU frequency of big CPUs in the bootloader
CPUfreq driver is loaded, the kernel is unaware of the little and big
CPU frequencies and doesn't scale the CPUs’ sched capacity for their current
frequency. The kernel might migrate threads to the big CPU if the load is
sufficiently high on the little CPU.
Make sure the big CPUs are at least as performant as the little CPUs for the frequency at which the bootloader leaves them in. For example, if the big CPU is 2x as performant as the little CPU for the same frequency, but the bootloader sets the little CPU’s frequency to 1.5 GHz and the big CPU’s frequency to 300 MHz, then the boot performance is going to drop if the kernel moves a thread to the big CPU. In this example, if it is safe to boot the big CPU at 750 MHz, you should do so even if you do not plan to explicitly use it.
Drivers should not load firmware in first stage init
There might be some unavoidable cases where firmware needs to be loaded in first stage init. But in general, drivers should not load any firmware in first stage init, especially in device probe context. Loading firmware in first stage init causes the entire boot process to stall if the firmware is not available in the first stage ramdisk. And even if the firmware is present in the first stage ramdisk, it still causes an unnecessary delay.