Has Google used A/B OTAs on any devices?
Yes. The marketing name for A/B updates is seamless updates.
Pixel and Pixel XL phones from October 2016 shipped with A/B, and
all Chromebooks use the same
implementation of A/B. The necessary platform code implementation is
public in Android 7.1 and higher.
Why are A/B OTAs better?
A/B OTAs provide a better user experience when taking updates. Measurements from monthly security updates show this feature has already proven a success: As of May 2017, 95% of Pixel owners are running the latest security update after a month compared to 87% of Nexus users, and Pixel users update sooner than Nexus users. Failures to update blocks during an OTA no longer result in a device that won't boot; until the new system image has successfully booted, Android retains the ability to fall back to the previous working system image.
How did A/B affect the 2016 Pixel partition sizes?
The following table contains details on the shipping A/B configuration versus the internally-tested non-A/B configuration:
|Pixel partition sizes||A/B||Non-A/B|
A/B updates require an increase of only 320 MiB in flash, with a savings of 32MiB from removing the recovery partition and another 100MiB preserved by removing the cache partition. This balances the cost of the B partitions for the bootloader, the boot partition, and the radio partition. The vendor partition doubled in size (the vast majority of the size increase). Pixel's A/B system image is half the size of the original non-A/B system image.
For the Pixel A/B and non-A/B variants tested internally (only A/B shipped), the space used differed by only 320MiB. On a 32GiB device, this is just under 1%. For a 16GiB device this would be less than 2%, and for an 8GiB device almost 4% (assuming all three devices had the same system image).
Why didn't you use SquashFS?
We experimented with SquashFS but weren't able to achieve the performance desired for a high-end device. We don't use or recommend SquashFS for handheld devices.
More specifically, SquashFS provided about 50% size savings on the system partition, but the overwhelming majority of the files that compressed well were the precompiled .odex files. Those files had very high compression ratios (approaching 80%), but the compression ratio for the rest of the system partition was much lower. In addition, SquashFS in Android 7.0 raised the following performance concerns:
- Pixel has very fast flash compared to earlier devices but not a huge number of spare CPU cycles, so reading fewer bytes from flash but needing more CPU for I/O was a potential bottleneck.
- I/O changes that perform well on an artificial benchmark run on an unloaded system sometimes don't work well on real-world use cases under real-world load (such as crypto on Nexus 6).
- Benchmarking showed 85% regressions in some places.
As SquashFS matures and adds features to reduce CPU impact (such as a whitelist of commonly-accessed files that shouldn't be compressed), we will continue to evaluate it and offer recommendations to device manufacturers.
How did you halve the size of the system partition without SquashFS?
Applications are stored in .apk files, which are actually ZIP archives. Each .apk file has inside it one or more .dex files containing portable Dalvik bytecode. An .odex file (optimized .dex) lives separately from the .apk file and can contain machine code specific to the device. If an .odex file is available, Android can run applications at ahead-of-time compiled speeds without having to wait for the code to be compiled each time the application is launched. An .odex file isn't strictly necessary: Android can actually run the .dex code directly via interpretation or Just-In-Time (JIT) compilation, but an .odex file provides the best combination of launch speed and run-time speed if space is available.
Example: For the installed-files.txt from a Nexus 6P running Android 7.1 with a total system image size of 2628MiB (2755792836 bytes), the breakdown of the largest contributors to overall system image size by file type is as follows:
|.so (native C/C++ code)||202162479 bytes||7.3%|
|.oat files/.art images||163892188 bytes||5.9%|
|icu locale data||27468687 bytes||0.9%|
These figures are similar for other devices too, so on Nexus/Pixel
devices, .odex files take up approximately half the system partition. This meant
we could continue to use ext4 but write the .odex files to the B partition
at the factory and then copy them to
/data on first boot. The
actual storage used with ext4 A/B is identical to SquashFS A/B, because if we
had used SquashFS we would have shipped the preopted .odex files on system_a
instead of system_b.
Doesn't copying .odex files to /data mean the space saved on /system is lost on /data?
Not exactly. On Pixel, most of the space taken by .odex files is for apps,
which typically exist on
/data. These apps take Google Play
updates, so the .apk and .odex files on the system image are unused for most of
the life of the device. Such files can be excluded entirely and replaced by
small, profile-driven .odex files when the user actually uses each app (thus
requiring no space for apps the user doesn't use). For details, refer to the
Google I/O 2016 talk The
Evolution of Art.
The comparison is difficult for a few key reasons:
- Apps updated by Google Play have always had their .odex files on
/dataas soon as they receive their first update.
- Apps the user doesn't run don't need an .odex file at all.
- Profile-driven compilation generates smaller .odex files than ahead-of-time compilation (because the former optimizes only performance-critical code).
For details on the tuning options available to OEMs, see Configuring ART.
Aren't there two copies of the .odex files on /data?
It's a little more complicated ... After the new system image has been
written, the new version of dex2oat is run against the new .dex files to
generate the new .odex files. This occurs while the old system is still running,
so the old and new .odex files are both on
/data at the same time.
The code in OtaDexoptService
getAvailableSpace before optimizing each package to avoid
/data. Note that available here is still
conservative: it's the amount of space left before hitting the usual
system low space threshold (measured as both a percentage and a byte count). So
/data is full, there won't be two copies of every .odex file.
The same code also has a BULK_DELETE_THRESHOLD: If the device gets that close
to filling the available space (as just described), the .odex files belonging to
apps that aren't used are removed. That's another case without two copies of
every .odex file.
In the worst case where
/data is completely full, the update
waits until the device has rebooted into the new system and no longer needs the
old system's .odex files. The PackageManager handles this:
After the new system has successfully booted,
can remove the .odex files that were used by the old system, returning the
device back to the steady state where there's only one copy.
So, while it is possible that
/data contains two copies of all
the .odex files, (a) this is temporary and (b) only occurs if you had plenty of
free space on
/data anyway. Except during an update, there's only
one copy. And as part of ART's general robustness features, it will never fill
/data with .odex files anyway (because that would be a problem on a
non-A/B system too).
Doesn't all this writing/copying increase flash wear?
Only a small portion of flash is rewritten: a full Pixel system update writes about 2.3GiB. (Apps are also recompiled, but that's true of non-A/B too.) Traditionally, block-based full OTAs wrote a similar amount of data, so flash wear rates should be similar.
Does flashing two system partitions increase factory flashing time?
No. Pixel didn't increase in system image size (it merely divided the space across two partitions).
Doesn't keeping .odex files on B make rebooting after factory data reset slow?
Yes. If you've actually used a device, taken an OTA, and performed a factory
data reset, the first reboot will be slower than it would otherwise be (1m40s vs
40s on a Pixel XL) because the .odex files will have been lost from B after the
first OTA and so can't be copied to
/data. That's the trade-off.
Factory data reset should be a rare operation when compared to regular boot
so the time taken is less important. (This doesn't affect users or reviewers who
get their device from the factory, because in that case the B partition is
available.) Use of the JIT compiler means we don't need to recompile
everything, so it's not as bad as you might think. It's also possible
to mark apps as requiring ahead-of-time compilation using
coreApp="true" in the manifest:
This is currently used by
system_server because it's not allowed to
JIT for security reasons.
Doesn't keeping .odex files on /data rather than /system make rebooting after an OTA slow?
No. As explained above, the new dex2oat is run while the old system image is still running to generate the files that will be needed by the new system. The update isn't considered available until that work has been done.
Can (should) we ship a 32GiB A/B device? 16GiB? 8GiB?
32GiB works well as it was proven on Pixel, and 320MiB out of 16GiB means a reduction of 2%. Similarly, 320MiB out of 8GiB a reduction of 4%. Obviously A/B would not be the recommended choice on devices with 4GiB, as the 320MiB overhead is almost 10% of the total available space.
Does AVB2.0 require A/B OTAs?
No. Android Verified Boot has always required block-based updates, but not necessarily A/B updates.
Do A/B OTAs require AVB2.0?
Do A/B OTAs break AVB2.0's rollback protection?
No. There's some confusion here because if an A/B system fails to boot into the new system image it will (after some number of retries determined by your bootloader) automatically revert to the "previous" system image. The key point here though is that "previous" in the A/B sense is actually still the "current" system image. As soon as the device successfully boots a new image, rollback protection kicks in and ensures that you can't go back. But until you've actually successfully booted the new image, rollback protection doesn't consider it to be the current system image.
If you're installing an update while the system is running, isn't that slow?
With non-A/B updates, the aim is to install the update as quickly as possible because the user is waiting and unable to use their device while the update is applied. With A/B updates, the opposite is true; because the user is still using their device, as little impact as possible is the goal, so the update is deliberately slow. Via logic in the Java system update client (which for Google is GmsCore, the core package provided by GMS), Android also attempts to choose a time when the users aren't using their devices at all. The platform supports pausing/resuming the update, and the client can use that to pause the update if the user starts to use the device and resume it when the device is idle again.
There are two phases while taking an OTA, shown clearly in the UI as Step 1 of 2 and Step 2 of 2 under the progress bar. Step 1 corresponds with writing the data blocks, while step 2 is pre-compiling the .dex files. These two phases are quite different in terms of performance impact. The first phase is simple I/O. This requires little in the way of resources (RAM, CPU, I/O) because it's just slowly copying blocks around.
The second phase runs dex2oat to precompile the new system image. This obviously has less clear bounds on its requirements because it compiles actual apps. And there's obviously much more work involved in compiling a large and complex app than a small and simple app; whereas in phase 1 there are no disk blocks that are larger or more complex than others.
The process is similar to when Google Play installs an app update in the background before showing the 5 apps updated notification, as has been done for years.
What if a user is actually waiting for the update?
The current implementation in GmsCore doesn't distinguish between background updates and user-initiated updates but may do so in the future. In the case where the user explicitly asked for the update to be installed or is watching the update progress screen, we'll prioritize the update work on the assumption that they're actively waiting for it to finish.
What happens if there's a failure to apply an update?
With non-A/B updates, if an update failed to apply, the user was usually left with an unusable device. The only exception was if the failure occurred before an application had even started (because the package failed to verify, say). With A/B updates, a failure to apply an update does not affect the currently running system. The update can simply be retried later.
Which systems on a chip (SoCs) support A/B?
As of 2017-03-15, we have the following information:
|Android 7.x and earlier||Android 8.x and later|
|Qualcomm||Depending on OEM requests||All chipsets will get support|
|Mediatek||Depending on OEM requests||All chipsets will get support|
For details on schedules, check with your SoC contacts. For SoCs not listed above, reach out to your SoC directly.