The Android build system for Android 13 and lower supports using Clang's profile-guided optimization (PGO) on native Android modules that have blueprint build rules. This page describes Clang PGO, how to continually generate and update profiles used for PGO, and how to integrate PGO with the build system (with use case).
NB: This document describes the use of PGO in the Android platform. To learn about using PGO from an Android app, visit this page.
About Clang PGO
Clang can perform profile-guided optimization using two types of profiles:
- Instrumentation-based profiles are generated from an instrumented target program. These profiles are detailed and impose a high runtime overhead.
- Sampling-based profiles are typically produced by sampling hardware counters. They impose a low runtime overhead, and can be collected without any instrumentation or modification to the binary. They are less detailed than instrumentation-based profiles.
All profiles should be generated from a representative workload that
exercises the typical behavior of the app. While Clang supports both
AST-based (-fprofile-instr-generate
) and LLVM IR-based
(-fprofile-generate)
, Android supports only LLVM IR-based for
instrumentation-based PGO.
The following flags are needed to build for profile collection:
-fprofile-generate
for IR-based instrumentation. With this option, the backend uses a weighted minimal spanning tree approach to reduce the number of instrumentation points and optimize their placement to low-weight edges (use this option for the link step as well). The Clang driver automatically passes the profiling runtime (libclang_rt.profile-arch-android.a
) to the linker. This library contains routines to write the profiles to disk upon program exit.-gline-tables-only
for sampling-based profile collection to generate minimal debug information.
A profile can be used for PGO using
-fprofile-use=pathname
or
-fprofile-sample-use=pathname
for instrumentation-based
and sampling-based profiles respectively.
Note: As changes are made to the code, if Clang can no
longer use the profile data it generates a
-Wprofile-instr-out-of-date
warning.
Use PGO
Using PGO involves the following steps:
- Build the library/executable with instrumentation by passing
-fprofile-generate
to the compiler and linker. - Collect profiles by running a representative workload on the instrumented binary.
- Post-process the profiles using the
llvm-profdata
utility (for details, see Handling LLVM profile files). - Use the profiles to apply PGO by passing
-fprofile-use=<>.profdata
to the compiler and linker.
For PGO in Android, profiles should be collected offline and checked in alongside the code to ensure reproducible builds. The profiles can be used as code evolves, but must be regenerated periodically (or whenever Clang warns that the profiles are stale).
Collect profiles
Clang can use profiles collected by running benchmarks using an instrumented build of the library or by sampling hardware counters when the benchmark is run. At this time, Android doesn't support using sampling-based profile collection, so you must collect profiles using an instrumented build:
- Identify a benchmark and the set of libraries collectively exercised by that benchmark.
- Add
pgo
properties to the benchmark and libraries (details below). - Produce an Android build with an instrumented copy of these libraries
using:
make ANDROID_PGO_INSTRUMENT=benchmark
benchmark
is a placeholder that identifies the
collection of libraries instrumented during build. The actual representative
inputs (and possibly another executable that links against a library being
benchmarked) aren't specific to PGO and are beyond the scope of this
document.
- Flash or sync the instrumented build on a device.
- Run the benchmark to collect profiles.
- Use the
llvm-profdata
tool (discussed below) to post-process the profiles and make them ready to be checked into the source tree.
Use profiles during build
Check the profiles into toolchain/pgo-profiles
in an Android
tree. The name should match what is specified in the
profile_file
sub-property of the pgo
property for
the library. The build system automatically passes the profile file to Clang
when building the library. The ANDROID_PGO_DISABLE_PROFILE_USE
environment variable can be set to true
to
temporarily disable PGO and measure its performance benefit.
To specify additional product-specific profile directories, append them to
the PGO_ADDITIONAL_PROFILE_DIRECTORIES
make variable in a
BoardConfig.mk
. If additional paths are specified, profiles in
these paths override those in toolchain/pgo-profiles
.
When generating a release image using the dist
target to
make
, the build system writes the names of missing profile files
to $DIST_DIR/pgo_profile_file_missing.txt
. You can check this
file to see what profile files were accidentally dropped (which silently
disables PGO).
Enable PGO in Android.bp files
To enable PGO in Android.bp
files for native modules, simply
specify the pgo
property. This property has the following
sub-properties:
Property | Description |
---|---|
instrumentation
|
Set to true for PGO using instrumentation. Default is
false . |
sampling
|
Set to true for PGO using sampling. Default is
false . |
benchmarks
|
List of strings. This module is built for profiling if any benchmark
in the list is specified in the ANDROID_PGO_INSTRUMENT build
option. |
profile_file
|
Profile file (relative to toolchain/pgo-profile ) to use
with PGO. The build warns that this file doesn't exist by adding this
file to $DIST_DIR/pgo_profile_file_missing.txt
unless the enable_profile_use property is set to
false OR the
ANDROID_PGO_NO_PROFILE_USE build variable is set to
true . |
enable_profile_use
|
Set to false if profiles shouldn't be used during
build. Can be used during bootstrap to enable profile collection or to
temporarily disable PGO. Default is true . |
cflags
|
List of additional flags to use during an instrumented build. |
Example of a module with PGO:
cc_library { name: "libexample", srcs: [ "src1.cpp", "src2.cpp", ], static: [ "libstatic1", "libstatic2", ], shared: [ "libshared1", ] pgo: { instrumentation: true, benchmarks: [ "benchmark1", "benchmark2", ], profile_file: "example.profdata", } }
If the benchmarks benchmark1
and benchmark2
exercise representative behavior for libraries libstatic1
,
libstatic2
, or libshared1
, the pgo
property of these libraries can also include the benchmarks. The
defaults
module in Android.bp
can include a common
pgo
specification for a set of libraries to avoid repeating the
same build rules for several modules.
To select different profile files or selectively disable PGO for an
architecture, specify the profile_file
,
enable_profile_use
, and cflags
properties per
architecture. Example (with architecture target in
bold):
cc_library { name: "libexample", srcs: [ "src1.cpp", "src2.cpp", ], static: [ "libstatic1", "libstatic2", ], shared: [ "libshared1", ], pgo: { instrumentation: true, benchmarks: [ "benchmark1", "benchmark2", ], } target: { android_arm: { pgo: { profile_file: "example_arm.profdata", } }, android_arm64: { pgo: { profile_file: "example_arm64.profdata", } } } }
To resolve references to the profiling runtime library during
instrumentation-based profiling, pass the build flag
-fprofile-generate
to the linker. Static libraries instrumented
with PGO, all shared libraries, and any binary that directly depends on the
static library must also be instrumented for PGO. However, such shared
libraries or executables don't need to use PGO profiles, and their
enable_profile_use
property can be set to false
.
Outside of this restriction, you can apply PGO to any static library, shared
library, or executable.
Handle LLVM profile files
Executing an instrumented library or executable produces a profile file
named default_unique_id_0.profraw
in
/data/local/tmp
(where unique_id
is a
numeric hash that is unique to this library). If this file already exists,
the profiling runtime merges the new profile with the old one while writing
the profiles. Note that /data/local/tmp
isn't accessible to app
developers; they should use somewhere like
/storage/emulated/0/Android/data/packagename/files
instead.
To change the location of the profile file, set the LLVM_PROFILE_FILE
environment variable at runtime.
The llvm-profdata
utility is then used to convert the .profraw
file (and possibly
merge multiple .profraw
files) to a .profdata
file:
llvm-profdata merge -output=profile.profdata <.profraw and/or .profdata files>
profile.profdata
can then be checked into the source
tree for use during build.
If multiple instrumented binaries/libraries are loaded during a benchmark,
each library generates a separate .profraw
file with a separate
unique ID. Typically, all of these files can be merged to a single
.profdata
file and used for PGO build. In cases where a library
is exercised by another benchmark, that library must be optimized using
profiles from both the benchmarks. In this situation, the show
option of llvm-profdata
is useful:
llvm-profdata merge -output=default_unique_id.profdata default_unique_id_0.profraw llvm-profdata show -all-functions default_unique_id.profdata
To map unique_ids to individual libraries, search the
show
output for each unique_id for a function name that
is unique to the library.
Case study: PGO for ART
The case study presents ART as a relatable example; however, it isn't an accurate description of the actual set of libraries profiled for ART or their interdependencies.
The dex2oat
ahead-of-time compiler in ART depends on
libart-compiler.so
, which in turn depends on
libart.so
. The ART runtime is implemented mainly in
libart.so
. Benchmarks for the compiler and the runtime will be
different:
Benchmark | Profiled libraries |
---|---|
dex2oat
|
dex2oat (executable), libart-compiler.so ,
libart.so |
art_runtime
|
libart.so
|
- Add the following
pgo
property todex2oat
,libart-compiler.so
:pgo: { instrumentation: true, benchmarks: ["dex2oat",], profile_file: "dex2oat.profdata", }
- Add the following
pgo
property tolibart.so
:pgo: { instrumentation: true, benchmarks: ["art_runtime", "dex2oat",], profile_file: "libart.profdata", }
- Create instrumented builds for the
dex2oat
andart_runtime
benchmarks using:make ANDROID_PGO_INSTRUMENT=dex2oat make ANDROID_PGO_INSTRUMENT=art_runtime
- Run the benchmarks exercising
dex2oat
andart_runtime
to obtain:- Three
.profraw
files fromdex2oat
(dex2oat_exe.profdata
,dex2oat_libart-compiler.profdata
, anddexeoat_libart.profdata
), identified using the method described in Handling LLVM profile files. - A single
art_runtime_libart.profdata
.
- Three
- Produce a common profdata file for
dex2oat
executable andlibart-compiler.so
using:llvm-profdata merge -output=dex2oat.profdata \ dex2oat_exe.profdata dex2oat_libart-compiler.profdata
- Obtain the profile for
libart.so
by merging the profiles from the two benchmarks:llvm-profdata merge -output=libart.profdata \ dex2oat_libart.profdata art_runtime_libart.profdata
The raw counts for
libart.so
from the two profiles might be disparate because the benchmarks differ in the number of test cases and the duration for which they run. In this case, you can use a weighted merge:llvm-profdata merge -output=libart.profdata \ -weighted-input=2,dex2oat_libart.profdata \ -weighted-input=1,art_runtime_libart.profdata
The above command assigns twice the weight to the profile from
dex2oat
. The actual weight should be determined based on domain knowledge or experimentation. - Check the profile files
dex2oat.profdata
andlibart.profdata
intotoolchain/pgo-profiles
for use during build.
Alternatively, create a single instrumented build with all libraries instrumented using:
make ANDROID_PGO_INSTRUMENT=dex2oat,art_runtime (or) make ANDROID_PGO_INSTRUMENT=ALL
The second command builds all PGO-enabled modules for profiling.