SurfaceFlinger and Hardware Composer

SurfaceFlinger and the Hardware Composer HAL prepare buffers of graphical data for display by performing four key tasks:

  • Accepting buffers
  • Determining the most efficient way to composite buffers
  • Compositing buffers
  • Sending buffers to the display

SurfaceFlinger

SurfaceFlinger accepts buffers of data from multiple sources, composites them, and sends them to the display.

When an app comes to the foreground, the WindowManager service asks SurfaceFlinger for a drawing surface. SurfaceFlinger creates a layer (the primary component of which is a BufferQueue) for which SurfaceFlinger acts as the consumer. A binder object for the producer side is passed through the WindowManager to the app, which can then start sending frames directly to SurfaceFlinger.

Most apps have three layers on screen at a time: the status bar at the top of the screen, the navigation bar at the bottom or side, and the app UI. Some apps have more, some less (for example, the default home app has a separate layer for the wallpaper, while a full-screen game might hide the status bar. Each layer can be updated independently. The status and navigation bars are rendered by a system process, while the app layers are rendered by the app, with no coordination between the two.

Device displays refresh at a certain rate, typically 60 fps on phones and tablets. If the display contents are updated mid-refresh, tearing is visible; so it's important to update the contents only between cycles. The system receives a signal from the display when it's safe to update the contents. For historical reasons we'll call this the VSYNC signal.

The refresh rate may vary over time, for example, some mobile devices range from 58 to 62 fps depending on current conditions. For an HDMI-attached television, this could theoretically dip to 24 or 48 Hz to match a video. Because we can update the screen only once per refresh cycle, submitting buffers for display at 200 fps is a waste of effort as most of the frames are never seen. Instead of taking action whenever an app submits a buffer, SurfaceFlinger wakes up when the display is ready for something new.

When the VSYNC signal arrives, SurfaceFlinger walks through its list of layers looking for new buffers. If it finds a new one, it acquires it; if not, it continues to use the previously acquired buffer. SurfaceFlinger must always display something, so it hangs on to one buffer. If no buffers have ever been submitted on a layer, the layer is ignored.

After SurfaceFlinger has collected all buffers for visible layers, it asks the Hardware Composer how composition should be performed.

Hardware Composer

The Hardware Composer HAL (HWC) determines the most efficient way to composite buffers with the available hardware. As a HAL, its implementation is device-specific and usually done by the display hardware OEM.

The value of this approach is easy to recognize when you consider overlay planes, which composite multiple buffers in the display hardware rather than the GPU. For example, consider a typical Android phone in portrait orientation, with the status bar on top, navigation bar at the bottom, and app content everywhere else. The contents for each layer are in separate buffers. You can handle composition using either of the following methods:

  • Rendering the app content into a scratch buffer, then rendering the status bar over it, the navigation bar on top of that, and finally passing the scratch buffer to the display hardware.
  • Passing all three buffers to the display hardware and instructing it to read data from different buffers for different parts of the screen.

The latter approach can be significantly more efficient.

Display processor capabilities vary significantly. The number of overlays, whether layers can be rotated or blended, and restrictions on positioning and overlap can be difficult to express through an API. To accommodate these options, the HWC performs following calculations:

  1. SurfaceFlinger provides HWC with a full list of layers and asks, "How do you want to handle this?"
  2. HWC responds by marking each layer as overlay or GLES composition.
  3. SurfaceFlinger takes care of any GLES composition, passing the output buffer to HWC, and lets HWC handle the rest.

Because hardware vendors can custom tailor decision-making code, it's possible to get the best performance out of every device.

Overlay planes may be less efficient than GL composition when nothing on the screen is changing. This is particularly true when overlay contents have transparent pixels and overlapping layers are blended. In such cases, the HWC can choose to request GLES composition for some or all layers and retain the composited buffer. If SurfaceFlinger comes back asking to composite the same set of buffers, the HWC can continue to show the previously composited scratch buffer. This can improve the battery life of an idle device.

Devices running Android 4.4 and higher typically support four overlay planes. Attempting to composite more layers than overlays causes the system to use GLES composition for some of them, meaning the number of layers used by an app can have a measurable impact on power consumption and performance.

Virtual displays

SurfaceFlinger supports a primary display (that is, what's built into your phone or tablet), an external display (such as a television connected through HDMI), and one or more virtual displays that make composited output available within the system. Virtual displays can be used to record the screen or send it over a network.

Virtual displays may share the same set of layers as the main display (the layer stack) or have their own set. There's no VSYNC for a virtual display, so the VSYNC for the primary display is used to trigger composition for all displays.

In older versions of Android, virtual displays were always composited with GLES and the Hardware Composer managed composition for the primary display only. In Android 4.4, the Hardware Composer gained the ability to participate in virtual display composition.

As you might expect, frames generated for a virtual display are written to a BufferQueue.

Case study: screenrecord

The screenrecord command allows you to record everything that appears on the screen as an .mp4 file on disk. To implement this, we have to receive composited frames from SurfaceFlinger, write them to the video encoder, and then write the encoded video data to a file. The video codecs are managed by a separate process (mediaserver) so we have to move large graphics buffers around the system. To make it more challenging, we're trying to record 60 fps video at full resolution. The key to making this work efficiently is BufferQueue.

The MediaCodec class allows an app to provide data as raw bytes in buffers, or through a Surface. When screenrecord requests access to a video encoder, mediaserver creates a BufferQueue, connects itself to the consumer side, then passes the producer side back to screenrecord as a Surface.

The screenrecord utility then asks SurfaceFlinger to create a virtual display that mirrors the main display (that is, it has all of the same layers), and directs it to send output to the Surface that came from mediaserver. In this case, SurfaceFlinger is the producer of buffers rather than the consumer.

After the configuration is complete, screenrecord triggered when the encoded data appears. As apps draw, their buffers travel to SurfaceFlinger, which composites them into a single buffer that gets sent directly to the video encoder in mediaserver. The full frames are never even seen by the screenrecord process. Internally, mediaserver has its own way of moving buffers around that also passes data by handle, minimizing overhead.

Case study: simulate secondary displays

The WindowManager can ask SurfaceFlinger to create a visible layer for which SurfaceFlinger acts as the BufferQueue consumer. It's also possible to ask SurfaceFlinger to create a virtual display, for which SurfaceFlinger acts as the BufferQueue producer.

If you connect a virtual display to a visible layer, a closed loop is created where the composited screen appears in a window. That window is now part of the composited output, so on the next refresh the composited image inside the window will show the window contents as well. To see this in action, enable Developer options in Settings, select Simulate secondary displays, and enable a window. To see secondary displays in action, use screenrecord to capture the act of enabling the display then play it back frame by frame.