Skip to content

android: Make the backend zero-copy#331

Draft
MarijnS95 wants to merge 2 commits intomasterfrom
android-zerocopy
Draft

android: Make the backend zero-copy#331
MarijnS95 wants to merge 2 commits intomasterfrom
android-zerocopy

Conversation

@MarijnS95
Copy link
Member

WIP but tested change that closes #318.

Still need to discuss how to handle MaybeUninit, and the now side-effect presenting in-progress modifications to pixels_mut() if the caller ended up dropping their Buffer half-way through, which is an unfortunate caveat of having a locked buffer for the surface that unlocks and presents on drop. The turning point is that it's not allowed to lock() a surface twice before unlock()(ing) and inherently presenting it.

Are other platforms affected by such a lock-unlock kind of API? As hinted in #318 ASurfaceControl+ASurfaceTransaction+AHardwareBuffer completely obviate this issue, but that has very high Android requirements (the API's are there for a while, but I seem to have been the first one ever using it on the root Surface, the one you get from NativeActivity, and it didn't work until a bug report and fix since Android 15 (API 35)).

@madsmtm
Copy link
Member

madsmtm commented Feb 1, 2026

Are other platforms affected by such a lock-unlock kind of API?

On macOS, the locking/unlocking operation that IOSurface does is expensive, because it tells the MMU to make the region of memory available to other parts of the system. But you can do that over and over again, because nobody should be using that IOSurface yet.

So I don't think there are other platforms where this is a problem, Android is a bit special in "managing" the buffer(s) for us like this.


Maybe an option would be to store the locked buffer on the surface? Something like:

struct AndroidImpl {
    native_window: NativeWindow,
    in_progress_buffer: Option<NativeWindowBufferLockGuard<'static>>, // + unsafe magic
}

fn buffer_mut() {
    if let Some(native_window_buffer) = self.in_progress_buffer {
        return BufferImpl { native_window_buffer };
    }
    
    let native_window_buffer = self.native_window.lock(None)?;
    BufferImpl { native_window_buffer, surface_in_progress_buffer: &mut self.in_progress_buffer }
}

impl Drop for BufferImpl {
    fn drop(&mut self) {
        let buffer = self.native_window_buffer.take();
        *self.surface_in_progress_buffer = Some(buffer);
    }
}

That would allow the following to work the same as on other platforms:

let mut buffer = surface.buffer_mut();
buffer.pixels().fill(Pixel::rgb(0, 0, 0)); // Clear
drop(buffer);
let mut buffer = surface.buffer_mut();
draw(&mut buffer);
buffer.present();

Comment on lines +141 to +167
// TODO: Validate that Android actually initializes (possibly with garbage) the buffer, or
// let softbuffer initialize it, or return MaybeUninit<Pixel>?
// i.e. consider age().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as android is using a swap chain of two or three buffers and we can get the buffer age, it probably makes the most sense for softbuffer to zero-initialize the buffers, if Android doesn't already do that.

It shouldn't really add any meaningful cost to zero each buffer once.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can somehow infer the amount of buffers with ANativeWindow_getBuffersDataSpace?

But I'd also be fine with zero-initializing in buffer_mut for now, and coming back to this later, this PR will be an improvement anyhow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at https://developer.android.com/ndk/reference/group/a-native-window#anativewindow_lock, I guess it would return the whole bounds of the buffer in inOutDirtyBounds if a buffer is freshly allocated?

Though the API seems a bit backwards from how we have softbuffer work. We don't have a damage region to pass until we present.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though the API seems a bit backwards from how we have softbuffer work. We don't have a damage region to pass until we present.

Yeah, that's discussed in a code comment too:

// TODO: Android requires the damage rect _at lock time_
// Since we're faking the backing buffer _anyway_, we could even fake the surface lock
// and lock it here (if it doesn't influence timings).
//
// Android seems to do this because the region can be expanded by the
// system, requesting the user to actually redraw a larger region.
// It's unclear if/when this is used, or if corruption/artifacts occur
// when the enlarged damage region is not re-rendered?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the implementation of this is in Surface::lock defined in https://android.googlesource.com/platform/frameworks/native/+/main/libs/gui/Surface.cpp.

Which also has:

const bool canCopyBack = (frontBuffer != nullptr &&
        backBuffer->width  == frontBuffer->width &&
        backBuffer->height == frontBuffer->height &&
        backBuffer->format == frontBuffer->format);

And then copies the front buffer to the back buffer if it can.

Or otherwise:

// if we can't copy-back anything, modify the user's dirty
// region to make sure they redraw the whole buffer

Maybe a bit of an abuse of the API, but if we pass an empty inOutDirtyBounds region, maybe we could then see if we get an empty region in return. In which case we can treat it as age 1. Otherwise treat it as a new buffer with age 0, which we can also zero-initialize to if we think that's necessary.

Sounds like that would work? And should generally return an age of 1, keeping things very simple for any damage-tracked renderer using softbuffer, since Android is apparently copying the front buffer to the back buffer already.

Copy link
Member Author

@MarijnS95 MarijnS95 Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can somehow infer the amount of buffers with ANativeWindow_getBuffersDataSpace?

That function queries the color space and related info: https://developer.android.com/ndk/reference/group/a-data-space

But I'd also be fine with zero-initializing in buffer_mut for now, and coming back to this later, this PR will be an improvement anyhow.

Okay, we could do that...


It seems that I get the same pixel buffer back on every 3rd lock, so a typical triple-buffered swapchain.

But always tricky to infer if you don't know about internal copies that may change the context relative to what you thought it might have been given a non-zero age.

That all goes away with custom AHardwareBuffer presented through ASurfaceControl/Transaction.

EDIT: And if I remember correctly, I once asked the Android developers to create an AImageWriter synonymous to the existing AImageReader to serve this exact purpose of (de)queueing buffers from a Surface in a more controllable manner (since the underlying APIs exist in the private ANativeWindow structure).


I'm a bit skeptical with those sources since they may change across Android versions. Despite that the documentation is never comprehensive/complete and may "change" to "clarify" scenarios though.

Maybe a bit of an abuse of the API, but if we pass an empty inOutDirtyBounds region, maybe we could then see if we get an empty region in return.

In that case, isn't it expected or allowed that nothing of the buffer is presented, because nothing was communicated or expected to have changed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit skeptical with those sources since they may change across Android versions. Despite that the documentation is never comprehensive/complete and may "change" to "clarify" scenarios though.

Yeah, me too. So I wouldn't expose it in age, at least not unless we can find some definitive docs for things (including the tripple-buffered statement you made above).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also unclear what would happen on a resize (which also changes the format). If the "view" size changes but resize is not called, does it reuse the same buffers with implied upscaling?

(On wgpu/glutin I saw that this format change implicitly limited EGL to only return configs for that format, so it was definitely confusing)

@MarijnS95
Copy link
Member Author

@madsmtm something like that would work. I dropped the Some reassignment in fn drop() since the mutable lifetime that BufferImpl has on AndroidImpl already guarantees unique ownership, so it can already access the Option directly.

Unfortunately that requires various unwraps (like your suggested self.native_window_buffer.take()), but at least should not be susceptible to std::mem::forget() scenarios (which are safe) where unlock is not called and AndroidImpl::in_progess_buffer remains None?

Copy link
Member

@madsmtm madsmtm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately that requires various unwraps

I'm totally fine with a bit of unwrapping, especially after #313, where we don't need to make pixels_mut zero-cost.

should not be susceptible to std::mem::forget() scenarios (which are safe) where unlock is not called and AndroidImpl::in_progess_buffer remains None?

I don't think my solution would've caused unsoundness either if you forget the buffer? It'd just cause a panic because you'd try to lock the buffer twice.

Anyhow, forgetting the buffer is definitely unsupported, as long as things are sound I don't care what the behavior of it is - I'll leave that up to whatever is easiest for you to maintain.

@madsmtm madsmtm added enhancement New feature or request DS - Android NDK labels Feb 3, 2026
@MarijnS95
Copy link
Member Author

MarijnS95 commented Feb 7, 2026

Unfortunately that requires various unwraps

I'm totally fine with a bit of unwrapping

I mostly wanted to perhaps have a wrapper function around this accessor, to have one clear place to document that the unwrap() is actually expected to never be reachable, because the Option is only taken on present right where the struct is also dropped (while being a borrow on AndroidImpl of course).

And again, if it wasn't for present() being directly on Buffer, or if BufferImpl took access to the entire AndroidImpl (which would still require an unwrap()), it could have possibly been a borrow of the Some() portion of the Option (though that likely has implications on multiple mutable borrows). Or with your proposed solution to own it and move it back to AndroidImpl in Drop (though see below).

should not be susceptible to std::mem::forget() scenarios (which are safe) where unlock is not called and AndroidImpl::in_progess_buffer remains None?

I don't think my solution would've caused unsoundness either if you forget the buffer? It'd just cause a panic because you'd try to lock the buffer twice.

I didn't particularly say that your suggestion is unsound, I explicitly named them "scenarios (which are safe)" but they will always error in the way you described.

Anyhow, forgetting the buffer is definitely unsupported, as long as things are sound I don't care what the behavior of it is - I'll leave that up to whatever is easiest for you to maintain.

At least with my variant - where BufferImpl does not own the NativeWindowBufferLockGuard with the disadvantage of constantly having to unwrap() the borrowed Option from the parent AndroidImpl - forgetting without Drop and re-accessing the buffer via buffer_mut() has no such issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

DS - Android NDK enhancement New feature or request

Development

Successfully merging this pull request may close these issues.

Zero-copying on Android

3 participants