Skip to content

aarch64: fix UB in non-power-of-two reads and writes#2042

Merged
sayantn merged 1 commit intorust-lang:mainfrom
folkertdev:aarch64-non-power-of-two-reads-writes
Feb 27, 2026
Merged

aarch64: fix UB in non-power-of-two reads and writes#2042
sayantn merged 1 commit intorust-lang:mainfrom
folkertdev:aarch64-non-power-of-two-reads-writes

Conversation

@folkertdev
Copy link
Contributor

fixes #2036

I believe this fixes the issue. We should think about how to test that, but also the beta cutoff is soon and so we should try to sync this quickly.

cc @okaneco
r? sayantn

@folkertdev folkertdev marked this pull request as ready for review February 26, 2026 12:30
@sayantn
Copy link
Contributor

sayantn commented Feb 26, 2026

Did you check that this doesn't interfere with the LLVM optimization?

Comment on lines +264 to +268
$crate::ptr::copy_nonoverlapping(
$ptr.cast::<$elem>(),
mem.as_mut_ptr().cast::<$elem>(),
$lanes * 3,
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy_nonoverlapping requires aligned pointers, so shouldn't this use read_unaligned($ptr) before writing to mem?

Same with the store needing to use write_unaligned rather than copy_nonoverlapping because that would require dst to be aligned.

Copy link
Contributor

@sayantn sayantn Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is copying elementwise, it should be ok if the pointer is element-aligned. I am not too familiar with ARM load semantics, is the pointer at least guaranteed to be element-aligned (I guess that's the case just from the type). @folkertdev could you just confirm this please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pointer should be aligned to the element type, that is assumed by the implementation. You can see that e.g. in the Operation section here:

https://developer.arm.com/architectures/instruction-sets/intrinsics/vld3q_f16

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok in that case it lgtm

@folkertdev
Copy link
Contributor Author

Looks right to me? https://godbolt.org/z/fcbEq19cv

@sayantn sayantn added this pull request to the merge queue Feb 27, 2026
Merged via the queue into rust-lang:main with commit d4a226d Feb 27, 2026
77 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Potential unsoundness in deinterleaving_load macro?

3 participants