Discussion:
[Bug 109007] radeonsi cache format changed, causes mesa crash on startup
b***@freedesktop.org
2018-12-11 07:48:30 UTC
Permalink
https://bugs.freedesktop.org/show_bug.cgi?id=109007

Bug ID: 109007
Summary: radeonsi cache format changed, causes mesa crash on
startup
Product: Mesa
Version: git
Hardware: Other
OS: All
Status: NEW
Severity: normal
Priority: medium
Component: Drivers/Gallium/radeonsi
Assignee: dri-***@lists.freedesktop.org
Reporter: ***@reactivated.net
QA Contact: dri-***@lists.freedesktop.org

Having upgraded from Mesa-17.3 to Mesa-18.1 in Endless OS, many users on
AMD-based platforms are now reporting that the system fails to boot into the
UI. I've reproduced and confirm that Xorg is crashing very early on.

Thread 4 "si_shader:0" received signal SIGSEGV, Segmentation fault.
__memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:1533
Backtrace:
#0 __memcpy_ssse3_back ()
at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:1533
#1 0x00007fffeeba2038 in memcpy (__len=3221880836, __src=0x7fffe4000e70,
__dest=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/string3.h:53
#2 read_data (size=3221880836, data=<optimized out>, ptr=0x7fffe4000e70)
at ../../../../../src/gallium/drivers/radeonsi/si_state_shaders.c:95
#3 read_chunk (ptr=0x7fffe4000e70, ***@entry=0x7fffe4000e6c,
data=***@entry=0x7fffe4000998, size=***@entry=0x7fffe4000980)
at ../../../../../src/gallium/drivers/radeonsi/si_state_shaders.c:121
#4 0x00007fffeeba21b3 in si_load_shader_binary (
shader=***@entry=0x7fffe40008c0, binary=***@entry=0x7fffe4000e00)
at ../../../../../src/gallium/drivers/radeonsi/si_state_shaders.c:187
#5 0x00007fffeeba4810 in si_shader_cache_load_shader (shader=0x7fffe40008c0,
ir_binary=0x7fffe4000a50, sscreen=0x555555a393a0)
at ../../../../../src/gallium/drivers/radeonsi/si_state_shaders.c:275
#6 si_init_shader_selector_async (job=***@entry=0x555555b8dfa0,
thread_index=***@entry=0)
at ../../../../../src/gallium/drivers/radeonsi/si_state_shaders.c:1875
#7 0x00007fffee747a55 in util_queue_thread_func (
input=***@entry=0x555555a39fb0) at ../../../src/util/u_queue.c:271
#8 0x00007fffee7476c7 in impl_thrd_routine (p=<optimized out>)
at ../../../include/c11/threads_posix.h:87
#9 0x00007ffff574d494 in start_thread (arg=0x7fffebe06700)

The problem here is that the on-disk radeonsi cache format changed without
consideration for this in the code. The affected codepath is
si_load_shader_binary() which does:

uint32_t size = *ptr++;
uint32_t crc32 = *ptr++;
[...]
ptr = read_data(ptr, &shader->config, sizeof(shader->config));
ptr = read_data(ptr, &shader->info, sizeof(shader->info));
ptr = read_chunk(ptr, (void**)&shader->binary.code,
&shader->binary.code_size);

So, the blob format is: 4 bytes size, 4 bytes CRC, shader config, shader info,
code.

In mesa-17.3 the si_shader_config was 48 bytes in size, but in Mesa-18.1 and
current master, si_shader_config is 52 bytes in size, because the max_simd_wave
field was added.

After upgrading mesa to 18.1, with shaders compiled and cached by mesa-17.3,
now the above code will obviously not behave as intended. We enter into
read_chunk() with the offsets slightly wrong:

*size = *ptr++;
assert(*data == NULL);
if (!*size)
return ptr;
*data = malloc(*size);
return read_data(ptr, *data, *size);

and when this code executes, *size has value 3221880836, for a shader that was
only 884 bytes uncompressed. read_data then tries to memcpy this much data, and
that causes the crash.

In addition to the lack of invalidation of existing disk caches after the
on-disk format was changed, this code also seems rather suspect in that it does
not verify that it is not reading beyond the end of the shader. As an attacker
I could maliciously rewrite the size field read by the read_chunk() code above
to be very large, fixup the CRC and recompress, and then I could cause other
apps to crash in this way.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freedesktop.org
2018-12-11 07:54:01 UTC
Permalink
https://bugs.freedesktop.org/show_bug.cgi?id=109007

Daniel Drake <***@reactivated.net> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@gmail.com

--- Comment #1 from Daniel Drake <***@reactivated.net> ---
The on-disk cache format change was introduced by
radeonsi: move max_simd_waves computation into a separate function
https://github.com/mesa3d/mesa/commit/c02c9ee550d137fbea3ed105131d621d6af5813b
--
You are receiving this mail because:
You are the assignee for the bug.
Loading...