Discussion:
[PATCH 2/3] drm/qxl: use QXL_GEM_DOMAIN_SURFACE for shadow bo.
Gerd Hoffmann
2018-12-06 10:46:37 UTC
Permalink
The shadow bo is used as qxl surface, so allocate it as
QXL_GEM_DOMAIN_SURFACE. Should usually be allocated in
PRIV ttm domain then, so this reduces VRAM memory pressure.

Signed-off-by: Gerd Hoffmann <***@redhat.com>
---
drivers/gpu/drm/qxl/qxl_display.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/qxl/qxl_display.c b/drivers/gpu/drm/qxl/qxl_display.c
index ce0b9c40fc..1dde3b2ecb 100644
--- a/drivers/gpu/drm/qxl/qxl_display.c
+++ b/drivers/gpu/drm/qxl/qxl_display.c
@@ -758,7 +758,7 @@ static int qxl_plane_prepare_fb(struct drm_plane *plane,
user_bo->shadow = old_bo->shadow;
} else {
qxl_bo_create(qdev, user_bo->gem_base.size,
- true, true, QXL_GEM_DOMAIN_VRAM, NULL,
+ true, true, QXL_GEM_DOMAIN_SURFACE, NULL,
&user_bo->shadow);
}
}
--
2.9.3
Gerd Hoffmann
2018-12-06 10:46:38 UTC
Permalink
dumb buffers are used as qxl surfaces, so allocate them as
QXL_GEM_DOMAIN_SURFACE. Should usually be allocated in
PRIV ttm domain then, so this reduces VRAM memory pressure.

Signed-off-by: Gerd Hoffmann <***@redhat.com>
---
drivers/gpu/drm/qxl/qxl_dumb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/qxl/qxl_dumb.c b/drivers/gpu/drm/qxl/qxl_dumb.c
index e3765739c3..272d19b677 100644
--- a/drivers/gpu/drm/qxl/qxl_dumb.c
+++ b/drivers/gpu/drm/qxl/qxl_dumb.c
@@ -59,7 +59,7 @@ int qxl_mode_dumb_create(struct drm_file *file_priv,
surf.stride = pitch;
surf.format = format;
r = qxl_gem_object_create_with_handle(qdev, file_priv,
- QXL_GEM_DOMAIN_VRAM,
+ QXL_GEM_DOMAIN_SURFACE,
args->size, &surf, &qobj,
&handle);
if (r)
--
2.9.3
Gerd Hoffmann
2018-12-06 10:46:36 UTC
Permalink
qxl surfaces (used for framebuffers and gem objects) can live in both
VRAM and PRIV ttm domains. Update placement setup to include both. Put
PRIV first in the list so it is preferred, so VRAM will have more room
for objects which must be allocated there.

Signed-off-by: Gerd Hoffmann <***@redhat.com>
---
drivers/gpu/drm/qxl/qxl_object.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/qxl/qxl_object.c b/drivers/gpu/drm/qxl/qxl_object.c
index 91f3bbc73e..f7f9f4f4fe 100644
--- a/drivers/gpu/drm/qxl/qxl_object.c
+++ b/drivers/gpu/drm/qxl/qxl_object.c
@@ -58,10 +58,10 @@ void qxl_ttm_placement_from_domain(struct qxl_bo *qbo, u32 domain, bool pinned)

qbo->placement.placement = qbo->placements;
qbo->placement.busy_placement = qbo->placements;
- if (domain == QXL_GEM_DOMAIN_VRAM)
- qbo->placements[c++].flags = TTM_PL_FLAG_CACHED | TTM_PL_FLAG_VRAM | pflag;
if (domain == QXL_GEM_DOMAIN_SURFACE)
qbo->placements[c++].flags = TTM_PL_FLAG_CACHED | TTM_PL_FLAG_PRIV | pflag;
+ if (domain == QXL_GEM_DOMAIN_SURFACE || domain == QXL_GEM_DOMAIN_VRAM)
+ qbo->placements[c++].flags = TTM_PL_FLAG_CACHED | TTM_PL_FLAG_VRAM | pflag;
if (domain == QXL_GEM_DOMAIN_CPU)
qbo->placements[c++].flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_SYSTEM | pflag;
if (!c)
--
2.9.3
Frediano Ziglio
2018-12-06 10:55:58 UTC
Permalink
Post by Gerd Hoffmann
qxl surfaces (used for framebuffers and gem objects) can live in both
VRAM and PRIV ttm domains. Update placement setup to include both. Put
PRIV first in the list so it is preferred, so VRAM will have more room
for objects which must be allocated there.
I remember these kind of changes in the past made migration
fails. I proposed similar patches years ago and they were rejected
for these reasons. Why now they are safe?

Looks like we are improving QXL, so that means we are actively working
on it. Should we not then thinking about moving feature in the proper
places (like spice-server for atomic mode setting instead of implementin
work around) ??
Post by Gerd Hoffmann
---
drivers/gpu/drm/qxl/qxl_object.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/qxl/qxl_object.c
b/drivers/gpu/drm/qxl/qxl_object.c
index 91f3bbc73e..f7f9f4f4fe 100644
--- a/drivers/gpu/drm/qxl/qxl_object.c
+++ b/drivers/gpu/drm/qxl/qxl_object.c
@@ -58,10 +58,10 @@ void qxl_ttm_placement_from_domain(struct qxl_bo *qbo,
u32 domain, bool pinned)
qbo->placement.placement = qbo->placements;
qbo->placement.busy_placement = qbo->placements;
- if (domain == QXL_GEM_DOMAIN_VRAM)
- qbo->placements[c++].flags = TTM_PL_FLAG_CACHED | TTM_PL_FLAG_VRAM | pflag;
if (domain == QXL_GEM_DOMAIN_SURFACE)
qbo->placements[c++].flags = TTM_PL_FLAG_CACHED | TTM_PL_FLAG_PRIV | pflag;
+ if (domain == QXL_GEM_DOMAIN_SURFACE || domain == QXL_GEM_DOMAIN_VRAM)
+ qbo->placements[c++].flags = TTM_PL_FLAG_CACHED | TTM_PL_FLAG_VRAM | pflag;
if (domain == QXL_GEM_DOMAIN_CPU)
qbo->placements[c++].flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_SYSTEM | pflag;
if (!c)
Frediano
Gerd Hoffmann
2018-12-06 11:34:16 UTC
Permalink
Post by Frediano Ziglio
Post by Gerd Hoffmann
qxl surfaces (used for framebuffers and gem objects) can live in both
VRAM and PRIV ttm domains. Update placement setup to include both. Put
PRIV first in the list so it is preferred, so VRAM will have more room
for objects which must be allocated there.
I remember these kind of changes in the past made migration
fails. I proposed similar patches years ago and they were rejected
for these reasons.
Pointer?
Post by Frediano Ziglio
Why now they are safe?
Well, you have to be careful what object you are allocating. Surfaces
can be in both PRIV and VRAM. Everything else (qxl commands, monitors
config, ...) must be in VRAM.
Post by Frediano Ziglio
Looks like we are improving QXL, so that means we are actively working
on it.
Well, I'm just trying make qxl behave better with wayland.
Post by Frediano Ziglio
Should we not then thinking about moving feature in the proper
places (like spice-server for atomic mode setting instead of implementin
work around) ??
Main advantage is that it doesn't need qxl device changes, so it works
on old hosts too. But, yes, we can consider to also modernize spice
protocol and qxl device.

cheers,
Gerd
Frediano Ziglio
2018-12-06 14:39:54 UTC
Permalink
Post by Gerd Hoffmann
Post by Frediano Ziglio
Post by Gerd Hoffmann
qxl surfaces (used for framebuffers and gem objects) can live in both
VRAM and PRIV ttm domains. Update placement setup to include both. Put
PRIV first in the list so it is preferred, so VRAM will have more room
for objects which must be allocated there.
I remember these kind of changes in the past made migration
fails. I proposed similar patches years ago and they were rejected
for these reasons.
Pointer?
I think is this:
https://patchwork.kernel.org/patch/7374931/

I think all started with Windows where we have:
https://gitlab.freedesktop.org/spice/win32/qxl-wddm-dod/commit/0214d5ceda3f0da94de3813fc902150d497c6b26
https://gitlab.freedesktop.org/spice/win32/qxl-wddm-dod/commit/54a719e14f1204143da2c64f8a2aaee4fe5cd7d6
Post by Gerd Hoffmann
Post by Frediano Ziglio
Why now they are safe?
Well, you have to be careful what object you are allocating. Surfaces
can be in both PRIV and VRAM. Everything else (qxl commands, monitors
config, ...) must be in VRAM.
Post by Frediano Ziglio
Looks like we are improving QXL, so that means we are actively working
on it.
Well, I'm just trying make qxl behave better with wayland.
As far as I remember the Linux kernel driver simulates the frame buffer
swap with drawings which cause more memory copies.
Not also sure if this workaround make the server consume more network
bandwidth.
Are we supporting multiple monitors for Wayland?
Post by Gerd Hoffmann
Post by Frediano Ziglio
Should we not then thinking about moving feature in the proper
places (like spice-server for atomic mode setting instead of implementin
work around) ??
Main advantage is that it doesn't need qxl device changes, so it works
on old hosts too. But, yes, we can consider to also modernize spice
protocol and qxl device.
cheers,
Gerd
On the other hand we faced some bugs due these workarounds so we end up
with a solution that is less optimal than before and potentially
with more bugs to fix.
At the end we sell to customer a worst product.

Frediano
Gerd Hoffmann
2018-12-07 08:26:23 UTC
Permalink
Post by Frediano Ziglio
Post by Gerd Hoffmann
Post by Frediano Ziglio
Post by Gerd Hoffmann
qxl surfaces (used for framebuffers and gem objects) can live in both
VRAM and PRIV ttm domains. Update placement setup to include both. Put
PRIV first in the list so it is preferred, so VRAM will have more room
for objects which must be allocated there.
I remember these kind of changes in the past made migration
fails. I proposed similar patches years ago and they were rejected
for these reasons.
Pointer?
https://patchwork.kernel.org/patch/7374931/
Ok, problem is you are doing it for both QXL_GEM_DOMAIN_VRAM and
QXL_GEM_DOMAIN_SURFACE. Surfaces can be in both VRAM and PRIV ttm
domains, so for the later this is fine. Most other allocations must be
in VRAM ttm domain though, so allowing PRIV for QXL_GEM_DOMAIN_VRAM is
*not* ok.

Noticed I need patch 1/2 of that series, otherwise things will break
when we run out of space in PRIV domain and surfaces are allocated in
VRAM.
Post by Frediano Ziglio
Post by Gerd Hoffmann
Well, you have to be careful what object you are allocating. Surfaces
can be in both PRIV and VRAM. Everything else (qxl commands, monitors
config, ...) must be in VRAM.
Post by Frediano Ziglio
Looks like we are improving QXL, so that means we are actively working
on it.
Well, I'm just trying make qxl behave better with wayland.
As far as I remember the Linux kernel driver simulates the frame buffer
swap with drawings which cause more memory copies.
Yes. wayland renders into a dumb gem bo (backed by a qxl surface),
Actual primary surface is a shadow bo though. Display updates are done
by sending a draw command for the primary surface, with the pixel data
coming from the dumb gem bo.

We could maybe optimize that, by having the image chunk point directly
to the dumb gem bo instead of allocating memory and memcpy'ing the pixel
data.

I don't feel like putting too much effort into qxl performance
optimization though. The time is better spent on virtio-gpu I think.

BTW: should I send virtio-gpu kernel patches to spice-devel too?
Post by Frediano Ziglio
Not also sure if this workaround make the server consume more network
bandwidth.
Doesn't make much of a difference I think. In case we improve qxl/spice
to support (a) one surface per monitor and (b) pageflips so we don't
need the shadow bo, then we would still need to send the pixel data.
Post by Frediano Ziglio
Are we supporting multiple monitors for Wayland?
I have patches for that. Which basically extend the shadow logic, so
the shadow bo used as primary surface will be big enough that all
heads/monitors/crtcs fit in. qxl/spice has a single surface then, while
userspace (i.e. wayland) can use one dumb gem bo per head/monitor/crtc.

Guess I should re-send them with spice-devel cc'ed,
Post by Frediano Ziglio
Post by Gerd Hoffmann
Main advantage is that it doesn't need qxl device changes, so it works
on old hosts too. But, yes, we can consider to also modernize spice
protocol and qxl device.
On the other hand we faced some bugs due these workarounds so we end up
with a solution that is less optimal than before and potentially
with more bugs to fix.
Ahem, well, not really.

Adding support for atomic modesetting to qxl caused some regressions
indeed. Those should be fixed meanwhile.

Anything which was done in the qxl driver for wayland (most notably the
shadow logic) improved the situation for wayland. wayland is pretty
much unusable without shadowing. I'm not aware of any xorg regressions
caused by dumb bo shadowing. xorg doesn't use dumb bo's in the first
place, and it composes its own qxl drawing commands instead of expecting
the qxl kms driver handle the update on pageflip.

cheers,
Gerd

Loading...