Discussion:
[PATCH] drivers/base: use a worker for sysfs unbind
Daniel Vetter
2018-12-10 08:46:53 UTC
Permalink
Drivers might want to remove some sysfs files, which needs the same
locks and ends up angering lockdep. Relevant snippet of the stack
trace:

kernfs_remove_by_name_ns+0x3b/0x80
bus_remove_driver+0x92/0xa0
acpi_video_unregister+0x24/0x40
i915_driver_unload+0x42/0x130 [i915]
i915_pci_remove+0x19/0x30 [i915]
pci_device_remove+0x36/0xb0
device_release_driver_internal+0x185/0x250
unbind_store+0xaf/0x180
kernfs_fop_write+0x104/0x190

I've stumbled over this because some new patches by Ram connect the
snd-hda-intel unload (where we do use sysfs unbind) with the locking
chains in the i915 unload code (but without creating a new loop),
which upset our CI. But the bug is already there and can be easily
reproduced by unbind i915 directly.

No idea whether this is the correct place to fix this, should at least
get CI happy again. Also not sure whether we should do the same on the
bind side, there we have the additional complication that the current
code forwards the driver load errno.

Note that the bus locking is already done by
device_release_driver_internal (if you give it the parent), so I
dropped that part. Also note that we don't recheck that the device is
still bound by the same driver, but neither does the current code do
that without races. And I figured that's a obscure enough corner case
to not bother.

v2: Use a task work. An entirely async work leads to impressive
fireworks in our CI, notably in the vtcon bind/unbind code. Task work
will be as synchronous as the current code, and so keep all these
preexisting races neatly tugged under the rug.

Cc: Ramalingam C <***@intel.com>
Cc: Greg Kroah-Hartman <***@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <***@kernel.org>
Signed-off-by: Daniel Vetter <***@intel.com>
---
drivers/base/bus.c | 35 +++++++++++++++++++++++++++++------
1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index 8bfd27ec73d6..095c4a140d76 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -17,6 +17,7 @@
#include <linux/string.h>
#include <linux/mutex.h>
#include <linux/sysfs.h>
+#include <linux/task_work.h>
#include "base.h"
#include "power/power.h"

@@ -174,22 +175,44 @@ static const struct kset_uevent_ops bus_uevent_ops = {

static struct kset *bus_kset;

+struct unbind_work {
+ struct callback_head twork;
+ struct device *dev;
+};
+
+void unbind_work_fn(struct callback_head *work)
+{
+ struct unbind_work *unbind_work =
+ container_of(work, struct unbind_work, twork);
+
+ device_release_driver_internal(unbind_work->dev, NULL,
+ unbind_work->dev->parent);
+ put_device(unbind_work->dev);
+ kfree(unbind_work);
+}
+
/* Manually detach a device from its associated driver. */
static ssize_t unbind_store(struct device_driver *drv, const char *buf,
size_t count)
{
struct bus_type *bus = bus_get(drv->bus);
+ struct unbind_work *unbind_work;
struct device *dev;
int err = -ENODEV;

dev = bus_find_device_by_name(bus, NULL, buf);
if (dev && dev->driver == drv) {
- if (dev->parent && dev->bus->need_parent_lock)
- device_lock(dev->parent);
- device_release_driver(dev);
- if (dev->parent && dev->bus->need_parent_lock)
- device_unlock(dev->parent);
- err = count;
+ unbind_work = kmalloc(sizeof(*unbind_work), GFP_KERNEL);
+ if (unbind_work) {
+ unbind_work->dev = dev;
+ get_device(dev);
+ init_task_work(&unbind_work->twork, unbind_work_fn);
+ task_work_add(current, &unbind_work->twork, true);
+
+ err = count;
+ } else {
+ err = -ENOMEM;
+ }
}
put_device(dev);
bus_put(bus);
--
2.20.0.rc1
Greg Kroah-Hartman
2018-12-10 10:06:34 UTC
Permalink
Post by Daniel Vetter
Drivers might want to remove some sysfs files, which needs the same
locks and ends up angering lockdep. Relevant snippet of the stack
kernfs_remove_by_name_ns+0x3b/0x80
bus_remove_driver+0x92/0xa0
acpi_video_unregister+0x24/0x40
i915_driver_unload+0x42/0x130 [i915]
i915_pci_remove+0x19/0x30 [i915]
pci_device_remove+0x36/0xb0
device_release_driver_internal+0x185/0x250
unbind_store+0xaf/0x180
kernfs_fop_write+0x104/0x190
I've stumbled over this because some new patches by Ram connect the
snd-hda-intel unload (where we do use sysfs unbind) with the locking
chains in the i915 unload code (but without creating a new loop),
which upset our CI. But the bug is already there and can be easily
reproduced by unbind i915 directly.
This is odd, why wouldn't any driver hit this issue? And why now since
you say this is triggerable today?

I know scsi was doing some strange things like trying to remove the
device itself from a sysfs callback on the device, which requires it to
just call a different kobject function created just for that type of
thing. Would that also make sense to do here instead of your workqueue?

thanks,

greg k-h
Daniel Vetter
2018-12-10 10:18:32 UTC
Permalink
Post by Greg Kroah-Hartman
Post by Daniel Vetter
Drivers might want to remove some sysfs files, which needs the same
locks and ends up angering lockdep. Relevant snippet of the stack
kernfs_remove_by_name_ns+0x3b/0x80
bus_remove_driver+0x92/0xa0
acpi_video_unregister+0x24/0x40
i915_driver_unload+0x42/0x130 [i915]
i915_pci_remove+0x19/0x30 [i915]
pci_device_remove+0x36/0xb0
device_release_driver_internal+0x185/0x250
unbind_store+0xaf/0x180
kernfs_fop_write+0x104/0x190
I've stumbled over this because some new patches by Ram connect the
snd-hda-intel unload (where we do use sysfs unbind) with the locking
chains in the i915 unload code (but without creating a new loop),
which upset our CI. But the bug is already there and can be easily
reproduced by unbind i915 directly.
This is odd, why wouldn't any driver hit this issue? And why now since
you say this is triggerable today?
The above backtrace is triggered by unbinding i915 on current upstream
kernels. Note: Will crash later on rather badly in the
fbdev/fbcon/vtconsole hell, but that's separate issue (which can be worked
around by first unbinding fbcon manually through sysfs).
Post by Greg Kroah-Hartman
I know scsi was doing some strange things like trying to remove the
device itself from a sysfs callback on the device, which requires it to
just call a different kobject function created just for that type of
thing. Would that also make sense to do here instead of your workqueue?
Note how we blow up on unregistering sw device instances supported by i915
in entirely different subsystems. I guess most drivers just have sysfs
files for their own stuff, where this is done as you describe. The problem
is that there's an awful lot of unrelated stuff hanging off i915.

Or maybe acpi_video is busted, and should be using a different function.
You haven't said which one, and I have no idea which one it is ...

And in case the context wasn't clear: This is unbinding the i915 pci
driver which triggers the above lockdep splat recursion.

Thanks, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
Daniel Vetter
2018-12-10 10:20:58 UTC
Permalink
Post by Daniel Vetter
Post by Greg Kroah-Hartman
Post by Daniel Vetter
Drivers might want to remove some sysfs files, which needs the same
locks and ends up angering lockdep. Relevant snippet of the stack
kernfs_remove_by_name_ns+0x3b/0x80
bus_remove_driver+0x92/0xa0
acpi_video_unregister+0x24/0x40
i915_driver_unload+0x42/0x130 [i915]
i915_pci_remove+0x19/0x30 [i915]
pci_device_remove+0x36/0xb0
device_release_driver_internal+0x185/0x250
unbind_store+0xaf/0x180
kernfs_fop_write+0x104/0x190
I've stumbled over this because some new patches by Ram connect the
snd-hda-intel unload (where we do use sysfs unbind) with the locking
chains in the i915 unload code (but without creating a new loop),
which upset our CI. But the bug is already there and can be easily
reproduced by unbind i915 directly.
This is odd, why wouldn't any driver hit this issue? And why now since
you say this is triggerable today?
The above backtrace is triggered by unbinding i915 on current upstream
kernels. Note: Will crash later on rather badly in the
fbdev/fbcon/vtconsole hell, but that's separate issue (which can be worked
around by first unbinding fbcon manually through sysfs).
Post by Greg Kroah-Hartman
I know scsi was doing some strange things like trying to remove the
device itself from a sysfs callback on the device, which requires it to
just call a different kobject function created just for that type of
thing. Would that also make sense to do here instead of your workqueue?
Note how we blow up on unregistering sw device instances supported by i915
in entirely different subsystems. I guess most drivers just have sysfs
files for their own stuff, where this is done as you describe. The problem
is that there's an awful lot of unrelated stuff hanging off i915.
Or maybe acpi_video is busted, and should be using a different function.
You haven't said which one, and I have no idea which one it is ...
And in case the context wasn't clear: This is unbinding the i915 pci
driver which triggers the above lockdep splat recursion.
btw another option for "fixing" this would be to annotate the mutex_lock
in kernfs_remove_by_name_ns as recursive. Which just shuts up lockdep (and
might hide some real bugs), but would get the job done since there's not
actually a deadlock here. Just lockdep being annoyed.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
Loading...