-
Mikhail Vanyulin authored
Backported from linux-6.7.0, commit a0e026e7b37e997f4fa3fcaa714e5484f3ce9e75 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/phy/phy.c?id=a0e026e7b37e997f4fa3fcaa714e5484f3ce9e75 Since commit 91a7cda1f4b8 ("net: phy: Fix race condition on link status change") all the phy_error() method invocations have been causing the nested-mutex-lock deadlock because it's normally done in the PHY-driver threaded IRQ handlers which since that change have been called with the phydev->lock mutex held. Here is the calls thread: IRQ: phy_interrupt() +-> mutex_lock(&phydev->lock); <--------------------+ drv->handle_interrupt() | Deadlock due +-> ERROR: phy_error() + to the nested +-> phy_process_error() | mutex lock +-> mutex_lock(&phydev->lock); <-+ phydev->state = PHY_ERROR; mutex_unlock(&phydev->lock); mutex_unlock(&phydev->lock); The problem can be easily reproduced just by calling phy_error() from any PHY-device threaded interrupt handler. Fix it by dropping the phydev->lock mutex lock from the phy_process_error() method and printing a nasty error message to the system log if the mutex isn't held in the caller execution context. Note for the fix to work correctly in the PHY-subsystem itself the phydev->lock mutex locking must be added to the phy_error_precise() function. Link: https://lore.kernel.org/netdev/20230816180944.19262-1-fancer.lancer@gmail.com Fixes: 91a7cda1f4b8 ("net: phy: Fix race condition on link status change") Suggested-by:
Andrew Lunn <andrew@lunn.ch> Signed-off-by:
Serge Semin <fancer.lancer@gmail.com> Reviewed-by:
Andrew Lunn <andrew@lunn.ch> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Mikhail Vanyulin <mikhail.vanyulin@rtsoft.de>
d4a5ac11
Code owners
Assign users and groups as approvers for specific file changes. Learn more.