Skip to content
Snippets Groups Projects
  1. Jun 09, 2022
    • Mickaël Salaün's avatar
      selftests/landlock: Extend tests for minimal valid attribute size · d709e275
      Mickaël Salaün authored
      commit 291865bd7e8bb4b4033d341fa02dafa728e6378c upstream.
      
      This might be useful when the struct landlock_ruleset_attr will get more
      fields.
      
      Cc: Shuah Khan <shuah@kernel.org>
      Link: https://lore.kernel.org/r/20220506160820.524344-4-mic@digikod.net
      
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMickaël Salaün <mic@digikod.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d709e275
    • Mickaël Salaün's avatar
      selftests/landlock: Make tests build with old libc · a6d127b8
      Mickaël Salaün authored
      commit 87129ef13603ae46c82bcd09eed948acf0506dbb upstream.
      
      Replace SYS_<syscall> with __NR_<syscall>.  Using the __NR_<syscall>
      notation, provided by UAPI, is useful to build tests on systems without
      the SYS_<syscall> definitions.
      
      Replace SYS_pivot_root with __NR_pivot_root, and SYS_move_mount with
      __NR_move_mount.
      
      Define renameat2() and RENAME_EXCHANGE if they are unknown to old build
      systems.
      
      Cc: Shuah Khan <shuah@kernel.org>
      Link: https://lore.kernel.org/r/20220506160820.524344-3-mic@digikod.net
      
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMickaël Salaün <mic@digikod.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a6d127b8
    • Mickaël Salaün's avatar
      landlock: Fix landlock_add_rule(2) documentation · e42fd077
      Mickaël Salaün authored
      commit a13e248ff90e81e9322406c0e618cf2168702f4e upstream.
      
      It is not mandatory to pass a file descriptor obtained with the O_PATH
      flag.  Also, replace rule's accesses with ruleset's accesses.
      
      Link: https://lore.kernel.org/r/20220506160820.524344-2-mic@digikod.net
      
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMickaël Salaün <mic@digikod.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e42fd077
    • Mickaël Salaün's avatar
      samples/landlock: Format with clang-format · ef350611
      Mickaël Salaün authored
      commit 81709f3dccacf4104a4bc2daa80bdd767a9c4c54 upstream.
      
      Let's follow a consistent and documented coding style.  Everything may
      not be to our liking but it is better than tacit knowledge.  Moreover,
      this will help maintain style consistency between different developers.
      
      This contains only whitespace changes.
      
      Automatically formatted with:
      clang-format-14 -i samples/landlock/*.[ch]
      
      Link: https://lore.kernel.org/r/20220506160513.523257-8-mic@digikod.net
      
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMickaël Salaün <mic@digikod.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ef350611
    • Mickaël Salaün's avatar
      samples/landlock: Add clang-format exceptions · ace62469
      Mickaël Salaün authored
      commit 9805a722db071e1772b80e6e0ff33f35355639ac upstream.
      
      In preparation to a following commit, add clang-format on and
      clang-format off stanzas around constant definitions.  This enables to
      keep aligned values, which is much more readable than packed
      definitions.
      
      Link: https://lore.kernel.org/r/20220506160513.523257-7-mic@digikod.net
      
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMickaël Salaün <mic@digikod.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ace62469
    • Mickaël Salaün's avatar
      selftests/landlock: Format with clang-format · de7a39e8
      Mickaël Salaün authored
      commit 371183fa578a4cf56b3ae12e54b7f01a4249add1 upstream.
      
      Let's follow a consistent and documented coding style.  Everything may
      not be to our liking but it is better than tacit knowledge.  Moreover,
      this will help maintain style consistency between different developers.
      
      This contains only whitespace changes.
      
      Automatically formatted with:
      clang-format-14 -i tools/testing/selftests/landlock/*.[ch]
      
      Link: https://lore.kernel.org/r/20220506160513.523257-6-mic@digikod.net
      Cc: stable@vger.kernel.org
      [mic: Update style according to
      https://lore.kernel.org/r/02494cb8-2aa5-1769-f28d-d7206f284e5a@digikod.net
      
      ]
      Signed-off-by: default avatarMickaël Salaün <mic@digikod.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      de7a39e8
    • Mickaël Salaün's avatar
      selftests/landlock: Normalize array assignment · 43c3014c
      Mickaël Salaün authored
      commit 135464f9d29c5b306d7201220f1d00dab30fea89 upstream.
      
      Add a comma after each array value to make clang-format keep the
      current array formatting.  See the following commit.
      
      Automatically modified with:
      sed -i 's/\t\({}\|NULL\)$/\0,/' tools/testing/selftests/landlock/fs_test.c
      
      Link: https://lore.kernel.org/r/20220506160513.523257-5-mic@digikod.net
      
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMickaël Salaün <mic@digikod.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      43c3014c
    • Mickaël Salaün's avatar
      selftests/landlock: Add clang-format exceptions · f5c70d9d
      Mickaël Salaün authored
      commit 4598d9abf4215e1e371a35683350d50122793c80 upstream.
      
      In preparation to a following commit, add clang-format on and
      clang-format off stanzas around constant definitions and the TEST_F_FORK
      macro.  This enables to keep aligned values, which is much more readable
      than packed definitions.
      
      Add other clang-format exceptions for FIXTURE() and
      FIXTURE_VARIANT_ADD() declarations to force space before open brace,
      which is reported by checkpatch.pl .
      
      Link: https://lore.kernel.org/r/20220506160513.523257-4-mic@digikod.net
      
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMickaël Salaün <mic@digikod.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f5c70d9d
    • Mickaël Salaün's avatar
      landlock: Format with clang-format · 695c7c06
      Mickaël Salaün authored
      commit 06a1c40a09a8dded4bf0e7e3ccbda6bddcccd7c8 upstream.
      
      Let's follow a consistent and documented coding style.  Everything may
      not be to our liking but it is better than tacit knowledge.  Moreover,
      this will help maintain style consistency between different developers.
      
      This contains only whitespace changes.
      
      Automatically formatted with:
      clang-format-14 -i security/landlock/*.[ch] include/uapi/linux/landlock.h
      
      Link: https://lore.kernel.org/r/20220506160513.523257-3-mic@digikod.net
      
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMickaël Salaün <mic@digikod.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      695c7c06
    • Mickaël Salaün's avatar
      landlock: Add clang-format exceptions · 58f52ad1
      Mickaël Salaün authored
      commit 6cc2df8e3a3967e7c13a424f87f6efb1d4a62d80 upstream.
      
      In preparation to a following commit, add clang-format on and
      clang-format off stanzas around constant definitions.  This enables to
      keep aligned values, which is much more readable than packed
      definitions.
      
      Link: https://lore.kernel.org/r/20220506160513.523257-2-mic@digikod.net
      
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMickaël Salaün <mic@digikod.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      58f52ad1
    • Manivannan Sadhasivam's avatar
      scsi: ufs: qcom: Add a readl() to make sure ref_clk gets enabled · 1be49ae1
      Manivannan Sadhasivam authored
      commit 8eecddfca30e1651dc1c74531ed5eef21dcce7e3 upstream.
      
      In ufs_qcom_dev_ref_clk_ctrl(), it was noted that the ref_clk needs to be
      stable for at least 1us. Even though there is wmb() to make sure the write
      gets "completed", there is no guarantee that the write actually reached the
      UFS device. There is a good chance that the write could be stored in a
      Write Buffer (WB). In that case, even though the CPU waits for 1us, the
      ref_clk might not be stable for that period.
      
      So lets do a readl() to make sure that the previous write has reached the
      UFS device before udelay().
      
      Also, the wmb() after writel_relaxed() is not really needed. Both writel()
      and readl() are ordered on all architectures and the CPU won't speculate
      instructions after readl() due to the in-built control dependency with read
      value on weakly ordered architectures. So it can be safely removed.
      
      Link: https://lore.kernel.org/r/20220504084212.11605-4-manivannan.sadhasivam@linaro.org
      
      
      Fixes: f06fcc71 ("scsi: ufs-qcom: add QUniPro hardware support and power optimizations")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarManivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1be49ae1
    • Xiaomeng Tong's avatar
      scsi: dc395x: Fix a missing check on list iterator · a078e6e8
      Xiaomeng Tong authored
      commit 036a45aa587a10fa2abbd50fbd0f6c4cfc44f69f upstream.
      
      The bug is here:
      
      	p->target_id, p->target_lun);
      
      The list iterator 'p' will point to a bogus position containing HEAD if the
      list is empty or no element is found. This case must be checked before any
      use of the iterator, otherwise it will lead to an invalid memory access.
      
      To fix this bug, add a check. Use a new variable 'iter' as the list
      iterator, and use the original variable 'p' as a dedicated pointer to point
      to the found element.
      
      Link: https://lore.kernel.org/r/20220414040231.2662-1-xiam0nd.tong@gmail.com
      
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarXiaomeng Tong <xiam0nd.tong@gmail.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a078e6e8
    • Junxiao Bi via Ocfs2-devel's avatar
      ocfs2: dlmfs: fix error handling of user_dlm_destroy_lock · 9c96238f
      Junxiao Bi via Ocfs2-devel authored
      commit 863e0d81b6683c4cbc588ad831f560c90e494bef upstream.
      
      When user_dlm_destroy_lock failed, it didn't clean up the flags it set
      before exit.  For USER_LOCK_IN_TEARDOWN, if this function fails because of
      lock is still in used, next time when unlink invokes this function, it
      will return succeed, and then unlink will remove inode and dentry if lock
      is not in used(file closed), but the dlm lock is still linked in dlm lock
      resource, then when bast come in, it will trigger a panic due to
      user-after-free.  See the following panic call trace.  To fix this,
      USER_LOCK_IN_TEARDOWN should be reverted if fail.  And also error should
      be returned if USER_LOCK_IN_TEARDOWN is set to let user know that unlink
      fail.
      
      For the case of ocfs2_dlm_unlock failure, besides USER_LOCK_IN_TEARDOWN,
      USER_LOCK_BUSY is also required to be cleared.  Even though spin lock is
      released in between, but USER_LOCK_IN_TEARDOWN is still set, for
      USER_LOCK_BUSY, if before every place that waits on this flag,
      USER_LOCK_IN_TEARDOWN is checked to bail out, that will make sure no flow
      waits on the busy flag set by user_dlm_destroy_lock(), then we can
      simplely revert USER_LOCK_BUSY when ocfs2_dlm_unlock fails.  Fix
      user_dlm_cluster_lock() which is the only function not following this.
      
      [  941.336392] (python,26174,16):dlmfs_unlink:562 ERROR: unlink
      004fb0000060000b5a90b8c847b72e1, error -16 from destroy
      [  989.757536] ------------[ cut here ]------------
      [  989.757709] kernel BUG at fs/ocfs2/dlmfs/userdlm.c:173!
      [  989.757876] invalid opcode: 0000 [#1] SMP
      [  989.758027] Modules linked in: ksplice_2zhuk2jr_ib_ipoib_new(O)
      ksplice_2zhuk2jr(O) mptctl mptbase xen_netback xen_blkback xen_gntalloc
      xen_gntdev xen_evtchn cdc_ether usbnet mii ocfs2 jbd2 rpcsec_gss_krb5
      auth_rpcgss nfsv4 nfsv3 nfs_acl nfs fscache lockd grace ocfs2_dlmfs
      ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bnx2fc
      fcoe libfcoe libfc scsi_transport_fc sunrpc ipmi_devintf bridge stp llc
      rds_rdma rds bonding ib_sdp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad
      rdma_cm ib_cm iw_cm falcon_lsm_serviceable(PE) falcon_nf_netcontain(PE)
      mlx4_vnic falcon_kal(E) falcon_lsm_pinned_13402(E) mlx4_ib ib_sa ib_mad
      ib_core ib_addr xenfs xen_privcmd dm_multipath iTCO_wdt iTCO_vendor_support
      pcspkr sb_edac edac_core i2c_i801 lpc_ich mfd_core ipmi_ssif i2c_core ipmi_si
      ipmi_msghandler
      [  989.760686]  ioatdma sg ext3 jbd mbcache sd_mod ahci libahci ixgbe dca ptp
      pps_core vxlan udp_tunnel ip6_udp_tunnel megaraid_sas mlx4_core crc32c_intel
      be2iscsi bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi ipv6 cxgb3 mdio
      libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi wmi
      dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
      ksplice_2zhuk2jr_ib_ipoib_old]
      [  989.761987] CPU: 10 PID: 19102 Comm: dlm_thread Tainted: P           OE
      4.1.12-124.57.1.el6uek.x86_64 #2
      [  989.762290] Hardware name: Oracle Corporation ORACLE SERVER
      X5-2/ASM,MOTHERBOARD,1U, BIOS 30350100 06/17/2021
      [  989.762599] task: ffff880178af6200 ti: ffff88017f7c8000 task.ti:
      ffff88017f7c8000
      [  989.762848] RIP: e030:[<ffffffffc07d4316>]  [<ffffffffc07d4316>]
      __user_dlm_queue_lockres.part.4+0x76/0x80 [ocfs2_dlmfs]
      [  989.763185] RSP: e02b:ffff88017f7cbcb8  EFLAGS: 00010246
      [  989.763353] RAX: 0000000000000000 RBX: ffff880174d48008 RCX:
      0000000000000003
      [  989.763565] RDX: 0000000000120012 RSI: 0000000000000003 RDI:
      ffff880174d48170
      [  989.763778] RBP: ffff88017f7cbcc8 R08: ffff88021f4293b0 R09:
      0000000000000000
      [  989.763991] R10: ffff880179c8c000 R11: 0000000000000003 R12:
      ffff880174d48008
      [  989.764204] R13: 0000000000000003 R14: ffff880179c8c000 R15:
      ffff88021db7a000
      [  989.764422] FS:  0000000000000000(0000) GS:ffff880247480000(0000)
      knlGS:ffff880247480000
      [  989.764685] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  989.764865] CR2: ffff8000007f6800 CR3: 0000000001ae0000 CR4:
      0000000000042660
      [  989.765081] Stack:
      [  989.765167]  0000000000000003 ffff880174d48040 ffff88017f7cbd18
      ffffffffc07d455f
      [  989.765442]  ffff88017f7cbd88 ffffffff816fb639 ffff88017f7cbd38
      ffff8800361b5600
      [  989.765717]  ffff88021db7a000 ffff88021f429380 0000000000000003
      ffffffffc0453020
      [  989.765991] Call Trace:
      [  989.766093]  [<ffffffffc07d455f>] user_bast+0x5f/0xf0 [ocfs2_dlmfs]
      [  989.766287]  [<ffffffff816fb639>] ? schedule_timeout+0x169/0x2d0
      [  989.766475]  [<ffffffffc0453020>] ? o2dlm_lock_ast_wrapper+0x20/0x20
      [ocfs2_stack_o2cb]
      [  989.766738]  [<ffffffffc045303a>] o2dlm_blocking_ast_wrapper+0x1a/0x20
      [ocfs2_stack_o2cb]
      [  989.767010]  [<ffffffffc0864ec6>] dlm_do_local_bast+0x46/0xe0 [ocfs2_dlm]
      [  989.767217]  [<ffffffffc084f5cc>] ? dlm_lockres_calc_usage+0x4c/0x60
      [ocfs2_dlm]
      [  989.767466]  [<ffffffffc08501f1>] dlm_thread+0xa31/0x1140 [ocfs2_dlm]
      [  989.767662]  [<ffffffff816f78da>] ? __schedule+0x24a/0x810
      [  989.767834]  [<ffffffff816f78ce>] ? __schedule+0x23e/0x810
      [  989.768006]  [<ffffffff816f78da>] ? __schedule+0x24a/0x810
      [  989.768178]  [<ffffffff816f78ce>] ? __schedule+0x23e/0x810
      [  989.768349]  [<ffffffff816f78da>] ? __schedule+0x24a/0x810
      [  989.768521]  [<ffffffff816f78ce>] ? __schedule+0x23e/0x810
      [  989.768693]  [<ffffffff816f78da>] ? __schedule+0x24a/0x810
      [  989.768893]  [<ffffffff816f78ce>] ? __schedule+0x23e/0x810
      [  989.769067]  [<ffffffff816f78da>] ? __schedule+0x24a/0x810
      [  989.769241]  [<ffffffff810ce4d0>] ? wait_woken+0x90/0x90
      [  989.769411]  [<ffffffffc084f7c0>] ? dlm_kick_thread+0x80/0x80 [ocfs2_dlm]
      [  989.769617]  [<ffffffff810a8bbb>] kthread+0xcb/0xf0
      [  989.769774]  [<ffffffff816f78da>] ? __schedule+0x24a/0x810
      [  989.769945]  [<ffffffff816f78da>] ? __schedule+0x24a/0x810
      [  989.770117]  [<ffffffff810a8af0>] ? kthread_create_on_node+0x180/0x180
      [  989.770321]  [<ffffffff816fdaa1>] ret_from_fork+0x61/0x90
      [  989.770492]  [<ffffffff810a8af0>] ? kthread_create_on_node+0x180/0x180
      [  989.770689] Code: d0 00 00 00 f0 45 7d c0 bf 00 20 00 00 48 89 83 c0 00 00
      00 48 89 83 c8 00 00 00 e8 55 c1 8c c0 83 4b 04 10 48 83 c4 08 5b 5d c3 <0f>
      0b 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 55 41 54 53 48 83
      [  989.771892] RIP  [<ffffffffc07d4316>]
      __user_dlm_queue_lockres.part.4+0x76/0x80 [ocfs2_dlmfs]
      [  989.772174]  RSP <ffff88017f7cbcb8>
      [  989.772704] ---[ end trace ebd1e38cebcc93a8 ]---
      [  989.772907] Kernel panic - not syncing: Fatal exception
      [  989.773173] Kernel Offset: disabled
      
      Link: https://lkml.kernel.org/r/20220518235224.87100-2-junxiao.bi@oracle.com
      
      
      Signed-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c96238f
    • Alexander Aring's avatar
      dlm: fix missing lkb refcount handling · e70f0582
      Alexander Aring authored
      
      commit 1689c169134f4b5a39156122d799b7dca76d8ddb upstream.
      
      We always call hold_lkb(lkb) if we increment lkb->lkb_wait_count.
      So, we always need to call unhold_lkb(lkb) if we decrement
      lkb->lkb_wait_count. This patch will add missing unhold_lkb(lkb) if we
      decrement lkb->lkb_wait_count. In case of setting lkb->lkb_wait_count to
      zero we need to countdown until reaching zero and call unhold_lkb(lkb).
      The waiters list unhold_lkb(lkb) can be removed because it's done for
      the last lkb_wait_count decrement iteration as it's done in
      _remove_from_waiters().
      
      This issue was discovered by a dlm gfs2 test case which use excessively
      dlm_unlock(LKF_CANCEL) feature. Probably the lkb->lkb_wait_count value
      never reached above 1 if this feature isn't used and so it was not
      discovered before.
      
      The testcase ended in a rsb on the rsb keep data structure with a
      refcount of 1 but no lkb was associated with it, which is itself
      an invalid behaviour. A side effect of that was a condition in which
      the dlm was sending remove messages in a looping behaviour. With this
      patch that has not been reproduced.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAlexander Aring <aahringo@redhat.com>
      Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e70f0582
    • Dan Carpenter's avatar
      dlm: uninitialized variable on error in dlm_listen_for_all() · 697b45d5
      Dan Carpenter authored
      
      commit 1f4f10845e14690b02410de50d9ea9684625a4ae upstream.
      
      The "sock" variable is not initialized on this error path.
      
      Cc: stable@vger.kernel.org
      Fixes: 2dc6b115 ("fs: dlm: introduce generic listen")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarAlexander Aring <aahringo@redhat.com>
      Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      697b45d5
    • Alexander Aring's avatar
      dlm: fix plock invalid read · acdad5bc
      Alexander Aring authored
      
      commit 42252d0d2aa9b94d168241710a761588b3959019 upstream.
      
      This patch fixes an invalid read showed by KASAN. A unlock will allocate a
      "struct plock_op" and a followed send_op() will append it to a global
      send_list data structure. In some cases a followed dev_read() moves it
      to recv_list and dev_write() will cast it to "struct plock_xop" and access
      fields which are only available in those structures. At this point an
      invalid read happens by accessing those fields.
      
      To fix this issue the "callback" field is moved to "struct plock_op" to
      indicate that a cast to "plock_xop" is allowed and does the additional
      "plock_xop" handling if set.
      
      Example of the KASAN output which showed the invalid read:
      
      [ 2064.296453] ==================================================================
      [ 2064.304852] BUG: KASAN: slab-out-of-bounds in dev_write+0x52b/0x5a0 [dlm]
      [ 2064.306491] Read of size 8 at addr ffff88800ef227d8 by task dlm_controld/7484
      [ 2064.308168]
      [ 2064.308575] CPU: 0 PID: 7484 Comm: dlm_controld Kdump: loaded Not tainted 5.14.0+ #9
      [ 2064.310292] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [ 2064.311618] Call Trace:
      [ 2064.312218]  dump_stack_lvl+0x56/0x7b
      [ 2064.313150]  print_address_description.constprop.8+0x21/0x150
      [ 2064.314578]  ? dev_write+0x52b/0x5a0 [dlm]
      [ 2064.315610]  ? dev_write+0x52b/0x5a0 [dlm]
      [ 2064.316595]  kasan_report.cold.14+0x7f/0x11b
      [ 2064.317674]  ? dev_write+0x52b/0x5a0 [dlm]
      [ 2064.318687]  dev_write+0x52b/0x5a0 [dlm]
      [ 2064.319629]  ? dev_read+0x4a0/0x4a0 [dlm]
      [ 2064.320713]  ? bpf_lsm_kernfs_init_security+0x10/0x10
      [ 2064.321926]  vfs_write+0x17e/0x930
      [ 2064.322769]  ? __fget_light+0x1aa/0x220
      [ 2064.323753]  ksys_write+0xf1/0x1c0
      [ 2064.324548]  ? __ia32_sys_read+0xb0/0xb0
      [ 2064.325464]  do_syscall_64+0x3a/0x80
      [ 2064.326387]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [ 2064.327606] RIP: 0033:0x7f807e4ba96f
      [ 2064.328470] Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 39 87 f8 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 7c 87 f8 ff 48
      [ 2064.332902] RSP: 002b:00007ffd50cfe6e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
      [ 2064.334658] RAX: ffffffffffffffda RBX: 000055cc3886eb30 RCX: 00007f807e4ba96f
      [ 2064.336275] RDX: 0000000000000040 RSI: 00007ffd50cfe7e0 RDI: 0000000000000010
      [ 2064.337980] RBP: 00007ffd50cfe7e0 R08: 0000000000000000 R09: 0000000000000001
      [ 2064.339560] R10: 000055cc3886eb30 R11: 0000000000000293 R12: 000055cc3886eb80
      [ 2064.341237] R13: 000055cc3886eb00 R14: 000055cc3886f590 R15: 0000000000000001
      [ 2064.342857]
      [ 2064.343226] Allocated by task 12438:
      [ 2064.344057]  kasan_save_stack+0x1c/0x40
      [ 2064.345079]  __kasan_kmalloc+0x84/0xa0
      [ 2064.345933]  kmem_cache_alloc_trace+0x13b/0x220
      [ 2064.346953]  dlm_posix_unlock+0xec/0x720 [dlm]
      [ 2064.348811]  do_lock_file_wait.part.32+0xca/0x1d0
      [ 2064.351070]  fcntl_setlk+0x281/0xbc0
      [ 2064.352879]  do_fcntl+0x5e4/0xfe0
      [ 2064.354657]  __x64_sys_fcntl+0x11f/0x170
      [ 2064.356550]  do_syscall_64+0x3a/0x80
      [ 2064.358259]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [ 2064.360745]
      [ 2064.361511] Last potentially related work creation:
      [ 2064.363957]  kasan_save_stack+0x1c/0x40
      [ 2064.365811]  __kasan_record_aux_stack+0xaf/0xc0
      [ 2064.368100]  call_rcu+0x11b/0xf70
      [ 2064.369785]  dlm_process_incoming_buffer+0x47d/0xfd0 [dlm]
      [ 2064.372404]  receive_from_sock+0x290/0x770 [dlm]
      [ 2064.374607]  process_recv_sockets+0x32/0x40 [dlm]
      [ 2064.377290]  process_one_work+0x9a8/0x16e0
      [ 2064.379357]  worker_thread+0x87/0xbf0
      [ 2064.381188]  kthread+0x3ac/0x490
      [ 2064.383460]  ret_from_fork+0x22/0x30
      [ 2064.385588]
      [ 2064.386518] Second to last potentially related work creation:
      [ 2064.389219]  kasan_save_stack+0x1c/0x40
      [ 2064.391043]  __kasan_record_aux_stack+0xaf/0xc0
      [ 2064.393303]  call_rcu+0x11b/0xf70
      [ 2064.394885]  dlm_process_incoming_buffer+0x47d/0xfd0 [dlm]
      [ 2064.397694]  receive_from_sock+0x290/0x770 [dlm]
      [ 2064.399932]  process_recv_sockets+0x32/0x40 [dlm]
      [ 2064.402180]  process_one_work+0x9a8/0x16e0
      [ 2064.404388]  worker_thread+0x87/0xbf0
      [ 2064.406124]  kthread+0x3ac/0x490
      [ 2064.408021]  ret_from_fork+0x22/0x30
      [ 2064.409834]
      [ 2064.410599] The buggy address belongs to the object at ffff88800ef22780
      [ 2064.410599]  which belongs to the cache kmalloc-96 of size 96
      [ 2064.416495] The buggy address is located 88 bytes inside of
      [ 2064.416495]  96-byte region [ffff88800ef22780, ffff88800ef227e0)
      [ 2064.422045] The buggy address belongs to the page:
      [ 2064.424635] page:00000000b6bef8bc refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xef22
      [ 2064.428970] flags: 0xfffffc0000200(slab|node=0|zone=1|lastcpupid=0x1fffff)
      [ 2064.432515] raw: 000fffffc0000200 ffffea0000d68b80 0000001400000014 ffff888001041780
      [ 2064.436110] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
      [ 2064.439813] page dumped because: kasan: bad access detected
      [ 2064.442548]
      [ 2064.443310] Memory state around the buggy address:
      [ 2064.445988]  ffff88800ef22680: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
      [ 2064.449444]  ffff88800ef22700: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
      [ 2064.452941] >ffff88800ef22780: 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc
      [ 2064.456383]                                                     ^
      [ 2064.459386]  ffff88800ef22800: 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc
      [ 2064.462788]  ffff88800ef22880: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
      [ 2064.466239] ==================================================================
      
      reproducer in python:
      
      import argparse
      import struct
      import fcntl
      import os
      
      parser = argparse.ArgumentParser()
      
      parser.add_argument('-f', '--file',
      		    help='file to use fcntl, must be on dlm lock filesystem e.g. gfs2')
      
      args = parser.parse_args()
      
      f = open(args.file, 'wb+')
      
      lockdata = struct.pack('hhllhh', fcntl.F_WRLCK,0,0,0,0,0)
      fcntl.fcntl(f, fcntl.F_SETLK, lockdata)
      lockdata = struct.pack('hhllhh', fcntl.F_UNLCK,0,0,0,0,0)
      fcntl.fcntl(f, fcntl.F_SETLK, lockdata)
      
      Fixes: 586759f0 ("gfs2: nfs lock support for gfs2")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: default avatarAlexander Aring <aahringo@redhat.com>
      Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      acdad5bc
    • Sven Schnelle's avatar
      s390/stp: clock_delta should be signed · f19e2e1d
      Sven Schnelle authored
      
      commit 5ace65ebb5ce9fe1cc8fdbdd97079fb566ef0ea4 upstream.
      
      clock_delta is declared as unsigned long in various places. However,
      the clock sync delta can be negative. This would add a huge positive
      offset in clock_sync_global where clock_delta is added to clk.eitod
      which is a 72 bit integer. Declare it as signed long to fix this.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Reviewed-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f19e2e1d
    • Nico Boehr's avatar
      s390/perf: obtain sie_block from the right address · 42b2f5dd
      Nico Boehr authored
      
      commit c9bfb460c3e4da2462e16b0f0b200990b36b1dd2 upstream.
      
      Since commit 1179f170 ("s390: fix fpu restore in entry.S"), the
      sie_block pointer is located at empty1[1], but in sie_block() it was
      taken from empty1[0].
      
      This leads to a random pointer being dereferenced, possibly causing
      system crash.
      
      This problem can be observed when running a simple guest with an endless
      loop and recording the cpu-clock event:
      
        sudo perf kvm --guestvmlinux=<guestkernel> --guest top -e cpu-clock
      
      With this fix, the correct guest address is shown.
      
      Fixes: 1179f170 ("s390: fix fpu restore in entry.S")
      Cc: stable@vger.kernel.org
      Acked-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Acked-by: default avatarClaudio Imbrenda <imbrenda@linux.ibm.com>
      Reviewed-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarNico Boehr <nrb@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      42b2f5dd
    • Rei Yamamoto's avatar
      mm, compaction: fast_find_migrateblock() should return pfn in the target zone · 20e6ec76
      Rei Yamamoto authored
      commit bbe832b9db2e1ad21522f8f0bf02775fff8a0e0e upstream.
      
      At present, pages not in the target zone are added to cc->migratepages
      list in isolate_migratepages_block().  As a result, pages may migrate
      between nodes unintentionally.
      
      This would be a serious problem for older kernels without commit
      a984226f ("mm: memcontrol: remove the pgdata parameter of
      mem_cgroup_page_lruvec"), because it can corrupt the lru list by
      handling pages in list without holding proper lru_lock.
      
      Avoid returning a pfn outside the target zone in the case that it is
      not aligned with a pageblock boundary.  Otherwise
      isolate_migratepages_block() will handle pages not in the target zone.
      
      Link: https://lkml.kernel.org/r/20220511044300.4069-1-yamamoto.rei@jp.fujitsu.com
      
      
      Fixes: 70b44595 ("mm, compaction: use free lists to quickly locate a migration source")
      Signed-off-by: default avatarRei Yamamoto <yamamoto.rei@jp.fujitsu.com>
      Reviewed-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Reviewed-by: default avatarOscar Salvador <osalvador@suse.de>
      Cc: Don Dutile <ddutile@redhat.com>
      Cc: Wonhyuk Yang <vvghjk1234@gmail.com>
      Cc: Rei Yamamoto <yamamoto.rei@jp.fujitsu.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      20e6ec76
    • Denis Efremov's avatar
      staging: r8188eu: prevent ->Ssid overflow in rtw_wx_set_scan() · ac2eab7d
      Denis Efremov authored
      
      commit bc10916e890948d8927a5c8c40fb5dc44be5e1b8 upstream.
      
      This code has a check to prevent read overflow but it needs another
      check to prevent writing beyond the end of the ->Ssid[] array.
      
      Fixes: 2b42bd58 ("staging: r8188eu: introduce new os_dep dir for RTL8188eu driver")
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarDenis Efremov <denis.e.efremov@oracle.com>
      Link: https://lore.kernel.org/r/20220518070052.108287-1-denis.e.efremov@oracle.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ac2eab7d
    • Johan Hovold's avatar
      PCI: qcom: Fix unbalanced PHY init on probe errors · a7daaaa8
      Johan Hovold authored
      commit 83013631f0f9961416abd812e228c8efbc2f6069 upstream.
      
      Undo the PHY initialisation (e.g. balance runtime PM) if host
      initialisation fails during probe.
      
      Link: https://lore.kernel.org/r/20220401133854.10421-3-johan+linaro@kernel.org
      
      
      Fixes: 82a82383 ("PCI: qcom: Add Qualcomm PCIe controller driver")
      Signed-off-by: default avatarJohan Hovold <johan+linaro@kernel.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarManivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
      Acked-by: default avatarStanimir Varbanov <svarbanov@mm-sol.com>
      Cc: stable@vger.kernel.org      # 4.5
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a7daaaa8
    • Johan Hovold's avatar
      PCI: qcom: Fix runtime PM imbalance on probe errors · 4f9d6407
      Johan Hovold authored
      commit 87d83b96c8d6c6c2d2096bd0bdba73bcf42b8ef0 upstream.
      
      Drop the leftover pm_runtime_disable() calls from the late probe error
      paths that would, for example, prevent runtime PM from being reenabled
      after a probe deferral.
      
      Link: https://lore.kernel.org/r/20220401133854.10421-2-johan+linaro@kernel.org
      
      
      Fixes: 6e5da6f7 ("PCI: qcom: Fix error handling in runtime PM support")
      Signed-off-by: default avatarJohan Hovold <johan+linaro@kernel.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarManivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
      Acked-by: default avatarStanimir Varbanov <svarbanov@mm-sol.com>
      Cc: stable@vger.kernel.org      # 4.20
      Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4f9d6407
    • Bjorn Helgaas's avatar
      PCI/PM: Fix bridge_d3_blacklist[] Elo i2 overwrite of Gigabyte X299 · 0db67767
      Bjorn Helgaas authored
      commit 12068bb346db5776d0ec9bb4cd073f8427a1ac92 upstream.
      
      92597f97a40b ("PCI/PM: Avoid putting Elo i2 PCIe Ports in D3cold") omitted
      braces around the new Elo i2 entry, so it overwrote the existing Gigabyte
      X299 entry.  Add the appropriate braces.
      
      Found by:
      
        $ make W=1 drivers/pci/pci.o
          CC      drivers/pci/pci.o
        drivers/pci/pci.c:2974:12: error: initialized field overwritten [-Werror=override-init]
         2974 |   .ident = "Elo i2",
              |            ^~~~~~~~
      
      Link: https://lore.kernel.org/r/20220526221258.GA409855@bhelgaas
      
      
      Fixes: 92597f97a40b ("PCI/PM: Avoid putting Elo i2 PCIe Ports in D3cold")
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Cc: stable@vger.kernel.org  # v5.15+
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0db67767
    • Alex Deucher's avatar
      drm/amdgpu: add beige goby PCI ID · 283bda02
      Alex Deucher authored
      
      commit 62e9bd20035b53ff6c679499c08546d96c6c60a7 upstream.
      
      Add a beige goby PCI ID.
      
      Reviewed-by: default avatarGuchun Chen <guchun.chen@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      283bda02
    • Gautam Menghani's avatar
      tracing: Initialize integer variable to prevent garbage return value · 4ef5ab53
      Gautam Menghani authored
      commit 154827f8e53d8c492b3fb0cb757fbcadb5d516b5 upstream.
      
      Initialize the integer variable to 0 to fix the clang scan warning:
      Undefined or garbage value returned to caller
      [core.uninitialized.UndefReturn]
              return ret;
      
      Link: https://lkml.kernel.org/r/20220522061826.1751-1-gautammenghani201@gmail.com
      
      
      
      Cc: stable@vger.kernel.org
      Fixes: 8993665a ("tracing/boot: Support multiple handlers for per-event histogram")
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Signed-off-by: default avatarGautam Menghani <gautammenghani201@gmail.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4ef5ab53
    • Keita Suzuki's avatar
      tracing: Fix potential double free in create_var_ref() · 37443b35
      Keita Suzuki authored
      commit 99696a2592bca641eb88cc9a80c90e591afebd0f upstream.
      
      In create_var_ref(), init_var_ref() is called to initialize the fields
      of variable ref_field, which is allocated in the previous function call
      to create_hist_field(). Function init_var_ref() allocates the
      corresponding fields such as ref_field->system, but frees these fields
      when the function encounters an error. The caller later calls
      destroy_hist_field() to conduct error handling, which frees the fields
      and the variable itself. This results in double free of the fields which
      are already freed in the previous function.
      
      Fix this by storing NULL to the corresponding fields when they are freed
      in init_var_ref().
      
      Link: https://lkml.kernel.org/r/20220425063739.3859998-1-keitasuzuki.park@sslab.ics.keio.ac.jp
      
      
      
      Fixes: 067fe038 ("tracing: Add variable reference handling to hist triggers")
      CC: stable@vger.kernel.org
      Reviewed-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: default avatarTom Zanussi <zanussi@kernel.org>
      Signed-off-by: default avatarKeita Suzuki <keitasuzuki.park@sslab.ics.keio.ac.jp>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      37443b35
    • Laurent Vivier's avatar
      tty: goldfish: Introduce gf_ioread32()/gf_iowrite32() · 0b011b40
      Laurent Vivier authored
      
      commit 2e2ac4a3327479f7e2744cdd88a5c823f2057bad upstream.
      
      The goldfish TTY device was clearly defined as having little-endian
      registers, but the switch to __raw_{read,write}l(() broke its driver
      when running on big-endian kernels (if anyone ever tried this).
      
      The m68k qemu implementation got this wrong, and assumed native-endian
      registers.  While this is a bug in qemu, it is probably impossible to
      fix that since there is no way of knowing which other operating systems
      have started relying on that bug over the years.
      
      Hence revert commit da31de35 ("tty: goldfish: use
      __raw_writel()/__raw_readl()", and define gf_ioread32()/gf_iowrite32()
      to be able to use accessors defined by the architecture.
      
      Cc: stable@vger.kernel.org # v5.11+
      Fixes: da31de35 ("tty: goldfish: use __raw_writel()/__raw_readl()")
      Signed-off-by: default avatarLaurent Vivier <laurent@vivier.eu>
      Link: https://lore.kernel.org/r/20220406201523.243733-2-laurent@vivier.eu
      
      
      [geert: Add rationale based on Arnd's comments]
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0b011b40
    • Sakari Ailus's avatar
      ACPI: property: Release subnode properties with data nodes · b3485d2b
      Sakari Ailus authored
      
      commit 3bd561e1572ee02a50cd1a5be339abf1a5b78d56 upstream.
      
      struct acpi_device_properties describes one source of properties present
      on either struct acpi_device or struct acpi_data_node. When properties are
      parsed, both are populated but when released, only those properties that
      are associated with the device node are freed.
      
      Fix this by also releasing memory of the data node properties.
      
      Fixes: 5f5e4890 ("ACPI / property: Allow multiple property compatible _DSD entries")
      Cc: 4.20+ <stable@vger.kernel.org> # 4.20+
      Signed-off-by: default avatarSakari Ailus <sakari.ailus@linux.intel.com>
      Reviewed-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b3485d2b
    • Jan Kara's avatar
      ext4: avoid cycles in directory h-tree · 3a3ce941
      Jan Kara authored
      
      commit 3ba733f879c2a88910744647e41edeefbc0d92b2 upstream.
      
      A maliciously corrupted filesystem can contain cycles in the h-tree
      stored inside a directory. That can easily lead to the kernel corrupting
      tree nodes that were already verified under its hands while doing a node
      split and consequently accessing unallocated memory. Fix the problem by
      verifying traversed block numbers are unique.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20220518093332.13986-2-jack@suse.cz
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3a3ce941
    • Jan Kara's avatar
      ext4: verify dir block before splitting it · ca17db38
      Jan Kara authored
      
      commit 46c116b920ebec58031f0a78c5ea9599b0d2a371 upstream.
      
      Before splitting a directory block verify its directory entries are sane
      so that the splitting code does not access memory it should not.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20220518093332.13986-1-jack@suse.cz
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ca17db38
    • Baokun Li's avatar
      ext4: fix bug_on in __es_tree_search · 3c617827
      Baokun Li authored
      
      commit d36f6ed761b53933b0b4126486c10d3da7751e7f upstream.
      
      Hulk Robot reported a BUG_ON:
      ==================================================================
      kernel BUG at fs/ext4/extents_status.c:199!
      [...]
      RIP: 0010:ext4_es_end fs/ext4/extents_status.c:199 [inline]
      RIP: 0010:__es_tree_search+0x1e0/0x260 fs/ext4/extents_status.c:217
      [...]
      Call Trace:
       ext4_es_cache_extent+0x109/0x340 fs/ext4/extents_status.c:766
       ext4_cache_extents+0x239/0x2e0 fs/ext4/extents.c:561
       ext4_find_extent+0x6b7/0xa20 fs/ext4/extents.c:964
       ext4_ext_map_blocks+0x16b/0x4b70 fs/ext4/extents.c:4384
       ext4_map_blocks+0xe26/0x19f0 fs/ext4/inode.c:567
       ext4_getblk+0x320/0x4c0 fs/ext4/inode.c:980
       ext4_bread+0x2d/0x170 fs/ext4/inode.c:1031
       ext4_quota_read+0x248/0x320 fs/ext4/super.c:6257
       v2_read_header+0x78/0x110 fs/quota/quota_v2.c:63
       v2_check_quota_file+0x76/0x230 fs/quota/quota_v2.c:82
       vfs_load_quota_inode+0x5d1/0x1530 fs/quota/dquot.c:2368
       dquot_enable+0x28a/0x330 fs/quota/dquot.c:2490
       ext4_quota_enable fs/ext4/super.c:6137 [inline]
       ext4_enable_quotas+0x5d7/0x960 fs/ext4/super.c:6163
       ext4_fill_super+0xa7c9/0xdc00 fs/ext4/super.c:4754
       mount_bdev+0x2e9/0x3b0 fs/super.c:1158
       mount_fs+0x4b/0x1e4 fs/super.c:1261
      [...]
      ==================================================================
      
      Above issue may happen as follows:
      -------------------------------------
      ext4_fill_super
       ext4_enable_quotas
        ext4_quota_enable
         ext4_iget
          __ext4_iget
           ext4_ext_check_inode
            ext4_ext_check
             __ext4_ext_check
              ext4_valid_extent_entries
               Check for overlapping extents does't take effect
         dquot_enable
          vfs_load_quota_inode
           v2_check_quota_file
            v2_read_header
             ext4_quota_read
              ext4_bread
               ext4_getblk
                ext4_map_blocks
                 ext4_ext_map_blocks
                  ext4_find_extent
                   ext4_cache_extents
                    ext4_es_cache_extent
                     ext4_es_cache_extent
                      __es_tree_search
                       ext4_es_end
                        BUG_ON(es->es_lblk + es->es_len < es->es_lblk)
      
      The error ext4 extents is as follows:
      0af3 0300 0400 0000 00000000    extent_header
      00000000 0100 0000 12000000     extent1
      00000000 0100 0000 18000000     extent2
      02000000 0400 0000 14000000     extent3
      
      In the ext4_valid_extent_entries function,
      if prev is 0, no error is returned even if lblock<=prev.
      This was intended to skip the check on the first extent, but
      in the error image above, prev=0+1-1=0 when checking the second extent,
      so even though lblock<=prev, the function does not return an error.
      As a result, bug_ON occurs in __es_tree_search and the system panics.
      
      To solve this problem, we only need to check that:
      1. The lblock of the first extent is not less than 0.
      2. The lblock of the next extent  is not less than
         the next block of the previous extent.
      The same applies to extent_idx.
      
      Cc: stable@kernel.org
      Fixes: 5946d089 ("ext4: check for overlapping extents in ext4_valid_extent_entries()")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20220518120816.1541863-1-libaokun1@huawei.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3c617827
    • Theodore Ts'o's avatar
      ext4: filter out EXT4_FC_REPLAY from on-disk superblock field s_state · b99fd734
      Theodore Ts'o authored
      
      commit c878bea3c9d724ddfa05a813f30de3d25a0ba83f upstream.
      
      The EXT4_FC_REPLAY bit in sbi->s_mount_state is used to indicate that
      we are in the middle of replay the fast commit journal.  This was
      actually a mistake, since the sbi->s_mount_info is initialized from
      es->s_state.  Arguably s_mount_state is misleadingly named, but the
      name is historical --- s_mount_state and s_state dates back to ext2.
      
      What should have been used is the ext4_{set,clear,test}_mount_flag()
      inline functions, which sets EXT4_MF_* bits in sbi->s_mount_flags.
      
      The problem with using EXT4_FC_REPLAY is that a maliciously corrupted
      superblock could result in EXT4_FC_REPLAY getting set in
      s_mount_state.  This bypasses some sanity checks, and this can trigger
      a BUG() in ext4_es_cache_extent().  As a easy-to-backport-fix, filter
      out the EXT4_FC_REPLAY bit for now.  We should eventually transition
      away from EXT4_FC_REPLAY to something like EXT4_MF_REPLAY.
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Link: https://lore.kernel.org/r/20220420192312.1655305-1-phind.uet@gmail.com
      Link: https://lore.kernel.org/r/20220517174028.942119-1-tytso@mit.edu
      
      
      Reported-by: default avatar <syzbot+c7358a3cd05ee786eb31@syzkaller.appspotmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b99fd734
    • Ye Bin's avatar
      ext4: fix bug_on in ext4_writepages · 18a759f7
      Ye Bin authored
      
      commit ef09ed5d37b84d18562b30cf7253e57062d0db05 upstream.
      
      we got issue as follows:
      EXT4-fs error (device loop0): ext4_mb_generate_buddy:1141: group 0, block bitmap and bg descriptor inconsistent: 25 vs 31513 free cls
      ------------[ cut here ]------------
      kernel BUG at fs/ext4/inode.c:2708!
      invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
      CPU: 2 PID: 2147 Comm: rep Not tainted 5.18.0-rc2-next-20220413+ #155
      RIP: 0010:ext4_writepages+0x1977/0x1c10
      RSP: 0018:ffff88811d3e7880 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88811c098000
      RDX: 0000000000000000 RSI: ffff88811c098000 RDI: 0000000000000002
      RBP: ffff888128140f50 R08: ffffffffb1ff6387 R09: 0000000000000000
      R10: 0000000000000007 R11: ffffed10250281ea R12: 0000000000000001
      R13: 00000000000000a4 R14: ffff88811d3e7bb8 R15: ffff888128141028
      FS:  00007f443aed9740(0000) GS:ffff8883aef00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020007200 CR3: 000000011c2a4000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       do_writepages+0x130/0x3a0
       filemap_fdatawrite_wbc+0x83/0xa0
       filemap_flush+0xab/0xe0
       ext4_alloc_da_blocks+0x51/0x120
       __ext4_ioctl+0x1534/0x3210
       __x64_sys_ioctl+0x12c/0x170
       do_syscall_64+0x3b/0x90
      
      It may happen as follows:
      1. write inline_data inode
      vfs_write
        new_sync_write
          ext4_file_write_iter
            ext4_buffered_write_iter
              generic_perform_write
                ext4_da_write_begin
                  ext4_da_write_inline_data_begin -> If inline data size too
                  small will allocate block to write, then mapping will has
                  dirty page
                      ext4_da_convert_inline_data_to_extent ->clear EXT4_STATE_MAY_INLINE_DATA
      2. fallocate
      do_vfs_ioctl
        ioctl_preallocate
          vfs_fallocate
            ext4_fallocate
              ext4_convert_inline_data
                ext4_convert_inline_data_nolock
                  ext4_map_blocks -> fail will goto restore data
                  ext4_restore_inline_data
                    ext4_create_inline_data
                    ext4_write_inline_data
                    ext4_set_inode_state -> set inode EXT4_STATE_MAY_INLINE_DATA
      3. writepages
      __ext4_ioctl
        ext4_alloc_da_blocks
          filemap_flush
            filemap_fdatawrite_wbc
              do_writepages
                ext4_writepages
                  if (ext4_has_inline_data(inode))
                    BUG_ON(ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA))
      
      The root cause of this issue is we destory inline data until call
      ext4_writepages under delay allocation mode.  But there maybe already
      convert from inline to extent.  To solve this issue, we call
      filemap_flush first..
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20220516122634.1690462-1-yebin10@huawei.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      18a759f7
    • Ye Bin's avatar
      ext4: fix warning in ext4_handle_inode_extension · b81d2ff6
      Ye Bin authored
      
      commit f4534c9fc94d22383f187b9409abb3f9df2e3db3 upstream.
      
      We got issue as follows:
      EXT4-fs error (device loop0) in ext4_reserve_inode_write:5741: Out of memory
      EXT4-fs error (device loop0): ext4_setattr:5462: inode #13: comm syz-executor.0: mark_inode_dirty error
      EXT4-fs error (device loop0) in ext4_setattr:5519: Out of memory
      EXT4-fs error (device loop0): ext4_ind_map_blocks:595: inode #13: comm syz-executor.0: Can't allocate blocks for non-extent mapped inodes with bigalloc
      ------------[ cut here ]------------
      WARNING: CPU: 1 PID: 4361 at fs/ext4/file.c:301 ext4_file_write_iter+0x11c9/0x1220
      Modules linked in:
      CPU: 1 PID: 4361 Comm: syz-executor.0 Not tainted 5.10.0+ #1
      RIP: 0010:ext4_file_write_iter+0x11c9/0x1220
      RSP: 0018:ffff924d80b27c00 EFLAGS: 00010282
      RAX: ffffffff815a3379 RBX: 0000000000000000 RCX: 000000003b000000
      RDX: ffff924d81601000 RSI: 00000000000009cc RDI: 00000000000009cd
      RBP: 000000000000000d R08: ffffffffbc5a2c6b R09: 0000902e0e52a96f
      R10: ffff902e2b7c1b40 R11: ffff902e2b7c1b40 R12: 000000000000000a
      R13: 0000000000000001 R14: ffff902e0e52aa10 R15: ffffffffffffff8b
      FS:  00007f81a7f65700(0000) GS:ffff902e3bc80000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffff600400 CR3: 000000012db88001 CR4: 00000000003706e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       do_iter_readv_writev+0x2e5/0x360
       do_iter_write+0x112/0x4c0
       do_pwritev+0x1e5/0x390
       __x64_sys_pwritev2+0x7e/0xa0
       do_syscall_64+0x37/0x50
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Above issue may happen as follows:
      Assume
      inode.i_size=4096
      EXT4_I(inode)->i_disksize=4096
      
      step 1: set inode->i_isize = 8192
      ext4_setattr
        if (attr->ia_size != inode->i_size)
          EXT4_I(inode)->i_disksize = attr->ia_size;
          rc = ext4_mark_inode_dirty
             ext4_reserve_inode_write
                ext4_get_inode_loc
                  __ext4_get_inode_loc
                    sb_getblk --> return -ENOMEM
         ...
         if (!error)  ->will not update i_size
           i_size_write(inode, attr->ia_size);
      Now:
      inode.i_size=4096
      EXT4_I(inode)->i_disksize=8192
      
      step 2: Direct write 4096 bytes
      ext4_file_write_iter
       ext4_dio_write_iter
         iomap_dio_rw ->return error
       if (extend)
         ext4_handle_inode_extension
           WARN_ON_ONCE(i_size_read(inode) < EXT4_I(inode)->i_disksize);
      ->Then trigger warning.
      
      To solve above issue, if mark inode dirty failed in ext4_setattr just
      set 'EXT4_I(inode)->i_disksize' with old value.
      
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Link: https://lore.kernel.org/r/20220326065351.761952-1-yebin10@huawei.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b81d2ff6
    • Baokun Li's avatar
      ext4: fix race condition between ext4_write and ext4_convert_inline_data · 14602353
      Baokun Li authored
      
      commit f87c7a4b084afc13190cbb263538e444cb2b392a upstream.
      
      Hulk Robot reported a BUG_ON:
       ==================================================================
       EXT4-fs error (device loop3): ext4_mb_generate_buddy:805: group 0,
       block bitmap and bg descriptor inconsistent: 25 vs 31513 free clusters
       kernel BUG at fs/ext4/ext4_jbd2.c:53!
       invalid opcode: 0000 [#1] SMP KASAN PTI
       CPU: 0 PID: 25371 Comm: syz-executor.3 Not tainted 5.10.0+ #1
       RIP: 0010:ext4_put_nojournal fs/ext4/ext4_jbd2.c:53 [inline]
       RIP: 0010:__ext4_journal_stop+0x10e/0x110 fs/ext4/ext4_jbd2.c:116
       [...]
       Call Trace:
        ext4_write_inline_data_end+0x59a/0x730 fs/ext4/inline.c:795
        generic_perform_write+0x279/0x3c0 mm/filemap.c:3344
        ext4_buffered_write_iter+0x2e3/0x3d0 fs/ext4/file.c:270
        ext4_file_write_iter+0x30a/0x11c0 fs/ext4/file.c:520
        do_iter_readv_writev+0x339/0x3c0 fs/read_write.c:732
        do_iter_write+0x107/0x430 fs/read_write.c:861
        vfs_writev fs/read_write.c:934 [inline]
        do_pwritev+0x1e5/0x380 fs/read_write.c:1031
       [...]
       ==================================================================
      
      Above issue may happen as follows:
                 cpu1                     cpu2
      __________________________|__________________________
      do_pwritev
       vfs_writev
        do_iter_write
         ext4_file_write_iter
          ext4_buffered_write_iter
           generic_perform_write
            ext4_da_write_begin
                                 vfs_fallocate
                                  ext4_fallocate
                                   ext4_convert_inline_data
                                    ext4_convert_inline_data_nolock
                                     ext4_destroy_inline_data_nolock
                                      clear EXT4_STATE_MAY_INLINE_DATA
                                     ext4_map_blocks
                                      ext4_ext_map_blocks
                                       ext4_mb_new_blocks
                                        ext4_mb_regular_allocator
                                         ext4_mb_good_group_nolock
                                          ext4_mb_init_group
                                           ext4_mb_init_cache
                                            ext4_mb_generate_buddy  --> error
             ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)
                                      ext4_restore_inline_data
                                       set EXT4_STATE_MAY_INLINE_DATA
             ext4_block_write_begin
            ext4_da_write_end
             ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)
             ext4_write_inline_data_end
              handle=NULL
              ext4_journal_stop(handle)
               __ext4_journal_stop
                ext4_put_nojournal(handle)
                 ref_cnt = (unsigned long)handle
                 BUG_ON(ref_cnt == 0)  ---> BUG_ON
      
      The lock held by ext4_convert_inline_data is xattr_sem, but the lock
      held by generic_perform_write is i_rwsem. Therefore, the two locks can
      be concurrent.
      
      To solve above issue, we add inode_lock() for ext4_convert_inline_data().
      At the same time, move ext4_convert_inline_data() in front of
      ext4_punch_hole(), remove similar handling from ext4_punch_hole().
      
      Fixes: 0c8d414f ("ext4: let fallocate handle inline data correctly")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20220428134031.4153381-1-libaokun1@huawei.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      14602353
    • Ye Bin's avatar
      ext4: fix use-after-free in ext4_rename_dir_prepare · 364380c0
      Ye Bin authored
      
      commit 0be698ecbe4471fcad80e81ec6a05001421041b3 upstream.
      
      We got issue as follows:
      EXT4-fs (loop0): mounted filesystem without journal. Opts: ,errors=continue
      ext4_get_first_dir_block: bh->b_data=0xffff88810bee6000 len=34478
      ext4_get_first_dir_block: *parent_de=0xffff88810beee6ae bh->b_data=0xffff88810bee6000
      ext4_rename_dir_prepare: [1] parent_de=0xffff88810beee6ae
      ==================================================================
      BUG: KASAN: use-after-free in ext4_rename_dir_prepare+0x152/0x220
      Read of size 4 at addr ffff88810beee6ae by task rep/1895
      
      CPU: 13 PID: 1895 Comm: rep Not tainted 5.10.0+ #241
      Call Trace:
       dump_stack+0xbe/0xf9
       print_address_description.constprop.0+0x1e/0x220
       kasan_report.cold+0x37/0x7f
       ext4_rename_dir_prepare+0x152/0x220
       ext4_rename+0xf44/0x1ad0
       ext4_rename2+0x11c/0x170
       vfs_rename+0xa84/0x1440
       do_renameat2+0x683/0x8f0
       __x64_sys_renameat+0x53/0x60
       do_syscall_64+0x33/0x40
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x7f45a6fc41c9
      RSP: 002b:00007ffc5a470218 EFLAGS: 00000246 ORIG_RAX: 0000000000000108
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f45a6fc41c9
      RDX: 0000000000000005 RSI: 0000000020000180 RDI: 0000000000000005
      RBP: 00007ffc5a470240 R08: 00007ffc5a470160 R09: 0000000020000080
      R10: 00000000200001c0 R11: 0000000000000246 R12: 0000000000400bb0
      R13: 00007ffc5a470320 R14: 0000000000000000 R15: 0000000000000000
      
      The buggy address belongs to the page:
      page:00000000440015ce refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x10beee
      flags: 0x200000000000000()
      raw: 0200000000000000 ffffea00043ff4c8 ffffea0004325608 0000000000000000
      raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff88810beee580: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff88810beee600: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      >ffff88810beee680: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                                        ^
       ffff88810beee700: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff88810beee780: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      ==================================================================
      Disabling lock debugging due to kernel taint
      ext4_rename_dir_prepare: [2] parent_de->inode=3537895424
      ext4_rename_dir_prepare: [3] dir=0xffff888124170140
      ext4_rename_dir_prepare: [4] ino=2
      ext4_rename_dir_prepare: ent->dir->i_ino=2 parent=-757071872
      
      Reason is first directory entry which 'rec_len' is 34478, then will get illegal
      parent entry. Now, we do not check directory entry after read directory block
      in 'ext4_get_first_dir_block'.
      To solve this issue, check directory entry in 'ext4_get_first_dir_block'.
      
      [ Trigger an ext4_error() instead of just warning if the directory is
        missing a '.' or '..' entry.   Also make sure we return an error code
        if the file system is corrupted.  -TYT ]
      
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20220414025223.4113128-1-yebin10@huawei.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      364380c0
    • Dmitry Monakhov's avatar
      ext4: mark group as trimmed only if it was fully scanned · 3e4b684f
      Dmitry Monakhov authored
      
      commit d63c00ea435a5352f486c259665a4ced60399421 upstream.
      
      Otherwise nonaligned fstrim calls will works inconveniently for iterative
      scanners, for example:
      
      // trim [0,16MB] for group-1, but mark full group as trimmed
      fstrim  -o $((1024*1024*128)) -l $((1024*1024*16)) ./m
      // handle [16MB,16MB] for group-1, do nothing because group already has the flag.
      fstrim  -o $((1024*1024*144)) -l $((1024*1024*16)) ./m
      
      [ Update function documentation for ext4_trim_all_free -- TYT ]
      
      Signed-off-by: default avatarDmitry Monakhov <dmtrmonakhov@yandex-team.ru>
      Link: https://lore.kernel.org/r/1650214995-860245-1-git-send-email-dmtrmonakhov@yandex-team.ru
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3e4b684f
    • Jan Kara's avatar
      bfq: Make sure bfqg for which we are queueing requests is online · 6ee0868b
      Jan Kara authored
      
      commit 075a53b78b815301f8d3dd1ee2cd99554e34f0dd upstream.
      
      Bios queued into BFQ IO scheduler can be associated with a cgroup that
      was already offlined. This may then cause insertion of this bfq_group
      into a service tree. But this bfq_group will get freed as soon as last
      bio associated with it is completed leading to use after free issues for
      service tree users. Fix the problem by making sure we always operate on
      online bfq_group. If the bfq_group associated with the bio is not
      online, we pick the first online parent.
      
      CC: stable@vger.kernel.org
      Fixes: e21b7a0b ("block, bfq: add full hierarchical scheduling and cgroups support")
      Tested-by: default avatar"yukuai (C)" <yukuai3@huawei.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20220401102752.8599-9-jack@suse.cz
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6ee0868b
    • Jan Kara's avatar
      bfq: Get rid of __bio_blkcg() usage · 86defc54
      Jan Kara authored
      
      commit 4e54a2493e582361adc3bfbf06c7d50d19d18837 upstream.
      
      BFQ usage of __bio_blkcg() is a relict from the past. Furthermore if bio
      would not be associated with any blkcg, the usage of __bio_blkcg() in
      BFQ is prone to races with the task being migrated between cgroups as
      __bio_blkcg() calls at different places could return different blkcgs.
      
      Convert BFQ to the new situation where bio->bi_blkg is initialized in
      bio_set_dev() and thus practically always valid. This allows us to save
      blkcg_gq lookup and noticeably simplify the code.
      
      CC: stable@vger.kernel.org
      Fixes: 0fe061b9 ("blkcg: fix ref count issue with bio_blkcg() using task_css")
      Tested-by: default avatar"yukuai (C)" <yukuai3@huawei.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20220401102752.8599-8-jack@suse.cz
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      86defc54
    • Jan Kara's avatar
      bfq: Track whether bfq_group is still online · 54c08ef2
      Jan Kara authored
      
      commit 09f871868080c33992cd6a9b72a5ca49582578fa upstream.
      
      Track whether bfq_group is still online. We cannot rely on
      blkcg_gq->online because that gets cleared only after all policies are
      offlined and we need something that gets updated already under
      bfqd->lock when we are cleaning up our bfq_group to be able to guarantee
      that when we see online bfq_group, it will stay online while we are
      holding bfqd->lock lock.
      
      CC: stable@vger.kernel.org
      Tested-by: default avatar"yukuai (C)" <yukuai3@huawei.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20220401102752.8599-7-jack@suse.cz
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      54c08ef2
Loading