Skip to content
Snippets Groups Projects
  • Filipe Manana's avatar
    Btrfs: fix inode eviction infinite loop after cloning into it · ccccf3d6
    Filipe Manana authored
    
    If we attempt to clone a 0 length region into a file we can end up
    inserting a range in the inode's extent_io tree with a start offset
    that is greater then the end offset, which triggers immediately the
    following warning:
    
    [ 3914.619057] WARNING: CPU: 17 PID: 4199 at fs/btrfs/extent_io.c:435 insert_state+0x4b/0x10b [btrfs]()
    [ 3914.620886] BTRFS: end < start 4095 4096
    (...)
    [ 3914.638093] Call Trace:
    [ 3914.638636]  [<ffffffff81425fd9>] dump_stack+0x4c/0x65
    [ 3914.639620]  [<ffffffff81045390>] warn_slowpath_common+0xa1/0xbb
    [ 3914.640789]  [<ffffffffa03ca44f>] ? insert_state+0x4b/0x10b [btrfs]
    [ 3914.642041]  [<ffffffff810453f0>] warn_slowpath_fmt+0x46/0x48
    [ 3914.643236]  [<ffffffffa03ca44f>] insert_state+0x4b/0x10b [btrfs]
    [ 3914.644441]  [<ffffffffa03ca729>] __set_extent_bit+0x107/0x3f4 [btrfs]
    [ 3914.645711]  [<ffffffffa03cb256>] lock_extent_bits+0x65/0x1bf [btrfs]
    [ 3914.646914]  [<ffffffff8142b2fb>] ? _raw_spin_unlock+0x28/0x33
    [ 3914.648058]  [<ffffffffa03cbac4>] ? test_range_bit+0xcc/0xde [btrfs]
    [ 3914.650105]  [<ffffffffa03cb3c3>] lock_extent+0x13/0x15 [btrfs]
    [ 3914.651361]  [<ffffffffa03db39e>] lock_extent_range+0x3d/0xcd [btrfs]
    [ 3914.652761]  [<ffffffffa03de1fe>] btrfs_ioctl_clone+0x278/0x388 [btrfs]
    [ 3914.654128]  [<ffffffff811226dd>] ? might_fault+0x58/0xb5
    [ 3914.655320]  [<ffffffffa03e0909>] btrfs_ioctl+0xb51/0x2195 [btrfs]
    (...)
    [ 3914.669271] ---[ end trace 14843d3e2e622fc1 ]---
    
    This later makes the inode eviction handler enter an infinite loop that
    keeps dumping the following warning over and over:
    
    [ 3915.117629] WARNING: CPU: 22 PID: 4228 at fs/btrfs/extent_io.c:435 insert_state+0x4b/0x10b [btrfs]()
    [ 3915.119913] BTRFS: end < start 4095 4096
    (...)
    [ 3915.137394] Call Trace:
    [ 3915.137913]  [<ffffffff81425fd9>] dump_stack+0x4c/0x65
    [ 3915.139154]  [<ffffffff81045390>] warn_slowpath_common+0xa1/0xbb
    [ 3915.140316]  [<ffffffffa03ca44f>] ? insert_state+0x4b/0x10b [btrfs]
    [ 3915.141505]  [<ffffffff810453f0>] warn_slowpath_fmt+0x46/0x48
    [ 3915.142709]  [<ffffffffa03ca44f>] insert_state+0x4b/0x10b [btrfs]
    [ 3915.143849]  [<ffffffffa03ca729>] __set_extent_bit+0x107/0x3f4 [btrfs]
    [ 3915.145120]  [<ffffffffa038c1e3>] ? btrfs_kill_super+0x17/0x23 [btrfs]
    [ 3915.146352]  [<ffffffff811548f6>] ? deactivate_locked_super+0x3b/0x50
    [ 3915.147565]  [<ffffffffa03cb256>] lock_extent_bits+0x65/0x1bf [btrfs]
    [ 3915.148785]  [<ffffffff8142b7e2>] ? _raw_write_unlock+0x28/0x33
    [ 3915.149931]  [<ffffffffa03bc325>] btrfs_evict_inode+0x196/0x482 [btrfs]
    [ 3915.151154]  [<ffffffff81168904>] evict+0xa0/0x148
    [ 3915.152094]  [<ffffffff811689e5>] dispose_list+0x39/0x43
    [ 3915.153081]  [<ffffffff81169564>] evict_inodes+0xdc/0xeb
    [ 3915.154062]  [<ffffffff81154418>] generic_shutdown_super+0x49/0xef
    [ 3915.155193]  [<ffffffff811546d1>] kill_anon_super+0x13/0x1e
    [ 3915.156274]  [<ffffffffa038c1e3>] btrfs_kill_super+0x17/0x23 [btrfs]
    (...)
    [ 3915.167404] ---[ end trace 14843d3e2e622fc2 ]---
    
    So just bail out of the clone ioctl if the length of the region to clone
    is zero, without locking any extent range, in order to prevent this issue
    (same behaviour as a pwrite with a 0 length for example).
    
    This is trivial to reproduce. For example, the steps for the test I just
    made for fstests:
    
      mkfs.btrfs -f SCRATCH_DEV
      mount SCRATCH_DEV $SCRATCH_MNT
    
      touch $SCRATCH_MNT/foo
      touch $SCRATCH_MNT/bar
    
      $CLONER_PROG -s 0 -d 4096 -l 0 $SCRATCH_MNT/foo $SCRATCH_MNT/bar
      umount $SCRATCH_MNT
    
    A test case for fstests follows soon.
    
    CC: <stable@vger.kernel.org>
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Reviewed-by: default avatarOmar Sandoval <osandov@osandov.com>
    Signed-off-by: default avatarChris Mason <clm@fb.com>
    ccccf3d6
Code owners
Assign users and groups as approvers for specific file changes. Learn more.
ioctl.c 128.33 KiB
/*
 * Copyright (C) 2007 Oracle.  All rights reserved.
 *
 * This program is free software; you can redistribute it and/or
 * modify it under the terms of the GNU General Public
 * License v2 as published by the Free Software Foundation.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 * General Public License for more details.
 *
 * You should have received a copy of the GNU General Public
 * License along with this program; if not, write to the
 * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
 * Boston, MA 021110-1307, USA.
 */

#include <linux/kernel.h>
#include <linux/bio.h>
#include <linux/buffer_head.h>
#include <linux/file.h>
#include <linux/fs.h>
#include <linux/fsnotify.h>
#include <linux/pagemap.h>
#include <linux/highmem.h>
#include <linux/time.h>
#include <linux/init.h>
#include <linux/string.h>
#include <linux/backing-dev.h>
#include <linux/mount.h>
#include <linux/mpage.h>
#include <linux/namei.h>
#include <linux/swap.h>
#include <linux/writeback.h>
#include <linux/statfs.h>
#include <linux/compat.h>
#include <linux/bit_spinlock.h>
#include <linux/security.h>
#include <linux/xattr.h>
#include <linux/vmalloc.h>
#include <linux/slab.h>
#include <linux/blkdev.h>
#include <linux/uuid.h>
#include <linux/btrfs.h>
#include <linux/uaccess.h>
#include "ctree.h"
#include "disk-io.h"
#include "transaction.h"
#include "btrfs_inode.h"
#include "print-tree.h"
#include "volumes.h"
#include "locking.h"
#include "inode-map.h"
#include "backref.h"
#include "rcu-string.h"
#include "send.h"
#include "dev-replace.h"
#include "props.h"
#include "sysfs.h"
#include "qgroup.h"

#ifdef CONFIG_64BIT
/* If we have a 32-bit userspace and 64-bit kernel, then the UAPI
 * structures are incorrect, as the timespec structure from userspace
 * is 4 bytes too small. We define these alternatives here to teach
 * the kernel about the 32-bit struct packing.
 */
struct btrfs_ioctl_timespec_32 {
	__u64 sec;