Skip to content
Snippets Groups Projects
  1. Oct 17, 2012
  2. Oct 16, 2012
  3. Oct 12, 2012
  4. Oct 11, 2012
  5. Oct 09, 2012
    • jeff.liu's avatar
      RDS: fix rds-ping spinlock recursion · 5175a5e7
      jeff.liu authored
      
      This is the revised patch for fixing rds-ping spinlock recursion
      according to Venkat's suggestions.
      
      RDS ping/pong over TCP feature has been broken for years(2.6.39 to
      3.6.0) since we have to set TCP cork and call kernel_sendmsg() between
      ping/pong which both need to lock "struct sock *sk". However, this
      lock has already been hold before rds_tcp_data_ready() callback is
      triggerred. As a result, we always facing spinlock resursion which
      would resulting in system panic.
      
      Given that RDS ping is only used to test the connectivity and not for
      serious performance measurements, we can queue the pong transmit to
      rds_wq as a delayed response.
      
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      CC: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
      CC: David S. Miller <davem@davemloft.net>
      CC: James Morris <james.l.morris@oracle.com>
      Signed-off-by: default avatarJie Liu <jeff.liu@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5175a5e7
    • Michel Lespinasse's avatar
      rbtree: empty nodes have no color · 4c199a93
      Michel Lespinasse authored
      
      Empty nodes have no color.  We can make use of this property to simplify
      the code emitted by the RB_EMPTY_NODE and RB_CLEAR_NODE macros.  Also,
      we can get rid of the rb_init_node function which had been introduced by
      commit 88d19cf3 ("timers: Add rb_init_node() to allow for stack
      allocated rb nodes") to avoid some issue with the empty node's color not
      being initialized.
      
      I'm not sure what the RB_EMPTY_NODE checks in rb_prev() / rb_next() are
      doing there, though.  axboe introduced them in commit 10fd48f2
      ("rbtree: fixed reversed RB_EMPTY_NODE and rb_next/prev").  The way I
      see it, the 'empty node' abstraction is only used by rbtree users to
      flag nodes that they haven't inserted in any rbtree, so asking the
      predecessor or successor of such nodes doesn't make any sense.
      
      One final rb_init_node() caller was recently added in sysctl code to
      implement faster sysctl name lookups.  This code doesn't make use of
      RB_EMPTY_NODE at all, and from what I could see it only called
      rb_init_node() under the mistaken assumption that such initialization was
      required before node insertion.
      
      [sfr@canb.auug.org.au: fix net/ceph/osd_client.c build]
      Signed-off-by: default avatarMichel Lespinasse <walken@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Acked-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Daniel Santos <daniel.santos@pobox.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4c199a93
  6. Oct 08, 2012
    • Julian Anastasov's avatar
      ipvs: fix ARP resolving for direct routing mode · ad4d3ef8
      Julian Anastasov authored
      
      After the change "Make neigh lookups directly in output packet path"
      (commit a263b309) IPVS can not reach the real server for DR mode
      because we resolve the destination address from IP header, not from
      route neighbour. Use the new FLOWI_FLAG_KNOWN_NH flag to request
      output routes with known nexthop, so that it has preference
      on resolving.
      
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad4d3ef8
    • Julian Anastasov's avatar
      ipv4: Add FLOWI_FLAG_KNOWN_NH · c92b9655
      Julian Anastasov authored
      
      Add flag to request that output route should be
      returned with known rt_gateway, in case we want to use
      it as nexthop for neighbour resolving.
      
      	The returned route can be cached as follows:
      
      - in NH exception: because the cached routes are not shared
      	with other destinations
      - in FIB NH: when using gateway because all destinations for
      	NH share same gateway
      
      	As last option, to return rt_gateway!=0 we have to
      set DST_NOCACHE.
      
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c92b9655
    • Julian Anastasov's avatar
      ipv4: introduce rt_uses_gateway · 155e8336
      Julian Anastasov authored
      
      Add new flag to remember when route is via gateway.
      We will use it to allow rt_gateway to contain address of
      directly connected host for the cases when DST_NOCACHE is
      used or when the NH exception caches per-destination route
      without DST_NOCACHE flag, i.e. when routes are not used for
      other destinations. By this way we force the neighbour
      resolving to work with the routed destination but we
      can use different address in the packet, feature needed
      for IPVS-DR where original packet for virtual IP is routed
      via route to real IP.
      
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      155e8336
    • Julian Anastasov's avatar
      ipv4: make sure nh_pcpu_rth_output is always allocated · f8a17175
      Julian Anastasov authored
      
      Avoid checking nh_pcpu_rth_output in fast path,
      abort fib_info creation on alloc_percpu failure.
      
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8a17175
    • Julian Anastasov's avatar
      ipv4: fix forwarding for strict source routes · e0adef0f
      Julian Anastasov authored
      
      After the change "Adjust semantics of rt->rt_gateway"
      (commit f8126f1d) rt_gateway can be 0 but ip_forward() compares
      it directly with nexthop. What we want here is to check if traffic
      is to directly connected nexthop and to fail if using gateway.
      
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e0adef0f
    • Julian Anastasov's avatar
      ipv4: fix sending of redirects · e81da0e1
      Julian Anastasov authored
      
      After "Cache input routes in fib_info nexthops" (commit
      d2d68ba9) and "Elide fib_validate_source() completely when possible"
      (commit 7a9bc9b8) we can not send ICMP redirects. It seems we
      should not cache the RTCF_DOREDIRECT flag in nh_rth_input because
      the same fib_info can be used for traffic that is not redirected,
      eg. from other input devices or from sources that are not in same subnet.
      
      	As result, we have to disable the caching of RTCF_DOREDIRECT
      flag and to force source validation for the case when forwarding
      traffic to the input device. If traffic comes from directly connected
      source we allow redirection as it was done before both changes.
      
      	Avoid setting RTCF_DOREDIRECT if IN_DEV_TX_REDIRECTS
      is disabled, this can avoid source address validation and to
      help caching the routes.
      
      	After the change "Adjust semantics of rt->rt_gateway"
      (commit f8126f1d) we should make sure our ICMP_REDIR_HOST messages
      contain daddr instead of 0.0.0.0 when target is directly connected.
      
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e81da0e1
    • Eric Dumazet's avatar
      ipv6: gro: fix PV6_GRO_CB(skb)->proto problem · 86347245
      Eric Dumazet authored
      
      It seems IPV6_GRO_CB(skb)->proto can be destroyed in skb_gro_receive()
      if a new skb is allocated (to serve as an anchor for frag_list)
      
      We copy NAPI_GRO_CB() only (not the IPV6 specific part) in :
      
      *NAPI_GRO_CB(nskb) = *NAPI_GRO_CB(p);
      
      So we leave IPV6_GRO_CB(nskb)->proto to 0 (fresh skb allocation) instead
      of IPPROTO_TCP (6)
      
      ipv6_gro_complete() isnt able to call ops->gro_complete()
      [ tcp6_gro_complete() ]
      
      Fix this by moving proto in NAPI_GRO_CB() and getting rid of
      IPV6_GRO_CB
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86347245
    • Florian Zumbiehl's avatar
      vlan: don't deliver frames for unknown vlans to protocols · 48cc32d3
      Florian Zumbiehl authored
      
      6a32e4f9 made the vlan code skip marking
      vlan-tagged frames for not locally configured vlans as PACKET_OTHERHOST if
      there was an rx_handler, as the rx_handler could cause the frame to be received
      on a different (virtual) vlan-capable interface where that vlan might be
      configured.
      
      As rx_handlers do not necessarily return RX_HANDLER_ANOTHER, this could cause
      frames for unknown vlans to be delivered to the protocol stack as if they had
      been received untagged.
      
      For example, if an ipv6 router advertisement that's tagged for a locally not
      configured vlan is received on an interface with macvlan interfaces attached,
      macvlan's rx_handler returns RX_HANDLER_PASS after delivering the frame to the
      macvlan interfaces, which caused it to be passed to the protocol stack, leading
      to ipv6 addresses for the announced prefix being configured even though those
      are completely unusable on the underlying interface.
      
      The fix moves marking as PACKET_OTHERHOST after the rx_handler so the
      rx_handler, if there is one, sees the frame unchanged, but afterwards,
      before the frame is delivered to the protocol stack, it gets marked whether
      there is an rx_handler or not.
      
      Signed-off-by: default avatarFlorian Zumbiehl <florz@florz.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      48cc32d3
    • Felix Fietkau's avatar
      mac80211: use ieee80211_free_txskb to fix possible skb leaks · c3e7724b
      Felix Fietkau authored
      
      A few places free skbs using dev_kfree_skb even though they're called
      after ieee80211_subif_start_xmit might have cloned it for tracking tx
      status. Use ieee80211_free_txskb here to prevent skb leaks.
      
      Signed-off-by: default avatarFelix Fietkau <nbd@openwrt.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      c3e7724b
    • Thomas Pedersen's avatar
      mac80211: call drv_get_tsf() in sleepable context · 55fabefe
      Thomas Pedersen authored
      
      The call to drv_get/set_tsf() was put on the workqueue to perform tsf
      adjustments since that function might sleep. However it ended up inside
      a spinlock, whose critical section must be atomic. Do tsf adjustment
      outside the spinlock instead, and get rid of a warning.
      
      Signed-off-by: default avatarThomas Pedersen <thomas@cozybit.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      55fabefe
    • Eric Dumazet's avatar
      net: gro: selective flush of packets · 2e71a6f8
      Eric Dumazet authored
      
      Current GRO can hold packets in gro_list for almost unlimited
      time, in case napi->poll() handler consumes its budget over and over.
      
      In this case, napi_complete()/napi_gro_flush() are not called.
      
      Another problem is that gro_list is flushed in non friendly way :
      We scan the list and complete packets in the reverse order.
      (youngest packets first, oldest packets last)
      This defeats priorities that sender could have cooked.
      
      Since GRO currently only store TCP packets, we dont really notice the
      bug because of retransmits, but this behavior can add unexpected
      latencies, particularly on mice flows clamped by elephant flows.
      
      This patch makes sure no packet can stay more than 1 ms in queue, and
      only in stress situations.
      
      It also complete packets in the right order to minimize latencies.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Jesse Gross <jesse@nicira.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2e71a6f8
    • Steffen Klassert's avatar
      ipv4: Don't report stale pmtu values to userspace · ee9a8f7a
      Steffen Klassert authored
      
      We report cached pmtu values even if they are already expired.
      Change this to not report these values after they are expired
      and fix a race in the expire time calculation, as suggested by
      Eric Dumazet.
      
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ee9a8f7a
    • Steffen Klassert's avatar
      ipv4: Don't create nh exeption when the device mtu is smaller than the reported pmtu · 7f92d334
      Steffen Klassert authored
      
      When a local tool like tracepath tries to send packets bigger than
      the device mtu, we create a nh exeption and set the pmtu to device
      mtu. The device mtu does not expire, so check if the device mtu is
      smaller than the reported pmtu and don't crerate a nh exeption in
      that case.
      
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7f92d334
Loading