Skip to content
Snippets Groups Projects
  1. Jan 03, 2013
  2. Dec 14, 2012
    • Christoph Paasch's avatar
      inet: Fix kmemleak in tcp_v4/6_syn_recv_sock and dccp_v4/6_request_recv_sock · e337e24d
      Christoph Paasch authored
      
      If in either of the above functions inet_csk_route_child_sock() or
      __inet_inherit_port() fails, the newsk will not be freed:
      
      unreferenced object 0xffff88022e8a92c0 (size 1592):
        comm "softirq", pid 0, jiffies 4294946244 (age 726.160s)
        hex dump (first 32 bytes):
          0a 01 01 01 0a 01 01 02 00 00 00 00 a7 cc 16 00  ................
          02 00 03 01 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff8153d190>] kmemleak_alloc+0x21/0x3e
          [<ffffffff810ab3e7>] kmem_cache_alloc+0xb5/0xc5
          [<ffffffff8149b65b>] sk_prot_alloc.isra.53+0x2b/0xcd
          [<ffffffff8149b784>] sk_clone_lock+0x16/0x21e
          [<ffffffff814d711a>] inet_csk_clone_lock+0x10/0x7b
          [<ffffffff814ebbc3>] tcp_create_openreq_child+0x21/0x481
          [<ffffffff814e8fa5>] tcp_v4_syn_recv_sock+0x3a/0x23b
          [<ffffffff814ec5ba>] tcp_check_req+0x29f/0x416
          [<ffffffff814e8e10>] tcp_v4_do_rcv+0x161/0x2bc
          [<ffffffff814eb917>] tcp_v4_rcv+0x6c9/0x701
          [<ffffffff814cea9f>] ip_local_deliver_finish+0x70/0xc4
          [<ffffffff814cec20>] ip_local_deliver+0x4e/0x7f
          [<ffffffff814ce9f8>] ip_rcv_finish+0x1fc/0x233
          [<ffffffff814cee68>] ip_rcv+0x217/0x267
          [<ffffffff814a7bbe>] __netif_receive_skb+0x49e/0x553
          [<ffffffff814a7cc3>] netif_receive_skb+0x50/0x82
      
      This happens, because sk_clone_lock initializes sk_refcnt to 2, and thus
      a single sock_put() is not enough to free the memory. Additionally, things
      like xfrm, memcg, cookie_values,... may have been initialized.
      We have to free them properly.
      
      This is fixed by forcing a call to tcp_done(), ending up in
      inet_csk_destroy_sock, doing the final sock_put(). tcp_done() is necessary,
      because it ends up doing all the cleanup on xfrm, memcg, cookie_values,
      xfrm,...
      
      Before calling tcp_done, we have to set the socket to SOCK_DEAD, to
      force it entering inet_csk_destroy_sock. To avoid the warning in
      inet_csk_destroy_sock, inet_num has to be set to 0.
      As inet_csk_destroy_sock does a dec on orphan_count, we first have to
      increase it.
      
      Calling tcp_done() allows us to remove the calls to
      tcp_clear_xmit_timer() and tcp_cleanup_congestion_control().
      
      A similar approach is taken for dccp by calling dccp_done().
      
      This is in the kernel since 093d2823 (tproxy: fix hash locking issue
      when using port redirection in __inet_inherit_port()), thus since
      version >= 2.6.37.
      
      Signed-off-by: default avatarChristoph Paasch <christoph.paasch@uclouvain.be>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e337e24d
    • Duan Jiong's avatar
      ipv6: Change skb->data before using icmpv6_notify() to propagate redirect · 093d04d4
      Duan Jiong authored
      
      In function ndisc_redirect_rcv(), the skb->data points to the transport
      header, but function icmpv6_notify() need the skb->data points to the
      inner IP packet. So before using icmpv6_notify() to propagate redirect,
      change skb->data to point the inner IP packet that triggered the sending
      of the Redirect, and introduce struct rd_msg to make it easy.
      
      Signed-off-by: default avatarDuan Jiong <djduanjiong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      093d04d4
  3. Dec 12, 2012
    • YOSHIFUJI Hideaki's avatar
      ndisc: Unexport ndisc_{build,send}_skb(). · fd0ea7db
      YOSHIFUJI Hideaki authored
      
      These symbols were exported for bonding device by commit 305d552a
      ("bonding: send IPv6 neighbor advertisement on failover").
      
      It bacame obsolete by commit 7c899432 ("bonding, ipv4, ipv6, vlan: Handle
      NETDEV_BONDING_FAILOVER like NETDEV_NOTIFY_PEERS") and removed by
      commit 4f5762ec ("bonding: Remove obsolete source file 'bond_ipv6.c'").
      
      Signed-off-by: default avatarYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fd0ea7db
    • Eric Dumazet's avatar
      pkt_sched: avoid requeues if possible · 1abbe139
      Eric Dumazet authored
      
      With BQL being deployed, we can more likely have following behavior :
      
      We dequeue a packet from qdisc in dequeue_skb(), then we realize target
      tx queue is in XOFF state in sch_direct_xmit(), and we have to hold the
      skb into gso_skb for later.
      
      This shows in stats (tc -s qdisc dev eth0) as requeues.
      
      Problem of these requeues is that high priority packets can not be
      dequeued as long as this (possibly low prio and big TSO packet) is not
      removed from gso_skb.
      
      At 1Gbps speed, a full size TSO packet is 500 us of extra latency.
      
      In some cases, we know that all packets dequeued from a qdisc are
      for a particular and known txq :
      
      - If device is non multi queue
      - For all MQ/MQPRIO slave qdiscs
      
      This patch introduces a new qdisc flag, TCQ_F_ONETXQUEUE to mark
      this capability, so that dequeue_skb() is allowed to dequeue a packet
      only if the associated txq is not stopped.
      
      This indeed reduce latencies for high prio packets (or improve fairness
      with sfq/fq_codel), and almost remove qdisc 'requeues'.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1abbe139
  4. Dec 11, 2012
  5. Dec 07, 2012
    • Yuchung Cheng's avatar
      tcp: bug fix Fast Open client retransmission · 93b174ad
      Yuchung Cheng authored
      
      If SYN-ACK partially acks SYN-data, the client retransmits the
      remaining data by tcp_retransmit_skb(). This increments lost recovery
      state variables like tp->retrans_out in Open state. If loss recovery
      happens before the retransmission is acked, it triggers the WARN_ON
      check in tcp_fastretrans_alert(). For example: the client sends
      SYN-data, gets SYN-ACK acking only ISN, retransmits data, sends
      another 4 data packets and get 3 dupacks.
      
      Since the retransmission is not caused by network drop it should not
      update the recovery state variables. Further the server may return a
      smaller MSS than the cached MSS used for SYN-data, so the retranmission
      needs a loop. Otherwise some data will not be retransmitted until timeout
      or other loss recovery events.
      
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93b174ad
    • Thomas Graf's avatar
      sctp: Add RCU protection to assoc->transport_addr_list · 45122ca2
      Thomas Graf authored
      
      peer.transport_addr_list is currently only protected by sk_sock
      which is inpractical to acquire for procfs dumping purposes.
      
      This patch adds RCU protection allowing for the procfs readers to
      enter RCU read-side critical sections.
      
      Modification of the list continues to be serialized via sk_lock.
      
      V2: Use list_del_rcu() in sctp_association_free() to be safe
          Skip transports marked dead when dumping for procfs
      
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarThomas Graf <tgraf@suug.ch>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      45122ca2
  6. Dec 05, 2012
  7. Dec 04, 2012
  8. Dec 03, 2012
  9. Dec 01, 2012
  10. Nov 30, 2012
    • Eric Dumazet's avatar
      net: move inet_dport/inet_num in sock_common · ce43b03e
      Eric Dumazet authored
      
      commit 68835aba (net: optimize INET input path further)
      moved some fields used for tcp/udp sockets lookup in the first cache
      line of struct sock_common.
      
      This patch moves inet_dport/inet_num as well, filling a 32bit hole
      on 64 bit arches and reducing number of cache line misses in lookups.
      
      Also change INET_MATCH()/INET_TW_MATCH() to perform the ports match
      before addresses match, as this check is more discriminant.
      
      Remove the hash check from MATCH() macros because we dont need to
      re validate the hash value after taking a refcount on socket, and
      use likely/unlikely compiler hints, as the sk_hash/hash check
      makes the following conditional tests 100% predicted by cpu.
      
      Introduce skc_addrpair/skc_portpair pair values to better
      document the alignment requirements of the port/addr pairs
      used in the various MATCH() macros, and remove some casts.
      
      The namespace check can also be done at last.
      
      This slightly improves TCP/UDP lookup times.
      
      IP/TCP early demux needs inet->rx_dst_ifindex and
      TCP needs inet->min_ttl, lets group them together in same cache line.
      
      With help from Ben Hutchings & Joe Perches.
      
      Idea of this patch came after Ling Ma proposal to move skc_hash
      to the beginning of struct sock_common, and should allow him
      to submit a final version of his patch. My tests show an improvement
      doing so.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Ling Ma <ling.ma.program@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ce43b03e
    • Rami Rosen's avatar
      rtnelink: remove unused parameter from rtnl_create_link(). · c0713563
      Rami Rosen authored
      
      This patch removes an unused parameter (src_net) from rtnl_create_link()
      method and from the method single invocation, in veth.
      This parameter was used in the past when calling
      ops->get_tx_queues(src_net, tb) in rtnl_create_link().
      The get_tx_queues() member of rtnl_link_ops was replaced by two methods,
      get_num_tx_queues() and get_num_rx_queues(), which do not get any
      parameter. This was done in commit d40156aa by
      Jiri Pirko ("rtnl: allow to specify different num for rx and tx queue count").
      
      Signed-off-by: default avatarRami Rosen <ramirose@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c0713563
    • Johannes Berg's avatar
      cfg80211: fix BSS struct IE access races · 9caf0364
      Johannes Berg authored
      
      When a BSS struct is updated, the IEs are currently
      overwritten or freed. This can lead to races if some
      other CPU is accessing the BSS struct and using the
      IEs concurrently.
      
      Fix this by always allocating the IEs in a new struct
      that holds the data and length and protecting access
      to this new struct with RCU.
      
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      9caf0364
    • Johannes Berg's avatar
      mac80211: remove probe response temporary buffer allocation · b9a9ada1
      Johannes Berg authored
      
      Instead of allocating a temporary buffer to build IEs
      build them right into the SKB.
      
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      b9a9ada1
  11. Nov 27, 2012
  12. Nov 26, 2012
    • Johannes Berg's avatar
      mac80211: support VHT rates in TX info · 8bc83c24
      Johannes Berg authored
      
      To achieve this, limit the number of retries to
      31 (instead of 255) and use the three bits that
      are then free for VHT flags.
      
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      8bc83c24
    • Johannes Berg's avatar
      mac80211: support drivers reporting VHT RX · 5614618e
      Johannes Berg authored
      
      Add support to mac80211 for having drivers report
      received VHT MCS information.
      
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      5614618e
    • Johannes Berg's avatar
      nl80211/cfg80211: add VHT MCS support · db9c64cf
      Johannes Berg authored
      
      Add support for reporting and calculating VHT MCSes.
      
      Note that I'm not completely sure that the bitrate
      calculations are correct, nor that they can't be
      simplified.
      
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      db9c64cf
    • Johannes Berg's avatar
      mac80211: convert to channel definition struct · 4bf88530
      Johannes Berg authored
      
      Convert mac80211 (and where necessary, some drivers a
      little bit) to the new channel definition struct.
      
      This will allow extending mac80211 for VHT, which is
      currently restricted to channel contexts since there
      are no drivers using that which makes it easier. As
      I also don't care about VHT for drivers not using the
      channel context API, I won't convert the previous API
      to VHT support.
      
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      4bf88530
    • Johannes Berg's avatar
      nl80211/cfg80211: support VHT channel configuration · 3d9d1d66
      Johannes Berg authored
      
      Change nl80211 to support specifying a VHT (or HT)
      using the control channel frequency (as before) and
      new attributes for the channel width and first and
      second center frequency. The old channel type is of
      course still supported for HT.
      
      Also change the cfg80211 channel definition struct
      to support these by adding the relevant fields to
      it (and removing the _type field.)
      
      This also adds new helper functions:
       - cfg80211_chandef_create to create a channel def
         struct given the control channel and channel type,
       - cfg80211_chandef_identical to check if two channel
         definitions are identical
       - cfg80211_chandef_compatible to check if the given
         channel definitions are compatible, and return the
         wider of the two
      
      This isn't entirely complete, but that doesn't matter
      until we have a driver using it. In particular, it's
      missing
       - regulatory checks on the usable bandwidth (if that
         even makes sense)
       - regulatory TX power (database can't deal with it)
       - a proper channel compatibility calculation for the
         new channel types
      
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      3d9d1d66
    • Johannes Berg's avatar
      cfg80211: pass a channel definition struct · 683b6d3b
      Johannes Berg authored
      
      Instead of passing a channel pointer and channel type
      to all functions and driver methods, pass a new channel
      definition struct. Right now, this struct contains just
      the control channel and channel type, but for VHT this
      will change.
      
      Also, add a small inline cfg80211_get_chandef_type() so
      that drivers don't need to use the _type field of the
      new structure all the time, which will change.
      
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      683b6d3b
    • Johannes Berg's avatar
      cfg80211: remove remain-on-channel channel type · 42d97a59
      Johannes Berg authored
      
      As mwifiex (and mac80211 in the software case) are the
      only drivers actually implementing remain-on-channel
      with channel type, userspace can't be relying on it.
      This is the case, as it's used only for P2P operations
      right now.
      
      Rather than adding a flag to tell userspace whether or
      not it can actually rely on it, simplify all the code
      by removing the ability to use different channel types.
      Leave only the validation of the attribute, so that if
      we extend it again later (with the needed capability
      flag), it can't break userspace sending invalid data.
      
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      42d97a59
    • Arend van Spriel's avatar
      cfg80211: change function signature of cfg80211_get_p2p_attr() · c216e641
      Arend van Spriel authored
      
      The function cfg80211_get_p2p_attr() can fail and returns
      a negative error code. However, the return type is unsigned
      int. The largest positive number is determined by desired_len
      variable in the function, which is u16. So changing the return
      type to int to allow easy error checking. Also change the type
      for the attribute to enum for improved type checking.
      
      Signed-off-by: default avatarArend van Spriel <arend@broadcom.com>
      [fix indentation, don't use u8 attr variable]
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      c216e641
  13. Nov 23, 2012
Loading