- May 04, 2022
-
-
Vladimir Oltean authored
Bring this driver-specific selftest output in line with the other selftests. Before: Testing VLAN pop.. OK Testing VLAN push.. OK Testing ingress VLAN modification.. OK Testing egress VLAN modification.. OK Testing frame prioritization.. OK After: TEST: VLAN pop [ OK ] TEST: VLAN push [ OK ] TEST: Ingress VLAN modification [ OK ] TEST: Egress VLAN modification [ OK ] TEST: Frame prioritization [ OK ] Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Joachim Wiberg authored
Extend tcpdump_start() & C:o to handle multiple instances. Useful when observing bridge operation, e.g., unicast learning/flooding, and any case of multicast distribution (to these ports but not that one ...). This means the interface argument is now a mandatory argument to all tcpdump_*() functions, hence the changes to the ocelot flower test. Signed-off-by:
Joachim Wiberg <troglobit@gmail.com> Reviewed-by:
Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 6182c5c5) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Joachim Wiberg authored
For some use-cases we may want to change the tcpdump flags used in tcpdump_start(). For instance, observing interfaces without the PROMISC flag, e.g. to see what's really being forwarded to the bridge interface. Signed-off-by:
Joachim Wiberg <troglobit@gmail.com> Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit fe32dffd) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Joachim Wiberg authored
Boiler plate for testing static mdb entries. This first test verifies adding and removing host mdb entries for all supported types: IPv4, IPv6, and MAC multicast. Signed-off-by:
Joachim Wiberg <troglobit@gmail.com> Signed-off-by:
Paolo Abeni <pabeni@redhat.com> (cherry picked from commit 50fe062c) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com> Conflicts in tools/testing/selftests/net/forwarding/Makefile with commit b2b681a4 ("selftests: forwarding: tests of locked port feature") which we did not backport.
-
Vladimir Oltean authored
As discussed here with Ido Schimmel: https://patchwork.kernel.org/project/netdevbpf/patch/20220224102908.5255-2-jianbol@nvidia.com/ the default conform-exceed action is "reclassify", for a reason we don't really understand. The point is that hardware can't offload that police action, so not specifying "conform-exceed" was always wrong, even though the command used to work in hardware (but not in software) until the kernel started adding validation for it. Fix the command used by the selftest by making the policer drop on exceed, and pass the packet to the next action (goto) on conform. Fixes: 8cd6b020 ("selftests: ocelot: add some example VCAP IS1, IS2 and ES0 tc offloads") Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by:
Ido Schimmel <idosch@nvidia.com>
-
Vladimir Oltean authored
This has been validated on the Ocelot/Felix switch family (NXP LS1028A) and should be relevant to any switch driver that offloads the tc-flower and/or tc-matchall actions trap, drop, accept, mirred, for which DSA has operations. TEST: gact drop and ok (skip_hw) [ OK ] TEST: mirred egress flower redirect (skip_hw) [ OK ] TEST: mirred egress flower mirror (skip_hw) [ OK ] TEST: mirred egress matchall mirror (skip_hw) [ OK ] TEST: mirred_egress_to_ingress (skip_hw) [ OK ] TEST: gact drop and ok (skip_sw) [ OK ] TEST: mirred egress flower redirect (skip_sw) [ OK ] TEST: mirred egress flower mirror (skip_sw) [ OK ] TEST: mirred egress matchall mirror (skip_sw) [ OK ] TEST: trap (skip_sw) [ OK ] TEST: mirred_egress_to_ingress (skip_sw) [ OK ] Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
The host interfaces $h1 and $h2 don't have to be switchdev interfaces, but due to the fact that we pass $tcflags which may have the value of "skip_sw", we force $h2 to offload a drop rule for dst_ip, something which it may not be able to do. The selftest only wants to verify the hit count of this rule as a means of figuring out whether the packet was received, so remove the $tcflags for it and let it be done in software. Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
The "ok" tc action is useful when placed in front of a more generic filter to exclude some more specific rules from matching it. The ocelot switches can offload this tc action by creating an empty action vector (no _ENA fields set to 1). This makes sense for all of VCAP IS1, IS2 and ES0 (but not for PSFP). Add support for this action. Note that this makes the gact_drop_and_ok_test() selftest pass, where "action ok" is used in front of an "action drop" rule, both offloaded to VCAP IS2. Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
OCELOT_POLICER_DISCARD helps "kill dropped packets dead" since a PERMIT/DENY mask mode with a port mask of 0 isn't enough to stop the CPU port from receiving packets removed from the forwarding path. The hardcoded initialization done for it in ocelot_vcap_init() is confusing. All we need from it is to have a rate and a burst size of 0. Reuse qos_policer_conf_set() for that purpose. Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
The "port" argument is used for nothing else except printing on the error path. Print errors on behalf of the policer index, which is less confusing anyway. Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
Unify the code paths for adding to an empty list and to a list with elements by keeping a "pos" list_head element that indicates where to insert. Initialize "pos" with the list head itself in case list_for_each_entry() doesn't iterate over any element. Note that list_for_each_safe() isn't needed because no element is removed from the list while iterating. Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
This makes no functional difference but helps in minimizing the delta for a future change. Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
list_add(..., pos->prev) and list_add_tail(..., pos) are equivalent, use the later form to unify with the case where the list is empty later. Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
Given the following order of operations: (1) we add filter A using tc-flower (2) we send a packet that matches it (3) we read the filter's statistics to find a hit count of 1 (4) we add a second filter B with a higher preference than A, and A moves one position to the right to make room in the TCAM for it (5) we send another packet, and this matches the second filter B (6) we read the filter statistics again. When this happens, the hit count of filter A is 2 and of filter B is 1, despite a single packet having matched each filter. Furthermore, in an alternate history, reading the filter stats a second time between steps (3) and (4) makes the hit count of filter A remain at 1 after step (6), as expected. The reason why this happens has to do with the filter->stats.pkts field, which is written to hardware through the call path below: vcap_entry_set / | \ / | \ / | \ / | \ es0_entry_set is1_entry_set is2_entry_set \ | / \ | / \ | / vcap_data_set(data.counter, ...) The primary role of filter->stats.pkts is to transport the filter hit counters from the last readout all the way from vcap_entry_get() -> ocelot_vcap_filter_stats_update() -> ocelot_cls_flower_stats(). The reason why vcap_entry_set() writes it to hardware is so that the counters (saturating and having a limited bit width) are cleared after each user space readout. The writing of filter->stats.pkts to hardware during the TCAM entry movement procedure is an unintentional consequence of the code design, because the hit count isn't up to date at this point. So at step (4), when filter A is moved by ocelot_vcap_filter_add() to make room for filter B, the hardware hit count is 0 (no packet matched on it in the meantime), but filter->stats.pkts is 1, because the last readout saw the earlier packet. The movement procedure programs the old hit count back to hardware, so this creates the impression to user space that more packets have been matched than they really were. The bug can be seen when running the gact_drop_and_ok_test() from the tc_actions.sh selftest. Fix the issue by reading back the hit count to tmp->stats.pkts before migrating the VCAP filter. Sure, this is a best-effort technique, since the packets that hit the rule between vcap_entry_get() and vcap_entry_set() won't be counted, but at least it allows the counters to be reliably used for selftests where the traffic is under control. The vcap_entry_get() name is a bit unintuitive, but it only reads back the counter portion of the TCAM entry, not the entire entry. The index from which we retrieve the counter is also a bit unintuitive (i - 1 during add, i + 1 during del), but this is the way in which TCAM entry movement works. The "entry index" isn't a stored integer for a TCAM filter, instead it is dynamically computed by ocelot_vcap_block_get_filter_index() based on the entry's position in the &block->rules list. That position (as well as block->count) is automatically updated by ocelot_vcap_filter_add_to_block() on add, and by ocelot_vcap_block_remove_filter() on del. So "i" is the new filter index, and "i - 1" or "i + 1" respectively are the old addresses of that TCAM entry (we only support installing/deleting one filter at a time). Fixes: b5962294 ("net: mscc: ocelot: Add support for tcam") Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
Once the CPU port was added to the destination port mask of a packet, it can never be cleared, so even packets marked as dropped by the MASK_MODE of a VCAP IS2 filter will still reach it. This is why we need the OCELOT_POLICER_DISCARD to "kill dropped packets dead" and make software stop seeing them. We disallow policer rules from being put on any other chain than the one for the first lookup, but we don't do this for "drop" rules, although we should. This change is merely ascertaining that the rules dont't (completely) work and letting the user know. The blamed commit is the one that introduced the multi-chain architecture in ocelot. Prior to that, we should have always offloaded the filters to VCAP IS2 lookup 0, where they did work. Fixes: 1397a2eb ("net: mscc: ocelot: create TCAM skeleton from tc filter chains") Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
The VCAP IS2 TCAM is looked up twice per packet, and each filter can be configured to only match during the first, second lookup, or both, or none. The blamed commit wrote the code for making VCAP IS2 filters match only on the given lookup. But right below that code, there was another line that explicitly made the lookup a "don't care", and this is overwriting the lookup we've selected. So the code had no effect. Some of the more noticeable effects of having filters match on both lookups: - in "tc -s filter show dev swp0 ingress", we see each packet matching a VCAP IS2 filter counted twice. This throws off scripts such as tools/testing/selftests/net/forwarding/tc_actions.sh and makes them fail. - a "tc-drop" action offloaded to VCAP IS2 needs a policer as well, because once the CPU port becomes a member of the destination port mask of a packet, nothing removes it, not even a PERMIT/DENY mask mode with a port mask of 0. But VCAP IS2 rules with the POLICE_ENA bit in the action vector can only appear in the first lookup. What happens when a filter matches both lookups is that the action vector is combined, and this makes the POLICE_ENA bit ineffective, since the last lookup in which it has appeared is the second one. In other words, "tc-drop" actions do not drop packets for the CPU port, dropped packets are still seen by software unless there was an FDB entry that directed those packets to some other place different from the CPU. The last bit used to work, because in the initial commit b5962294 ("net: mscc: ocelot: Add support for tcam"), we were writing the FIRST field of the VCAP IS2 half key with a 1, not with a "don't care". The change to "don't care" was made inadvertently by me in commit c1c3993e ("net: mscc: ocelot: generalize existing code for VCAP"), which I just realized, and which needs a separate fix from this one, for "stable" kernels that lack the commit blamed below. Fixes: 226e9cd8 ("net: mscc: ocelot: only install TCAM entries into a specific lookup and PAG") Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
ocelot_vcap_filter_del() works by moving the next filters over the current one, and then deleting the last filter by calling vcap_entry_set() with a del_filter which was specially created by memsetting its memory to zeroes. vcap_entry_set() then programs this to the TCAM and action RAM via the cache registers. The problem is that vcap_entry_set() is a dispatch function which looks at del_filter->block_id. But since del_filter is zeroized memory, the block_id is 0, or otherwise said, VCAP_ES0. So practically, what we do is delete the entry at the same TCAM index from VCAP ES0 instead of IS1 or IS2. The code was not always like this. vcap_entry_set() used to simply be is2_entry_set(), and then, the logic used to work. Restore the functionality by populating the block_id of the del_filter based on the VCAP block of the filter that we're deleting. This makes vcap_entry_set() know what to do. Fixes: 1397a2eb ("net: mscc: ocelot: create TCAM skeleton from tc filter chains") Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
The error path of ocelot_flower_parse() removes a VCAP filter from ocelot->traps, but the main deletion path - ocelot_vcap_filter_del() - does not. So functions such as felix_update_trapping_destinations() can still access the freed VCAP filter via ocelot->traps. Fix this bug by removing the filter from ocelot->traps when it gets deleted. Fixes: e42bd4ed ("net: mscc: ocelot: keep traps in a list") Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
Since the blamed commit, VCAP filters can appear on more than one list. If their action is "trap", they are chained on ocelot->traps via filter->trap_list. Consequently, when we free a VCAP filter, we must remove it from all lists it is a member of, including ocelot->traps. Normally, conditionally removing an element from a list (depending on whether it is present or not) involves traversing the list, but we already have a reference to the element, so that isn't really necessary. Moreover, the operation "list_del(&filter->trap_list)" operation is fundamentally the same regardless of whether we've iterated through the list or just happened to have the element. So I thought it would be ok to check whether the element has been added to a list by calling list_empty(). However, this does not do the correct thing. list_empty() checks whether "head->next == head", but in our case, head->next == head->prev == NULL, and head != NULL. This makes us proceed to call list_del(), which modifies the prev pointer of the next element, and the next pointer of the prev element. But the next and prev elements are NULL, so we dereference those pointers and die. It would appear that list_empty() is not the function to use to detect that condition. But if we had previously called INIT_LIST_HEAD() on &filter->trap_list, then we could use list_empty() to denote whether we are members of a list (any list). Although the more "natural" thing seems to be to iterate through ocelot->traps and only remove the filter from the list if it was a member of it, it seems pointless to do that. So fix the bug by calling INIT_LIST_HEAD() on the non-head element. Fixes: e42bd4ed ("net: mscc: ocelot: keep traps in a list") Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
Gain support for port mirroring using tc-matchall by forwarding the calls to the ocelot switch library. Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by:
Jakub Kicinski <kuba@kernel.org> (cherry picked from commit 5e497497) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
Drivers might have error messages to propagate to user space, most common being that they support a single mirror port. Propagate the netlink extack so that they can inform user space in a verbal way of their limitations. Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by:
Jakub Kicinski <kuba@kernel.org> (cherry picked from commit 0148bb50) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
Per-flow mirroring with the VCAP IS2 TCAM (in itself handled as an offload for tc-flower) is done by setting the MIRROR_ENA bit from the action vector of the filter. The packet is mirrored to the port mask configured in the ANA:ANA:MIRRORPORTS register (the same port mask as the destinations for port-based mirroring). Functionality was tested with: tc qdisc add dev swp3 clsact tc filter add dev swp3 ingress protocol ip \ flower skip_sw ip_proto icmp \ action mirred egress mirror dev swp1 and pinging through swp3, while seeing that the ICMP replies are mirrored towards swp1. Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by:
Jakub Kicinski <kuba@kernel.org> (cherry picked from commit f2a0e216) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
Some VCAP filters utilize resources which are global to the switch, like for example VCAP IS2 policers take an index into a global policer pool. In commit c9a7fe12 ("net: mscc: ocelot: add action of police on vcap_is2"), Xiaoliang expressed this by hooking into the low-level ocelot_vcap_filter_add_to_block() and ocelot_vcap_block_remove_filter() functions, and allocating/freeing the policers from there. Evaluating the code, there probably isn't a better place, but we'll need to do something similar for the mirror ports, and the code will start to look even more hacked up than it is right now. Create two ocelot_vcap_filter_{add,del}_aux_resources() functions to contain the madness, and pollute less the body of other functions such as ocelot_vcap_filter_add_to_block(). Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by:
Jakub Kicinski <kuba@kernel.org> (cherry picked from commit c3d427ea) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
Ocelot switches perform port-based ingress mirroring if ANA:PORT:PORT_CFG field SRC_MIRROR_ENA is set, and egress mirroring if the port is in ANA:ANA:EMIRRORPORTS. Both ingress-mirrored and egress-mirrored frames are copied to the port mask from ANA:ANA:MIRRORPORTS. So the choice of limiting to a single mirror port via ocelot_mirror_get() and ocelot_mirror_put() may seem bizarre, but the hardware model doesn't map very well to the user space model. If the user wants to mirror the ingress of swp1 towards swp2 and the ingress of swp3 towards swp4, we'd have to program ANA:ANA:MIRRORPORTS with BIT(2) | BIT(4), and that would make swp1 be mirrored towards swp4 too, and swp3 towards swp2. But there are no tc-matchall rules to describe those actions. Now, we could offload a matchall rule with multiple mirred actions, one per desired mirror port, and force the user to stick to the multi-action rule format for subsequent matchall filters. But both DSA and ocelot have the flow_offload_has_one_action() check for the matchall offload, plus the fact that it will get cumbersome to cross-check matchall mirrors with flower mirrors (which will be added in the next patch). As a result, we limit the configuration to a single mirror port, with the possibility of lifting the restriction in the future. Frames injected from the CPU don't get egress-mirrored, since they are sent with the BYPASS bit in the injection frame header, and this bypasses the analyzer module (effectively also the mirroring logic). I don't know what to do/say about this. Functionality was tested with: tc qdisc add dev swp3 clsact tc filter add dev swp3 ingress \ matchall skip_sw \ action mirred egress mirror dev swp1 and pinging through swp3, while seeing that the ICMP replies are mirrored towards swp1. Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by:
Jakub Kicinski <kuba@kernel.org> (cherry picked from commit ccb6ed42) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
In preparation for adding port mirroring support to the ocelot driver, the dispatching function ocelot_setup_tc_cls_matchall() must be free of action-specific code. Move port policer creation and deletion to separate functions. Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by:
Jakub Kicinski <kuba@kernel.org> (cherry picked from commit 4fa72108) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Jianbo Liu authored
As more police parameters are passed to flow_offload, driver can check them to make sure hardware handles packets in the way indicated by tc. The conform-exceed control should be drop/pipe or drop/ok. Besides, for drop/ok, the police should be the last action. As hardware can't configure peakrate/avrate/overhead, offload should not be supported if any of them is configured. Signed-off-by:
Jianbo Liu <jianbol@nvidia.com> Reviewed-by:
Ido Schimmel <idosch@nvidia.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit d97b4b10) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com> Conflicts in drivers/net/ethernet/netronome/nfp/flower/qos_conf.c which we did not update.
-
Jianbo Liu authored
The current police offload action entry is missing exceed/notexceed actions and parameters that can be configured by tc police action. Add the missing parameters as a pre-step for offloading police actions to hardware. Signed-off-by:
Jianbo Liu <jianbol@nvidia.com> Signed-off-by:
Roi Dayan <roid@nvidia.com> Reviewed-by:
Ido Schimmel <idosch@nvidia.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit b8cd5831) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Baowen Zheng authored
Use flow_indr_dev_register/flow_indr_dev_setup_offload to offload tc action. We need to call tc_cleanup_flow_action to clean up tc action entry since in tc_setup_action, some actions may hold dev refcnt, especially the mirror action. Signed-off-by:
Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by:
Louis Peens <louis.peens@corigine.com> Signed-off-by:
Simon Horman <simon.horman@corigine.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 8cbfe939) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Baowen Zheng authored
Add a new ops to tc_action_ops for flow action setup. Refactor function tc_setup_flow_action to use this new ops. We make this change to facilitate to add standalone action module. We will also use this ops to offload action independent of filter in following patch. Signed-off-by:
Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by:
Simon Horman <simon.horman@corigine.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit c54e1d92) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Baowen Zheng authored
To improves readability, we rename offload functions with offload instead of flow. The term flow is related to exact matches, so we rename these functions with offload. We make this change to facilitate single action offload functions naming. Signed-off-by:
Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by:
Simon Horman <simon.horman@corigine.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 9c1c0e12) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Baowen Zheng authored
Add index to flow_action_entry structure and delete index from police and gate child structure. We make this change to offload tc action for driver to identify a tc action. Signed-off-by:
Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by:
Simon Horman <simon.horman@corigine.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 5a995900) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Baowen Zheng authored
A follow-up patch will allow users to offload tc actions independent of classifier in the software datapath. In preparation for this, teach all drivers that support offload of the flow tables to reject such configuration as currently none of them support it. Signed-off-by:
Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by:
Simon Horman <simon.horman@corigine.com> Acked-by:
Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 144d4c9e) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Baowen Zheng authored
Fill flags to action structure to allow user control if the action should be offloaded to hardware or not. Signed-off-by:
Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by:
Louis Peens <louis.peens@corigine.com> Signed-off-by:
Simon Horman <simon.horman@corigine.com> Acked-by:
Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 40bd094d) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
Test basic (port-default, VLAN PCP and IP DSCP) QoS classification for Ocelot switches. Advanced QoS classification using tc filters is covered by tc_flower_chains.sh in the same directory. Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Amit Cohen authored
Currently `ping_do()` and `ping6_do()` send 10 packets. There are cases that it is not possible to catch only the interesting packets using tc rule, so then, it is possible to send many packets and verify that at least this amount of packets hit the rule. Add `PING_COUNT` variable, which is set to 10 by default, to allow tests sending more than 10 packets using the existing ping API. Signed-off-by:
Amit Cohen <amcohen@nvidia.com> Reviewed-by:
Petr Machata <petrm@nvidia.com> Signed-off-by:
Jakub Kicinski <kuba@kernel.org> (cherry picked from commit 0cd0b1f7) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
IEEE_8021QAZ_MAX_TCS is defined in include/uapi/linux/dcbnl.h, which is included by net/dcbnl.h. Then, linux/netdevice.h conditionally includes net/dcbnl.h if CONFIG_DCB is enabled. Therefore, when CONFIG_DCB is disabled, this indirect dependency is broken. There isn't a good reason to include net/dcbnl.h headers into the ocelot switch library which exports low-level hardware API, so replace IEEE_8021QAZ_MAX_TCS with OCELOT_NUM_TC which has the same value. Fixes: 978777d0 ("net: dsa: felix: configure default-prio and dscp priorities") Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220315131215.273450-1-vladimir.oltean@nxp.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> (cherry picked from commit 72f56fdb) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
Follow the established programming model for this driver and provide shims in the felix DSA driver which call the implementations from the ocelot switch lib. The ocelot switchdev driver wasn't integrated with dcbnl due to lack of hardware availability. The switch doesn't have any fancy QoS classification enabled by default. The provided getters will create a default-prio app table entry of 0, and no dscp entry. However, the getters have been made to actually retrieve the hardware configuration rather than static values, to be future proof in case DSA will need this information from more call paths. For default-prio, there is a single field per port, in ANA_PORT_QOS_CFG, called QOS_DEFAULT_VAL. DSCP classification is enabled per-port, again via ANA_PORT_QOS_CFG (field QOS_DSCP_ENA), and individual DSCP values are configured as trusted or not through register ANA_DSCP_CFG (replicated 64 times). An untrusted DSCP value falls back to other QoS classification methods. If trusted, the selected ANA_DSCP_CFG register also holds the QoS class in the QOS_DSCP_VAL field. The hardware also supports DSCP remapping (DSCP value X is translated to DSCP value Y before the QoS class is determined based on the app table entry for Y) and DSCP packet rewriting. The dcbnl framework, for being so flexible in other useless areas, doesn't appear to support this. So this functionality has been left out. Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 978777d0) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
Similar to the port-based default priority, IEEE 802.1Q-2018 allows the Application Priority Table to define QoS classes (0 to 7) per IP DSCP value (0 to 63). In the absence of an app table entry for a packet with DSCP value X, QoS classification for that packet falls back to other methods (VLAN PCP or port-based default). The presence of an app table for DSCP value X with priority Y makes the hardware classify the packet to QoS class Y. As opposed to the default-prio where DSA exposes only a "set" in dsa_switch_ops (because the port-based default is the fallback, it always exists, either implicitly or explicitly), for DSCP priorities we expose an "add" and a "del". The addition of a DSCP entry means trusting that DSCP priority, the deletion means ignoring it. Drivers that already trust (at least some) DSCP values can describe their configuration in dsa_switch_ops :: port_get_dscp_prio(), which is called for each DSCP value from 0 to 63. Again, there can be more than one dcbnl app table entry for the same DSCP value, DSA chooses the one with the largest configured priority. Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 47d75f78) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Vladimir Oltean authored
The port-based default QoS class is assigned to packets that lack a VLAN PCP (or the port is configured to not trust the VLAN PCP), an IP DSCP (or the port is configured to not trust IP DSCP), and packets on which no tc-skbedit action has matched. Similar to other drivers, this can be exposed to user space using the DCB Application Priority Table. IEEE 802.1Q-2018 specifies in Table D-8 - Sel field values that when the Selector is 1, the Protocol ID value of 0 denotes the "Default application priority. For use when application priority is not otherwise specified." The way in which the dcbnl integration in DSA has been designed has to do with its requirements. Andrew Lunn explains that SOHO switches are expected to come with some sort of pre-configured QoS profile, and that it is desirable for this to come pre-loaded into the DSA slave interfaces' DCB application priority table. In the dcbnl design, this is possible because calls to dcb_ieee_setapp() can be initiated by anyone including being self-initiated by this device driver. However, what makes this challenging to implement in DSA is that the DSA core manages the net_devices (effectively hiding them from drivers), while drivers manage the hardware. The DSA core has no knowledge of what individual drivers' QoS policies are. DSA could export to drivers a wrapper over dcb_ieee_setapp() and these could call that function to pre-populate the app priority table, however drivers don't have a good moment in time to do this. The dsa_switch_ops :: setup() method gets called before the net_devices are created (dsa_slave_create), and so is dsa_switch_ops :: port_setup(). What remains is dsa_switch_ops :: port_enable(), but this gets called upon each ndo_open. If we add app table entries on every open, we'd need to remove them on close, to avoid duplicate entry errors. But if we delete app priority entries on close, what we delete may not be the initial, driver pre-populated entries, but rather user-added entries. So it is clear that letting drivers choose the timing of the dcb_ieee_setapp() call is inappropriate. The alternative which was chosen is to introduce hardware-specific ops in dsa_switch_ops, and effectively hide dcbnl details from drivers as well. For pre-populating the application table, dsa_slave_dcbnl_init() will call ds->ops->port_get_default_prio() which is supposed to read from hardware. If the operation succeeds, DSA creates a default-prio app table entry. The method is called as soon as the slave_dev is registered, but before we release the rtnl_mutex. This is done such that user space sees the app table entries as soon as it sees the interface being registered. The fact that we populate slave_dev->dcbnl_ops with a non-NULL pointer changes behavior in dcb_doit() from net/dcb/dcbnl.c, which used to return -EOPNOTSUPP for any dcbnl operation where netdev->dcbnl_ops is NULL. Because there are still dcbnl-unaware DSA drivers even if they have dcbnl_ops populated, the way to restore the behavior is to make all dcbnl_ops return -EOPNOTSUPP on absence of the hardware-specific dsa_switch_ops method. The dcbnl framework absurdly allows there to be more than one app table entry for the same selector and protocol (in other words, more than one port-based default priority). In the iproute2 dcb program, there is a "replace" syntactical sugar command which performs an "add" and a "del" to hide this away. But we choose the largest configured priority when we call ds->ops->port_set_default_prio(), using __fls(). When there is no default-prio app table entry left, the port-default priority is restored to 0. Link: https://patchwork.kernel.org/project/netdevbpf/patch/20210113154139.1803705-2-olteanv@gmail.com/ Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit d538eca8) Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-
Xiaoliang Yang authored
VSC9959 has TSN capabilities on hardware. Using tsntool netlink interface to configure the following TSN features: - IEEE 802.1Qbv - IEEE 802.1Qbu/802.3br - IEEE 802.1Qci - IEEE 802.1Qav - IEEE 802.1CB This patch is based on netlink adaptation layer in net/tsn/*. Enable CONFIG_MSCC_FELIX_SWITCH_TSN config to add the TSN support. Signed-off-by:
Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by:
Vladimir Oltean <vladimir.oltean@nxp.com>
-