mirror-linux/include/trace
Paolo Abeni d2000361e4 mptcp: better mptcp-level RTT estimator
The current MPTCP-level RTT estimator has several issues. On high speed
links, the MPTCP-level receive buffer auto-tuning happens with a
frequency well above the TCP-level's one. That in turn can cause
excessive/unneeded receive buffer increase.

On such links, the initial rtt_us value is considerably higher than the
actual delay, and the current mptcp_rcv_space_adjust() updates
msk->rcvq_space.rtt_us with a period equal to the such field previous
value. If the initial rtt_us is 40ms, its first update will happen after
40ms, even if the subflows see actual RTT orders of magnitude lower.

Additionally:
- setting the msk RTT to the maximum among all the subflows RTTs makes
  DRS constantly overshooting the rcvbuf size when a subflow has
  considerable higher latency than the other(s).

- during unidirectional bulk transfers with multiple active subflows,
  the TCP-level RTT estimator occasionally sees considerably higher
  value than the real link delay, i.e. when the packet scheduler reacts
  to an incoming ACK on given subflow pushing data on a different
  subflow.

- currently inactive but still open subflows (i.e. switched to backup
  mode) are always considered when computing the msk-level RTT.

Address the all the issues above with a more accurate RTT estimation
strategy: the MPTCP-level RTT is set to the minimum of all the subflows
actually feeding data into the MPTCP receive buffer, using a small
sliding window.

While at it, also use EWMA to compute the msk-level scaling_ratio, to
that MPTCP can avoid traversing the subflow list is
mptcp_rcv_space_adjust().

Use some care to avoid updating msk and ssk level fields too often.

Fixes: a6b118febb ("mptcp: add receive buffer auto-tuning")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260407-net-next-mptcp-reduce-rbuf-v2-1-0d1d135bf6f6@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-08 19:32:00 -07:00
..
events mptcp: better mptcp-level RTT estimator 2026-04-08 19:32:00 -07:00
misc NFSD: Remove NFSERR_EAGAIN 2026-01-02 13:43:41 -05:00
stages tracing: Add bitmask-list option for human-readable bitmask display 2026-01-26 17:00:50 -05:00
bpf_probe.h tracepoint: Have tracepoints created with DECLARE_TRACE() have _tp suffix 2025-05-14 11:19:32 -04:00
define_custom_trace.h
define_trace.h tracepoint: Have tracepoints created with DECLARE_TRACE() have _tp suffix 2025-05-14 11:19:32 -04:00
perf.h tracing: perf: Have perf tracepoint callbacks always disable preemption 2026-01-30 10:43:35 -05:00
syscall.h tracing: Display some syscall arrays as strings 2025-10-28 20:10:58 -04:00
trace_custom_events.h
trace_events.h tracing: Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast 2026-01-30 10:44:11 -05:00