[ Upstream commit 6319685bdc ]
Change the minimum value of RX/TX descriptors to 64 to enable setting the rx/tx
value between 64 and 80. All igb devices can use as low as 64 descriptors.
This change will unify igb with other drivers.
Based on commit 7b1be1987c ("e1000e: lower ring minimum size to 64")
Fixes: 9d5c824399 ("igb: PCI-Express 82575 Gigabit Ethernet driver")
Signed-off-by: Olga Zaborska <olga.zaborska@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8360717524 ]
Change the minimum value of RX/TX descriptors to 64 to enable setting the rx/tx
value between 64 and 80. All igbvf devices can use as low as 64 descriptors.
This change will unify igbvf with other drivers.
Based on commit 7b1be1987c ("e1000e: lower ring minimum size to 64")
Fixes: d4e0fe01a3 ("igbvf: add new driver to support 82576 virtual functions")
Signed-off-by: Olga Zaborska <olga.zaborska@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5aa4827971 ]
Change the minimum value of RX/TX descriptors to 64 to enable setting the rx/tx
value between 64 and 80. All igc devices can use as low as 64 descriptors.
This change will unify igc with other drivers.
Based on commit 7b1be1987c ("e1000e: lower ring minimum size to 64")
Fixes: 0507ef8a03 ("igc: Add transmit and receive fastpath and interrupt handlers")
Signed-off-by: Olga Zaborska <olga.zaborska@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 29fe7a1b62 ]
The smq value used in the CN10K NIX AQ instruction enqueue mailbox
handler was truncated to 9-bit value from 10-bit value because of
typecasting the CN10K mbox request structure to the CN9K structure.
Though this hasn't caused any problems when programming the NIX SQ
context to the HW because the context structure is the same size.
However, this causes a problem when accessing the structure parameters.
This patch reads the right smq value for each platform.
Fixes: 30077d210c ("octeontx2-af: cn10k: Update NIX/NPA context structure")
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6ad40b36cd ]
kcm_exit_net() should call mutex_destroy() on knet->mutex. This is especially
needed if CONFIG_DEBUG_MUTEXES is enabled.
Fixes: ab7ac4eb98 ("kcm: Kernel Connection Multiplexor module")
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Link: https://lore.kernel.org/r/20230902170708.1727999-1-syoshida@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b192812905 ]
As with sk->sk_shutdown shown in the previous patch, sk->sk_err can be
read locklessly by unix_dgram_sendmsg().
Let's use READ_ONCE() for sk_err as well.
Note that the writer side is marked by commit cc04410af7 ("af_unix:
annotate lockless accesses to sk->sk_err").
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a454d84ee2 ]
There is a race where skb's from the sk_psock_backlog can be referenced
after userspace side has already skb_consumed() the sk_buff and its refcnt
dropped to zer0 causing use after free.
The flow is the following:
while ((skb = skb_peek(&psock->ingress_skb))
sk_psock_handle_Skb(psock, skb, ..., ingress)
if (!ingress) ...
sk_psock_skb_ingress
sk_psock_skb_ingress_enqueue(skb)
msg->skb = skb
sk_psock_queue_msg(psock, msg)
skb_dequeue(&psock->ingress_skb)
The sk_psock_queue_msg() puts the msg on the ingress_msg queue. This is
what the application reads when recvmsg() is called. An application can
read this anytime after the msg is placed on the queue. The recvmsg hook
will also read msg->skb and then after user space reads the msg will call
consume_skb(skb) on it effectively free'ing it.
But, the race is in above where backlog queue still has a reference to
the skb and calls skb_dequeue(). If the skb_dequeue happens after the
user reads and free's the skb we have a use after free.
The !ingress case does not suffer from this problem because it uses
sendmsg_*(sk, msg) which does not pass the sk_buff further down the
stack.
The following splat was observed with 'test_progs -t sockmap_listen':
[ 1022.710250][ T2556] general protection fault, ...
[...]
[ 1022.712830][ T2556] Workqueue: events sk_psock_backlog
[ 1022.713262][ T2556] RIP: 0010:skb_dequeue+0x4c/0x80
[ 1022.713653][ T2556] Code: ...
[...]
[ 1022.720699][ T2556] Call Trace:
[ 1022.720984][ T2556] <TASK>
[ 1022.721254][ T2556] ? die_addr+0x32/0x80^M
[ 1022.721589][ T2556] ? exc_general_protection+0x25a/0x4b0
[ 1022.722026][ T2556] ? asm_exc_general_protection+0x22/0x30
[ 1022.722489][ T2556] ? skb_dequeue+0x4c/0x80
[ 1022.722854][ T2556] sk_psock_backlog+0x27a/0x300
[ 1022.723243][ T2556] process_one_work+0x2a7/0x5b0
[ 1022.723633][ T2556] worker_thread+0x4f/0x3a0
[ 1022.723998][ T2556] ? __pfx_worker_thread+0x10/0x10
[ 1022.724386][ T2556] kthread+0xfd/0x130
[ 1022.724709][ T2556] ? __pfx_kthread+0x10/0x10
[ 1022.725066][ T2556] ret_from_fork+0x2d/0x50
[ 1022.725409][ T2556] ? __pfx_kthread+0x10/0x10
[ 1022.725799][ T2556] ret_from_fork_asm+0x1b/0x30
[ 1022.726201][ T2556] </TASK>
To fix we add an skb_get() before passing the skb to be enqueued in the
engress queue. This bumps the skb->users refcnt so that consume_skb()
and kfree_skb will not immediately free the sk_buff. With this we can
be sure the skb is still around when we do the dequeue. Then we just
need to decrement the refcnt or free the skb in the backlog case which
we do by calling kfree_skb() on the ingress case as well as the sendmsg
case.
Before locking change from fixes tag we had the sock locked so we
couldn't race with user and there was no issue here.
Fixes: 799aa7f98d ("skmsg: Avoid lock_sock() in sk_psock_backlog()")
Reported-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Xu Kuohai <xukuohai@huawei.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230901202137.214666-1-john.fastabend@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 719c5e37e9 ]
Previously, the defines for phy_device flags in the Micrel driver were
ambiguous in their representation. They were intended to be bit masks
but were mistakenly defined as bit positions. This led to the following
issues:
- MICREL_KSZ8_P1_ERRATA, designated for KSZ88xx switches, overlapped
with MICREL_PHY_FXEN and MICREL_PHY_50MHZ_CLK.
- Due to this overlap, the code path for MICREL_PHY_FXEN, tailored for
the KSZ8041 PHY, was not executed for KSZ88xx PHYs.
- Similarly, the code associated with MICREL_PHY_50MHZ_CLK wasn't
triggered for KSZ88xx.
To rectify this, all three flags have now been explicitly converted to
use the `BIT()` macro, ensuring they are defined as bit masks and
preventing potential overlaps in the future.
Fixes: 49011e0c15 ("net: phy: micrel: ksz886x/ksz8081: add cabletest support")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f31867d0d9 ]
The existing code incorrectly casted a negative value (the result of a
subtraction) to an unsigned value without checking. For example, if
/proc/sys/net/ipv6/conf/*/temp_prefered_lft was set to 1, the preferred
lifetime would jump to 4 billion seconds. On my machine and network the
shortest lifetime that avoided underflow was 3 seconds.
Fixes: 76506a986d ("IPv6: fix DESYNC_FACTOR")
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 151e887d8f ]
The veth_xmit function returns NETDEV_TX_OK even when packets are dropped.
This behavior leads to incorrect calculations of statistics counts, as
well as things like txq->trans_start updates.
Fixes: e314dbdc1c ("[NET]: Virtual ethernet device driver.")
Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 817c7cd204 ]
gve_rx_append_frags() is able to build skbs chained with frag_list,
like GRO engine.
Problem is that shinfo->frag_list should only be used
for the head of the chain.
All other links should use skb->next pointer.
Otherwise, built skbs are not valid and can cause crashes.
Equivalent code in GRO (skb_gro_receive()) is:
if (NAPI_GRO_CB(p)->last == p)
skb_shinfo(p)->frag_list = skb;
else
NAPI_GRO_CB(p)->last->next = skb;
NAPI_GRO_CB(p)->last = skb;
Fixes: 9b8dd5e5ea ("gve: DQO: Add RX path")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Bailey Forrest <bcf@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Catherine Sullivan <csully@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fa09bc40b2 ]
Disable virtualization features on 82580 just as on i210/i211.
This avoids that virt functions are acidentally called on 82850.
Fixes: 55cac248ca ("igb: Add full support for 82580 devices")
Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8423be8926 ]
Route hints when the nexthop is part of a multipath group causes packets
in the same receive batch to be sent to the same nexthop irrespective of
the multipath hash of the packet. So, do not extract route hint for
packets whose destination is part of a multipath group.
A new SKB flag IP6SKB_MULTIPATH is introduced for this purpose, set the
flag when route is looked up in fib6_select_path() and use it in
ip6_can_use_hint() to check for the existence of the flag.
Fixes: 197dbf24e3 ("ipv6: introduce and uses route look hints for list input.")
Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6ac66cb03a ]
Route hints when the nexthop is part of a multipath group causes packets
in the same receive batch to be sent to the same nexthop irrespective of
the multipath hash of the packet. So, do not extract route hint for
packets whose destination is part of a multipath group.
A new SKB flag IPSKB_MULTIPATH is introduced for this purpose, set the
flag when route is looked up in ip_mkroute_input() and use it in
ip_extract_route_hint() to check for the existence of the flag.
Fixes: 02b2494161 ("ipv4: use dst hint for ipv4 list receive")
Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9531e4a83f ]
msk->rmem_fwd_alloc can be read locklessly.
Add mptcp_rmem_fwd_alloc_add(), similar to sk_forward_alloc_add(),
and appropriate READ_ONCE()/WRITE_ONCE() annotations.
Fixes: 6511882cdd ("mptcp: allocate fwd memory separately on the rx and tx path")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5e6300e7b3 ]
Every time sk->sk_forward_alloc is read locklessly,
add a READ_ONCE().
Add sk_forward_alloc_add() helper to centralize updates,
to reduce number of WRITE_ONCE().
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 66d58f046c ]
inet_sk_diag_fill() has been changed to use sk_forward_alloc_get(),
but sk_get_meminfo() was forgotten.
Fixes: 292e6077b0 ("net: introduce sk_forward_alloc_get()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a90c367e5a ]
Drop intel_vgpu_reset_gtt() as it no longer has any callers. In addition
to eliminating dead code, this eliminates the last possible scenario where
__kvmgt_protect_table_find() can be reached without holding vgpu_lock.
Requiring vgpu_lock to be held when calling __kvmgt_protect_table_find()
will allow a protecting the gfn hash with vgpu_lock without too much fuss.
No functional change intended.
Fixes: ba25d97757 ("drm/i915/gvt: Do not destroy ppgtt_mm during vGPU D3->D0.")
Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Link: https://lore.kernel.org/r/20230729013535.1070024-11-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 708e49583d ]
Put the struct page reference acquired by gfn_to_pfn(), KVM's API is that
the caller is ultimately responsible for dropping any reference.
Note, kvm_release_pfn_clean() ensures the pfn is actually a refcounted
struct page before trying to put any references.
Fixes: b901b252b6 ("drm/i915/gvt: Add 2M huge gtt support")
Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Link: https://lore.kernel.org/r/20230729013535.1070024-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f046923af7 ]
Check that the pfn found by gfn_to_pfn() is actually backed by "struct
page" memory prior to retrieving and dereferencing the page. KVM
supports backing guest memory with VM_PFNMAP, VM_IO, etc., and so
there is no guarantee the pfn returned by gfn_to_pfn() has an associated
"struct page".
Fixes: b901b252b6 ("drm/i915/gvt: Add 2M huge gtt support")
Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Link: https://lore.kernel.org/r/20230729013535.1070024-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3e019d8a05 ]
Fix a use-after-free error that is possible if the xsk_diag interface
is used after the socket has been unbound from the device. This can
happen either due to the socket being closed or the device
disappearing. In the early days of AF_XDP, the way we tested that a
socket was not bound to a device was to simply check if the netdevice
pointer in the xsk socket structure was NULL. Later, a better system
was introduced by having an explicit state variable in the xsk socket
struct. For example, the state of a socket that is on the way to being
closed and has been unbound from the device is XSK_UNBOUND.
The commit in the Fixes tag below deleted the old way of signalling
that a socket is unbound, setting dev to NULL. This in the belief that
all code using the old way had been exterminated. That was
unfortunately not true as the xsk diagnostics code was still using the
old way and thus does not work as intended when a socket is going
down. Fix this by introducing a test against the state variable. If
the socket is in the state XSK_UNBOUND, simply abort the diagnostic's
netlink operation.
Fixes: 18b1ab7aa7 ("xsk: Fix race at socket teardown")
Reported-by: syzbot+822d1359297e2694f873@syzkaller.appspotmail.com
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: syzbot+822d1359297e2694f873@syzkaller.appspotmail.com
Tested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20230831100119.17408-1-magnus.karlsson@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8aae7625ff ]
New skbs allocated via nf_send_reset() have skb->dev == NULL.
fib*_rules_early_flow_dissect helpers already have a 'struct net'
argument but its not passed down to the flow dissector core, which
will then WARN as it can't derive a net namespace to use:
WARNING: CPU: 0 PID: 0 at net/core/flow_dissector.c:1016 __skb_flow_dissect+0xa91/0x1cd0
[..]
ip_route_me_harder+0x143/0x330
nf_send_reset+0x17c/0x2d0 [nf_reject_ipv4]
nft_reject_inet_eval+0xa9/0xf2 [nft_reject_inet]
nft_do_chain+0x198/0x5d0 [nf_tables]
nft_do_chain_inet+0xa4/0x110 [nf_tables]
nf_hook_slow+0x41/0xc0
ip_local_deliver+0xce/0x110
..
Cc: Stanislav Fomichev <sdf@google.com>
Cc: David Ahern <dsahern@kernel.org>
Cc: Ido Schimmel <idosch@nvidia.com>
Fixes: 812fa71f0d ("netfilter: Dissect flow after packet mangling")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217826
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230830110043.30497-1-fw@strlen.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit efc0b0bcff ]
In addition to the EINVAL, there may be an ENOMEM.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 70431bfd82 ("cifs: Support fscache indexing rewrite")
Signed-off-by: Katya Orlova <e.orlova@ispras.ru>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9fd29a5bae ]
Use filesystem context support to handle dfs links.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
Stable-dep-of: efc0b0bcff ("smb: propagate error code of extract_sharename()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit eead005664 ]
Currently, 'carryover_ios/bytes' is not handled in throtl_trim_slice(),
for consequence, 'carryover_ios/bytes' will be used to throttle bio
multiple times, for example:
1) set iops limit to 100, and slice start is 0, slice end is 100ms;
2) current time is 0, and 10 ios are dispatched, those io won't be
throttled and io_disp is 10;
3) still at current time 0, update iops limit to 1000, carryover_ios is
updated to (0 - 10) = -10;
4) in this slice(0 - 100ms), io_allowed = 100 + (-10) = 90, which means
only 90 ios can be dispatched without waiting;
5) assume that io is throttled in slice(0 - 100ms), and
throtl_trim_slice() update silce to (100ms - 200ms). In this case,
'carryover_ios/bytes' is not cleared and still only 90 ios can be
dispatched between 100ms - 200ms.
Fix this problem by updating 'carryover_ios/bytes' in
throtl_trim_slice().
Fixes: a880ae93e5 ("blk-throttle: fix io hung due to configuration updates")
Reported-by: zhuxiaohui <zhuxiaohui.400@bytedance.com>
Link: https://lore.kernel.org/all/20230812072116.42321-1-zhuxiaohui.400@bytedance.com/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230816012708.1193747-5-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5eefc5307c ]
References to i915_requests may be trapped by userspace inside a
sync_file or dmabuf (dma-resv) and held indefinitely across different
proceses. To counter-act the memory leaks, we try to not to keep
references from the request past their completion.
On the other side on fence release we need to know if rq->engine
is valid and points to hw engine (true for non-virtual requests).
To make it possible extra bit has been added to rq->execution_mask,
for marking virtual engines.
Fixes: bcb9aa45d5 ("Revert "drm/i915: Hold reference to intel_context over life of i915_request"")
Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230821153035.3903006-1-andrzej.hajda@intel.com
(cherry picked from commit 280410677a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a84260e314 ]
It has system-wide test and cpu-list test but the cpu-list test fails
sometimes. It runs sleep command on CPU1 and measure both user.slice
and system.slice cgroups by default (on systemd-based systems).
But if the system was idle enough, sometime the system.slice gets no
count and it makes the test failing. Maybe that's because it only looks
at the CPU1, let's add CPU0 to increase the chance it finds some tasks.
Fixes: 7901086014 ("perf test: Add a new test for perf stat cgroup BPF counter")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230825164152.165610-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0dd1f81554 ]
Running shellcheck on lock_contention.sh generates below warning:
In stat_bpf_counters_cgrp.sh line 28:
if [ -d /sys/fs/cgroup/system.slice -a -d /sys/fs/cgroup/user.slice ]; then
^-- SC2166 (warning): Prefer [ p ] && [ q ] as [ p -a q ] is not well defined.
In stat_bpf_counters_cgrp.sh line 34:
local self_cgrp=$(grep perf_event /proc/self/cgroup | cut -d: -f3)
^-------------^ SC3043 (warning): In POSIX sh, 'local' is undefined.
^-------^ SC2155 (warning): Declare and assign separately to avoid masking return values.
^-- SC2046 (warning): Quote this to prevent word splitting.
In stat_bpf_counters_cgrp.sh line 51:
local output
^----------^ SC3043 (warning): In POSIX sh, 'local' is undefined.
In stat_bpf_counters_cgrp.sh line 65:
local output
^----------^ SC3043 (warning): In POSIX sh, 'local' is undefined.
Fixed above warnings by:
- Changing the expression [p -a q] to [p] && [q].
- Fixing shellcheck warnings for local usage, by prefixing
function name to the variable.
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230709182800.53002-6-atrajeev@linux.vnet.ibm.com
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stable-dep-of: a84260e314 ("perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4aae44f658 ]
Because LPC32xx PWM controllers have only a single output which is
registered as the only PWM device/channel per controller, it is known in
advance that pwm->hwpwm value is always 0. On basis of this fact
simplify the code by removing operations with pwm->hwpwm, there is no
controls which require channel number as input.
Even though I wasn't aware at the time when I forward ported that patch,
this fixes a null pointer dereference as lpc32xx->chip.pwms is NULL
before devm_pwmchip_add() is called.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Fixes: 3d2813fb17 ("pwm: lpc32xx: Don't modify HW state in .probe() after the PWM chip was registered")
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cf38e7691c ]
When built with CONFIG_INTEL_MID_WATCHDOG=m, currently the driver
needs to be loaded manually, for the lack of module alias.
This causes unintended resets in cases where watchdog timer is
set-up by bootloader and the driver is not explicitly loaded.
Add MODULE_ALIAS() to load the driver automatically at boot and
avoid this issue.
Fixes: 87a1ef8058 ("watchdog: add Intel MID watchdog driver support")
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20230811120220.31578-1-raag.jadav@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ef23cb5933 ]
While debugging a segfault on 'perf lock contention' without an
available perf.data file I noticed that it was basically calling:
perf_session__delete(ERR_PTR(-1))
Resulting in:
(gdb) run lock contention
Starting program: /root/bin/perf lock contention
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
failed to open perf.data: No such file or directory (try 'perf record' first)
Initializing perf session failed
Program received signal SIGSEGV, Segmentation fault.
0x00000000005e7515 in auxtrace__free (session=0xffffffffffffffff) at util/auxtrace.c:2858
2858 if (!session->auxtrace)
(gdb) p session
$1 = (struct perf_session *) 0xffffffffffffffff
(gdb) bt
#0 0x00000000005e7515 in auxtrace__free (session=0xffffffffffffffff) at util/auxtrace.c:2858
#1 0x000000000057bb4d in perf_session__delete (session=0xffffffffffffffff) at util/session.c:300
#2 0x000000000047c421 in __cmd_contention (argc=0, argv=0x7fffffffe200) at builtin-lock.c:2161
#3 0x000000000047dc95 in cmd_lock (argc=0, argv=0x7fffffffe200) at builtin-lock.c:2604
#4 0x0000000000501466 in run_builtin (p=0xe597a8 <commands+552>, argc=2, argv=0x7fffffffe200) at perf.c:322
#5 0x00000000005016d5 in handle_internal_command (argc=2, argv=0x7fffffffe200) at perf.c:375
#6 0x0000000000501824 in run_argv (argcp=0x7fffffffe02c, argv=0x7fffffffe020) at perf.c:419
#7 0x0000000000501b11 in main (argc=2, argv=0x7fffffffe200) at perf.c:535
(gdb)
So just set it to NULL after using PTR_ERR(session) to decode the error
as perf_session__delete(NULL) is supported.
The same problem was found in 'perf top' after an audit of all
perf_session__new() failure handling.
Fixes: 6ef81c55a2 ("perf session: Return error code for perf_session__new() function on failure")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
Cc: Mukesh Ojha <mojha@codeaurora.org>
Cc: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Shawn Landden <shawn@git.icu>
Cc: Song Liu <songliubraving@fb.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Link: https://lore.kernel.org/lkml/ZN4Q2rxxsL08A8rd@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5df8ecfe36 ]
Drop the explicit check on the extended CPUID level in cpu_has_svm(), the
kernel's cached CPUID info will leave the entire SVM leaf unset if said
leaf is not supported by hardware. Prior to using cached information,
the check was needed to avoid false positives due to Intel's rather crazy
CPUID behavior of returning the values of the maximum supported leaf if
the specified leaf is unsupported.
Fixes: 682a810887 ("x86/kvm/svm: Simplify cpu_has_svm()")
Link: https://lore.kernel.org/r/20230721201859.2307736-13-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 979e9c9fc9 ]
In 616b14b47a ("perf build: Conditionally define NDEBUG") we
started using NDEBUG=1 when DEBUG=1 isn't present, so code that is
enclosed with assert() is not called.
In dd317df072 ("perf build: Make binutil libraries opt in") we
stopped linking against binutils-devel, for licensing reasons.
Recently people asked me why annotation of BPF programs wasn't working,
i.e. this:
$ perf annotate bpf_prog_5280546344e3f45c_kfree_skb
was returning:
case SYMBOL_ANNOTATE_ERRNO__NO_LIBOPCODES_FOR_BPF:
scnprintf(buf, buflen, "Please link with binutils's libopcode to enable BPF annotation");
This was on a fedora rpm, so its new enough that I had to try to test by
rebuilding using BUILD_NONDISTRO=1, only to get it segfaulting on me.
This combination made this libopcode function not to be called:
assert(bfd_check_format(bfdf, bfd_object));
Changing it to:
if (!bfd_check_format(bfdf, bfd_object))
abort();
Made it work, looking at this "check" function made me realize it
changes the 'bfdf' internal state, i.e. we better call it.
So stop using assert() on it, just call it and abort if it fails.
Probably it is better to propagate the error, etc, but it seems it is
unlikely to fail from the usage done so far and we really need to stop
using libopcodes, so do the quick fix above and move on.
With it we have BPF annotation back working when built with
BUILD_NONDISTRO=1:
⬢[acme@toolbox perf-tools-next]$ perf annotate --stdio2 bpf_prog_5280546344e3f45c_kfree_skb | head
No kallsyms or vmlinux with build-id 939bc71a1a51cdc434e60af93c7e734f7d5c0e7e was found
Samples: 12 of event 'cpu-clock:ppp', 4000 Hz, Event count (approx.): 3000000, [percent: local period]
bpf_prog_5280546344e3f45c_kfree_skb() bpf_prog_5280546344e3f45c_kfree_skb
Percent int kfree_skb(struct trace_event_raw_kfree_skb *args) {
nop
33.33 xchg %ax,%ax
push %rbp
mov %rsp,%rbp
sub $0x180,%rsp
push %rbx
push %r13
⬢[acme@toolbox perf-tools-next]$
Fixes: 6987561c9e ("perf annotate: Enable annotation of BPF programs")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mohamed Mahmoud <mmahmoud@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Dave Tucker <datucker@redhat.com>
Cc: Derek Barbosa <debarbos@redhat.com>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/lkml/ZMrMzoQBe0yqMek1@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cc141c35af ]
The driver has been switched to use IRQF_NO_AUTOEN, but in the error
unwinding and remove paths calls to enable_irq() were left in place, which
will lead to an incorrect enable counter value.
Fixes: bcd9730a04 ("Input: move to use request_irq by IRQF_NO_AUTOEN flag")
Link: https://lore.kernel.org/r/20230724053024.352054-3-dmitry.torokhov@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 687fe7dfb7 ]
Remove option having i2c client contain raw gpio number instead of proper
IRQ number. There are no users of this facility in mainline and it will
allow cleaning up the driver code with regard to wakeup handling, etc.
Link: https://lore.kernel.org/r/20230724053024.352054-1-dmitry.torokhov@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Stable-dep-of: cc141c35af ("Input: tca6416-keypad - fix interrupt enable disbalance")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fe1328b5b2 ]
So, let's drop output GPIO direction check and only check GPIO value to set
the initial power state.
Fixes: 706dc68102 ("backlight: gpio: Explicitly set the direction of the GPIO")
Signed-off-by: Liu Ying <victor.liu@nxp.com>
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230721093342.1532531-1-victor.liu@nxp.com
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0323e8fedd ]
Allocate driver data as first resource in the probe function. This way it
can be used during allocation of the other resources (instead of assigning
these to local variables first and update driver data only when it's
allocated). Also as driver data is allocated using a devm function this
should happen first to have the order of freeing resources in the error
path and the remove function in reverse.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Stable-dep-of: c11622324c ("pwm: atmel-tcb: Fix resource freeing in error path and remove")
Signed-off-by: Sasha Levin <sashal@kernel.org>