Now that we can rely errors in the normal control flow there is no
need for this flag, remove it.
Link: https://lore.kernel.org/r/20190814075928.23766-9-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Factor the main copy page to vram routine out into a helper that acts
on a single page and which doesn't require the nouveau_dmem_migrate
structure for argument passing. As an added benefit the new version
only allocates the dma address array once and reuses it for each
subsequent chunk of work.
Link: https://lore.kernel.org/r/20190814075928.23766-8-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Factor the main copy page to ram routine out into a helper that acts on
a single page and which doesn't require the nouveau_dmem_fault
structure for argument passing. Also remove the loop over multiple
pages as we only handle one at the moment, although the structure of
the main worker function makes it relatively easy to add multi page
support back if needed in the future. But at least for now this avoid
the needed to dynamically allocate memory for the dma addresses in
what is essentially the page fault path.
Link: https://lore.kernel.org/r/20190814075928.23766-7-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
nouveau_dmem_migrate_vma and nouveau_dmem_convert_pfn are only called
when CONFIG_DRM_NOUVEAU_SVM is enabled, so there is no need to provide
!CONFIG_DRM_NOUVEAU_SVM stubs for them.
Link: https://lore.kernel.org/r/20190814075928.23766-6-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Factor out the end of fencing logic from the two migration routines.
Link: https://lore.kernel.org/r/20190814075928.23766-5-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Factor out the repeated device memory address calculation into
a helper.
Link: https://lore.kernel.org/r/20190814075928.23766-4-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When we start a new batch of dma_map operations we need to reset dma_nr,
as we start filling a newly allocated array.
Link: https://lore.kernel.org/r/20190814075928.23766-3-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
There isn't any good reason to pass callbacks to migrate_vma. Instead
we can just export the three steps done by this function to drivers and
let them sequence the operation without callbacks. This removes a lot
of boilerplate code as-is, and will allow the drivers to drastically
improve code flow and error handling further on.
Link: https://lore.kernel.org/r/20190814075928.23766-2-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Jason Gunthorpe says:
====================
Add mmu_notifier_get/put for managing mmu notifier registrations
This series introduces a new registration flow for mmu_notifiers based on
the idea that the user would like to get a single refcounted piece of
memory for a mm, keyed to its use.
For instance many users of mmu_notifiers use an interval tree or similar
to dispatch notifications to some object. There are many objects but only
one notifier subscription per mm holding the tree.
Of the 12 places that call mmu_notifier_register:
- 7 are maintaining some kind of obvious mapping of mm_struct to
mmu_notifier registration, ie in some linked list or hash table. Of
the 7 this series converts 4 (gru, hmm, RDMA, radeon)
- 3 (hfi1, gntdev, vhost) are registering multiple notifiers, but each
one immediately does some VA range filtering, ie with an interval tree.
These would be better with a global subsystem-wide range filter and
could convert to this API.
- 2 (kvm, amd_iommu) are deliberately using a single mm at a time, and
really can't use this API. One of the intel-svm's modes is also in this
list
The 3/7 unconverted drivers are:
- intel-svm
This driver tracks mm's in a global linked list 'global_svm_list'
and would benefit from this API.
Its flow is a bit complex, since it also wants a set of non-shared
notifiers.
- i915_gem_usrptr
This driver tracks mm's in a per-device hash
table (dev_priv->mm_structs), but only has an optional use of
mmu_notifiers. Since it still seems to need the hash table it is
difficult to convert.
- amdkfd/kfd_process
This driver is using a global SRCU hash table to track mm's
The control flow here is very complicated and the driver is relying on
this hash table to be fast on the ioctl syscall path.
It would definitely benefit, but only if the ioctl path didn't need to
do the search so often.
====================
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The sequence of mmu_notifier_unregister_no_release(),
mmu_notifier_call_srcu() is identical to mmu_notifier_put() with the
free_notifier callback.
As this is the last user of those APIs, converting it means we can drop
them.
Link: https://lore.kernel.org/r/20190806231548.25242-11-jgg@ziepe.ca
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When using mmu_notifer_unregister_no_release() the caller must ensure
there is a SRCU synchronize before the mn memory is freed, otherwise use
after free races are possible, for instance:
CPU0 CPU1
invalidate_range_start
hlist_for_each_entry_rcu(..)
mmu_notifier_unregister_no_release(&p->mn)
kfree(mn)
if (mn->ops->invalidate_range_end)
The error unwind in amdkfd misses the SRCU synchronization.
amdkfd keeps the kfd_process around until the mm is released, so split the
flow to fully initialize the kfd_process and register it for find_process,
and with the notifier. Past this point the kfd_process does not need to be
cleaned up as it is fully ready.
The final failable step does a vm_mmap() and does not seem to impact the
kfd_process global state. Since it also cannot be undone (and already has
problems with undo if it internally fails), it has to be last.
This way we don't have to try to unwind the mmu_notifier_register() and
avoid the problem with the SRCU.
Along the way this also fixes various other error unwind bugs in the flow.
Fixes: 45102048f7 ("amdkfd: Add process queue manager module")
Link: https://lore.kernel.org/r/20190806231548.25242-10-jgg@ziepe.ca
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
radeon is using a device global hash table to track what mmu_notifiers
have been registered on struct mm. This is better served with the new
get/put scheme instead.
radeon has a bug where it was not blocking notifier release() until all
the BO's had been invalidated. This could result in a use after free of
pages the BOs. This is tied into a second bug where radeon left the
notifiers running endlessly even once the interval tree became
empty. This could result in a use after free with module unload.
Both are fixed by changing the lifetime model, the BOs exist in the
interval tree with their natural lifetimes independent of the mm_struct
lifetime using the get/put scheme. The release runs synchronously and just
does invalidate_start across the entire interval tree to create the
required DMA fence.
Additions to the interval tree after release are already impossible as
only current->mm is used during the add.
Link: https://lore.kernel.org/r/20190806231548.25242-9-jgg@ziepe.ca
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This is a significant simplification, it eliminates all the remaining
'hmm' stuff in mm_struct, eliminates krefing along the critical notifier
paths, and takes away all the ugly locking and abuse of page_table_lock.
mmu_notifier_get() provides the single struct hmm per struct mm which
eliminates mm->hmm.
It also directly guarantees that no mmu_notifier op callback is callable
while concurrent free is possible, this eliminates all the krefs inside
the mmu_notifier callbacks.
The remaining krefs in the range code were overly cautious, drivers are
already not permitted to free the mirror while a range exists.
Link: https://lore.kernel.org/r/20190806231548.25242-6-jgg@ziepe.ca
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The number of CS lines is mentioned as 2 in the spi-controller binding
but however in the example, 4 cs-gpios are used. Hence fix that to
mention 4.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20190820115000.32041-1-manivannan.sadhasivam@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
We prohibit non-aligned access in kernel mode, but some special NIC
driver needs to support kernel-state unaligned access. For example,
when the bus does not support unaligned access, IP header parsing
will cause non-aligned access and driver does not recopy the skb
buffer to dma for performance reasons.
Added kernel_enable & user_enable to control unaligned access and
added kernel_count & user_count for statistical unaligned access.
Signed-off-by: Guo Ren <ren_guo@c-sky.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Use devm_spi_register_controller to fix missing spi_unregister_controller
when unload module.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Link: https://lore.kernel.org/r/20190818095113.2397-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
When transitioning to supend state, uniphier_aio_dai_suspend() is called
and asserts reset lines and disables clocks.
However, if there are two or more DAIs, uniphier_aio_dai_suspend() are
called multiple times, and double reset assersion will cause.
This patch defines the counter that has the number of DAIs at first, and
whenever uniphier_aio_dai_suspend() are called, it decrements the
counter. And only if the counter is zero, it asserts reset lines and
disables clocks.
In the same way, uniphier_aio_dai_resume() are called, it increments the
counter after deasserting reset lines and enabling clocks.
Fixes: 139a342002 ("ASoC: uniphier: add support for UniPhier AIO CPU DAI driver")
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Link: https://lore.kernel.org/r/1566281764-14059-1-git-send-email-hayashi.kunihiko@socionext.com
Signed-off-by: Mark Brown <broonie@kernel.org>
We use defer cache flush mechanism to improve the performance of
610, but the implementation is wrong. We fix it up now and update
the mechanism:
- Zero page needn't be flushed.
- If page is file mapping & non-touched in user space, defer flush.
- If page is anon mapping or dirty file mapping, flush immediately.
- In update_mmu_cache finish the defer flush by flush_dcache_page().
For 610 we need take care the dcache aliasing issue:
- VIPT cache with 8K-bytes size per way in 4K page granularity.
Signed-off-by: Guo Ren <ren_guo@c-sky.com>
Cc: Arnd Bergmann <arnd@arndb.de>
As part of the grand SMMU driver refactoring effort, the I/O register
accessors were moved into 'arm-smmu.h' in commit 6d7dff62af
("iommu/arm-smmu: Move Secure access quirk to implementation").
On 32-bit architectures (such as ARM), the 64-bit accessors are defined
in 'linux/io-64-nonatomic-hi-lo.h', so include this header to fix the
build.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
If devm_request_irq() fails to disable all interrupts, no cleanup is
performed before retuning the error. To fix this issue, invoke
omap_dma_free() to do the cleanup.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Link: https://lore.kernel.org/r/1565938570-7528-1-git-send-email-wenwen@cs.uga.edu
Signed-off-by: Vinod Koul <vkoul@kernel.org>
In ti_dra7_xbar_probe(), 'rsv_events' is allocated through kcalloc(). Then
of_property_read_u32_array() is invoked to search for the property.
However, if this process fails, 'rsv_events' is not deallocated, leading to
a memory leak bug. To fix this issue, free 'rsv_events' before returning
the error.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Link: https://lore.kernel.org/r/1565938136-7249-1-git-send-email-wenwen@cs.uga.edu
Signed-off-by: Vinod Koul <vkoul@kernel.org>
There is no need to duplicate what SPI core already does, i.e. mapping buffers
for DMA capable transfers. This patch removes all related pices of code.
Tested-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The MCP2515 datasheet clearly describes a level-triggered interrupt pin.
Therefore the receiving interrupt controller must also be configured for
level-triggered operation otherwise there is a danger of a missed
interrupt condition blocking all subsequent interrupts. The ONESHOT
flag ensures that the interrupt is masked until the threaded interrupt
handler exits.
Rather than change the flags globally (they must have worked for at
least one user), keep the old behavior for for non DT devices. DT based
devices specify the flags in their corresonding DT node.
See: https://github.com/raspberrypi/linux/issues/2175https://github.com/raspberrypi/linux/issues/2263
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Tested-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Passing driver name as name during request_threaded_irq() results in all
CAN IRQs have same name. This does not help much to easily identify which
IRQ belongs to which CAN instance. Therefore pass dev_name() during
request_threaded_irq() so that better identifiable name is listed for
CAN devices in cat /proc/interrupts output.
Output of cat /proc/interrupts
Before this patch:
253: 2 gpio-mxc 13 Edge mcp251x
259: 2 gpio-mxc 19 Edge mcp251x
After this patch:
253: 2 gpio-mxc 13 Edge spi1.1
259: 2 gpio-mxc 19 Edge spi1.2
Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
Tested-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Some boards take longer than 5ms to power up after a reset, so allow
some retries attempts before giving up.
Fixes: ff06d611a3 ("can: mcp251x: Improve mcp251x_hw_reset()")
Cc: linux-stable <stable@vger.kernel.org>
Tested-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch changes all the uint8_t in the arguments in several function
to u8.
Acked-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch fixes the print format strings in the driver.
Acked-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch removes unnecessary blank lines, so that checkpatch doesn't
complain anymore.
Acked-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch converts all block comments to network subsystem style block
comments.
Acked-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch adds the missing error handling in m_can_plat_probe() if
mcan_class is NULL.
Fixes: f524f829b7 ("can: m_can: Create a m_can platform framework")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The struct m_can_classdev::device_data is a void pointer, so there's no
need to cast it to struct m_can_plat_priv *, when assigning the struct
m_can_plat_priv pointer.
This patch removes the not needed casts from the m_can_platform driver.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
In regmap_spi_gather_write() the "addr" is prepared. The chip expects
the number of 32 bit words to write in the lower 8 bits of addr. However
the number of byte to write in shifted left by 3 (== divided by 8).
The function tcan4x5x_regmap_write() is called with a data buffer, which
holds the register information in the first 32 bits, followed by the
actual data. tcan4x5x_regmap_write() calls regmap_spi_gather_write()
with the val pointer pointing to the actual data (i.e. the original
pointer is incremented by 4 bytes), but without decrementing the count.
If the regmap framework only calls tcan4x5x_regmap_write() to read
single 32 bit registers these two bugs cancel each other.
This patch fixes the code.
Fixes: 5443c226ba ("can: tcan4x5x: Add tcan4x5x driver to the kernel")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch adds the missing error handling in tcan4x5x_can_probe() if
mcan_class is NULL.
Fixes: 5443c226ba ("can: tcan4x5x: Add tcan4x5x driver to the kernel")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The struct m_can_classdev::device_data is a void pointer, so there's no
need to cast it to struct tcan4x5x_priv *, when assigning the struct
tcan4x5x_priv pointer.
This patch removes the not needed casts from the tcan4x5x driver.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The mutex struct tcan4x5x_priv::tcan4x5x_lock is unused in the driver,
so this patch removes the variable from the driver.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
There is no need to duplicate what SPI core already does, i.e. mapping buffers
for DMA capable transfers. This patch removes all related pices of code.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Static structure peak_pciec_i2c_bit_ops, of type i2c_algo_bit_data, is
not used except to be copied into another variable. Hence make it const
to protect it from modification.
Issue found with Coccinelle.
Signed-off-by: Nishka Dasgupta <nishkadg.linux@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
All R-Car platforms use DT for describing CAN controllers. R-Car CAN
platform data support was never used in any upstream kernel.
Move the Clock Select Register settings enum into the driver, and remove
platform data support and the corresponding header file.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The alignment of mux_configure32() and fsl_edma_chan_mux() need
to be adjusted, it must start precisely at the first column after
the openning parenthesis of the first line.
Fixes: 9d831528a6 ("dmaengine: fsl-edma: extract common fsl-edma code (no changes in behavior intended)")
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Link: https://lore.kernel.org/r/20190814072105.144107-3-maowenan@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
There is one sparse warning in drivers/dma/fsl-edma-common.c:
drivers/dma/fsl-edma-common.c:93:6: warning: symbol 'mux_configure32'
was not declared. Should it be static?
Fix it by setting mux_configure32() as static.
Fixes: 232a7f18cf ("dmaengine: fsl-edma: add i.mx7ulp edma2 version support")
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Link: https://lore.kernel.org/r/20190814072105.144107-2-maowenan@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Clang produces the following warning
drivers/dma/mv_xor_v2.c:264:40: warning: shifting a negative signed value
is undefined [-Wshift-negative-value]
reg &= (~MV_XOR_V2_DMA_IMSG_THRD_MASK <<
MV_XOR_V2_DMA_IMSG_THRD_SHIFT);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
drivers/dma/mv_xor_v2.c:271:46: warning: shifting a negative signed value
is undefined [-Wshift-negative-value]
reg &= (~MV_XOR_V2_DMA_IMSG_TIMER_THRD_MASK <<
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
Upon further investigation MV_XOR_V2_DMA_IMSG_THRD_SHIFT and
MV_XOR_V2_DMA_IMSG_TIMER_THRD_SHIFT are both 0. Since shifting by 0 does
nothing, these variables can be removed.
Cc: clang-built-linux@googlegroups.com
Link: https://github.com/ClangBuiltLinux/linux/issues/521
Signed-off-by: Nathan Huckleberry <nhuck@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20190813173448.109859-1-nhuck@google.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
After a partition migration, pseries_devicetree_update() processes
changes to the device tree communicated from the platform to
Linux. This is a relatively heavyweight operation, with multiple
device tree searches, memory allocations, and conversations with
partition firmware.
There's a few levels of nested loops which are bounded only by
decisions made by the platform, outside of Linux's control, and indeed
we have seen RCU stalls on large systems while executing this call
graph. Use cond_resched() in these loops so that the cpu is yielded
when needed.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802192926.19277-4-nathanl@linux.ibm.com
rtas_cpu_state_change_mask() potentially operates on scores of cpus,
so explicitly allow rescheduling in the loop body.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802192926.19277-3-nathanl@linux.ibm.com
The LPAR migration implementation and userspace-initiated cpu hotplug
can interleave their executions like so:
1. Set cpu 7 offline via sysfs.
2. Begin a partition migration, whose implementation requires the OS
to ensure all present cpus are online; cpu 7 is onlined:
rtas_ibm_suspend_me -> rtas_online_cpus_mask -> cpu_up
This sets cpu 7 online in all respects except for the cpu's
corresponding struct device; dev->offline remains true.
3. Set cpu 7 online via sysfs. _cpu_up() determines that cpu 7 is
already online and returns success. The driver core (device_online)
sets dev->offline = false.
4. The migration completes and restores cpu 7 to offline state:
rtas_ibm_suspend_me -> rtas_offline_cpus_mask -> cpu_down
This leaves cpu7 in a state where the driver core considers the cpu
device online, but in all other respects it is offline and
unused. Attempts to online the cpu via sysfs appear to succeed but the
driver core actually does not pass the request to the lower-level
cpuhp support code. This makes the cpu unusable until the cpu device
is manually set offline and then online again via sysfs.
Instead of directly calling cpu_up/cpu_down, the migration code should
use the higher-level device core APIs to maintain consistent state and
serialize operations.
Fixes: 120496ac2d ("powerpc: Bring all threads online prior to migration/hibernation")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802192926.19277-2-nathanl@linux.ibm.com
If a page is already mapped RW without the DIRTY flag, the DIRTY
flag is never set and a TLB store miss exception is taken forever.
This is easily reproduced with the following app:
void main(void)
{
volatile char *ptr = mmap(0, 4096, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0);
*ptr = *ptr;
}
When DIRTY flag is not set, bail out of TLB miss handler and take
a minor page fault which will set the DIRTY flag.
Fixes: f8b58c64ea ("powerpc/603: let's handle PAGE_DIRTY directly")
Cc: stable@vger.kernel.org # v5.1+
Reported-by: Doug Crawford <doug.crawford@intelight-its.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/80432f71194d7ee75b2f5043ecf1501cf1cca1f3.1566196646.git.christophe.leroy@c-s.fr
pfn_pte is never given a pte above the addressable physical memory
limit, so the masking is redundant. In case of a software bug, it
is not obviously better to silently truncate the pfn than to corrupt
the pte (either one will result in memory corruption or crashes),
so there is no reason to add this to the fast path.
Add VM_BUG_ON to catch cases where the pfn is invalid. These would
catch the create_section_mapping bug fixed by a previous commit.
[16885.256466] ------------[ cut here ]------------
[16885.256492] kernel BUG at arch/powerpc/include/asm/book3s/64/pgtable.h:612!
cpu 0x0: Vector: 700 (Program Check) at [c0000000ee0a36d0]
pc: c000000000080738: __map_kernel_page+0x248/0x6f0
lr: c000000000080ac0: __map_kernel_page+0x5d0/0x6f0
sp: c0000000ee0a3960
msr: 9000000000029033
current = 0xc0000000ec63b400
paca = 0xc0000000017f0000 irqmask: 0x03 irq_happened: 0x01
pid = 85, comm = sh
kernel BUG at arch/powerpc/include/asm/book3s/64/pgtable.h:612!
Linux version 5.3.0-rc1-00001-g0fe93e5f3394
enter ? for help
[c0000000ee0a3a00] c000000000d37378 create_physical_mapping+0x260/0x360
[c0000000ee0a3b10] c000000000d370bc create_section_mapping+0x1c/0x3c
[c0000000ee0a3b30] c000000000071f54 arch_add_memory+0x74/0x130
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190724084638.24982-5-npiggin@gmail.com
Ensure __va is given a physical address below PAGE_OFFSET, and __pa is
given a virtual address above PAGE_OFFSET.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190724084638.24982-4-npiggin@gmail.com
The alloc_pages_node return value should be tested for failure
before being passed to page_address.
Tested-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190724084638.24982-3-npiggin@gmail.com