mirror-linux/arch/s390/include/asm
Minchan Kim 99baac21e4 mm: fix MADV_[FREE|DONTNEED] TLB flush miss problem
Nadav reported parallel MADV_DONTNEED on same range has a stale TLB
problem and Mel fixed it[1] and found same problem on MADV_FREE[2].

Quote from Mel Gorman:
 "The race in question is CPU 0 running madv_free and updating some PTEs
  while CPU 1 is also running madv_free and looking at the same PTEs.
  CPU 1 may have writable TLB entries for a page but fail the pte_dirty
  check (because CPU 0 has updated it already) and potentially fail to
  flush.

  Hence, when madv_free on CPU 1 returns, there are still potentially
  writable TLB entries and the underlying PTE is still present so that a
  subsequent write does not necessarily propagate the dirty bit to the
  underlying PTE any more. Reclaim at some unknown time at the future
  may then see that the PTE is still clean and discard the page even
  though a write has happened in the meantime. I think this is possible
  but I could have missed some protection in madv_free that prevents it
  happening."

This patch aims for solving both problems all at once and is ready for
other problem with KSM, MADV_FREE and soft-dirty story[3].

TLB batch API(tlb_[gather|finish]_mmu] uses [inc|dec]_tlb_flush_pending
and mmu_tlb_flush_pending so that when tlb_finish_mmu is called, we can
catch there are parallel threads going on.  In that case, forcefully,
flush TLB to prevent for user to access memory via stale TLB entry
although it fail to gather page table entry.

I confirmed this patch works with [4] test program Nadav gave so this
patch supersedes "mm: Always flush VMA ranges affected by zap_page_range
v2" in current mmotm.

NOTE:

This patch modifies arch-specific TLB gathering interface(x86, ia64,
s390, sh, um).  It seems most of architecture are straightforward but
s390 need to be careful because tlb_flush_mmu works only if
mm->context.flush_mm is set to non-zero which happens only a pte entry
really is cleared by ptep_get_and_clear and friends.  However, this
problem never changes the pte entries but need to flush to prevent
memory access from stale tlb.

[1] http://lkml.kernel.org/r/20170725101230.5v7gvnjmcnkzzql3@techsingularity.net
[2] http://lkml.kernel.org/r/20170725100722.2dxnmgypmwnrfawp@suse.de
[3] http://lkml.kernel.org/r/BD3A0EBE-ECF4-41D4-87FA-C755EA9AB6BD@gmail.com
[4] https://patchwork.kernel.org/patch/9861621/

[minchan@kernel.org: decrease tlb flush pending count in tlb_finish_mmu]
  Link: http://lkml.kernel.org/r/20170808080821.GA31730@bbox
Link: http://lkml.kernel.org/r/20170802000818.4760-7-namit@vmware.com
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Nadav Amit <namit@vmware.com>
Reported-by: Nadav Amit <namit@vmware.com>
Reported-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-10 15:54:07 -07:00
..
fpu s390/fpu: improve kernel_fpu_[begin|end] 2016-08-29 11:05:01 +02:00
trace s390/zcrypt: tracepoint definitions for zcrypt device driver. 2016-12-14 16:33:40 +01:00
Kbuild s390: use two more generic header files 2017-06-12 16:25:57 +02:00
airq.h
appldata.h
archrandom.h s390/crypto: Provide s390 specific arch random functionality. 2017-04-26 13:41:35 +02:00
asm-prototypes.h s390/kbuild: enable modversions for symbols exported from asm 2016-12-20 15:22:56 +01:00
atomic.h s390/atomic: refactor atomic primitives 2016-11-11 16:37:33 +01:00
atomic_ops.h s390/spinlock: use atomic primitives for spinlocks 2017-04-12 08:43:33 +02:00
barrier.h
bitops.h s390/bitops: remove outdated comment 2017-03-22 08:29:05 +01:00
bug.h debug: Fix WARN_ON_ONCE() for modules 2017-07-20 12:31:04 +02:00
bugs.h
cache.h s390: use __section macro everywhere 2016-06-13 15:58:23 +02:00
ccwdev.h
ccwgroup.h
checksum.h Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
chpid.h
cio.h vfio: ccw: introduce support for ccw0 2017-03-31 12:55:12 +02:00
clp.h s390/pci: add ioctl interface for CLP 2016-03-07 16:54:32 +01:00
cmb.h
cmpxchg.h
compat.h take compat_sys_old_getrlimit() to native syscall 2017-05-27 15:38:06 -04:00
cpacf.h Merge branch 's390forkvm' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into features 2017-04-27 07:34:07 +02:00
cpcmd.h
cpu.h
cpu_mf.h s390/cpu_mf: remove register variable in __ecctr() 2017-03-31 07:53:34 +02:00
cpufeature.h
cputime.h s390/cputime: provide archicture specific cputime_to_nsecs 2017-03-01 09:59:27 +01:00
crw.h
css_chars.h
ctl_reg.h KVM: s390: implement instruction execution protection for emulated 2017-06-22 12:41:06 +02:00
current.h
debug.h s390: convert debug_info.ref_count from atomic_t to refcount_t 2017-05-11 16:35:32 +02:00
delay.h
diag.h s390/diag: add diag26c support 2017-06-20 15:44:15 -04:00
dis.h s390/uprobes: fix compile for !KPROBES 2017-05-03 09:08:57 +02:00
dma-mapping.h s390: implement ->mapping_error 2017-06-28 06:54:31 -07:00
dma.h
eadm.h block: introduce new block status code type 2017-06-09 09:27:32 -06:00
ebcdic.h
elf.h s390: reduce ELF_ET_DYN_BASE 2017-07-10 16:32:36 -07:00
exec.h
extable.h s390: switch to extable.h 2017-03-28 18:23:55 -04:00
extmem.h
facility.h s390/facilities: get rid of __ASSEMBLY__ in facility header file 2017-03-22 08:29:18 +01:00
fcx.h s390: use canonical include guard style 2016-06-13 15:58:17 +02:00
ftrace.h s390/dumpstack: get rid of return_address again 2016-10-17 14:44:33 +02:00
futex.h
gmap.h KVM: s390: backup the currently enabled gmap when scheduled out 2016-06-20 09:55:24 +02:00
hardirq.h
hugetlb.h mm/hugetlb: allow architectures to override huge_pte_clear() 2017-07-06 16:24:34 -07:00
hw_irq.h
idals.h Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
idle.h
io.h s390: provide default ioremap and iounmap declaration 2017-06-12 16:26:00 +02:00
ipl.h s390: fix initrd corruptions with gcov/kcov instrumented kernels 2016-12-12 12:11:20 +01:00
irq.h s390: use SPARSE_IRQ 2016-06-13 15:58:24 +02:00
irqflags.h s390/irqflags: optimize irq restore 2016-01-19 12:14:01 +01:00
isc.h vfio: ccw: basic implementation for vfio_ccw driver 2017-03-31 12:55:04 +02:00
itcw.h
jump_label.h s390: add explicit <linux/stringify.h> for jump label 2016-06-13 15:58:16 +02:00
kdebug.h
kexec.h s390/crash: Remove unused KEXEC_NOTE_BYTES 2017-07-05 07:35:29 +02:00
kprobes.h s390/uprobes: fix compile for !KPROBES 2017-05-03 09:08:57 +02:00
kvm_host.h PPC: 2017-07-06 18:38:31 -07:00
kvm_para.h
linkage.h
livepatch.h s390: Audit and remove any remaining unnecessary uses of module.h 2017-02-17 07:40:41 +01:00
lowcore.h s390: add a system call for guarded storage 2017-03-22 08:14:25 +01:00
mman.h s390/mm: make TASK_SIZE independent from the number of page table levels 2017-04-25 07:47:32 +02:00
mmu.h s390/kvm: Add use_cmma field to mm_context_t 2017-04-20 13:33:09 +02:00
mmu_context.h s390/kvm: avoid global config of vm.alloc_pgste=1 2017-06-13 13:03:41 +02:00
mmzone.h
module.h
nmi.h KVM: s390: Inject machine check into the guest 2017-06-28 12:42:32 +02:00
numa.h
os_info.h
page-states.h s390/kvm: Add PGSTE manipulation functions 2017-04-20 13:33:08 +02:00
page.h s390/mm: implement 5 level pages tables 2017-06-12 16:25:54 +02:00
pci.h s390/pci: fix handling of PEC 306 2017-06-28 07:32:13 +02:00
pci_clp.h s390/pci: use proper endianness annotations 2017-01-16 07:27:53 +01:00
pci_debug.h
pci_dma.h
pci_insn.h s390/pci: improve error handling during interrupt deregistration 2017-06-28 07:32:08 +02:00
pci_io.h s390/pci: improve ZPCI_* macros 2016-01-26 12:45:49 +01:00
percpu.h s390/percpu: remove this_cpu_cmpxchg_double_4 2016-03-02 06:44:30 -06:00
perf_event.h s390/cpum_cf: update counter numbers to ecctr limits 2017-03-31 07:53:26 +02:00
pgalloc.h s390/mm: implement 5 level pages tables 2017-06-12 16:25:54 +02:00
pgtable.h s390/mm: add p?d_folded() helper functions 2017-06-12 16:26:00 +02:00
pkey.h s390/pkey: Introduce new API for secure key verification 2017-03-22 08:29:13 +01:00
preempt.h s390/preempt: move preempt_count to the lowcore 2016-11-11 16:37:40 +01:00
processor.h Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux 2017-07-03 15:39:36 -07:00
ptrace.h s390/kvm: avoid global config of vm.alloc_pgste=1 2017-06-13 13:03:41 +02:00
qdio.h
reset.h
runtime_instr.h
rwsem.h locking/rwsem: Remove rwsem_atomic_add() and rwsem_atomic_update() 2016-06-08 15:16:59 +02:00
schid.h
sclp.h s390/sclp: Detect KSS facility 2017-04-21 11:08:04 +02:00
scsw.h s390/dasd: channel path aware error recovery 2016-12-12 12:05:03 +01:00
seccomp.h s390/seccomp: include generic seccomp header file 2016-04-01 17:20:55 +02:00
sections.h mm: fix section name for .data..ro_after_init 2017-03-31 17:13:30 -07:00
segment.h
serial.h
set_memory.h treewide: move set_memory_* functions away from cacheflush.h 2017-05-08 17:15:13 -07:00
setup.h s390/spinlock: remove compare and delay instruction 2017-04-12 08:43:33 +02:00
shmparam.h
signal.h
sigp.h s390/smp: use sigp condition code define 2017-06-12 16:25:58 +02:00
smp.h s390/smp: initialize cpu_present_mask in setup_arch 2016-12-07 07:23:07 +01:00
sparsemem.h s390: make MAX_PHYSMEM_BITS configurable 2017-03-28 16:55:10 +02:00
spinlock.h s390/spinlock: use atomic primitives for spinlocks 2017-04-12 08:43:33 +02:00
spinlock_types.h s390/spinlock: use atomic primitives for spinlocks 2017-04-12 08:43:33 +02:00
stp.h s390/time: remove ETR support 2016-06-13 15:58:21 +02:00
string.h s390/lib: add missing memory barriers to string inline assemblies 2016-12-14 16:33:41 +01:00
switch_to.h s390: add a system call for guarded storage 2017-03-22 08:14:25 +01:00
syscall.h s390/syscalls: Fix out of bounds arguments access 2017-07-05 07:35:30 +02:00
sysinfo.h S390/sysinfo: use uuid_is_null instead of opencoding it 2017-06-05 16:59:06 +02:00
termios.h
thread_info.h s390/kvm: avoid global config of vm.alloc_pgste=1 2017-06-13 13:03:41 +02:00
timex.h s390/timex: micro optimization for tod_to_ns 2017-03-01 09:59:28 +01:00
tlb.h mm: fix MADV_[FREE|DONTNEED] TLB flush miss problem 2017-08-10 15:54:07 -07:00
tlbflush.h s390/mm,kvm: flush gmap address space with IDTE 2016-08-24 09:23:55 +02:00
topology.h s390/numa: establish cpu to node mapping early 2016-12-07 07:23:25 +01:00
types.h
uaccess.h Merge branch 'work.uaccess-unaligned' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-07-15 11:17:52 -07:00
unaligned.h
unistd.h s390: ignore pkey system calls 2016-10-17 11:25:25 +02:00
uprobes.h uprobes: remove function declarations from arch/{mips,s390} 2016-10-07 18:46:30 -07:00
user.h
vdso.h s390/time: steer clocksource on STP sync events 2016-10-28 10:09:02 +02:00
vga.h
vtime.h
vtimer.h
vx-insn.h RAID/s390: add SIMD implementation for raid6 gen/xor 2016-08-29 11:05:04 +02:00
xor.h s390/xor: optimized xor routing using the XC instruction 2016-02-23 08:56:17 +01:00