commit 160f4124ea8b4cd6c86867e111fa55e266345a16
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Tue Jul 11 06:31:05 2023 +0200

    Linux 6.4.3
    
    Link: https://lore.kernel.org/r/20230709111345.297026264@linuxfoundation.org
    Tested-by: Ronald Warsow <rwarsow@gmx.de
    Link: https://lore.kernel.org/r/20230709203826.141774942@linuxfoundation.org
    Tested-by: Ronald Warsow <rwarsow@gmx.de>
    Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
    Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
    Tested-by: Chris Paterson (CIP) <chris.paterson2@renesas.com>
    Tested-by: Salvatore Bonaccorso <carnil@debian.org>
    Tested-by: Guenter Roeck <linux@roeck-us.net>
    Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com>
    Tested-by: Ron Economos <re@w6rz.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 036666b4163d320282a627075934f1ab0de12f8b
Author: Suren Baghdasaryan <surenb@google.com>
Date:   Sat Jul 8 12:12:12 2023 -0700

    fork: lock VMAs of the parent process when forking
    
    commit fb49c455323ff8319a123dd312be9082c49a23a5 upstream.
    
    When forking a child process, the parent write-protects anonymous pages
    and COW-shares them with the child being forked using copy_present_pte().
    
    We must not take any concurrent page faults on the source vma's as they
    are being processed, as we expect both the vma and the pte's behind it
    to be stable.  For example, the anon_vma_fork() expects the parents
    vma->anon_vma to not change during the vma copy.
    
    A concurrent page fault on a page newly marked read-only by the page
    copy might trigger wp_page_copy() and a anon_vma_prepare(vma) on the
    source vma, defeating the anon_vma_clone() that wasn't done because the
    parent vma originally didn't have an anon_vma, but we now might end up
    copying a pte entry for a page that has one.
    
    Before the per-vma lock based changes, the mmap_lock guaranteed
    exclusion with concurrent page faults.  But now we need to do a
    vma_start_write() to make sure no concurrent faults happen on this vma
    while it is being processed.
    
    This fix can potentially regress some fork-heavy workloads.  Kernel
    build time did not show noticeable regression on a 56-core machine while
    a stress test mapping 10000 VMAs and forking 5000 times in a tight loop
    shows ~5% regression.  If such fork time regression is unacceptable,
    disabling CONFIG_PER_VMA_LOCK should restore its performance.  Further
    optimizations are possible if this regression proves to be problematic.
    
    Suggested-by: David Hildenbrand <david@redhat.com>
    Reported-by: Jiri Slaby <jirislaby@kernel.org>
    Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
    Reported-by: Holger Hoffstätte <holger@applied-asynchrony.com>
    Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-asynchrony.com/
    Reported-by: Jacob Young <jacobly.alt@gmail.com>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624
    Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first")
    Cc: stable@vger.kernel.org
    Signed-off-by: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 890ba5c464c2a9aeb26d0e873962e5b7d401df6b
Author: Liu Shixin <liushixin2@huawei.com>
Date:   Tue Jul 4 18:19:42 2023 +0800

    bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page
    
    commit 028725e73375a1ff080bbdf9fb503306d0116f28 upstream.
    
    commit dd0ff4d12dd2 ("bootmem: remove the vmemmap pages from kmemleak in
    put_page_bootmem") fix an overlaps existing problem of kmemleak.  But the
    problem still existed when HAVE_BOOTMEM_INFO_NODE is disabled, because in
    this case, free_bootmem_page() will call free_reserved_page() directly.
    
    Fix the problem by adding kmemleak_free_part() in free_bootmem_page() when
    HAVE_BOOTMEM_INFO_NODE is disabled.
    
    Link: https://lkml.kernel.org/r/20230704101942.2819426-1-liushixin2@huawei.com
    Fixes: f41f2ed43ca5 ("mm: hugetlb: free the vmemmap pages associated with each HugeTLB page")
    Signed-off-by: Liu Shixin <liushixin2@huawei.com>
    Acked-by: Muchun Song <songmuchun@bytedance.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Mike Kravetz <mike.kravetz@oracle.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e83e62fb1f386ee0e3e0da327660d4a4bcc2af2e
Author: Peter Collingbourne <pcc@google.com>
Date:   Mon May 22 17:43:08 2023 -0700

    mm: call arch_swap_restore() from do_swap_page()
    
    commit 6dca4ac6fc91fd41ea4d6c4511838d37f4e0eab2 upstream.
    
    Commit c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") moved
    the call to swap_free() before the call to set_pte_at(), which meant that
    the MTE tags could end up being freed before set_pte_at() had a chance to
    restore them.  Fix it by adding a call to the arch_swap_restore() hook
    before the call to swap_free().
    
    Link: https://lkml.kernel.org/r/20230523004312.1807357-2-pcc@google.com
    Link: https://linux-review.googlesource.com/id/I6470efa669e8bd2f841049b8c61020c510678965
    Fixes: c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")
    Signed-off-by: Peter Collingbourne <pcc@google.com>
    Reported-by: Qun-wei Lin <Qun-wei.Lin@mediatek.com>
    Closes: https://lore.kernel.org/all/5050805753ac469e8d727c797c2218a9d780d434.camel@mediatek.com/
    Acked-by: David Hildenbrand <david@redhat.com>
    Acked-by: "Huang, Ying" <ying.huang@intel.com>
    Reviewed-by: Steven Price <steven.price@arm.com>
    Acked-by: Catalin Marinas <catalin.marinas@arm.com>
    Cc: <stable@vger.kernel.org>    [6.1+]
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 18822d84fd0931825e9640d757a47a93df1c97bb
Author: Hugh Dickins <hughd@google.com>
Date:   Sat Jul 8 16:04:00 2023 -0700

    mm: lock newly mapped VMA with corrected ordering
    
    commit 1c7873e3364570ec89343ff4877e0f27a7b21a61 upstream.
    
    Lockdep is certainly right to complain about
    
      (&vma->vm_lock->lock){++++}-{3:3}, at: vma_start_write+0x2d/0x3f
                     but task is already holding lock:
      (&mapping->i_mmap_rwsem){+.+.}-{3:3}, at: mmap_region+0x4dc/0x6db
    
    Invert those to the usual ordering.
    
    Fixes: 33313a747e81 ("mm: lock newly mapped VMA which can be modified after it becomes visible")
    Cc: stable@vger.kernel.org
    Signed-off-by: Hugh Dickins <hughd@google.com>
    Tested-by: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 406815be903b5d7edeffff9594eabf0db2035fe1
Author: Suren Baghdasaryan <surenb@google.com>
Date:   Sat Jul 8 12:12:11 2023 -0700

    mm: lock newly mapped VMA which can be modified after it becomes visible
    
    commit 33313a747e81af9f31d0d45de78c9397fa3655eb upstream.
    
    mmap_region adds a newly created VMA into VMA tree and might modify it
    afterwards before dropping the mmap_lock.  This poses a problem for page
    faults handled under per-VMA locks because they don't take the mmap_lock
    and can stumble on this VMA while it's still being modified.  Currently
    this does not pose a problem since post-addition modifications are done
    only for file-backed VMAs, which are not handled under per-VMA lock.
    However, once support for handling file-backed page faults with per-VMA
    locks is added, this will become a race.
    
    Fix this by write-locking the VMA before inserting it into the VMA tree.
    Other places where a new VMA is added into VMA tree do not modify it
    after the insertion, so do not need the same locking.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 10bef9542ad3f38d6fb0919a44c413ddd3222814
Author: Suren Baghdasaryan <surenb@google.com>
Date:   Sat Jul 8 12:12:10 2023 -0700

    mm: lock a vma before stack expansion
    
    commit c137381f71aec755fbf47cd4e9bd4dce752c054c upstream.
    
    With recent changes necessitating mmap_lock to be held for write while
    expanding a stack, per-VMA locks should follow the same rules and be
    write-locked to prevent page faults into the VMA being expanded. Add
    the necessary locking.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>