Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/torvalds/linux.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJunaid Shahid <junaids@google.com>2022-08-11 01:49:39 +0300
committerPaolo Bonzini <pbonzini@redhat.com>2022-08-19 14:38:03 +0300
commitb64d740ea7ddc929d97b28de4c0665f7d5db9e2a (patch)
tree671bb2795093290203d8473e28a900967525712b /arch/x86/kvm/x86.c
parent1441ca1494d74a2c9d6c9bd8dd633d9ebff1daba (diff)
kvm: x86: mmu: Always flush TLBs when enabling dirty logging
When A/D bits are not available, KVM uses a software access tracking mechanism, which involves making the SPTEs inaccessible. However, the clear_young() MMU notifier does not flush TLBs. So it is possible that there may still be stale, potentially writable, TLB entries. This is usually fine, but can be problematic when enabling dirty logging, because it currently only does a TLB flush if any SPTEs were modified. But if all SPTEs are in access-tracked state, then there won't be a TLB flush, which means that the guest could still possibly write to memory and not have it reflected in the dirty bitmap. So just unconditionally flush the TLBs when enabling dirty logging. As an alternative, KVM could explicitly check the MMU-Writable bit when write-protecting SPTEs to decide if a flush is needed (instead of checking the Writable bit), but given that a flush almost always happens anyway, so just making it unconditional seems simpler. Signed-off-by: Junaid Shahid <junaids@google.com> Message-Id: <20220810224939.2611160-1-junaids@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Diffstat (limited to 'arch/x86/kvm/x86.c')
-rw-r--r--arch/x86/kvm/x86.c44
1 files changed, 44 insertions, 0 deletions
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 205ebdc2b11b..d7374d768296 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12473,6 +12473,50 @@ static void kvm_mmu_slot_apply_flags(struct kvm *kvm,
} else {
kvm_mmu_slot_remove_write_access(kvm, new, PG_LEVEL_4K);
}
+
+ /*
+ * Unconditionally flush the TLBs after enabling dirty logging.
+ * A flush is almost always going to be necessary (see below),
+ * and unconditionally flushing allows the helpers to omit
+ * the subtly complex checks when removing write access.
+ *
+ * Do the flush outside of mmu_lock to reduce the amount of
+ * time mmu_lock is held. Flushing after dropping mmu_lock is
+ * safe as KVM only needs to guarantee the slot is fully
+ * write-protected before returning to userspace, i.e. before
+ * userspace can consume the dirty status.
+ *
+ * Flushing outside of mmu_lock requires KVM to be careful when
+ * making decisions based on writable status of an SPTE, e.g. a
+ * !writable SPTE doesn't guarantee a CPU can't perform writes.
+ *
+ * Specifically, KVM also write-protects guest page tables to
+ * monitor changes when using shadow paging, and must guarantee
+ * no CPUs can write to those page before mmu_lock is dropped.
+ * Because CPUs may have stale TLB entries at this point, a
+ * !writable SPTE doesn't guarantee CPUs can't perform writes.
+ *
+ * KVM also allows making SPTES writable outside of mmu_lock,
+ * e.g. to allow dirty logging without taking mmu_lock.
+ *
+ * To handle these scenarios, KVM uses a separate software-only
+ * bit (MMU-writable) to track if a SPTE is !writable due to
+ * a guest page table being write-protected (KVM clears the
+ * MMU-writable flag when write-protecting for shadow paging).
+ *
+ * The use of MMU-writable is also the primary motivation for
+ * the unconditional flush. Because KVM must guarantee that a
+ * CPU doesn't contain stale, writable TLB entries for a
+ * !MMU-writable SPTE, KVM must flush if it encounters any
+ * MMU-writable SPTE regardless of whether the actual hardware
+ * writable bit was set. I.e. KVM is almost guaranteed to need
+ * to flush, while unconditionally flushing allows the "remove
+ * write access" helpers to ignore MMU-writable entirely.
+ *
+ * See is_writable_pte() for more details (the case involving
+ * access-tracked SPTEs is particularly relevant).
+ */
+ kvm_arch_flush_remote_tlbs_memslot(kvm, new);
}
}