Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/torvalds/linux.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-11-09KVM: SVM: Only dump VMSA to klog at KERN_DEBUG levelPeter Gonda
Explicitly print the VMSA dump at KERN_DEBUG log level, KERN_CONT uses KERNEL_DEFAULT if the previous log line has a newline, i.e. if there's nothing to continuing, and as a result the VMSA gets dumped when it shouldn't. The KERN_CONT documentation says it defaults back to KERNL_DEFAULT if the previous log line has a newline. So switch from KERN_CONT to print_hex_dump_debug(). Jarkko pointed this out in reference to the original patch. See: https://lore.kernel.org/all/YuPMeWX4uuR1Tz3M@kernel.org/ print_hex_dump(KERN_DEBUG, ...) was pointed out there, but print_hex_dump_debug() should similar. Fixes: 6fac42f127b8 ("KVM: SVM: Dump Virtual Machine Save Area (VMSA) to klog") Signed-off-by: Peter Gonda <pgonda@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Harald Hoyer <harald@profian.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Message-Id: <20221104142220.469452-1-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-09KVM: SVM: do not allocate struct svm_cpu_data dynamicallyPaolo Bonzini
The svm_data percpu variable is a pointer, but it is allocated via svm_hardware_setup() when KVM is loaded. Unlike hardware_enable() this means that it is never NULL for the whole lifetime of KVM, and static allocation does not waste any memory compared to the status quo. It is also more efficient and more easily handled from assembly code, so do it and don't look back. Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-08-10KVM: SVM: Disable SEV-ES support if MMIO caching is disableSean Christopherson
Disable SEV-ES if MMIO caching is disabled as SEV-ES relies on MMIO SPTEs generating #NPF(RSVD), which are reflected by the CPU into the guest as a #VC. With SEV-ES, the untrusted host, a.k.a. KVM, doesn't have access to the guest instruction stream or register state and so can't directly emulate in response to a #NPF on an emulated MMIO GPA. Disabling MMIO caching means guest accesses to emulated MMIO ranges cause #NPF(!PRESENT), and those flavors of #NPF cause automatic VM-Exits, not #VC. Adjust KVM's MMIO masks to account for the C-bit location prior to doing SEV(-ES) setup, and document that dependency between adjusting the MMIO SPTE mask and SEV(-ES) setup. Fixes: b09763da4dd8 ("KVM: x86/mmu: Add module param to disable MMIO caching (for testing)") Reported-by: Michael Roth <michael.roth@amd.com> Tested-by: Michael Roth <michael.roth@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220803224957.1285926-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-08-01Merge remote-tracking branch 'kvm/next' into kvm-next-5.20Paolo Bonzini
KVM/s390, KVM/x86 and common infrastructure changes for 5.20 x86: * Permit guests to ignore single-bit ECC errors * Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache * Intel IPI virtualization * Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS * PEBS virtualization * Simplify PMU emulation by just using PERF_TYPE_RAW events * More accurate event reinjection on SVM (avoid retrying instructions) * Allow getting/setting the state of the speaker port data bit * Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent * "Notify" VM exit (detect microarchitectural hangs) for Intel * Cleanups for MCE MSR emulation s390: * add an interface to provide a hypervisor dump for secure guests * improve selftests to use TAP interface * enable interpretive execution of zPCI instructions (for PCI passthrough) * First part of deferred teardown * CPU Topology * PV attestation * Minor fixes Generic: * new selftests API using struct kvm_vcpu instead of a (vm, id) tuple x86: * Use try_cmpxchg64 instead of cmpxchg64 * Bugfixes * Ignore benign host accesses to PMU MSRs when PMU is disabled * Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior * x86/MMU: Allow NX huge pages to be disabled on a per-vm basis * Port eager page splitting to shadow MMU as well * Enable CMCI capability by default and handle injected UCNA errors * Expose pid of vcpu threads in debugfs * x2AVIC support for AMD * cleanup PIO emulation * Fixes for LLDT/LTR emulation * Don't require refcounted "struct page" to create huge SPTEs x86 cleanups: * Use separate namespaces for guest PTEs and shadow PTEs bitmasks * PIO emulation * Reorganize rmap API, mostly around rmap destruction * Do not workaround very old KVM bugs for L0 that runs with nesting enabled * new selftests API for CPUID
2022-07-28KVM: SVM: Dump Virtual Machine Save Area (VMSA) to klogJarkko Sakkinen
As Virtual Machine Save Area (VMSA) is essential in troubleshooting attestation, dump it to the klog with the KERN_DEBUG level of priority. Cc: Jarkko Sakkinen <jarkko@kernel.org> Suggested-by: Harald Hoyer <harald@profian.com> Signed-off-by: Jarkko Sakkinen <jarkko@profian.com> Message-Id: <20220728050919.24113-1-jarkko@profian.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: SEV: Init target VMCBs in sev_migrate_fromPeter Gonda
The target VMCBs during an intra-host migration need to correctly setup for running SEV and SEV-ES guests. Add sev_init_vmcb() function and make sev_es_init_vmcb() static. sev_init_vmcb() uses the now private function to init SEV-ES guests VMCBs when needed. Fixes: 0b020f5af092 ("KVM: SEV: Add support for SEV-ES intra host migration") Fixes: b56639318bb2 ("KVM: SEV: Add support for SEV intra host migration") Signed-off-by: Peter Gonda <pgonda@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: <20220623173406.744645-1-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/svm: add __GFP_ACCOUNT to __sev_dbg_{en,de}crypt_user()Mingwei Zhang
Adding the accounting flag when allocating pages within the SEV function, since these memory pages should belong to individual VM. No functional change intended. Signed-off-by: Mingwei Zhang <mizhang@google.com> Message-Id: <20220623171858.2083637-1-mizhang@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-15KVM: SVM: Hide SEV migration lockdep goo behind CONFIG_PROVE_LOCKINGSean Christopherson
Wrap the manipulation of @role and the manual mutex_{release,acquire}() invocations in CONFIG_PROVE_LOCKING=y to squash a clang-15 warning. When building with -Wunused-but-set-parameter and CONFIG_DEBUG_LOCK_ALLOC=n, clang-15 seees there's no usage of @role in mutex_lock_killable_nested() and yells. PROVE_LOCKING selects DEBUG_LOCK_ALLOC, and the only reason KVM manipulates @role is to make PROVE_LOCKING happy. To avoid true ugliness, use "i" and "j" to detect the first pass in the loops; the "idx" field that's used by kvm_for_each_vcpu() is guaranteed to be '0' on the first pass as it's simply the first entry in the vCPUs XArray, which is fully KVM controlled. kvm_for_each_vcpu() passes '0' for xa_for_each_range()'s "start", and xa_for_each_range() will not enter the loop if there's no entry at '0'. Fixes: 0c2c7c069285 ("KVM: SEV: Mark nested locking of vcpu->lock") Reported-by: kernel test robot <lkp@intel.com> Cc: Peter Gonda <pgonda@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220613214237.2538266-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-27Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "S390: - ultravisor communication device driver - fix TEID on terminating storage key ops RISC-V: - Added Sv57x4 support for G-stage page table - Added range based local HFENCE functions - Added remote HFENCE functions based on VCPU requests - Added ISA extension registers in ONE_REG interface - Updated KVM RISC-V maintainers entry to cover selftests support ARM: - Add support for the ARMv8.6 WFxT extension - Guard pages for the EL2 stacks - Trap and emulate AArch32 ID registers to hide unsupported features - Ability to select and save/restore the set of hypercalls exposed to the guest - Support for PSCI-initiated suspend in collaboration with userspace - GICv3 register-based LPI invalidation support - Move host PMU event merging into the vcpu data structure - GICv3 ITS save/restore fixes - The usual set of small-scale cleanups and fixes x86: - New ioctls to get/set TSC frequency for a whole VM - Allow userspace to opt out of hypercall patching - Only do MSR filtering for MSRs accessed by rdmsr/wrmsr AMD SEV improvements: - Add KVM_EXIT_SHUTDOWN metadata for SEV-ES - V_TSC_AUX support Nested virtualization improvements for AMD: - Support for "nested nested" optimizations (nested vVMLOAD/VMSAVE, nested vGIF) - Allow AVIC to co-exist with a nested guest running - Fixes for LBR virtualizations when a nested guest is running, and nested LBR virtualization support - PAUSE filtering for nested hypervisors Guest support: - Decoupling of vcpu_is_preempted from PV spinlocks" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (199 commits) KVM: x86: Fix the intel_pt PMI handling wrongly considered from guest KVM: selftests: x86: Sync the new name of the test case to .gitignore Documentation: kvm: reorder ARM-specific section about KVM_SYSTEM_EVENT_SUSPEND x86, kvm: use correct GFP flags for preemption disabled KVM: LAPIC: Drop pending LAPIC timer injection when canceling the timer x86/kvm: Alloc dummy async #PF token outside of raw spinlock KVM: x86: avoid calling x86 emulator without a decoded instruction KVM: SVM: Use kzalloc for sev ioctl interfaces to prevent kernel data leak x86/fpu: KVM: Set the base guest FPU uABI size to sizeof(struct kvm_xsave) s390/uv_uapi: depend on CONFIG_S390 KVM: selftests: x86: Fix test failure on arch lbr capable platforms KVM: LAPIC: Trace LAPIC timer expiration on every vmentry KVM: s390: selftest: Test suppression indication on key prot exception KVM: s390: Don't indicate suppression on dirtying, failing memop selftests: drivers/s390x: Add uvdevice tests drivers/s390/char: Add Ultravisor io device MAINTAINERS: Update KVM RISC-V entry to cover selftests support RISC-V: KVM: Introduce ISA extension register RISC-V: KVM: Cleanup stale TLB entries when host CPU changes RISC-V: KVM: Add remote HFENCE functions based on VCPU requests ...
2022-05-25KVM: SVM: Use kzalloc for sev ioctl interfaces to prevent kernel data leakAshish Kalra
For some sev ioctl interfaces, the length parameter that is passed maybe less than or equal to SEV_FW_BLOB_MAX_SIZE, but larger than the data that PSP firmware returns. In this case, kmalloc will allocate memory that is the size of the input rather than the size of the data. Since PSP firmware doesn't fully overwrite the allocated buffer, these sev ioctl interface may return uninitialized kernel slab memory. Reported-by: Andy Nguyen <theflow@google.com> Suggested-by: David Rientjes <rientjes@google.com> Suggested-by: Peter Gonda <pgonda@google.com> Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Cc: linux-kernel@vger.kernel.org Fixes: eaf78265a4ab3 ("KVM: SVM: Move SEV code to separate file") Fixes: 2c07ded06427d ("KVM: SVM: add support for SEV attestation command") Fixes: 4cfdd47d6d95a ("KVM: SVM: Add KVM_SEV SEND_START command") Fixes: d3d1af85e2c75 ("KVM: SVM: Add KVM_SEND_UPDATE_DATA command") Fixes: eba04b20e4861 ("KVM: x86: Account a variety of miscellaneous allocations") Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Reviewed-by: Peter Gonda <pgonda@google.com> Message-Id: <20220516154310.3685678-1-Ashish.Kalra@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-25Merge tag 'kvm-riscv-5.19-1' of https://github.com/kvm-riscv/linux into HEADPaolo Bonzini
KVM/riscv changes for 5.19 - Added Sv57x4 support for G-stage page table - Added range based local HFENCE functions - Added remote HFENCE functions based on VCPU requests - Added ISA extension registers in ONE_REG interface - Updated KVM RISC-V maintainers entry to cover selftests support
2022-05-25Merge tag 'kvmarm-5.19' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 5.19 - Add support for the ARMv8.6 WFxT extension - Guard pages for the EL2 stacks - Trap and emulate AArch32 ID registers to hide unsupported features - Ability to select and save/restore the set of hypercalls exposed to the guest - Support for PSCI-initiated suspend in collaboration with userspace - GICv3 register-based LPI invalidation support - Move host PMU event merging into the vcpu data structure - GICv3 ITS save/restore fixes - The usual set of small-scale cleanups and fixes [Due to the conflict, KVM_SYSTEM_EVENT_SEV_TERM is relocated from 4 to 6. - Paolo]
2022-05-24Merge tag 'x86_sev_for_v5.19_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull AMD SEV-SNP support from Borislav Petkov: "The third AMD confidential computing feature called Secure Nested Paging. Add to confidential guests the necessary memory integrity protection against malicious hypervisor-based attacks like data replay, memory remapping and others, thus achieving a stronger isolation from the hypervisor. At the core of the functionality is a new structure called a reverse map table (RMP) with which the guest has a say in which pages get assigned to it and gets notified when a page which it owns, gets accessed/modified under the covers so that the guest can take an appropriate action. In addition, add support for the whole machinery needed to launch a SNP guest, details of which is properly explained in each patch. And last but not least, the series refactors and improves parts of the previous SEV support so that the new code is accomodated properly and not just bolted on" * tag 'x86_sev_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) x86/entry: Fixup objtool/ibt validation x86/sev: Mark the code returning to user space as syscall gap x86/sev: Annotate stack change in the #VC handler x86/sev: Remove duplicated assignment to variable info x86/sev: Fix address space sparse warning x86/sev: Get the AP jump table address from secrets page x86/sev: Add missing __init annotations to SEV init routines virt: sevguest: Rename the sevguest dir and files to sev-guest virt: sevguest: Change driver name to reflect generic SEV support x86/boot: Put globals that are accessed early into the .data section x86/boot: Add an efi.h header for the decompressor virt: sevguest: Fix bool function returning negative value virt: sevguest: Fix return value check in alloc_shared_pages() x86/sev-es: Replace open-coded hlt-loop with sev_es_terminate() virt: sevguest: Add documentation for SEV-SNP CPUID Enforcement virt: sevguest: Add support to get extended report virt: sevguest: Add support to derive key virt: Add SEV-SNP guest driver x86/sev: Register SEV-SNP guest request platform device x86/sev: Provide support for SNP guest request NAEs ...
2022-05-06KVM: SEV: Mark nested locking of vcpu->lockPeter Gonda
svm_vm_migrate_from() uses sev_lock_vcpus_for_migration() to lock all source and target vcpu->locks. Unfortunately there is an 8 subclass limit, so a new subclass cannot be used for each vCPU. Instead maintain ownership of the first vcpu's mutex.dep_map using a role specific subclass: source vs target. Release the other vcpu's mutex.dep_maps. Fixes: b56639318bb2b ("KVM: SEV: Add support for SEV intra host migration") Reported-by: John Sperbeck<jsperbeck@google.com> Suggested-by: David Rientjes <rientjes@google.com> Suggested-by: Sean Christopherson <seanjc@google.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Hillf Danton <hdanton@sina.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Peter Gonda <pgonda@google.com> Message-Id: <20220502165807.529624-1-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-29KVM: SEV-ES: Use V_TSC_AUX if available instead of RDTSC/MSR_TSC_AUX interceptsBabu Moger
The TSC_AUX virtualization feature allows AMD SEV-ES guests to securely use TSC_AUX (auxiliary time stamp counter data) in the RDTSCP and RDPID instructions. The TSC_AUX value is set using the WRMSR instruction to the TSC_AUX MSR (0xC0000103). It is read by the RDMSR, RDTSCP and RDPID instructions. If the read/write of the TSC_AUX MSR is intercepted, then RDTSCP and RDPID must also be intercepted when TSC_AUX virtualization is present. However, the RDPID instruction can't be intercepted. This means that when TSC_AUX virtualization is present, RDTSCP and TSC_AUX MSR read/write must not be intercepted for SEV-ES (or SEV-SNP) guests. Signed-off-by: Babu Moger <babu.moger@amd.com> Message-Id: <165040164424.1399644.13833277687385156344.stgit@bmoger-ubuntu> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-29Merge branch 'kvm-fixes-for-5.18-rc5' into HEADPaolo Bonzini
Fixes for (relatively) old bugs, to be merged in both the -rc and next development trees. The merge reconciles the ABI fixes for KVM_EXIT_SYSTEM_EVENT between 5.18 and commit c24a950ec7d6 ("KVM, SEV: Add KVM_EXIT_SHUTDOWN metadata for SEV-ES", 2022-04-13).
2022-04-21KVM: SEV: add cache flush to solve SEV cache incoherency issuesMingwei Zhang
Flush the CPU caches when memory is reclaimed from an SEV guest (where reclaim also includes it being unmapped from KVM's memslots). Due to lack of coherency for SEV encrypted memory, failure to flush results in silent data corruption if userspace is malicious/broken and doesn't ensure SEV guest memory is properly pinned and unpinned. Cache coherency is not enforced across the VM boundary in SEV (AMD APM vol.2 Section 15.34.7). Confidential cachelines, generated by confidential VM guests have to be explicitly flushed on the host side. If a memory page containing dirty confidential cachelines was released by VM and reallocated to another user, the cachelines may corrupt the new user at a later time. KVM takes a shortcut by assuming all confidential memory remain pinned until the end of VM lifetime. Therefore, KVM does not flush cache at mmu_notifier invalidation events. Because of this incorrect assumption and the lack of cache flushing, malicous userspace can crash the host kernel: creating a malicious VM and continuously allocates/releases unpinned confidential memory pages when the VM is running. Add cache flush operations to mmu_notifier operations to ensure that any physical memory leaving the guest VM get flushed. In particular, hook mmu_notifier_invalidate_range_start and mmu_notifier_release events and flush cache accordingly. The hook after releasing the mmu lock to avoid contention with other vCPUs. Cc: stable@vger.kernel.org Suggested-by: Sean Christpherson <seanjc@google.com> Reported-by: Mingwei Zhang <mizhang@google.com> Signed-off-by: Mingwei Zhang <mizhang@google.com> Message-Id: <20220421031407.2516575-4-mizhang@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-21KVM: SVM: Flush when freeing encrypted pages even on SME_COHERENT CPUsMingwei Zhang
Use clflush_cache_range() to flush the confidential memory when SME_COHERENT is supported in AMD CPU. Cache flush is still needed since SME_COHERENT only support cache invalidation at CPU side. All confidential cache lines are still incoherent with DMA devices. Cc: stable@vger.kerel.org Fixes: add5e2f04541 ("KVM: SVM: Add support for the SEV-ES VMSA") Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Mingwei Zhang <mizhang@google.com> Message-Id: <20220421031407.2516575-3-mizhang@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-21KVM: SVM: Simplify and harden helper to flush SEV guest page(s)Sean Christopherson
Rework sev_flush_guest_memory() to explicitly handle only a single page, and harden it to fall back to WBINVD if VM_PAGE_FLUSH fails. Per-page flushing is currently used only to flush the VMSA, and in its current form, the helper is completely broken with respect to flushing actual guest memory, i.e. won't work correctly for an arbitrary memory range. VM_PAGE_FLUSH takes a host virtual address, and is subject to normal page walks, i.e. will fault if the address is not present in the host page tables or does not have the correct permissions. Current AMD CPUs also do not honor SMAP overrides (undocumented in kernel versions of the APM), so passing in a userspace address is completely out of the question. In other words, KVM would need to manually walk the host page tables to get the pfn, ensure the pfn is stable, and then use the direct map to invoke VM_PAGE_FLUSH. And the latter might not even work, e.g. if userspace is particularly evil/clever and backs the guest with Secret Memory (which unmaps memory from the direct map). Signed-off-by: Sean Christopherson <seanjc@google.com> Fixes: add5e2f04541 ("KVM: SVM: Add support for the SEV-ES VMSA") Reported-by: Mingwei Zhang <mizhang@google.com> Cc: stable@vger.kernel.org Signed-off-by: Mingwei Zhang <mizhang@google.com> Message-Id: <20220421031407.2516575-2-mizhang@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-13KVM, SEV: Add KVM_EXIT_SHUTDOWN metadata for SEV-ESPeter Gonda
If an SEV-ES guest requests termination, exit to userspace with KVM_EXIT_SYSTEM_EVENT and a dedicated SEV_TERM type instead of -EINVAL so that userspace can take appropriate action. See AMD's GHCB spec section '4.1.13 Termination Request' for more details. Suggested-by: Sean Christopherson <seanjc@google.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Peter Gonda <pgonda@google.com> Reported-by: kernel test robot <lkp@intel.com> Message-Id: <20220407210233.782250-1-pgonda@google.com> [Add documentatino. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-11KVM: SVM: Do not activate AVIC for SEV-enabled guestSuravee Suthikulpanit
Since current AVIC implementation cannot support encrypted memory, inhibit AVIC for SEV-enabled guest. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Message-Id: <20220408133710.54275-1-suravee.suthikulpanit@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06KVM: SVM: Create a separate mapping for the SEV-ES save areaTom Lendacky
The save area for SEV-ES/SEV-SNP guests, as used by the hardware, is different from the save area of a non SEV-ES/SEV-SNP guest. This is the first step in defining the multiple save areas to keep them separate and ensuring proper operation amongst the different types of guests. Create an SEV-ES/SEV-SNP save area and adjust usage to the new save area definition where needed. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com> Link: https://lore.kernel.org/r/20220405182743.308853-1-brijesh.singh@amd.com
2022-04-05KVM: SEV: Add cond_resched() to loop in sev_clflush_pages()Peter Gonda
Add resched to avoid warning from sev_clflush_pages() with large number of pages. Signed-off-by: Peter Gonda <pgonda@google.com> Cc: Sean Christopherson <seanjc@google.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: <20220330164306.2376085-1-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-01KVM: SVM: Exit to userspace on ENOMEM/EFAULT GHCB errorsSean Christopherson
Exit to userspace if setup_vmgexit_scratch() fails due to OOM or because copying data from guest (userspace) memory failed/faulted. The OOM scenario is clearcut, it's userspace's decision as to whether it should terminate the guest, free memory, etc... As for -EFAULT, arguably, any guest issue is a violation of the guest's contract with userspace, and thus userspace needs to decide how to proceed. E.g. userspace defines what is RAM vs. MMIO and communicates that directly to the guest, KVM is not involved in deciding what is/isn't RAM nor in communicating that information to the guest. If the scratch GPA doesn't resolve to a memslot, then the guest is not honoring the memory configuration as defined by userspace. And if userspace unmaps an hva for whatever reason, then exiting to userspace with -EFAULT is absolutely the right thing to do. KVM's ABI currently sucks and doesn't provide enough information to act on the -EFAULT, but that will hopefully be remedied in the future as there are multiple use cases, e.g. uffd and virtiofs truncation, that shouldn't require any work in KVM beyond returning -EFAULT with a small amount of metadata. KVM could define its ABI such that failure to access the scratch area is reflected into the guest, i.e. establish a contract with userspace, but that's undesirable as it limits KVM's options in the future, e.g. in the potential uffd case any failure on a uaccess needs to kick out to userspace. KVM does have several cases where it reflects these errors into the guest, e.g. kvm_pv_clock_pairing() and Hyper-V emulation, but KVM would preferably "fix" those instead of propagating the falsehood that any memory failure is the guest's fault. Lastly, returning a boolean as an "error" for that a helper that isn't named accordingly never works out well. Fixes: ad5b353240c8 ("KVM: SVM: Do not terminate SEV-ES guests on GHCB validation failure") Cc: Alper Gun <alpergun@google.com> Cc: Peter Gonda <pgonda@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220225205209.3881130-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-18KVM: SEV: Allow SEV intra-host migration of VM with mirrorsPeter Gonda
For SEV-ES VMs with mirrors to be intra-host migrated they need to be able to migrate with the mirror. This is due to that fact that all VMSAs need to be added into the VM with LAUNCH_UPDATE_VMSA before lAUNCH_FINISH. Allowing migration with mirrors allows users of SEV-ES to keep the mirror VMs VMSAs during migration. Adds a list of mirror VMs for the original VM iterate through during its migration. During the iteration the owner pointers can be updated from the source to the destination. This fixes the ASID leaking issue which caused the blocking of migration of VMs with mirrors. Signed-off-by: Peter Gonda <pgonda@google.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: <20220211193634.3183388-1-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10KVM: SVM: Rename hook implementations to conform to kvm_x86_ops' namesSean Christopherson
Massage SVM's implementation names that still diverge from kvm_x86_ops to allow for wiring up all SVM-defined functions via kvm-x86-ops.h. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-22-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10KVM: SVM: Rename SEV implemenations to conform to kvm_x86_ops hooksSean Christopherson
Rename svm_vm_copy_asid_from() and svm_vm_migrate_from() to conform to the names used by kvm_x86_ops, and opportunistically use "sev" instead of "svm" to more precisely identify the role of the hooks. svm_vm_copy_asid_from() in particular was poorly named as the function does much more than simply copy the ASID. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-21-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10KVM: x86: Use more verbose names for mem encrypt kvm_x86_ops hooksSean Christopherson
Use slightly more verbose names for the so called "memory encrypt", a.k.a. "mem enc", kvm_x86_ops hooks to bridge the gap between the current super short kvm_x86_ops names and SVM's more verbose, but non-conforming names. This is a step toward using kvm-x86-ops.h with KVM_X86_CVM_OP() to fill svm_x86_ops. Opportunistically rename mem_enc_op() to mem_enc_ioctl() to better reflect its true nature, as it really is a full fledged ioctl() of its own. Ideally, the hook would be named confidential_vm_ioctl() or so, as the ioctl() is a gateway to more than just memory encryption, and because its underlying purpose to support Confidential VMs, which can be provided without memory encryption, e.g. if the TCB of the guest includes the host kernel but not host userspace, or by isolation in hardware without encrypting memory. But, diverging from KVM_MEMORY_ENCRYPT_OP even further is undeseriable, and short of creating alises for all related ioctl()s, which introduces a different flavor of divergence, KVM is stuck with the nomenclature. Defer renaming SVM's functions to a future commit as there are additional changes needed to make SVM fully conforming and to match reality (looking at you, svm_vm_copy_asid_from()). No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-20-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10KVM: SVM: improve split between svm_prepare_guest_switch and ↵Paolo Bonzini
sev_es_prepare_guest_switch KVM performs the VMSAVE to the host save area for both regular and SEV-ES guests, so hoist it up to svm_prepare_guest_switch. And because sev_es_prepare_guest_switch does not really need to know the details of struct svm_cpu_data *, just pass it the pointer to the host save area inside the HSAVE page. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: SVM: Explicitly require DECODEASSISTS to enable SEV supportSean Christopherson
Add a sanity check on DECODEASSIST being support if SEV is supported, as KVM cannot read guest private memory and thus relies on the CPU to provide the instruction byte stream on #NPF for emulation. The intent of the check is to document the dependency, it should never fail in practice as producing hardware that supports SEV but not DECODEASSISTS would be non-sensical. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-19Merge branch 'kvm-pi-raw-spinlock' into HEADPaolo Bonzini
Bring in fix for VT-d posted interrupts before further changing the code in 5.17. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07KVM: SEV: Mark nested locking of kvm->lockWanpeng Li
Both source and dest vms' kvm->locks are held in sev_lock_two_vms. Mark one with a different subtype to avoid false positives from lockdep. Fixes: c9d61dcb0bc26 (KVM: SEV: accept signals in sev_lock_two_vms) Reported-by: Yiru Xu <xyru1999@gmail.com> Tested-by: Jinrong Liang <cloudliang@tencent.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1641364863-26331-1-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-08KVM: Use 'unsigned long' as kvm_for_each_vcpu()'s indexMarc Zyngier
Everywhere we use kvm_for_each_vpcu(), we use an int as the vcpu index. Unfortunately, we're about to move rework the iterator, which requires this to be upgrade to an unsigned long. Let's bite the bullet and repaint all of it in one go. Signed-off-by: Marc Zyngier <maz@kernel.org> Message-Id: <20211116160403.4074052-7-maz@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-05KVM: SVM: Do not terminate SEV-ES guests on GHCB validation failureTom Lendacky
Currently, an SEV-ES guest is terminated if the validation of the VMGEXIT exit code or exit parameters fails. The VMGEXIT instruction can be issued from userspace, even though userspace (likely) can't update the GHCB. To prevent userspace from being able to kill the guest, return an error through the GHCB when validation fails rather than terminating the guest. For cases where the GHCB can't be updated (e.g. the GHCB can't be mapped, etc.), just return back to the guest. The new error codes are documented in the lasest update to the GHCB specification. Fixes: 291bd20d5d88 ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT") Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Message-Id: <b57280b5562893e2616257ac9c2d4525a9aeeb42.1638471124.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-05KVM: SEV: Fall back to vmalloc for SEV-ES scratch area if necessarySean Christopherson
Use kvzalloc() to allocate KVM's buffer for SEV-ES's GHCB scratch area so that KVM falls back to __vmalloc() if physically contiguous memory isn't available. The buffer is purely a KVM software construct, i.e. there's no need for it to be physically contiguous. Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211109222350.2266045-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-05KVM: SEV: Return appropriate error codes if SEV-ES scratch setup failsSean Christopherson
Return appropriate error codes if setting up the GHCB scratch area for an SEV-ES guest fails. In particular, returning -EINVAL instead of -ENOMEM when allocating the kernel buffer could be confusing as userspace would likely suspect a guest issue. Fixes: 8f423a80d299 ("KVM: SVM: Support MMIO for an SEV-ES guest") Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211109222350.2266045-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: SEV: accept signals in sev_lock_two_vmsPaolo Bonzini
Generally, kvm->lock is not taken for a long time, but sev_lock_two_vms is different: it takes vCPU locks inside, so userspace can hold it back just by calling a vCPU ioctl. Play it safe and use mutex_lock_killable. Message-Id: <20211123005036.2954379-13-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: SEV: do not take kvm->lock when destroyingPaolo Bonzini
Taking the lock is useless since there are no other references, and there are already accesses (e.g. to sev->enc_context_owner) that do not take it. So get rid of it. Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211123005036.2954379-12-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: SEV: Prohibit migration of a VM that has mirrorsPaolo Bonzini
VMs that mirror an encryption context rely on the owner to keep the ASID allocated. Performing a KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM would cause a dangling ASID: 1. copy context from A to B (gets ref to A) 2. move context from A to L (moves ASID from A to L) 3. close L (releases ASID from L, B still references it) The right way to do the handoff instead is to create a fresh mirror VM on the destination first: 1. copy context from A to B (gets ref to A) [later] 2. close B (releases ref to A) 3. move context from A to L (moves ASID from A to L) 4. copy context from L to M So, catch the situation by adding a count of how many VMs are mirroring this one's encryption context. Fixes: 0b020f5af092 ("KVM: SEV: Add support for SEV-ES intra host migration") Message-Id: <20211123005036.2954379-11-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: SEV: Do COPY_ENC_CONTEXT_FROM with both VMs lockedPaolo Bonzini
Now that we have a facility to lock two VMs with deadlock protection, use it for the creation of mirror VMs as well. One of COPY_ENC_CONTEXT_FROM(dst, src) and COPY_ENC_CONTEXT_FROM(src, dst) would always fail, so the combination is nonsensical and it is okay to return -EBUSY if it is attempted. This sidesteps the question of what happens if a VM is MOVE_ENC_CONTEXT_FROM'd at the same time as it is COPY_ENC_CONTEXT_FROM'd: the locking prevents that from happening. Cc: Peter Gonda <pgonda@google.com> Cc: Sean Christopherson <seanjc@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211123005036.2954379-10-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: SEV: move mirror status to destination of KVM_CAP_VM_MOVE_ENC_CONTEXT_FROMPaolo Bonzini
Allow intra-host migration of a mirror VM; the destination VM will be a mirror of the same ASID as the source. Fixes: b56639318bb2 ("KVM: SEV: Add support for SEV intra host migration") Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211123005036.2954379-8-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: SEV: initialize regions_list of a mirror VMPaolo Bonzini
This was broken before the introduction of KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM, but technically harmless because the region list was unused for a mirror VM. However, it is untidy and it now causes a NULL pointer access when attempting to move the encryption context of a mirror VM. Fixes: 54526d1fd593 ("KVM: x86: Support KVM VMs sharing SEV context") Message-Id: <20211123005036.2954379-7-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: SEV: cleanup locking for KVM_CAP_VM_MOVE_ENC_CONTEXT_FROMPaolo Bonzini
Encapsulate the handling of the migration_in_progress flag for both VMs in two functions sev_lock_two_vms and sev_unlock_two_vms. It does not matter if KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM locks the destination struct kvm a bit later, and this change 1) keeps the cleanup chain of labels smaller 2) makes it possible for KVM_CAP_VM_COPY_ENC_CONTEXT_FROM to reuse the logic. Cc: Peter Gonda <pgonda@google.com> Cc: Sean Christopherson <seanjc@google.com> Message-Id: <20211123005036.2954379-6-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: SEV: do not use list_replace_init on an empty listPaolo Bonzini
list_replace_init cannot be used if the source is an empty list, because "new->next->prev = new" will overwrite "old->next": new old prev = new, next = new prev = old, next = old new->next = old->next prev = new, next = old prev = old, next = old new->next->prev = new prev = new, next = old prev = old, next = new new->prev = old->prev prev = old, next = old prev = old, next = old new->next->prev = new prev = old, next = old prev = new, next = new The desired outcome instead would be to leave both old and new the same as they were (two empty circular lists). Use list_cut_before, which already has the necessary check and is documented to discard the previous contents of the list that will hold the result. Fixes: b56639318bb2 ("KVM: SEV: Add support for SEV intra host migration") Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211123005036.2954379-5-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-18Merge branch 'kvm-5.16-fixes' into kvm-masterPaolo Bonzini
* Fixes for Xen emulation * Kill kvm_map_gfn() / kvm_unmap_gfn() and broken gfn_to_pfn_cache * Fixes for migration of 32-bit nested guests on 64-bit hypervisor * Compilation fixes * More SEV cleanups
2021-11-18KVM: SEV: Fix typo in and tweak name of cmd_allowed_from_miror()Sean Christopherson
Rename cmd_allowed_from_miror() to is_cmd_allowed_from_mirror(), fixing a typo and making it obvious that the result is a boolean where false means "not allowed". No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211109215101.2211373-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-18KVM: SEV: Drop a redundant setting of sev->asid during initializationSean Christopherson
Remove a fully redundant write to sev->asid during SEV/SEV-ES guest initialization. The ASID is set a few lines earlier prior to the call to sev_platform_init(), which doesn't take "sev" as a param, i.e. can't muck with the ASID barring some truly magical behind-the-scenes code. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211109215101.2211373-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-18KVM: SEV: Set sev_info.active after initial checks in sev_guest_init()Sean Christopherson
Set sev_info.active during SEV/SEV-ES activation before calling any code that can potentially consume sev_info.es_active, e.g. set "active" and "es_active" as a pair immediately after the initial sanity checks. KVM generally expects that es_active can be true if and only if active is true, e.g. sev_asid_new() deliberately avoids sev_es_guest() so that it doesn't get a false negative. This will allow WARNing in sev_es_guest() if the VM is tagged as SEV-ES but not SEV. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211109215101.2211373-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-18KVM: SEV: Disallow COPY_ENC_CONTEXT_FROM if target has created vCPUsSean Christopherson
Reject COPY_ENC_CONTEXT_FROM if the destination VM has created vCPUs. KVM relies on SEV activation to occur before vCPUs are created, e.g. to set VMCB flags and intercepts correctly. Fixes: 54526d1fd593 ("KVM: x86: Support KVM VMs sharing SEV context") Cc: stable@vger.kernel.org Cc: Peter Gonda <pgonda@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Nathan Tempelman <natet@google.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211109215101.2211373-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-12KVM: SEV: unify cgroup cleanup code for svm_vm_migrate_fromPaolo Bonzini
Use the same cleanup code independent of whether the cgroup to be uncharged and unref'd is the source or the destination cgroup. Use a bool to track whether the destination cgroup has been charged, which also fixes a bug in the error case: the destination cgroup must be uncharged only if it does not match the source. Fixes: b56639318bb2 ("KVM: SEV: Add support for SEV intra host migration") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>