-
Andre Przywara authored
This adds support for the AMD Phenom/Barcelona's SSE4a instructions. Those include insertq and extrq, which are doing shift and mask on XMM registers, in two versions (immediate shift/length values and stored in another XMM register). Additionally it implements movntss, movntsd, which are scalar non-temporal stores (avoiding cache trashing). These are implemented as normal stores, though. SSE4a is guarded by the SSE4A CPUID bit (Fn8000_0001:ECX[6]). Signed-off-by:
Andre Przywara <andre.przywara@amd.com> Signed-off-by:
Aurelien Jarno <aurelien@aurel32.net>
Andre Przywara authoredThis adds support for the AMD Phenom/Barcelona's SSE4a instructions. Those include insertq and extrq, which are doing shift and mask on XMM registers, in two versions (immediate shift/length values and stored in another XMM register). Additionally it implements movntss, movntsd, which are scalar non-temporal stores (avoiding cache trashing). These are implemented as normal stores, though. SSE4a is guarded by the SSE4A CPUID bit (Fn8000_0001:ECX[6]). Signed-off-by:
Andre Przywara <andre.przywara@amd.com> Signed-off-by:
Aurelien Jarno <aurelien@aurel32.net>