diff options
| author | jduncanator <1518948+jduncanator@users.noreply.github.com> | 2020-03-05 11:41:33 +1100 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2020-03-05 11:41:33 +1100 |
| commit | 68e15c1a7471e4b2844fc0d3c7385523e595521d (patch) | |
| tree | 3783af4216d1e4b31135d8055ea5bcd44a69276e /ARMeilleure/CodeGen | |
| parent | d9ed827696700ef5b9b777031bab451f23fb837c (diff) | |
Implement Fast Paths for most A32 SIMD instructions (#952)
* Begin work on A32 SIMD Intrinsics
* More instructions, some cleanup.
* Intrinsics for Move instructions (zip etc)
These pass the existing tests.
* Intrinsics for some of Cvt
While doing this I noticed that the conversion for int/fp was incorrect
in the slow path. I'll fix this in the original repo.
* Intrinsics for more Arithmetic instructions.
* Intrinsics for Vext
* Fix VEXT Intrinsic for double words.
* Use InsertPs to move scalar values.
* Cleanup, fix VPADD.f32 and VMIN signed integer.
* Cleanup, add SSE2 support for scalar insert.
Works similarly to the IR scalar insert, but obviously this one works
directly on V128.
* Minor cleanup.
* Enable intrinsic for FP64 to integer conversion.
* Address feedback apart from splitting out intrinsic float abs
Also: bad VREV encodings as undefined rather than throwing in translation.
* Move float abs to helper, fix bug with cvt
* Rename opc2 & 3 to match A32 docs, use ArgumentOutOfRangeException appropriately.
* Get name of variable at compilation rather than string literal.
* Use correct double sign mask.
Diffstat (limited to 'ARMeilleure/CodeGen')
| -rw-r--r-- | ARMeilleure/CodeGen/X86/IntrinsicTable.cs | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/ARMeilleure/CodeGen/X86/IntrinsicTable.cs b/ARMeilleure/CodeGen/X86/IntrinsicTable.cs index fd3b691d..c003eff3 100644 --- a/ARMeilleure/CodeGen/X86/IntrinsicTable.cs +++ b/ARMeilleure/CodeGen/X86/IntrinsicTable.cs @@ -52,6 +52,7 @@ namespace ARMeilleure.CodeGen.X86 Add(Intrinsic.X86Divss, new IntrinsicInfo(X86Instruction.Divss, IntrinsicType.Binary)); Add(Intrinsic.X86Haddpd, new IntrinsicInfo(X86Instruction.Haddpd, IntrinsicType.Binary)); Add(Intrinsic.X86Haddps, new IntrinsicInfo(X86Instruction.Haddps, IntrinsicType.Binary)); + Add(Intrinsic.X86Insertps, new IntrinsicInfo(X86Instruction.Insertps, IntrinsicType.TernaryImm)); Add(Intrinsic.X86Maxpd, new IntrinsicInfo(X86Instruction.Maxpd, IntrinsicType.Binary)); Add(Intrinsic.X86Maxps, new IntrinsicInfo(X86Instruction.Maxps, IntrinsicType.Binary)); Add(Intrinsic.X86Maxsd, new IntrinsicInfo(X86Instruction.Maxsd, IntrinsicType.Binary)); @@ -62,6 +63,7 @@ namespace ARMeilleure.CodeGen.X86 Add(Intrinsic.X86Minss, new IntrinsicInfo(X86Instruction.Minss, IntrinsicType.Binary)); Add(Intrinsic.X86Movhlps, new IntrinsicInfo(X86Instruction.Movhlps, IntrinsicType.Binary)); Add(Intrinsic.X86Movlhps, new IntrinsicInfo(X86Instruction.Movlhps, IntrinsicType.Binary)); + Add(Intrinsic.X86Movss, new IntrinsicInfo(X86Instruction.Movss, IntrinsicType.Binary)); Add(Intrinsic.X86Mulpd, new IntrinsicInfo(X86Instruction.Mulpd, IntrinsicType.Binary)); Add(Intrinsic.X86Mulps, new IntrinsicInfo(X86Instruction.Mulps, IntrinsicType.Binary)); Add(Intrinsic.X86Mulsd, new IntrinsicInfo(X86Instruction.Mulsd, IntrinsicType.Binary)); |
