Difference between revisions of "Extending NaplesPU for 64-bit support"
(→nu+ Clang modifications) |
|||
(29 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | + | NaplesPU toolchain can be extended to support 64-bit operations. | |
− | A git branch with full 64-bit support is provided. Consequently, if it is necessary to compile the toolchain supporting this extension, a ''checkout'' on ''llvm-7'' branch is required. | + | A git branch with full 64-bit support is provided. Consequently, if it is necessary to compile the toolchain supporting this extension, a ''checkout'' on ''llvm-7-64b'' branch is required. |
− | Changes are related to both [[ | + | Changes are related to both [[NaplesPU Clang Documentation | frontend]] and [[NaplesPU LLVM Documentation | backend]]. |
− | == | + | == NaplesPU Frontend Modifications == |
− | + | NaplesPU frontend abstracts target informations through the ''TargetInfo'' class, extending it in the [[NaplesPU Clang Documentation #Defining Target Features | NaplesPUTargetInfo]] implementation. | |
+ | |||
+ | Since 64-bit operations require to support double-integer and double-floating-point formats, the following changes and additions are required in the target definition: | ||
+ | |||
+ | <syntaxhighlight> | ||
+ | class LLVM_LIBRARY_VISIBILITY NaplesPUTargetInfo : public TargetInfo { | ||
+ | ... | ||
+ | public: | ||
+ | NaplesPUTargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts) | ||
+ | : TargetInfo(Triple) { | ||
+ | ... | ||
+ | resetDataLayout("e-m:e-p:32:32-i64:64:64-i32:32:32-f32:32:32-f64:64:64"); | ||
+ | LongDoubleWidth = 64; | ||
+ | LongDoubleAlign = 64; | ||
+ | DoubleWidth = 64; | ||
+ | DoubleAlign = 64; | ||
+ | LongWidth = 64; | ||
+ | LongAlign = 64; | ||
+ | LongLongWidth = 64; | ||
+ | LongLongAlign = 64; | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | == NaplesPU Backend Modifications == | ||
+ | |||
+ | This section describes the backend modification to be applied for 64-bit support. | ||
+ | |||
+ | === Registers Definition === | ||
+ | The 64-bit support for registers is based on the "Sub-Reg" behaviour. Since NaplesPU registers are 32-bit wide, a 64-bit variable is stored split in two parts: | ||
+ | * The higher 32-bit are placed in the S[i] register; | ||
+ | * The lower 32-bit are placed in the S[i+1] register. | ||
+ | |||
+ | The following class is declared in [[NaplesPURegisterInfo.td]]. | ||
+ | |||
+ | <syntaxhighlight> | ||
+ | class NaplesPU64GPRReg<bits<16> Enc, string n, list<Register> subregs> | ||
+ | : NaplesPURegWithSubRegs<Enc, n, subregs> { | ||
+ | let SubRegIndices = [sub_even, sub_odd]; | ||
+ | let CoveredBySubRegs = 1; | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | The register instantiation is realized as follows: | ||
+ | <syntaxhighlight> | ||
+ | foreach i = 0-28 in { | ||
+ | def S#!shl(i, 1)#_S#!add(!shl(i, 1), 1) : NaplesPU64GPRReg<!shl(i, 1), "s"#!shl(i, 1)#_64, | ||
+ | [!cast<NaplesPUGPRReg>("S"#!shl(i, 1)), | ||
+ | !cast<NaplesPUGPRReg>("S"#!add(!shl(i, 1), 1))]>; | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | Using the newly defined 64-bit support, it is possible to manage vector registers partitioned in eight cells, in which each one is 64-bit wide. | ||
+ | |||
+ | <syntaxhighlight> | ||
+ | def VR512L : RegisterClass<"NaplesPU", [v8i64, v8f64, v8i8, v8i16, v8i32], 512, (sequence "V%u", 0, 63)>; | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | === Calling Convention === | ||
+ | Calling conventions for NaplesPU are modified supporting the 64-bit registers. As a result, the [[NaplesPU LLVM Documentation #Calling Convention | 32-bit Calling Convention]] is extended by adding the following lines: | ||
+ | |||
+ | <syntaxhighlight> | ||
+ | CCIfType<[v8i8, v8i16, v8i32], CCPromoteToType<v8i64>>, | ||
+ | |||
+ | CCIfNotVarArg<CCIfType<[i64, f64], CCAssignToReg<[S0_S1, S2_S3, S4_S5, S6_S7, S8_S9, S10_S11, S12_S13, S14_S15]>>>, | ||
+ | |||
+ | CCIfNotVarArg<CCIfType<[v16i32, v16f32, v8i64, v8f64], CCAssignToReg<[V0, V1, V2, V3, V4, V5, V6, V7]>>>, | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | The lines above describes how parameters are passed through registers. Also the stack assignment is described below: | ||
+ | |||
+ | <syntaxhighlight> | ||
+ | CCIfType<[i64, f64], CCAssignToStack<8, 8>>, | ||
+ | CCIfType<[v16i32, v16f32, v8i64, v8f64], CCAssignToStack<64, 64>> | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | The return value calling convention is extended follows: | ||
+ | |||
+ | <syntaxhighlight> | ||
+ | CCIfType<[i64, f64], CCAssignToReg<[S0_S1, S2_S3, S4_S5, S6_S7, S8_S9, S10_S11]>>, | ||
+ | CCIfType<[v8i8, v8i16, v8i32], CCPromoteToType<v8i64>>, | ||
+ | CCIfType<[v16i32, v16f32, v8i64, v8f64], CCAssignToReg<[V0, V1, V2, V3, V4, V5]>> | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | === ISA Support === | ||
+ | |||
+ | Referring to [[NaplesPU LLVM Documentation #ISA Support | 32-bit ISA Support]] the 64-bit support on instructions is realized by extending the NaplesPU instruction hierarchy, adding a bit that acts as a mark. | ||
+ | |||
+ | As an example, in the following code, the bit <FR_TwoOp_Unmasked_64> class is derived from the <FR_TwoOp_Unmasked>, setting the sixth parameter as <1>. Recalling what described in the [[NaplesPU LLVM Documentation #ISA Support | 32-bit ISA Support]], it describes if the instruction refers to a 64-bit behaviour or not. | ||
+ | |||
+ | <syntaxhighlight> | ||
+ | class FR_TwoOp_Unmasked_64<dag outs, dag ins, string asmstr, list<dag> pattern, bits<6> opcode, Fmt fmt2, Fmt fmt1, Fmt fmt0> | ||
+ | : FR_TwoOp_Unmasked<outs, ins, asmstr, pattern, opcode, 1, fmt2, fmt1, fmt0> {} | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | === Instruction Lowering === | ||
+ | Since new instructions are defined, a ''lowering'' behaviour on them is often required. As a result, [[NaplesPUISelLowering.cpp | NaplesPUISelLowering]] is extended in some points. For instance, since ''LoadI64'' is a pseudo-instruction, it must be expanded as follows: | ||
+ | <syntaxhighlight> | ||
+ | case NaplesPU::LoadI64: | ||
+ | return EmitLoadI64(&MI, BB); | ||
+ | ... | ||
+ | MachineBasicBlock * | ||
+ | NaplesPUTargetLowering::EmitLoadI64(MachineInstr *MI, | ||
+ | MachineBasicBlock *BB) const { | ||
+ | |||
+ | DebugLoc DL = MI->getDebugLoc(); | ||
+ | const TargetInstrInfo *TII = Subtarget.getInstrInfo(); | ||
+ | MachineRegisterInfo &MRI = BB->getParent()->getRegInfo(); | ||
+ | |||
+ | // Create the destination register | ||
+ | unsigned DstReg = MI->getOperand(0).getReg(); | ||
+ | int64_t ImmOp = MI->getOperand(1).getImm(); | ||
+ | |||
+ | unsigned Immediate = ((ImmOp >> 32) & 0xffffffff); | ||
+ | |||
+ | BuildMI(*BB, MI, DL, TII->get(NaplesPU::MOVEIHSI)) | ||
+ | .addReg(DstReg, RegState::Define, NaplesPU::sub_odd) | ||
+ | .addImm(((Immediate >> 16) & 0xFFFF)); | ||
+ | BuildMI(*BB, MI, DL, TII->get(NaplesPU::MOVEILSI)) | ||
+ | .addReg(DstReg, 0, NaplesPU::sub_odd) | ||
+ | .addImm((Immediate & 0xFFFF)); | ||
+ | |||
+ | Immediate = (ImmOp & 0xffffffff); | ||
+ | |||
+ | BuildMI(*BB, MI, DL, TII->get(NaplesPU::MOVEIHSI)) | ||
+ | .addReg(DstReg, 0, NaplesPU::sub_even) | ||
+ | .addImm(((Immediate >> 16) & 0xFFFF)); | ||
+ | BuildMI(*BB, MI, DL, TII->get(NaplesPU::MOVEILSI)) | ||
+ | .addReg(DstReg, 0, NaplesPU::sub_even) | ||
+ | .addImm((Immediate & 0xFFFF)); | ||
+ | |||
+ | MI->eraseFromParent(); | ||
+ | |||
+ | return BB; | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | As it is explained, the load operation for 64-bit immediate value is realized by requiring a ''NaplesPU64GPR'' register and writing the split parts on the sub-registers. | ||
+ | |||
+ | === Disassembler Support === | ||
+ | Since new register classes are introduced, proper decode methods are implemented in '''NaplesPUDisassembler'''. As a result, ''DecodeGPR64RegisterClasses'' and ''DecodeVR512LRegisterClasses'' are added to disassemble the code. |
Latest revision as of 17:23, 21 June 2019
NaplesPU toolchain can be extended to support 64-bit operations. A git branch with full 64-bit support is provided. Consequently, if it is necessary to compile the toolchain supporting this extension, a checkout on llvm-7-64b branch is required.
Changes are related to both frontend and backend.
Contents
[hide]NaplesPU Frontend Modifications
NaplesPU frontend abstracts target informations through the TargetInfo class, extending it in the NaplesPUTargetInfo implementation.
Since 64-bit operations require to support double-integer and double-floating-point formats, the following changes and additions are required in the target definition:
class LLVM_LIBRARY_VISIBILITY NaplesPUTargetInfo : public TargetInfo {
...
public:
NaplesPUTargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts)
: TargetInfo(Triple) {
...
resetDataLayout("e-m:e-p:32:32-i64:64:64-i32:32:32-f32:32:32-f64:64:64");
LongDoubleWidth = 64;
LongDoubleAlign = 64;
DoubleWidth = 64;
DoubleAlign = 64;
LongWidth = 64;
LongAlign = 64;
LongLongWidth = 64;
LongLongAlign = 64;
}
NaplesPU Backend Modifications
This section describes the backend modification to be applied for 64-bit support.
Registers Definition
The 64-bit support for registers is based on the "Sub-Reg" behaviour. Since NaplesPU registers are 32-bit wide, a 64-bit variable is stored split in two parts:
- The higher 32-bit are placed in the S[i] register;
- The lower 32-bit are placed in the S[i+1] register.
The following class is declared in NaplesPURegisterInfo.td.
class NaplesPU64GPRReg<bits<16> Enc, string n, list<Register> subregs>
: NaplesPURegWithSubRegs<Enc, n, subregs> {
let SubRegIndices = [sub_even, sub_odd];
let CoveredBySubRegs = 1;
}
The register instantiation is realized as follows:
foreach i = 0-28 in {
def S#!shl(i, 1)#_S#!add(!shl(i, 1), 1) : NaplesPU64GPRReg<!shl(i, 1), "s"#!shl(i, 1)#_64,
[!cast<NaplesPUGPRReg>("S"#!shl(i, 1)),
!cast<NaplesPUGPRReg>("S"#!add(!shl(i, 1), 1))]>;
}
Using the newly defined 64-bit support, it is possible to manage vector registers partitioned in eight cells, in which each one is 64-bit wide.
def VR512L : RegisterClass<"NaplesPU", [v8i64, v8f64, v8i8, v8i16, v8i32], 512, (sequence "V%u", 0, 63)>;
Calling Convention
Calling conventions for NaplesPU are modified supporting the 64-bit registers. As a result, the 32-bit Calling Convention is extended by adding the following lines:
CCIfType<[v8i8, v8i16, v8i32], CCPromoteToType<v8i64>>,
CCIfNotVarArg<CCIfType<[i64, f64], CCAssignToReg<[S0_S1, S2_S3, S4_S5, S6_S7, S8_S9, S10_S11, S12_S13, S14_S15]>>>,
CCIfNotVarArg<CCIfType<[v16i32, v16f32, v8i64, v8f64], CCAssignToReg<[V0, V1, V2, V3, V4, V5, V6, V7]>>>,
The lines above describes how parameters are passed through registers. Also the stack assignment is described below:
CCIfType<[i64, f64], CCAssignToStack<8, 8>>,
CCIfType<[v16i32, v16f32, v8i64, v8f64], CCAssignToStack<64, 64>>
The return value calling convention is extended follows:
CCIfType<[i64, f64], CCAssignToReg<[S0_S1, S2_S3, S4_S5, S6_S7, S8_S9, S10_S11]>>,
CCIfType<[v8i8, v8i16, v8i32], CCPromoteToType<v8i64>>,
CCIfType<[v16i32, v16f32, v8i64, v8f64], CCAssignToReg<[V0, V1, V2, V3, V4, V5]>>
ISA Support
Referring to 32-bit ISA Support the 64-bit support on instructions is realized by extending the NaplesPU instruction hierarchy, adding a bit that acts as a mark.
As an example, in the following code, the bit <FR_TwoOp_Unmasked_64> class is derived from the <FR_TwoOp_Unmasked>, setting the sixth parameter as <1>. Recalling what described in the 32-bit ISA Support, it describes if the instruction refers to a 64-bit behaviour or not.
class FR_TwoOp_Unmasked_64<dag outs, dag ins, string asmstr, list<dag> pattern, bits<6> opcode, Fmt fmt2, Fmt fmt1, Fmt fmt0>
: FR_TwoOp_Unmasked<outs, ins, asmstr, pattern, opcode, 1, fmt2, fmt1, fmt0> {}
Instruction Lowering
Since new instructions are defined, a lowering behaviour on them is often required. As a result, NaplesPUISelLowering is extended in some points. For instance, since LoadI64 is a pseudo-instruction, it must be expanded as follows:
case NaplesPU::LoadI64:
return EmitLoadI64(&MI, BB);
...
MachineBasicBlock *
NaplesPUTargetLowering::EmitLoadI64(MachineInstr *MI,
MachineBasicBlock *BB) const {
DebugLoc DL = MI->getDebugLoc();
const TargetInstrInfo *TII = Subtarget.getInstrInfo();
MachineRegisterInfo &MRI = BB->getParent()->getRegInfo();
// Create the destination register
unsigned DstReg = MI->getOperand(0).getReg();
int64_t ImmOp = MI->getOperand(1).getImm();
unsigned Immediate = ((ImmOp >> 32) & 0xffffffff);
BuildMI(*BB, MI, DL, TII->get(NaplesPU::MOVEIHSI))
.addReg(DstReg, RegState::Define, NaplesPU::sub_odd)
.addImm(((Immediate >> 16) & 0xFFFF));
BuildMI(*BB, MI, DL, TII->get(NaplesPU::MOVEILSI))
.addReg(DstReg, 0, NaplesPU::sub_odd)
.addImm((Immediate & 0xFFFF));
Immediate = (ImmOp & 0xffffffff);
BuildMI(*BB, MI, DL, TII->get(NaplesPU::MOVEIHSI))
.addReg(DstReg, 0, NaplesPU::sub_even)
.addImm(((Immediate >> 16) & 0xFFFF));
BuildMI(*BB, MI, DL, TII->get(NaplesPU::MOVEILSI))
.addReg(DstReg, 0, NaplesPU::sub_even)
.addImm((Immediate & 0xFFFF));
MI->eraseFromParent();
return BB;
}
As it is explained, the load operation for 64-bit immediate value is realized by requiring a NaplesPU64GPR register and writing the split parts on the sub-registers.
Disassembler Support
Since new register classes are introduced, proper decode methods are implemented in NaplesPUDisassembler. As a result, DecodeGPR64RegisterClasses and DecodeVR512LRegisterClasses are added to disassemble the code.