Extending NaplesPU for 64-bit support

From NaplesPU Documentation
Revision as of 14:36, 14 May 2019 by Francesco (talk | contribs) (nu+ Backend Modifications)
Jump to: navigation, search

nu+ toolchain can be extended to support 64-bit operations. A git branch with full 64-bit support is provided. Consequently, if it is necessary to compile the toolchain supporting this extension, a checkout on llvm-7-64b branch is required.

Changes are related to both frontend and backend.

nu+ Frontend Modifications

nu+ frontend abstracts target informations through the TargetInfo class, extending it in the NuPlusTargetInfo implementation.

Since 64-bit operations require to support double-integer and double-floating-point formats, the following changes and additions are required in the target definition:

 class LLVM_LIBRARY_VISIBILITY NuPlusTargetInfo : public TargetInfo {
   ...
 public:
   NuPlusTargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts)
     : TargetInfo(Triple) {
     ...
     resetDataLayout("e-m:e-p:32:32-i64:64:64-i32:32:32-f32:32:32-f64:64:64");
     LongDoubleWidth = 64;
     LongDoubleAlign = 64;
     DoubleWidth = 64;
     DoubleAlign = 64;
     LongWidth = 64;
     LongAlign = 64;
     LongLongWidth = 64;
     LongLongAlign = 64;
   }

nu+ Backend Modifications

This section describes the backend modification to be applied for 64-bit support.

Registers Definition

The 64-bit support for registers is based on the "Sub-Reg" behaviour. Since nu+ registers are 32-bit wide, a 64-bit variable is stored split in two parts:

  • The higher 32-bit are placed in the S[i] register;
  • The lower 32-bit are placed in the S[i+1] register.

The following class is declared in NuPlusRegisterInfo.td.

 class NuPlus64GPRReg<bits<16> Enc, string n, list<Register> subregs>
   : NuPlusRegWithSubRegs<Enc, n, subregs> {
   let SubRegIndices = [sub_even, sub_odd];
   let CoveredBySubRegs = 1;
 }

The register instantiation is realized as follows:

 foreach i = 0-28 in {
 def S#!shl(i, 1)#_S#!add(!shl(i, 1), 1) : NuPlus64GPRReg<!shl(i, 1), "s"#!shl(i, 1)#_64,
              [!cast<NuPlusGPRReg>("S"#!shl(i, 1)),
              !cast<NuPlusGPRReg>("S"#!add(!shl(i, 1), 1))]>;
}

Using the newly defined 64-bit support, it is possible to manage vector registers partitioned in eight cells, in which each one is 64-bit wide.

def VR512L : RegisterClass<"NuPlus", [v8i64, v8f64, v8i8, v8i16, v8i32], 512, (sequence "V%u", 0, 63)>;

Calling Convention

Calling conventions for nu+ are modified supporting the 64-bit registers. As a result, the 32-bit Calling Convention is extended by adding the following lines:

CCIfType<[v8i8, v8i16, v8i32], CCPromoteToType<v8i64>>,

CCIfNotVarArg<CCIfType<[i64, f64], CCAssignToReg<[S0_S1, S2_S3, S4_S5, S6_S7, S8_S9,       S10_S11, S12_S13, S14_S15]>>>,

CCIfNotVarArg<CCIfType<[v16i32, v16f32, v8i64, v8f64], CCAssignToReg<[V0, V1, V2, V3, V4,  V5, V6, V7]>>>,

The lines above describes how parameters are passed through registers. Also the stack assignment is described below:

   CCIfType<[i64, f64], CCAssignToStack<8, 8>>,
   CCIfType<[v16i32, v16f32, v8i64, v8f64], CCAssignToStack<64, 64>>

The return value calling convention is extended follows:

 CCIfType<[i64, f64], CCAssignToReg<[S0_S1, S2_S3, S4_S5, S6_S7, S8_S9, S10_S11]>>,
 CCIfType<[v8i8, v8i16, v8i32], CCPromoteToType<v8i64>>,
 CCIfType<[v16i32, v16f32, v8i64, v8f64], CCAssignToReg<[V0, V1, V2, V3, V4, V5]>>

ISA Support

Referring to 32-bit ISA Support the 64-bit support on instructions is realized by extending the nu+ instruction hierarchy, adding a bit that acts as a mark.

As an example, in the following code, the bit <FR_TwoOp_Unmasked_64> class is derived from the <FR_TwoOp_Unmasked>, setting the sixth parameter as <1>. Recalling what described in the 32-bit ISA Support, it describes if the instruction refers to a 64-bit behaviour or not.

class FR_TwoOp_Unmasked_64<dag outs, dag ins, string asmstr, list<dag> pattern, bits<6> opcode, Fmt fmt2, Fmt fmt1, Fmt fmt0>
: FR_TwoOp_Unmasked<outs, ins, asmstr, pattern, opcode, 1, fmt2, fmt1, fmt0> {}

Instruction Lowering

Since new instructions are defined, a lowering behaviour on them is often required. As a result, NuPlusISelLowering is extended in some points. For instance, since LoadI64 is a pseudo-instruction, it must be expanded as follows:

case NuPlus::LoadI64:
  return EmitLoadI64(&MI, BB);
...
MachineBasicBlock *
 NuPlusTargetLowering::EmitLoadI64(MachineInstr *MI,
                                   MachineBasicBlock *BB) const {
 
   DebugLoc DL = MI->getDebugLoc();
   const TargetInstrInfo *TII = Subtarget.getInstrInfo();
   MachineRegisterInfo &MRI = BB->getParent()->getRegInfo();
 
   // Create the destination register
   unsigned DstReg = MI->getOperand(0).getReg();
   int64_t ImmOp = MI->getOperand(1).getImm();
 
   unsigned Immediate = ((ImmOp >> 32) & 0xffffffff);
 
   BuildMI(*BB, MI, DL, TII->get(NuPlus::MOVEIHSI))
               .addReg(DstReg, RegState::Define, NuPlus::sub_odd)
               .addImm(((Immediate >> 16) & 0xFFFF));
   BuildMI(*BB, MI, DL, TII->get(NuPlus::MOVEILSI))
               .addReg(DstReg, 0, NuPlus::sub_odd)
               .addImm((Immediate & 0xFFFF));
 
   Immediate = (ImmOp & 0xffffffff);

   BuildMI(*BB, MI, DL, TII->get(NuPlus::MOVEIHSI))
               .addReg(DstReg, 0, NuPlus::sub_even)
               .addImm(((Immediate >> 16) & 0xFFFF));
   BuildMI(*BB, MI, DL, TII->get(NuPlus::MOVEILSI))
               .addReg(DstReg, 0, NuPlus::sub_even)
               .addImm((Immediate & 0xFFFF));
 
   MI->eraseFromParent();
 
   return BB;
 }

As it is explained, the load operation for 64-bit immediate value is realized by requiring a NuPlus64GPR register and writing the split parts on the sub-registers.

Disassembler Support

Since new register classes are introduced, proper decode methods are implemented in NuPlusDisassembler. As a result, DecodeGPR64RegisterClasses and DecodeVR512LRegisterClasses are added to disassemble the code.