NaplesPU LLVM Documentation
The main task of the backend is to generate nu+ assembly code from the LLVM IR obtained by the Clang frontend. It also handles object representation of classes needed to create the assembler and the disassembler. The nu+ backend is contained in the NuPlus folder under "compiler/lib/Target" directory. It contains several files, each implementing a specific class of the LLVM Framework.
An LLVM backend is constructed on two types of file, C++ and TableGen source files. Refer to section TableGen to get a detailed explanation of the latters.
Contents
Required reading
Before working on LLVM, you should be familiar with some things. In particular:
- Basic Blocks
- SSA (Static Single Assignment) form
- AST (Abstract Syntax tree)
- DAG Direct Acyclic Graph.
In addition to general aspects about compilers, it is recommended to review the following topics:
See the following textbook for other information Getting Started with LLVM Core Libraries and LLVM Cookbook.
See also this article to get an overview of the main CodeGenerator phases.
TableGen
TableGen is a record-oriented language used to describe the target-specific information. It is written by the LLVM team in order to simplify the back-end development and to avoid potential code redundancy. For example, by using TableGen, if some feature of the target-specific register file changes, you do not need to modify different files wherever the register appears but you need only to modify the .td file that contains its definition. Actually, the TableGen is used to define instruction formats, instructions, registers, pattern-matching DAGs, instruction selection matching order, calling conventions, and target platform properties.
For other informations, check the TableGen Documentation
Backend Description
This section shows how is implemented the backend support for nu+ within LLVM.
Target Definition
The target-specific information is explained in TableGen files. The custom target is defined by creating a new NuPlus.td file, in which the target itself is described. This file contains the implementation of the target-independent interfaces provided by Target.td. Implementations are done by using the class inheritance mechanism.
The code below is the Target class definition that should be implemented in NuPlus.td.
class Target {
InstrInfo InstructionSet;
list<AsmParser> AssemblyParsers = [DefaultAsmParser];
list<AsmParserVariant> AssemblyParserVariants = [DefaultAsmParserVariant];
list<AsmWriter> AssemblyWriters = [DefaultAsmWriter];
}
This file should also include the other defined .td target-related files. The target definition is done as follows:
def : Processor<"nuplus", NoItineraries, []>;
where Processor is a class defined in Target.td:
class Processor<string n, ProcessorItineraries pi, list<SubtargetFeature> f>
where:
- n is the chipset name, used in the command line option -mcpu to determine the appropriate chip.
- p is the processor itinerary, as described in the theoretical description of the LLVM instruction scheduling phase. NoItinerary means that no itinerary is defined.
- f is a list of target features.
Registers Definition
The target-specific registers are defined in NuPlusRegisterInfo.td. LLVM provides two ways to define a register, both of them declared in Target.td. The first one is used to define a simple scalar register and follows the declaration below:
class Register<string n, list<string> altNames = []>
where n is the register name, while altNames is a list of register alternative names.
The second method to define a register is to inherit from the class declared below:
class RegisterWithSubRegs<string n, list<Register> subregs>
This second way is used when it is required to define a register that is a collection of n sub-registers. There is also a third way to define registers. It consists of defining a super-register, that is a pseudo-register resulting of the combination of other sub-registers. It is useful when there is no architectural support for larger registers:
class RegisterTuples<list<SubRegIndex> Indices, list<dag> Regs>
For example, the following code is the declaration of 32-bit register class:
class NuPlus32GPRReg<string n> : Register <n>;
At this point, registers can be instantiated as follows:
foreach i = 0-57 in {
def S#i : MyTargetGPRReg<"s"#i>, DwarfRegNum<[i]>;
}
...
def SP_REG : MyTargetGPRReg<"sp">, DwarfRegNum<[61]>; //stack pointer
...
foreach i = 0-63 in {
def V#i : MyTargetGPRReg<"v"#i>, DwarfRegNum<[!add(i, 64)]>;
}
The instantiation reflects the custom target hardware architecture, with a set of scalar registers and a set of vectorial ones. Each register inherits also from theDwarfRegNum assigning to it an incremental number. This is useful for the internal identification of registers, consistent with the DWARF standard.
Now that registers are defined they must belong to classes in order to define the allocation of them.
LLVM provides the RegisterClass defined as below:
class RegisterClass<string namespace, list<ValueType> regTypes, int alignment, dag regList>
in which:
- namespace is the namespace associate to it;
- regTypesis a list of ValueType values indicating the types of variables that can be allocated into them;
- alignment is the alignment associated to the registers in the RegisterClass;
- regList is the list of registers that belong to the class. Since the parameter type is dag, TableGen provides operands to define a set of registers in terms of a set of operators.
For example, to match our target features it is necessary to define two classes:
- GPR32, that is the abstraction of 32-bit wide scalar registers;
- VR512W, that are the abstraction of 512-bit wide vectorial registers;
def GPR32 : RegisterClass<"MyTarget", [i32, f32, i64, f64], 32, (add (sequence "S%u", 0, 57),
TR_REG, MR_REG, FP_REG, SP_REG, RA_REG, PC_REG)>;
def VR512W : RegisterClass<"MyTarget", [v16i32, v16f32, v16i8, v16i16], 512, (sequence "V%u", 0, 63)>;
Calling Convention
This section shows how to define the calling convention, that is how parameters are passed to sub-functions, and how the return value is sent back to the caller.
The calling convention is defined in the NuPlusCallingConv.td by using the classes defined in the TargetCallingConv.td file.
In our purposes, it's required to define the calling convention in terms of the registers used to pass the arguments to the callee. LLVM provides the CallingConv class defined below:
class CallingConv<list<CCAction> actions>
This class requires a list of CCAction. TargetCallingConv.td contains a set of CCAction derived classes that must be used to define the sub-function calling behaviour.
The calling convention for the custom target device is defined by using the first eight registers, and then the stack for the remaining parameters. It means that for 32-bit variables, they are passed to the callee by using the registers Si, where i = 0..7. The same schema is adopted for vectorial variables.
The calling convention should also take care of the passing of type that is not natively supported by the target. In this case, the solution adopted is the \textit{type promotion}. The mechanism is simple: it consists of promoting the unsupported type to a supported one. It can be easily implemented by only using the CCAction classes provided by LLVM. By using the just described approach, i1, i8, i16 are promoted to i32 while v16i8, v16i16 are promoted to v16i32.
def CC_NuPlus : CallingConv<[
CCIfType<[i1, i8, i16], CCPromoteToType<i32>>,
CCIfType<[v16i8, v16i16], CCPromoteToType<v16i32>>,
CCIfNotVarArg<CCIfType<[i32, f32], CCAssignToReg<[S0, S1, S2, S3, S4, S5, S6, S7]>>>,
CCIfNotVarArg<CCIfType<[v16i32, v16f32], CCAssignToReg<[V0, V1, V2, V3, V4, V5, V6, V7]>>>,
CCIfType<[i32, f32], CCAssignToStack<4, 4>>,
CCIfType<[v16i32, v16f32], CCAssignToStack<64, 64>>
]>;
The calling convention for nu+ in terms of results returning mechanism is realised by passing them in the first six registers.
It could be also possible that the return type does not correspond to any native target type. The solution is promoting.
def RetCC_NuPlus32 : CallingConv<[
CCIfType<[i1, i8, i16], CCPromoteToType<i32>>,
CCIfType<[i32, f32], CCAssignToReg<[S0, S1, S2, S3, S4, S5]>>>
LLVM also provides a mechanism to define the registers that the callee must save before starting the function execution. The mechanism consists in defining an instance of the class CalleeSavedRegs:
class CalleeSavedRegs<dag saves>
where saves is the list of registers to be saved.
For example, in nu+, the callee saved registers are defined as follows:
def MyTargetCSR : CalleeSavedRegs<(add MR_REG, FP_REG, RA_REG)>