Difference between revisions of "NaplesPU LLVM Documentation"
(→Backend Description) |
|||
Line 55: | Line 55: | ||
* ''p'' is the processor itinerary, as described in the theoretical description of the LLVM ''instruction scheduling'' phase. ''NoItinerary'' means that no itinerary is defined. | * ''p'' is the processor itinerary, as described in the theoretical description of the LLVM ''instruction scheduling'' phase. ''NoItinerary'' means that no itinerary is defined. | ||
* ''f'' is a list of target features. | * ''f'' is a list of target features. | ||
+ | |||
+ | === Registers Definition === | ||
+ | The target-specific registers are defined in [[NuPlusRegisterInfo.td]]. LLVM provides two ways to define a register, both of them declared in ''Target.td''. The first one is used to define a simple scalar register and follows the declaration below: | ||
+ | |||
+ | <code>class Register<string n, list<string> altNames = []></code> | ||
+ | |||
+ | where ''n'' is the register name, while ''altNames'' is a list of register alternative names. | ||
+ | |||
+ | The second method to define a register is to inherit from the class declared below: | ||
+ | |||
+ | <code>class RegisterWithSubRegs<string n, list<Register> subregs> </code> | ||
+ | |||
+ | This second way is used when it is required to define a register that is a collection of ''n'' sub-registers. | ||
+ | There is also a third way to define registers. It consists of defining a ''super-register'', that is a pseudo-register resulting of the combination of other sub-registers. It is useful when there is no architectural support for larger registers: | ||
+ | |||
+ | <code>class RegisterTuples<list<SubRegIndex> Indices, list<dag> Regs></code> | ||
+ | |||
+ | For example, the following code is the declaration of 32-bit register class: | ||
+ | |||
+ | <code>class NuPlus32GPRReg<string n> : Register <n>;</code> | ||
+ | |||
+ | At this point, registers can be instantiated as follows: | ||
+ | |||
+ | <syntaxhighlight> | ||
+ | foreach i = 0-57 in { | ||
+ | def S#i : MyTargetGPRReg<"s"#i>, DwarfRegNum<[i]>; | ||
+ | } | ||
+ | ... | ||
+ | def SP_REG : MyTargetGPRReg<"sp">, DwarfRegNum<[61]>; //stack pointer | ||
+ | ... | ||
+ | foreach i = 0-63 in { | ||
+ | def V#i : MyTargetGPRReg<"v"#i>, DwarfRegNum<[!add(i, 64)]>; | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | The instantiation reflects the custom target hardware architecture, with a set of scalar registers and a set of vectorial ones. | ||
+ | Each register inherits also from the''DwarfRegNum'' assigning to it an incremental number. This is useful for the internal identification of registers, consistent with the ''DWARF'' standard. | ||
+ | |||
+ | Now that registers are defined they must belong to classes in order to define the allocation of them. | ||
+ | |||
+ | LLVM provides the ''RegisterClass'' defined as below: | ||
+ | |||
+ | <code>class RegisterClass<string namespace, list<ValueType> regTypes, int alignment, dag regList></code> | ||
+ | |||
+ | in which: | ||
+ | |||
+ | * ''namespace'' is the namespace associate to it; | ||
+ | * ''regTypes''is a list of ''ValueType'' values indicating the types of variables that can be allocated into them; | ||
+ | * ''alignment'' is the alignment associated to the registers in the ''RegisterClass''; | ||
+ | * ''regList'' is the list of registers that belong to the class. Since the parameter type is ''dag'', TableGen provides operands to define a set of registers in terms of a set of operators. | ||
+ | |||
+ | For example, to match our target features it is necessary to define two classes: | ||
+ | * GPR32, that is the abstraction of 32-bit wide scalar registers; | ||
+ | * VR512W, that are the abstraction of 512-bit wide vectorial registers; | ||
+ | |||
+ | <syntaxhighlight> | ||
+ | def GPR32 : RegisterClass<"MyTarget", [i32, f32, i64, f64], 32, (add (sequence "S%u", 0, 57), | ||
+ | TR_REG, MR_REG, FP_REG, SP_REG, RA_REG, PC_REG)>; | ||
+ | def VR512W : RegisterClass<"MyTarget", [v16i32, v16f32, v16i8, v16i16], 512, (sequence "V%u", 0, 63)>; | ||
+ | </syntaxhighlight> |
Revision as of 19:08, 30 March 2019
The main task of the backend is to generate nu+ assembly code from the LLVM IR obtained by the Clang frontend. It also handles object representation of classes needed to create the assembler and the disassembler. The nu+ backend is contained in the NuPlus folder under "compiler/lib/Target" directory. It contains several files, each implementing a specific class of the LLVM Framework.
An LLVM backend is constructed on two types of file, C++ and TableGen source files. Refer to section TableGen to get a detailed explanation of the latters.
Contents
Required reading
Before working on LLVM, you should be familiar with some things. In particular:
- Basic Blocks
- SSA (Static Single Assignment) form
- AST (Abstract Syntax tree)
- DAG Direct Acyclic Graph.
In addition to general aspects about compilers, it is recommended to review the following topics:
See the following textbook for other information Getting Started with LLVM Core Libraries and LLVM Cookbook.
See also this article to get an overview of the main CodeGenerator phases.
TableGen
TableGen is a record-oriented language used to describe the target-specific information. It is written by the LLVM team in order to simplify the back-end development and to avoid potential code redundancy. For example, by using TableGen, if some feature of the target-specific register file changes, you do not need to modify different files wherever the register appears but you need only to modify the .td file that contains its definition. Actually, the TableGen is used to define instruction formats, instructions, registers, pattern-matching DAGs, instruction selection matching order, calling conventions, and target platform properties.
For other informations, check the TableGen Documentation
Backend Description
This section shows how is implemented the backend support for nu+ within LLVM.
Target Definition
The target-specific information is explained in TableGen files. The custom target is defined by creating a new NuPlus.td file, in which the target itself is described. This file contains the implementation of the target-independent interfaces provided by Target.td. Implementations are done by using the class inheritance mechanism.
The code below is the Target class definition that should be implemented in NuPlus.td.
class Target {
InstrInfo InstructionSet;
list<AsmParser> AssemblyParsers = [DefaultAsmParser];
list<AsmParserVariant> AssemblyParserVariants = [DefaultAsmParserVariant];
list<AsmWriter> AssemblyWriters = [DefaultAsmWriter];
}
This file should also include the other defined .td target-related files. The target definition is done as follows:
def : Processor<"nuplus", NoItineraries, []>;
where Processor is a class defined in Target.td:
class Processor<string n, ProcessorItineraries pi, list<SubtargetFeature> f>
where:
- n is the chipset name, used in the command line option -mcpu to determine the appropriate chip.
- p is the processor itinerary, as described in the theoretical description of the LLVM instruction scheduling phase. NoItinerary means that no itinerary is defined.
- f is a list of target features.
Registers Definition
The target-specific registers are defined in NuPlusRegisterInfo.td. LLVM provides two ways to define a register, both of them declared in Target.td. The first one is used to define a simple scalar register and follows the declaration below:
class Register<string n, list<string> altNames = []>
where n is the register name, while altNames is a list of register alternative names.
The second method to define a register is to inherit from the class declared below:
class RegisterWithSubRegs<string n, list<Register> subregs>
This second way is used when it is required to define a register that is a collection of n sub-registers. There is also a third way to define registers. It consists of defining a super-register, that is a pseudo-register resulting of the combination of other sub-registers. It is useful when there is no architectural support for larger registers:
class RegisterTuples<list<SubRegIndex> Indices, list<dag> Regs>
For example, the following code is the declaration of 32-bit register class:
class NuPlus32GPRReg<string n> : Register <n>;
At this point, registers can be instantiated as follows:
foreach i = 0-57 in {
def S#i : MyTargetGPRReg<"s"#i>, DwarfRegNum<[i]>;
}
...
def SP_REG : MyTargetGPRReg<"sp">, DwarfRegNum<[61]>; //stack pointer
...
foreach i = 0-63 in {
def V#i : MyTargetGPRReg<"v"#i>, DwarfRegNum<[!add(i, 64)]>;
}
The instantiation reflects the custom target hardware architecture, with a set of scalar registers and a set of vectorial ones. Each register inherits also from theDwarfRegNum assigning to it an incremental number. This is useful for the internal identification of registers, consistent with the DWARF standard.
Now that registers are defined they must belong to classes in order to define the allocation of them.
LLVM provides the RegisterClass defined as below:
class RegisterClass<string namespace, list<ValueType> regTypes, int alignment, dag regList>
in which:
- namespace is the namespace associate to it;
- regTypesis a list of ValueType values indicating the types of variables that can be allocated into them;
- alignment is the alignment associated to the registers in the RegisterClass;
- regList is the list of registers that belong to the class. Since the parameter type is dag, TableGen provides operands to define a set of registers in terms of a set of operators.
For example, to match our target features it is necessary to define two classes:
- GPR32, that is the abstraction of 32-bit wide scalar registers;
- VR512W, that are the abstraction of 512-bit wide vectorial registers;
def GPR32 : RegisterClass<"MyTarget", [i32, f32, i64, f64], 32, (add (sequence "S%u", 0, 57),
TR_REG, MR_REG, FP_REG, SP_REG, RA_REG, PC_REG)>;
def VR512W : RegisterClass<"MyTarget", [v16i32, v16f32, v16i8, v16i16], 512, (sequence "V%u", 0, 63)>;