NaplesPUInstrFormats.td

From NaplesPU Documentation
Revision as of 17:52, 31 October 2017 by Catello (talk | contribs)
Jump to: navigation, search

NuPlusInstrFormats.td and NuPlusInstrInfo.td describe the nu+ instructions and the patterns to transform LLVM IR into machine code. The NuPlusInstrFormats.td contains the classes that describe the nu+ instruction formats, support classes that facilitates the instructions definition and also the definition nodes which make the pattern recognition easier.

The files "compiler/include/llvm/Target/Target.td" and "compiler/include/llvm/Target/TargetSelectionDAG.td" contain the Tablegen classes used for the description.

Instruction Formats

An instruction is specified in TableGen by the class Instruction (compiler/include/llvm/Target/Target.td), which contains, among others, the following fields:

  • Output operands (dag OutOperandList;), this contains the output value(s) defined by the instruction as a result of its computation;
  • Input operands (dag InOperandList;), this holds all the input value(s) used by the instruction as its input operands;
  • Assembly string (string AsmString = "";),this stores the string that is recognized by the assembler or that is printed by the disassembler;
  • DAG pattern (list<dag> Pattern;), this is the DAG pattern of machine-independent SelectionDAG nodes that is matched by the instruction selector to produce an instance of the corresponding target-specific instruction.

The class provides also flags to capture information about the high-level semantics of the instruction. The ones used in the nu+ back-end are:

  • isBranch, equals to 1 if the instruction is a branch.
  • isIndirectBranch, equals to 1 if the instruction is an indirect branch.
  • isBarrier, equals to 1 if the instruction is an indirect branch.
  • isTerminator, equals to 1 if the control flow can fall through the instruction.
  • isPseudo, equals to 1 if the instruction is a pseudo instruction.
  • isCodeGenOnly, equals to 1 if the instruction is a pseudo instruction used for codegen modeling purposes. The intended use is:
    • isPseudo: Does not have encoding information and should be expanded, at the latest, during lowering to MCInst.
    • isCodeGenOnly: Does have encoding information and can go through to the CodeEmitter unchanged, but duplicates a canonical instruction

definition's encoding and should be ignored when constructing theassembler match tables.

  • isAsmParserOnly, equals to 1 if the instruction is a pseudo instruction for use by the assembler parser. In this way the disassembler does use the asm string of the instruction. This is useful when two or more instructions share the same encoding, thus generating a disassembler conflict.
  • isReturn, equals to 1 if the instruction is a return.
  • isCall, equals to 1 if the instruction is a function call.

An instruction can also specify if it reads or writes non-operand registers, by specifying the registers in the Uses and Defs fields. The former contains a list of registers that the instruction uses (reads), the latter contains a list of registers that the instruction defines (writes).

To handle the nu+ ISA complexity, a hierarchy of classes has been created. Each level of the hierarchy refines an aspect of the nu+ instruction formats. For example the FR_TwoOp_Unmasked_32 class refines the FR class providing an easy way to define unmasked instructions of type R that takes two 32-bit operands.

The instruction formats classes are then used to create instruction multiclasses. In this way all the possible variants are generated with a single instruction definition. An example is the FArithInt_TwoOp multiclass. It is used with arithmetic instructions with two integer operands. When FArithInt_TwoOp instrcution is defined, Talbegen automatically instantiate all the possible variations according to the classes contained in the multiclass definition.

However, there is also a Pseudo class which can be used for nodes that cannot be translated into machine nodes through a pattern but require other transformations.

Pattern Fragments

In the file are also defined custom pattern fragments (the default ones are included in the file TargetSelectionDAG.td) used to help LLVM to match LLVM IR patterns. A pattern fragment, represented by the class PatFrag, can match something on the DAG, from a single node to multiple nested other fragments, by specifying the input operands, the dag fragment to match that satisfy a predicate (if applicable, default none) and even the transformation to perform through a SDNodeXForm (if applicable, default NOOP_SDNodeXForm).


class PatFrag<dag ops, dag frag, code pred = [{}],
              SDNodeXForm xform = NOOP_SDNodeXForm> : SDPatternOperator {
  dag Operands = ops;
  dag Fragment = frag;
  code PredicateCode = pred;
  code ImmediateCode = [{}];
  SDNodeXForm OperandTransform = xform;
}

As an example lets consider the nu+ load-store pattern fragments. Since nu+ has two addressing spaces, the main memory and the scratchpad memory, these pattern fragments are used to detect where loads and stores are directed. This is done specifying a predicate (written in C++) that checks the associated addressing space.

def MemStore : PatFrag<(ops node:$val, node:$ptr),
                       (store node:$val, node:$ptr), [{
               if(cast<StoreSDNode>(N)->getAddressSpace() != 77)
                  return !cast<StoreSDNode>(N)->isTruncatingStore();
               else
                  return false;}]>;

def ScratchpadStore : PatFrag<(ops node:$val, node:$ptr),
                              (store node:$val, node:$ptr), [{
               if(cast<StoreSDNode>(N)->getAddressSpace() == 77)
                 return !cast<StoreSDNode>(N)->isTruncatingStore();
               else
                 return false;}]>;

Starting from the PatFrag class, other useful classes are derived such as the OutPatFrag class and the PatLeaf class. The OutPatFrag class is pattern fragment but do not have predicates or transforms, used to avoid repeated subexpressions in output patterns. The PatLeaf class is a pattern fragments that have no operands and is used as a helper.

class OutPatFrag<dag ops, dag frag>
        : PatFrag<ops, frag, [{}], NOOP_SDNodeXForm>;

class PatLeaf<dag frag, code pred = [{}], SDNodeXForm xform = NOOP_SDNodeXForm>
        : PatFrag<(ops), frag, pred, xform>;

In the nu+ backend OutPatFrag class is used to help with the extraction and insertion of sub-registers.

def GetEvenReg: OutPatFrag<(ops node:$Rs),
                           (EXTRACT_SUBREG (i64 $Rs), sub_even)>;

def GetOddReg: OutPatFrag<(ops node:$Rs),
                          (EXTRACT_SUBREG (i64 $Rs), sub_odd)>;

def SetEvenReg: OutPatFrag<(ops node:$Rs),
                           (i64 (SUBREG_TO_REG (i64 0), (i32 $Rs), sub_even))>;

def SetOddReg: OutPatFrag<(ops node:$Rs),
                           (i64 (SUBREG_TO_REG (i64 0), (i32 $Rs), sub_odd))>;

In the nu+ backend PatLeaf is used to define 16-bit and 9-bit immediates.

def simm16 : PatLeaf<(imm), [{ return isInt<16>(N->getSExtValue()); }]>;

def simm9 : PatLeaf<(imm), [{ return isInt<9>(N->getSExtValue()); }]>;

The 'SDNodeXForm, mentioned above, is a class provided in order to manipulate nodes in the output DAG once a match has been formed and is typically used to manipulate immediate values. As an example, the LO32I transformation node is used to take the 32 less significant bits from 64-bit integer immediates.

def LO32I : SDNodeXForm<imm, [{
            return CurDAG->getTargetConstant((unsigned)N->getAPIntValue().getLoBits(32).getZExtValue(), SDLoc(N), MVT::i32);}]>;

For more complex patterns that require pattern matching code in C++, LLVM provides the ComplexPattern class. It takes the number of operands returned by the select function, the name of the function used to pattern match the max pattern(usually defined in the TargetNameDAGToDAGISel class), the list of possible root nodes of the sub-dags to match and the list of possible predicates.

class ComplexPattern<ValueType ty, int numops, string fn,
                     list<SDNode> roots = [], list<SDNodeProperty> props = []> {
  ValueType Ty = ty;
  int NumOperands = numops;
  string SelectFunc = fn;
  list<SDNode> RootNodes = roots;
  list<SDNodeProperty> Properties = props;
}


In the nu+ backend, it is used for the addressing modes, the SelectADDRri function is defined in the NuPlusDAGToDAGISel class.

def ADDRri : ComplexPattern<iPTR, 2, "SelectADDRri", [frameindex], []>;
def V16ADDRri : ComplexPattern<v16i32, 2, "SelectADDRri", [], []>;
def V8ADDRri : ComplexPattern<v8i64, 2, "SelectADDRri", [], []>;