Node Structure

[<<<] [>>>]

Have a look at the C definition of a node:

typedef struct _cNODE {
  long OpCode; // the code of operation
  union {
    struct {// when the node is a command
      unsigned long next;
      union {
        unsigned long pNode;// node id of the node
        long lLongValue;
        double dDoubleValue;
        unsigned long szStringValue;
        }Argument;
      }CommandArgument;
    struct {//when the node is an operation
      unsigned long Argument;//node id of the node list head
      }Arguments;
    union {// when the node is a constant

double dValue; long lValue; unsigned long sValue; // serial value of the string from the string table }Constant; struct {// when the node is a variable unsigned long Serial;// the serial number of the variable }Variable; struct {// when node is a user functions unsigned long NodeId; // the entry point of the function unsigned long Argument; // node id of the node list head }UserFunction; struct {// when the node is a node list head unsigned long actualm; //car unsigned long rest; //cdr }NodeList; }Parameter; } cNODE,*pcNODE;

The field OpCode is the same as the code used in the lexer or the syntax analyzer. In case of an IF statement it is CMD_IF. This field can, should and is used to identify which part of the union Parameter is to be used.

The individual lines of the BASIC program that create code are chained into a list. Each line has a head node. The OpCode of the head nodes is eNTYPE_LST. This type of node contains NodeList structure. The field NodeList.actualm contains the index of the first node of the actual line and the field NodeList.rest contains the index of the next header node.

This type of node is used to gather expression lists into a linked list.

Note that usually not the first node in the byte-code is the first head node, where the code is to be started. The nodes generated from a line are created before the head node is allocated in the syntax analyzer and the head node thus gets a larger serial number. The builder uses the serial numbers counted by the syntax analyzer and does not rearrange the nodes.

The command node that the field NodeList.actualm "points" contains the opcode of the command. For example if the actual command is IF then the OpCode is CMD_IF.

In case of command nodes the Parameter is CommandArgument. If the command has only a single argument the field next is zero. Otherwise this field contains the node index of the node holding the next argument.

The Parameter.CommandArgument.Argument union contains the actual argument of the command. There is no indication in the data structure what type the argument is. The command has to know what kind of arguments it gets, and should not interpret the union different.

The field pNode is the node index of the parameter. This is the case for example when the parameter is an expression or a label to jump to.

The fields lLongValue, dDoubleValue and szStringValue contain the constant values in case the argument is a constant. However this is actually not the string that is stored in the field szStringValue but the index to the string table where the string is started. (Yes, here is some inconsistency in naming.)

Strings are stored in a string table where each string is stored one after the other. Each string is terminated with a zero character and each string is preceded by a long value that indicates the length of the string. The zero character termination eases the use of the string constants when they have to be passed to the operating system avoiding the need to copy the strings in some cases.

The field Parameter.CommandArgument.next is zero in case there are no more arguments of the command, or the index of the node containing the next argument. The OpCode field of the following arguments is eNTYPE_CRG.

When the node is part of an expression and represents an operation or the call of a built-in function then the Arguments structure of the Parameter union is to be used. This simply contains Argument that "points" to a list of "list" nodes that list the arguments in a list. In this case the OpCode is the code of the built-in function or operation.

When the node represents a string or a numeric constant the Constant union field of the union Parameter should be used. This stores the constant value similar as the field CommandArgument except that it can only be long, double or a string. In case of constant node the OpCode is eNTYPE_DBL for a double, eNTYPE_LNG for a long and eNTYPE_STR for a string.

When the node represents a variable the field Variable has to be used. In this case the field Serial contains the serial number of the variable. To distinguish between local and global variables the OpCode is either eNTYPE_LVR for local variables or eNTYPE_GVR for global variables.

When the node is a user defined function call the field UserFunction is used. Note that this is not the node that is generated from the line sub/function myfunc but rather when the function or subroutine is called. The OpCode is eNTYPE_FUN.

The field NodeId is the index of the node where the function or subroutine starts. The field Argument is the index of the list node that starts the list of the argument expressions.


[<<<] [>>>]