Binary File Format

[<<<] [>>>]

The built code is usually saved to a cache file and this file is used later to load the already compiled code into memory for subsequent execution. The format of this file can be read from the function build_SaveCode in file `builder.c'. This function is quite linear it just saves several structures and is well commented so you should not have problem to understand it. However here I also give some description on the format of the binary file.

The binary file may or may not start with a textual line. This line is the usual UNIX #! /usr/bin/scriba string telling the operating system how to execute the text file. Altough we are talking now about a binary file, from the operating system point of view this is just a file, like a ScriptBasic source file, a Perl script or a bash script. The operating system starts to read the file and if the start of the file is something like

#! /usr/bin/scriba\n

with a new-line character at the end then it can be executed from the command line if some other permission related constraints are met.

When ScriptBasic saves the binary format file it uses the same executable path that was given in the source file. If the source file starts with the line #! /usr/bin/mypath/scriba and the basic progam `myprog.bas' was started using the command line

/usr/bin/scriba -o myprog.bbf myprog.bas

then the file `myprog.bbf' will start with the line #! /usr/bin/mypath/scriba.

The user's guide lists a small BASIC program that reads and writes the binary file and alters this line.

Having this line on the first place in the binary format BASIC file makes it possible to deliver programs in compiled format. For example you may develop a CGI application and deliver it as compiled format to protect your program from the customer. You can convert your source issuing the command line

/usr/bin/scriba -o outputdirectory/myprog.bas myprog.bas

and deliver the binary `myprog.bas' to the customer. ScriptBasic does not care the file extension and does not expect a file with the extension .bas to be source BASIC. It automatically recognizes binary format BASIC programs and thus you need no alter even the URLs that refer to CGI BASIC programs.

The next byte in the file following this optional opening line is the size of a long on the machine the code was created. The binary code is not necessarily portable from one machine to another. It depends on pointer and long size as well as byte ordering. We experienced Windows NT and Linux to create the same binary file but this is not a must, may change.

The size of a long is stored in a single character as sizeof(long)+0x30 so the ASCII character is either '4' or '8' on 32 and 64 bit machines.

This byte is followed by the version information. This is a struct:

  unsigned long MagicCode;
  unsigned long VersionHigh, VersionLow;
  unsigned long MyVersionHigh,MyVersionLow;
  unsigned long Build;
  unsigned long Date;
  unsigned char Variation[9];

The MagicCode is 0x1A534142. On DOS based system this is the characters 'BAS' and ^Z which means end of text file. Thus if you issue the command

C:\> type mybinaryprogram.bbf

you will get

4BAS

without scrambling your screen. If you use UNIX system then be clever enough not to cat a binary program to the terminal.

The values VersionHigh and VersionLow are the version number of ScriptBasic core code. This is currently 1 and 0. The fields MyVersionHigh and MyVersionLow are reserved for developers who develop a variation of ScriptBasic. The variation may alter some features and still is based on the same version of the core code. These two version fields are reserved here to distinguish between different variation versions based on the same core ScriptBasic code. To maintain these version numbers is essential for those who embed ScriptBasic into an application, especially if the different versions of the variations alter the binary file format which I doubt is really needed.

The field Build is the build of the core ScriptBasic code.

The Date is date when the file `builder.c' was compiled. The date is stored in a long in a tricky way that ensures that no two days result the same long number. In case you want to track how this is coded see the function build_MagicCode in file `builder.c'. This is really tricky.

The final field is Variation which is and should be an exactly 8 character long string and a zero character.

If you want to compile a different variation then alter the #define directives in the file `builder.c'

#define VERSION_HIGH 0x00000001
#define VERSION_LOW  0x00000000
#define MYVERSION_HIGH 0x00000000
#define MYVERSION_LOW  0x00000000
#define VARIATION "STANDARD"

To successfully load a binary format file to run in ScriptBasic the long size, the magic code, version information including the build and the variation string should match. Date may be different.

The following foru long numbers in the binary file define

This is followed by the nodes themselves and the stringtable.

This is the last point that has to exist in a binary format file of a BASIC program. The following bytes are optional and may not be present in the file.

The optional part contains the size of the function table defined on a long and the function table. After this the size of the global variable table is stored in a long and the global variable table.

The global variable and function symbol table are list of elements, each containing a long followed by the zero character terminated symbolic name. The long stores the serial number of the variable or the entry point of the function (the node index where the function starts).

These two tables are not used by ScriptBasic by itself, ScriptBasic does not need any symbolic information to execute a BASIC program. Programmers embedding ScriptBasic however demanded access global variables by name and the ability to execute individual functions from a BASIC program. If this last part is missing from a binary format BASIC program you will not be able to use in an application that uses these features.


[<<<] [>>>]