IntroductionThis Wiki page defines the binary file format used by MiniJoe DetailsA MiniJoe binary file consists of a stream of bytes, 16-bit, 32-bit and 64-bit values are stored as the required number of bytes in big-endian format, this is the same format as supported by the Java java.io.DataInputStream and java.io.DataOutputStream interfaces. A MiniJoe binary file consists of a magic number followed by one or more blocks. A block starts with an identifying byte following by the block data. Whilst most blocks have their length as an short immediately following the type byte, the interpretation of the length is block dependent and described below. Some of these blocks may contain other blocks, which in turn may contain more blocks. | MiniJoe | Magic Block | | Magic | uint32 | | Block | Type Data ... | | Type | uint8 | | Data | uint8 ... |
The following blocks are currently defined: | Type | Description | | 0x00 | Comment | | 0x10 | String Table | | 0x20 | Double Literals | | 0x30 | String Literals | | 0x40 | Regex Literals | | 0x50 | Function Literals | | 0x60 | Variable Names | | 0x80 | Byte code | | 0xE0 | Line number data | | 0xF0 | Debug data | | 0xFF | End Marker |
Magic Magic := "M" "i" "n" "i" "J" "o" "e" Version
Version := uint8 Comment BlockCommentBlock := 0x00 Length Data
Length := uint16
Data := uint8[Length] This block is used for free-form comments about the MiniJoe binary, its contents are ignored (and possibly discarded) by the client-side runtime. The length field gives the number of data bytes. (The intention is that the length and data is read by a single call to java.io.DataInputStream.readUtf8). Global String Table BlockStringTableBlock := 0x10 Count String[Count]
Count := uint16
String := Length uint8[Length]
Length := uint16 The global string table contains the utf8 data for all strings within this program, and is referenced by the string literal block, the regex literal block, the variable name and possibly by the debug data block. The count field is the number of string entries, each string entry has a uint16 length followed by the string data, as if written by java.io.DataOutputStream.writeUtf8. The strings in this table are implicitly numbered according to their position within this block. Double Literal BlockDoubleLiteralBlock := 0x20 Count Double[Count]
Count := uint16
Double := uint64 String Literal BlockStringLiteralBlock := 0x30 Count Index[Count]
Count := uint16
Index := uint16 The String Literal block contains indexes into the global string table. Regex Literal BlockRegexLiteralBlock := 0x40 Count Index[Count]
Count := uint16
Index := uint16 The Regex Literal table contains indexes into the global string table. Function Literal BlockFunctionLiteralBlock := 0x50 Count FunctionLiteral[Count]
Count := uint16 A Function Literal block contains one or more Function Literals, each function literal contains one or more other blocks. FunctionLiteral := Block ... 0xFF
Count := uint16 Variable Name BlockVariableNameBlock := 0x60 Count Index[Count]
Count := uint16
Index := uint16 The Local Variable Name Block contains indexes into the Global String Table and gives a mapping from local variable names to local variable slots. The variable names are implicitly numbered according to the position within this block. Code BlockByteCodeBlock := 0x80 Locals Parameters Flags Length Code[Length]
Locals := uint16 // number of local variable slots required (including parameters)
Parameters := uint16 // number of declared parameters
Flags := uint8 // flags, see below
Count := uint16 // number of code bytes
Code := uint8 Flag Bits| Bit | Description | | 0 | if set, the code does not contain closures or with statements and local variables may be allocated via the stack | | 1..7 | reserved |
If there is a Variable Name Block corresponding to this Code Block, then the number of local variable must be at least equal to the number of variable name mappings (it may be larger, in case the compiler wishes to use local variable slots to store intermediate values, i.e. the results of common subexpressions). The parameters are mapped to the first 0 ... N local variable slots. we should consider a flag that copies the global string table to the local string table (consider a JSON data transfer) -Stefan Haustein 10/17/07 4:47 PM Line Number TableLineNumberBlock := 0xE0 Length [ProgramCounter, LineNumber]*
Length := uint16
ProgramCounter := uint16
LineNumber := uint16 The Length attribute gives the number of LineNumber pairs in this block. The pairs occur in strictly increasing ProgramCounter order. End MarkerEndMarker := 0xFF Marks the end of a function literal or of the file. ScopingA MiniJoe binary file has two levels or scopes: 'program scope' are blocks are the top level, 'function scope' are blocks within a Function Literal. Note that a Function Literal may contain a Function Literal Block which will contain more Function Literals. At each level no more than one block of each type is allowed. Except were described below, the order of blocks within each scope doesn't matter, however the order would normally be in the order they are described above. At the program level, the following blocks are valid: - Comment Block - if present this block must be first
- Global String Table - optional, but MUST be present if there are any String Literal Blocks or Regex Literal Blocks in the file. This block must also come before any String or Regex Literal block at the program level, and would normally be the first block with in a file (or second if there exists a comment block).
- Number Literal Block
- String Literal Block
- Regex Literal Block
- Function Literal Block
- Byte Code Block
- Line Number Block
- Debug Block
- End Of File Block - MUST be present, at the end of the file
At the function level the following blocks are valid: - Comment Block - if present this block must be the first block in a function literal
- Number Literal Block
- String Literal Block
- Regex Literal Block
- Function Literal Block
- Variable Name Block
- Byte Code Block
- Debug Block
- End Marker - MUST be present, at the end of the function literal
|