BigEd wrote:Could you sketch out the memory map for the two cases? (For example, is there a distinction between user program and user variables? Does it use all sideways RAM or just some? How much of the 64k in the copro gets used?)
Sure, thanks for giving me the opportunity to waffle some more.
The following keeps talking about 'modules'. You probably know what they are, but just for the benefit of any other readers, a module is what you get by compiling a PLASMA source file. It's a bit like an object file in a traditional compiled language like C, although it's really a bit more like a shared object as they are linked together dynamically at run time. A module includes a list of dependencies on other modules, so taking the ROGUE game as an example:
- The top-level module ROGUE, which is what you load to run the game, has dependencies on CMDSYS (the built-in module containing some basic functions), ROGUEMAP, ROGUECOMBAT and ROGUEIO.
- ROGUEMAP has dependencies on CMDSYS and ROGUEIO
- ROGUEIO has dependencies on CMDSYS and ACORNOS
- ROGUECOMBAT has dependencies on CMDSYS and ROGUEMAP
When you tell the VM to load ROGUE by typing '+ROGUE', it loads it and then recursively loads all the dependencies so all the necessary code and data is present in memory.
A module contains assembly language code (for 'asm myfunction' definitions in the source code), PLASMA bytecode (for 'def myfunction' definitions in the source code), global variables and information on the symbols exported and imported by that module, which the VM uses to perform the dynamic linking. I'll note in passing that assembly language code in a module doesn't get relocated, so it must be written to be position independent (e.g. using Bxx branches instead of JMP).
With that out of the way, let's talk about the memory maps.
The 'PLASMA' executable runs on an unexpanded machine or on a second processor. It has a flat memory map, which looks roughly like this.
- OSHWM to HEAPSTART=OSHWM+6K - the PLASMA VM itself (6K is a rough figure, and I just made the name HEAPSTART up)
- HEAPSTART to HIMEM - user programs and data. HIMEM varies with mode on an unexpanded machine as you'd expect, on a second processor it's always &F800. This area is further broken down into:
- HEAPSTART to HEAPEND - the heap, which grows up from HEAPSTART towards higher addresses. Modules are loaded and executed here, program global variables live here along with the program code, and user programs can allocate space using heapalloc() and release it with heaprelease(). User program heap allocations are automatically freed on program exit. (There is support for user modules remaining resident after execution, and my brief experiments suggest this works fine, but I haven't played with it much yet.)
- IFP to HIMEM - the frame stack, which grows down from HIMEM towards lower addresses. Used to hold local variables for functions which are executing.
On this model, whatever memory is free is available for program code and user data as required; you can have a small program operating on a large amount of data, or vice versa. On a Master with no second processor in mode 7, there's 21K available for programs and data at the PLASMA prompt; on a second processor there's 53K available.
As noted above, the PLASMA language allows you to write functions in either assembly language ('asm myfunction') or PLASMA ('def myfunction'), the latter being compiled into PLASMA bytecodes. Under the 'PLASMA' executable, both of these types of functions are loaded onto the heap and executed from there. 'PLAS128' uses up to 64K of sideways RAM to hold PLASMA bytecodes, but the VM, assembly language functions and all user data (globals, heap allocations, local variables on the frame stack) live in main RAM exactly as in the 'PLASMA' executable.
The PLAS128 VM is a bit larger due to the extra complexity of handling the sideways RAM paging, so you get a bit less main RAM free - which means you have a bit less space available for globals, data on the heap, assembly language code and local variables - in return for not using main RAM for the bytecodes which hopefully make up the bulk of your program. PLAS128 benefits from shadow RAM just as PLASMA does, because it raises HIMEM, but it's not mandatory.
There are a few extra subtleties to mention regarding PLAS128:
- The first 9 bytes of each 16K bank are deliberately left untouched to avoid the risk of them accidentally looking like valid ROMs to the OS
- A single module can't span two 16K sideways RAM bank, so if you load a module with 12K of bytecode and then another module with 5K of bytecode, the last 4K in the first sideways RAM bank will be wasted. This also means you can't load a module which has more than 15.xK of bytecode.
- Every bytecode function, including those loaded into sideways RAM, needs an entry in the definition table in main RAM, which takes about 5 bytes per function, and exported functions additionally need a symbol table entry, which takes about len(function_name)+2 bytes.
- Constant strings in PLASMA bytecode need to be copied into main RAM so they can be accessed; the frame stack is used to contain a per-function string pool in addition to the local variables for that function. This obviously uses up RAM while the function is executing, but it doesn't count against the 255 byte limit on a function's local variables.
Eventually I intend to produce a third version, a "normal" Acorn OS language ROM which runs from a single sideways ROM or RAM bank. This would get the VM code out of main RAM (making use of modes 0-2 more practical without shadow RAM) and I expect it to be possible to allow the remaining 10K or so of the sideways RAM bank to be used for bytecodes if running from sideways RAM instead of sideways ROM. (That 10K could *almost* be used for heap, but problems would occur if a user program made a heap allocation, happened to get an address in that 10K of sideways RAM and passed it to an OS call.)
It has occurred to me that as long as no module exceeds 64K, there is very little practical limitation on the total size of the PLASMA bytecodes forming a program. It would therefore probably be possible to produce a version for the Matchbox co-pro with 1MB of banked RAM which uses (say) 6-7K for the VM, a 16K window onto 1MB of bytecode space and the remaining 37K or so for the heap (assembly language code and user data) and frame stack. Similarly, I think it would be possible to produce a 'PLAS256' which uses more than 64K of sideways RAM for bytecodes, but given that even PLAS128 hasn't had a serious hammering yet I have resisted the temptation to produce something so unnecessary.
(It would be sort of cool in a very very geeky way, but part of the fun of something like this IMO is building something that actually works well, even if it works at something of no practical use, and without a huge program needing the space provided by these hypothetical versions they'd never get enough testing to be satisfyingly proved to work.)