423 lines
16 KiB
Plaintext
423 lines
16 KiB
Plaintext
=====================
|
|
vCPU's SYS extensions
|
|
=====================
|
|
|
|
SYS extensions, or SYS functions, are a method to mix vCPU code
|
|
with native Gigatron code, mostly for acceleration purposes. The
|
|
SYS function will be in ROM, because that's from where native code
|
|
executes.
|
|
|
|
The release ROMs come with a set of predefined SYS functions. Any
|
|
application in GT1 format can use those freely, provided they do a
|
|
version check first.
|
|
|
|
You can also write your own application-specific SYS functions, but
|
|
then you better publish your program as a ROM file, not GT1.
|
|
|
|
SYS extensions are called with the `SYS' instruction from vCPU.
|
|
Their ROM address must be put in `sysFn' in zero page. Optional
|
|
arguments are in sysArgs[0:7]. It is also common to pass an input
|
|
and return value through vAC.
|
|
|
|
SYS functions are free to do whatever they want, as long as they
|
|
play well with the rules of the timing game. For simplicity of
|
|
implementation, vCPU measures elapsed time in integer number of
|
|
"ticks". 1 tick equals two hardware CPU cycles (1 cycle is 160 ns,
|
|
so 1 tick is 320 ns). All vCPU instructions and SYS functions
|
|
therefore execute in an even number of cycles. A `nop' must be
|
|
inserted if the cycle count would be odd without it.
|
|
|
|
Rule 1
|
|
------
|
|
|
|
SYS functions must complete and reenter the vCPU within their maximum
|
|
declared time. This maximum duration is passed to vCPU through the
|
|
operand D of the SYS call. Although the *caller* must feed this
|
|
maximum value into vCPU, the caller may only pass a value that
|
|
matches with the SYS function's own declaration. vCPU in turn uses
|
|
this value to decide if the function can complete in the remainder
|
|
of the current timeslice, or if it must wait for the start of the
|
|
next one. The total allowed time is 28-2*D clocks or cycles, with
|
|
D<=0. This includes all overhead including that of the SYS vCPU
|
|
instruction itself.
|
|
|
|
Reserving an insufficient number of ticks results in broken video
|
|
signals. It's OK for the caller to "ask" for, or "reserve", more
|
|
time than will be used. This may only result in getting dispatched
|
|
in a later timeslice than strictly necessary. Still, some of the
|
|
more complex built-in SYS functions declare a higher value than
|
|
their implementation needs. This is to allow for future bug fixing
|
|
without changing the caller in existing programs.
|
|
|
|
[Side note: One way to see the raw value of D is as the *excess*
|
|
number of ticks that vCPU must accommodate for, but represented as
|
|
a number <=0. By default vCPU already accounts for 28 cycles for
|
|
standard instructions.]
|
|
|
|
Rule 2
|
|
------
|
|
|
|
On return, SYS functions always report back, in hardware register
|
|
AC, how much time they have *actually* consumed. This returned value
|
|
is the negated number of whole ticks spent. Miscounting results in
|
|
broken video signals. vCPU uses this to know how much time has
|
|
actually passed. This may be less than reserved, but never more.
|
|
|
|
These rules restrict SYS calls to relatively simple operations,
|
|
because they must fit in the time between two horizontal pulses,
|
|
including overhead. This translates to roughly ~130 CPU cycles at
|
|
most. In reality they typically replace a small sequence of vCPU
|
|
instructions.
|
|
|
|
Loops are normally not possible. Small loops are usually unrolled
|
|
to reduce overhead. Still, some SYS functions implement iteration
|
|
by setting back vPC just before returning. This causes them to be
|
|
dispatched again, possibly in the next timeslice.
|
|
|
|
[Side note: It's best to study the cycle count annotations in the
|
|
ROM source code for concrete examples on how things are typically
|
|
done: how the return value is passed back to vCPU, how branches are
|
|
annotated and how sometimes `nop' instructions are inserted to make
|
|
the total cycle count even.]
|
|
|
|
Naming
|
|
------
|
|
|
|
The naming convention is: SYS_<CamelCase>[_v<V>]_<N>
|
|
|
|
With <N> the maximum number of clocks or cycles (not ticks!) the
|
|
function will need, counted from NEXT to NEXT. This duration <N>
|
|
must be passed to the 'SYS' vCPU instruction as operand, represented
|
|
as `270-max(14,<N>/2)'. In GCL you can just type `<N>!!', and the
|
|
compiler translates the operand for you. The gt1dump tool disassembles
|
|
and displays <N>.
|
|
|
|
If a SYS extension was introduced after ROM v1, the version number
|
|
of introduction is included in the name as v<V>. This helps the
|
|
programmer to be reminded to verify the actual ROM version. A program
|
|
must fail gracefully on older ROMs, or at least not crash. See also
|
|
Docs/GT1-files.txt on using `romType' for this purpose.
|
|
|
|
Function details
|
|
----------------
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_Exec_88
|
|
----------------------------------------------------------------------
|
|
|
|
Load code from ROM into memory and execute it
|
|
|
|
This loads the vCPU code with consideration of the current vSP
|
|
Used during reset, but also for switching between applications or for
|
|
loading data from ROM from within an application (overlays).
|
|
|
|
ROM stream format is [<addrH> <addrL> <n&255> n*<byte>]* 0
|
|
on top of lookup tables.
|
|
|
|
Variables:
|
|
sysArgs[0:1] ROM pointer (in)
|
|
sysArgs[2:3] RAM pointer (changed)
|
|
sysArgs[4] State counter (changed)
|
|
vLR vCPU continues here (out)
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_Out_22
|
|
----------------------------------------------------------------------
|
|
|
|
Send byte to output port
|
|
|
|
Variables:
|
|
vAC
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_In_24
|
|
----------------------------------------------------------------------
|
|
|
|
Read a byte from the input port
|
|
|
|
Variables:
|
|
vAC
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_Random_34
|
|
----------------------------------------------------------------------
|
|
|
|
Update entropy and copy to vAC
|
|
|
|
This same algorithm runs automatically once per vertical blank.
|
|
Use this function to get numbers at a higher rate.
|
|
|
|
Variables:
|
|
vAC
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_Unpack_56
|
|
----------------------------------------------------------------------
|
|
|
|
Unpack 3 bytes into 4 pixels
|
|
|
|
Variables:
|
|
sysArgs[0:2] Packed bytes (in)
|
|
sysArgs[0:3] Pixels (out)
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_Draw4_30
|
|
----------------------------------------------------------------------
|
|
|
|
Draw 4 pixels on screen, horizontally next to each other
|
|
|
|
Variables:
|
|
sysArgs[0:3] Pixels (in)
|
|
sysArgs[4:5] Position on screen (in)
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_VDrawBits_134
|
|
----------------------------------------------------------------------
|
|
|
|
Draw slice of a character, 8 pixels vertical
|
|
|
|
Variables:
|
|
sysArgs[0] Color 0 "background" (in)
|
|
sysArgs[1] Color 1 "pen" (in)
|
|
sysArgs[2] 8 bits, highest bit first (changed)
|
|
sysArgs[4:5] Position on screen (in)
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_LSRW1_48
|
|
Extension SYS_LSRW2_52
|
|
Extension SYS_LSRW3_52
|
|
Extension SYS_LSRW4_50
|
|
Extension SYS_LSRW5_50
|
|
Extension SYS_LSRW6_48
|
|
Extension SYS_LSRW7_30
|
|
Extension SYS_LSRW8_24
|
|
----------------------------------------------------------------------
|
|
|
|
Shift right by 1..8 bits
|
|
|
|
Variables:
|
|
vAC
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_LSLW4_46
|
|
Extension SYS_LSLW8_24
|
|
----------------------------------------------------------------------
|
|
|
|
Shift left by 4 or 8 bits
|
|
|
|
Variables:
|
|
vAC
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_Read3_40
|
|
----------------------------------------------------------------------
|
|
|
|
Read 3 consecutive bytes from ROM
|
|
|
|
Note: This function a bit obsolete, as it has very limited use. It's
|
|
effectively an application-specific SYS function for the Pictures
|
|
application from ROM v1. It requires the ROM data be organized
|
|
with trampoline3a and trampoline3b fragments, and their address
|
|
in ROM to be known. Better avoid using this.
|
|
|
|
Variables:
|
|
sysArgs[0:2] Bytes (out)
|
|
sysArgs[6:7] ROM pointer (in)
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_SetMode_v2_80
|
|
----------------------------------------------------------------------
|
|
|
|
Set video mode to 0 to 3 black scanlines per pixel line.
|
|
|
|
Mainly for making the MODE command available in Tiny BASIC, so that
|
|
the user can experiment. It's adviced to refrain from using
|
|
SYS_SetMode_v2_80 in regular applications. Video mode is a deeply
|
|
personal preference, and the programmer shouldn't overrule the user
|
|
in that choice. The Gigatron philisophy is that the end user has
|
|
the final say on what happens on the system, not the application,
|
|
even if that implies a degraded performance. This doesn't mean that
|
|
all applications must work well in all video modes: mode 1 is still
|
|
the default. If an application really doesn't work at all in that
|
|
mode, it's acceptable to change mode once after loading.
|
|
|
|
There's no "SYS_GetMode" function.
|
|
|
|
Variables:
|
|
vAC bit 0:1 Mode:
|
|
0 "ABCD" -> Full mode (slowest)
|
|
1 "ABC-" -> Default mode after reset
|
|
2 "A-C-" -> at67's mode
|
|
3 "A---" -> HGM's mode
|
|
vAC bit 2:15 Ignored bits and should be 0
|
|
|
|
Special values (ROM v4):
|
|
vAC = 1975 Zombie mode (no video signals, no input,
|
|
no blinkenlights).
|
|
vAC = -1 Leave zombie mode and restore previous mode.
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_SetMemory_v2_54
|
|
----------------------------------------------------------------------
|
|
|
|
SYS function for setting 1..256 bytes
|
|
|
|
Variables:
|
|
sysArgs[0] Copy count (in, changed)
|
|
sysArgs[1] Copy value (in)
|
|
sysArgs[2:3] Destination address (in, changed)
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_SendSerial1_v3_80
|
|
----------------------------------------------------------------------
|
|
|
|
SYS function for sending data over serial controller port using
|
|
pulse width modulation of the vertical sync signal.
|
|
|
|
Variables:
|
|
sysArgs[0:1] Source address (in, changed)
|
|
sysArgs[2] Start bit mask (typically 1) (in, changed)
|
|
sysArgs[3] Number of send frames X (in, changed)
|
|
|
|
The sending will abort if input data is detected on the serial port.
|
|
Returns 0 in case of all bits sent, or <>0 in case of abort
|
|
|
|
This modulates the next upcoming X vertical pulses with the supplied
|
|
data. A zero becomes a 7 line vPulse, a one will be 9 lines.
|
|
After that, the vPulse width falls back to 8 lines (idle).
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_ExpanderControl_v4_40
|
|
----------------------------------------------------------------------
|
|
|
|
Sets the I/O and RAM expander's control register
|
|
|
|
Variables:
|
|
vAC bit 2 Device enable /SS0
|
|
bit 3 Device enable /SS1
|
|
bit 4 Device enable /SS2
|
|
bit 5 Device enable /SS3
|
|
bit 6 Banking B0
|
|
bit 7 Banking B1
|
|
bit 15 Data out MOSI
|
|
sysArgs[7] Cache for control state (written to)
|
|
|
|
Intended for prototyping, and probably too low-level for most applications
|
|
Still there's a safeguard: it's not possible to disable RAM using this
|
|
|
|
Intended for prototyping, and probably too low-level for most applications
|
|
Still there's a safeguard: it's not possible to disable RAM using this
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_Run6502_v4_80
|
|
----------------------------------------------------------------------
|
|
|
|
Transfer control to v6502
|
|
|
|
Calling 6502 code from vCPU goes (only) through this SYS function.
|
|
Directly modifying the vCPUselect variable is unreliable. The
|
|
control transfer is immediate, without waiting for the current
|
|
time slice to end or first returning to vCPU.
|
|
|
|
vCPU code and v6502 code can interoperate without much hassle:
|
|
- The v6502 program counter is vLR, and v6502 doesn't touch vPC
|
|
- Returning to vCPU is with the BRK instruction
|
|
- BRK doesn't dump process state on the stack
|
|
- vCPU can save/restore the vLR with PUSH/POP
|
|
- Stacks are shared, vAC is shared
|
|
- vAC can indicate what the v6502 code wants. vAC+1 will be cleared
|
|
- Alternative is to leave a word in sysArgs[6:7] (v6502 X and Y registers)
|
|
- Another way is to set vPC before BRK, and vCPU will continue there(+2)
|
|
|
|
Calling v6502 code from vCPU looks like this:
|
|
LDWI SYS_Run6502_v4_80
|
|
STW sysFn
|
|
LDWI $6502_start_address
|
|
STW vLR
|
|
SYS 80
|
|
|
|
Variables:
|
|
vAC Accumulator
|
|
vLR Program Counter
|
|
vSP Stack Pointer (+1)
|
|
sysArgs[6] Index Register X
|
|
sysArgs[7] Index Register Y
|
|
For info:
|
|
sysArgs[0:1] Address Register, free to clobber
|
|
sysArgs[2] Instruction Register, free to clobber
|
|
sysArgs[3:5] Flags, don't touch
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_ResetWaveforms_v4_50
|
|
----------------------------------------------------------------------
|
|
|
|
Setup the default sound waveforms in page 7
|
|
|
|
soundTable[4x+0] = sawtooth, to be modified into metallic/noise
|
|
soundTable[4x+1] = pulse
|
|
soundTable[4x+2] = triangle
|
|
soundTable[4x+3] = sawtooth, also useful to right shift 2 bits
|
|
|
|
Used by cold and warm reset
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_ShuffleNoise_v4_46
|
|
----------------------------------------------------------------------
|
|
|
|
Use simple 6-bits variation of RC4 to permutate waveform 0 in soundTable
|
|
|
|
Used by cold and warm reset
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_SpiExchangeBytes_v4_134
|
|
----------------------------------------------------------------------
|
|
|
|
Send and receive 1..256 bytes over SPI interface
|
|
|
|
Variables:
|
|
sysArgs[0] Page index start, for both send/receive (in, changed)
|
|
sysArgs[1] Memory page for send data (in)
|
|
sysArgs[2] Page index stop (in)
|
|
sysArgs[3] Memory page for receive data (in)
|
|
sysArgs[4] Scratch (changed)
|
|
|
|
----------------------------------------------------------------------
|
|
Extension SYS_Sprite6_v3_64
|
|
Extension SYS_Sprite6x_v3_64
|
|
Extension SYS_Sprite6y_v3_64
|
|
Extension SYS_Sprite6xy_v3_64
|
|
----------------------------------------------------------------------
|
|
|
|
Blit sprite in screen memory
|
|
|
|
Variables:
|
|
vAC Destination address in screen
|
|
sysArgs[0:1] Source address of 6xY pixels (colors 0..63) terminated
|
|
by negative byte value N (typically N = -Y)
|
|
sysArgs[2:7] Scratch (user as copy buffer)
|
|
|
|
This SYS function draws a sprite of 6 pixels wide and Y pixels high.
|
|
The pixel data is read sequentually from RAM, in horizontal chunks
|
|
of 6 pixels at a time, and then written to the screen through the
|
|
destination pointer (each chunk underneath the previous), thus
|
|
drawing a 6xY stripe. Pixel values should be non-negative. The first
|
|
negative byte N after a chunk signals the end of the sprite data.
|
|
So the sprite's height Y is determined by the source data and is
|
|
therefore flexible. This negative byte value, typically N == -Y,
|
|
is then used to adjust the destination pointer's high byte, to make
|
|
it easier to draw sprites wider than 6 pixels: just repeat the SYS
|
|
call for as many 6-pixel wide stripes you need. All arguments are
|
|
already left in place to facilitate this. After one call, the source
|
|
pointer will point past that source data, effectively:
|
|
src += Y * 6 + 1
|
|
The destination pointer will have been adjusted as:
|
|
dst += (Y + N) * 256 + 6
|
|
(With arithmetic wrapping around on the same memory page)
|
|
|
|
Y is only limited by source memory, not by CPU cycles. The
|
|
implementation is such that the SYS function self-repeats, each
|
|
time drawing the next 6-pixel chunk. It can typically draw 12
|
|
pixels per scanline this way.
|
|
|
|
-- End of document --
|