gigatron/rom/Docs/SYS-functions.txt
2025-01-28 19:17:01 +03:00

423 lines
16 KiB
Plaintext

=====================
vCPU's SYS extensions
=====================
SYS extensions, or SYS functions, are a method to mix vCPU code
with native Gigatron code, mostly for acceleration purposes. The
SYS function will be in ROM, because that's from where native code
executes.
The release ROMs come with a set of predefined SYS functions. Any
application in GT1 format can use those freely, provided they do a
version check first.
You can also write your own application-specific SYS functions, but
then you better publish your program as a ROM file, not GT1.
SYS extensions are called with the `SYS' instruction from vCPU.
Their ROM address must be put in `sysFn' in zero page. Optional
arguments are in sysArgs[0:7]. It is also common to pass an input
and return value through vAC.
SYS functions are free to do whatever they want, as long as they
play well with the rules of the timing game. For simplicity of
implementation, vCPU measures elapsed time in integer number of
"ticks". 1 tick equals two hardware CPU cycles (1 cycle is 160 ns,
so 1 tick is 320 ns). All vCPU instructions and SYS functions
therefore execute in an even number of cycles. A `nop' must be
inserted if the cycle count would be odd without it.
Rule 1
------
SYS functions must complete and reenter the vCPU within their maximum
declared time. This maximum duration is passed to vCPU through the
operand D of the SYS call. Although the *caller* must feed this
maximum value into vCPU, the caller may only pass a value that
matches with the SYS function's own declaration. vCPU in turn uses
this value to decide if the function can complete in the remainder
of the current timeslice, or if it must wait for the start of the
next one. The total allowed time is 28-2*D clocks or cycles, with
D<=0. This includes all overhead including that of the SYS vCPU
instruction itself.
Reserving an insufficient number of ticks results in broken video
signals. It's OK for the caller to "ask" for, or "reserve", more
time than will be used. This may only result in getting dispatched
in a later timeslice than strictly necessary. Still, some of the
more complex built-in SYS functions declare a higher value than
their implementation needs. This is to allow for future bug fixing
without changing the caller in existing programs.
[Side note: One way to see the raw value of D is as the *excess*
number of ticks that vCPU must accommodate for, but represented as
a number <=0. By default vCPU already accounts for 28 cycles for
standard instructions.]
Rule 2
------
On return, SYS functions always report back, in hardware register
AC, how much time they have *actually* consumed. This returned value
is the negated number of whole ticks spent. Miscounting results in
broken video signals. vCPU uses this to know how much time has
actually passed. This may be less than reserved, but never more.
These rules restrict SYS calls to relatively simple operations,
because they must fit in the time between two horizontal pulses,
including overhead. This translates to roughly ~130 CPU cycles at
most. In reality they typically replace a small sequence of vCPU
instructions.
Loops are normally not possible. Small loops are usually unrolled
to reduce overhead. Still, some SYS functions implement iteration
by setting back vPC just before returning. This causes them to be
dispatched again, possibly in the next timeslice.
[Side note: It's best to study the cycle count annotations in the
ROM source code for concrete examples on how things are typically
done: how the return value is passed back to vCPU, how branches are
annotated and how sometimes `nop' instructions are inserted to make
the total cycle count even.]
Naming
------
The naming convention is: SYS_<CamelCase>[_v<V>]_<N>
With <N> the maximum number of clocks or cycles (not ticks!) the
function will need, counted from NEXT to NEXT. This duration <N>
must be passed to the 'SYS' vCPU instruction as operand, represented
as `270-max(14,<N>/2)'. In GCL you can just type `<N>!!', and the
compiler translates the operand for you. The gt1dump tool disassembles
and displays <N>.
If a SYS extension was introduced after ROM v1, the version number
of introduction is included in the name as v<V>. This helps the
programmer to be reminded to verify the actual ROM version. A program
must fail gracefully on older ROMs, or at least not crash. See also
Docs/GT1-files.txt on using `romType' for this purpose.
Function details
----------------
----------------------------------------------------------------------
Extension SYS_Exec_88
----------------------------------------------------------------------
Load code from ROM into memory and execute it
This loads the vCPU code with consideration of the current vSP
Used during reset, but also for switching between applications or for
loading data from ROM from within an application (overlays).
ROM stream format is [<addrH> <addrL> <n&255> n*<byte>]* 0
on top of lookup tables.
Variables:
sysArgs[0:1] ROM pointer (in)
sysArgs[2:3] RAM pointer (changed)
sysArgs[4] State counter (changed)
vLR vCPU continues here (out)
----------------------------------------------------------------------
Extension SYS_Out_22
----------------------------------------------------------------------
Send byte to output port
Variables:
vAC
----------------------------------------------------------------------
Extension SYS_In_24
----------------------------------------------------------------------
Read a byte from the input port
Variables:
vAC
----------------------------------------------------------------------
Extension SYS_Random_34
----------------------------------------------------------------------
Update entropy and copy to vAC
This same algorithm runs automatically once per vertical blank.
Use this function to get numbers at a higher rate.
Variables:
vAC
----------------------------------------------------------------------
Extension SYS_Unpack_56
----------------------------------------------------------------------
Unpack 3 bytes into 4 pixels
Variables:
sysArgs[0:2] Packed bytes (in)
sysArgs[0:3] Pixels (out)
----------------------------------------------------------------------
Extension SYS_Draw4_30
----------------------------------------------------------------------
Draw 4 pixels on screen, horizontally next to each other
Variables:
sysArgs[0:3] Pixels (in)
sysArgs[4:5] Position on screen (in)
----------------------------------------------------------------------
Extension SYS_VDrawBits_134
----------------------------------------------------------------------
Draw slice of a character, 8 pixels vertical
Variables:
sysArgs[0] Color 0 "background" (in)
sysArgs[1] Color 1 "pen" (in)
sysArgs[2] 8 bits, highest bit first (changed)
sysArgs[4:5] Position on screen (in)
----------------------------------------------------------------------
Extension SYS_LSRW1_48
Extension SYS_LSRW2_52
Extension SYS_LSRW3_52
Extension SYS_LSRW4_50
Extension SYS_LSRW5_50
Extension SYS_LSRW6_48
Extension SYS_LSRW7_30
Extension SYS_LSRW8_24
----------------------------------------------------------------------
Shift right by 1..8 bits
Variables:
vAC
----------------------------------------------------------------------
Extension SYS_LSLW4_46
Extension SYS_LSLW8_24
----------------------------------------------------------------------
Shift left by 4 or 8 bits
Variables:
vAC
----------------------------------------------------------------------
Extension SYS_Read3_40
----------------------------------------------------------------------
Read 3 consecutive bytes from ROM
Note: This function a bit obsolete, as it has very limited use. It's
effectively an application-specific SYS function for the Pictures
application from ROM v1. It requires the ROM data be organized
with trampoline3a and trampoline3b fragments, and their address
in ROM to be known. Better avoid using this.
Variables:
sysArgs[0:2] Bytes (out)
sysArgs[6:7] ROM pointer (in)
----------------------------------------------------------------------
Extension SYS_SetMode_v2_80
----------------------------------------------------------------------
Set video mode to 0 to 3 black scanlines per pixel line.
Mainly for making the MODE command available in Tiny BASIC, so that
the user can experiment. It's adviced to refrain from using
SYS_SetMode_v2_80 in regular applications. Video mode is a deeply
personal preference, and the programmer shouldn't overrule the user
in that choice. The Gigatron philisophy is that the end user has
the final say on what happens on the system, not the application,
even if that implies a degraded performance. This doesn't mean that
all applications must work well in all video modes: mode 1 is still
the default. If an application really doesn't work at all in that
mode, it's acceptable to change mode once after loading.
There's no "SYS_GetMode" function.
Variables:
vAC bit 0:1 Mode:
0 "ABCD" -> Full mode (slowest)
1 "ABC-" -> Default mode after reset
2 "A-C-" -> at67's mode
3 "A---" -> HGM's mode
vAC bit 2:15 Ignored bits and should be 0
Special values (ROM v4):
vAC = 1975 Zombie mode (no video signals, no input,
no blinkenlights).
vAC = -1 Leave zombie mode and restore previous mode.
----------------------------------------------------------------------
Extension SYS_SetMemory_v2_54
----------------------------------------------------------------------
SYS function for setting 1..256 bytes
Variables:
sysArgs[0] Copy count (in, changed)
sysArgs[1] Copy value (in)
sysArgs[2:3] Destination address (in, changed)
----------------------------------------------------------------------
Extension SYS_SendSerial1_v3_80
----------------------------------------------------------------------
SYS function for sending data over serial controller port using
pulse width modulation of the vertical sync signal.
Variables:
sysArgs[0:1] Source address (in, changed)
sysArgs[2] Start bit mask (typically 1) (in, changed)
sysArgs[3] Number of send frames X (in, changed)
The sending will abort if input data is detected on the serial port.
Returns 0 in case of all bits sent, or <>0 in case of abort
This modulates the next upcoming X vertical pulses with the supplied
data. A zero becomes a 7 line vPulse, a one will be 9 lines.
After that, the vPulse width falls back to 8 lines (idle).
----------------------------------------------------------------------
Extension SYS_ExpanderControl_v4_40
----------------------------------------------------------------------
Sets the I/O and RAM expander's control register
Variables:
vAC bit 2 Device enable /SS0
bit 3 Device enable /SS1
bit 4 Device enable /SS2
bit 5 Device enable /SS3
bit 6 Banking B0
bit 7 Banking B1
bit 15 Data out MOSI
sysArgs[7] Cache for control state (written to)
Intended for prototyping, and probably too low-level for most applications
Still there's a safeguard: it's not possible to disable RAM using this
Intended for prototyping, and probably too low-level for most applications
Still there's a safeguard: it's not possible to disable RAM using this
----------------------------------------------------------------------
Extension SYS_Run6502_v4_80
----------------------------------------------------------------------
Transfer control to v6502
Calling 6502 code from vCPU goes (only) through this SYS function.
Directly modifying the vCPUselect variable is unreliable. The
control transfer is immediate, without waiting for the current
time slice to end or first returning to vCPU.
vCPU code and v6502 code can interoperate without much hassle:
- The v6502 program counter is vLR, and v6502 doesn't touch vPC
- Returning to vCPU is with the BRK instruction
- BRK doesn't dump process state on the stack
- vCPU can save/restore the vLR with PUSH/POP
- Stacks are shared, vAC is shared
- vAC can indicate what the v6502 code wants. vAC+1 will be cleared
- Alternative is to leave a word in sysArgs[6:7] (v6502 X and Y registers)
- Another way is to set vPC before BRK, and vCPU will continue there(+2)
Calling v6502 code from vCPU looks like this:
LDWI SYS_Run6502_v4_80
STW sysFn
LDWI $6502_start_address
STW vLR
SYS 80
Variables:
vAC Accumulator
vLR Program Counter
vSP Stack Pointer (+1)
sysArgs[6] Index Register X
sysArgs[7] Index Register Y
For info:
sysArgs[0:1] Address Register, free to clobber
sysArgs[2] Instruction Register, free to clobber
sysArgs[3:5] Flags, don't touch
----------------------------------------------------------------------
Extension SYS_ResetWaveforms_v4_50
----------------------------------------------------------------------
Setup the default sound waveforms in page 7
soundTable[4x+0] = sawtooth, to be modified into metallic/noise
soundTable[4x+1] = pulse
soundTable[4x+2] = triangle
soundTable[4x+3] = sawtooth, also useful to right shift 2 bits
Used by cold and warm reset
----------------------------------------------------------------------
Extension SYS_ShuffleNoise_v4_46
----------------------------------------------------------------------
Use simple 6-bits variation of RC4 to permutate waveform 0 in soundTable
Used by cold and warm reset
----------------------------------------------------------------------
Extension SYS_SpiExchangeBytes_v4_134
----------------------------------------------------------------------
Send and receive 1..256 bytes over SPI interface
Variables:
sysArgs[0] Page index start, for both send/receive (in, changed)
sysArgs[1] Memory page for send data (in)
sysArgs[2] Page index stop (in)
sysArgs[3] Memory page for receive data (in)
sysArgs[4] Scratch (changed)
----------------------------------------------------------------------
Extension SYS_Sprite6_v3_64
Extension SYS_Sprite6x_v3_64
Extension SYS_Sprite6y_v3_64
Extension SYS_Sprite6xy_v3_64
----------------------------------------------------------------------
Blit sprite in screen memory
Variables:
vAC Destination address in screen
sysArgs[0:1] Source address of 6xY pixels (colors 0..63) terminated
by negative byte value N (typically N = -Y)
sysArgs[2:7] Scratch (user as copy buffer)
This SYS function draws a sprite of 6 pixels wide and Y pixels high.
The pixel data is read sequentually from RAM, in horizontal chunks
of 6 pixels at a time, and then written to the screen through the
destination pointer (each chunk underneath the previous), thus
drawing a 6xY stripe. Pixel values should be non-negative. The first
negative byte N after a chunk signals the end of the sprite data.
So the sprite's height Y is determined by the source data and is
therefore flexible. This negative byte value, typically N == -Y,
is then used to adjust the destination pointer's high byte, to make
it easier to draw sprites wider than 6 pixels: just repeat the SYS
call for as many 6-pixel wide stripes you need. All arguments are
already left in place to facilitate this. After one call, the source
pointer will point past that source data, effectively:
src += Y * 6 + 1
The destination pointer will have been adjusted as:
dst += (Y + N) * 256 + 6
(With arithmetic wrapping around on the same memory page)
Y is only limited by source memory, not by CPU cycles. The
implementation is such that the SYS function self-repeats, each
time drawing the next 6-pixel chunk. It can typically draw 12
pixels per scanline this way.
-- End of document --