* cpu/drcbec.cpp: Don't clear carry flag on a zero-bit rotate through
carry.
* cpu/drcbex86.cpp: Don't clear carry flag on a word-sized zero-bit
rotate through carry (64-bit case is more involved).
* cpu/drcbex64.cpp: Removed code for another special case of ROLAND that
the simplifier deals with.
-konami/ksys573.cpp, bus/pccard/linflash.cpp: Corrected "Gacha Gachamp".
* Actually take a voltage snapshot when R or C change. This was being attempted, but didn't work because set_target_v would exit early if the target V was not changing. Made the snapshoting more explicit.
* Consider the EG done based on elapsed time, instead of proximity to target value. Some low volume DMX sounds were affected by this.
* video/mb_vcu.cpp:
- Implement device_palette_interface for palette functionality.
- Use an address space finder to access the host address space.
- Use logmacro.h helpers for configurable logging.
- Added a VRAM addressing helper.
- Suppress side effects for debugger reads.
- Cleanup 2bpp graphics drawing and screen update function.
* stern/mazerbla.cpp:
- Reduced run-time tag lookups and preprocessor macros.
- Reduced duplication and unnecessary trampolines.
- Updated comments.
* Implement device_palette_interface for color palette functionality.
* Added some missing members to save states, and use fixed-size integer types for members that need to be saved.
* Moved many internal functions into protected: and private: sections.
* Use more appropriate integer types, made many local variables const.
* machine/ie15_kbd.cpp: Reassigned keys on the IE15 keyboard to match the layout of a VT52 keypad.
* ussr/ms0515.cpp, ussr/dvk_ksm.cpp: Removed keyboard serial speed workaround.
-emu/diexec.cpp: If a shorter input line pulse overlaps a longer pulse, don't shorten the pulse.
-cpu/e132xs: Added named input line number constants.
-video/sprite.cpp: Got rid of simple_list and fixed_allocator.
This was "working" on x86-64 due to the backend treating shift/rotate
instructions with zero immediate bit count as a no-op even if the source
and destination registaer aren't the same. Fixing the bug in the
back-end caused it to break the same way on x86-64 as it does on the
other three back-ends that didn't have this bug.
* Fixed many cases that could cause the upper bits of a register not to
cleared following a 32-bit operation.
* Added more simplifications.
* Allow many simplifications when flag updates are requested.
* Fixed various bits of unreachable code.
* cpu/drcbearm64.cpp, cpu/drcbex64.cpp: Removed code for special-casing
some situations the simplifier can now take care of consistently.
-cpu/drcbex64.cpp: Fixed a bug causing some shifts to be treated as a
no-op when the destination and source are not the same.
-cpu/drcbearm64.cpp, cpu/drcbex64.cpp: Added a special case for
comparing something to itself.
-cpu/e132xs: Use the CARRY instruction rather than a right shift to set
up carry in.
- Working RAM through PPI 8155 internal RAM and handlers.
- Extended the PPI 8166 to support the 14bit timer + 2bit control.
- Hooked the i8257 DMA controller.
- Demuxed the digital inputs.
- Adjusted screen visible area.
- Worked the DMA support to get registers in the correct addressing.
- Hooked the analogic inputs.
- Added inputs for two players.
- Added DIP switches for coinage, difficulty, and lives.
- Added and demuxed spinner controls.
- Added NVRAM support.
- Sound support.
- Adjusted the spinners parameters to general purpose.
- Sound level control circuitry.
- Wired players lamps.
- Added technical notes.
- Rewrote the enhanced no documented i8085 RDEL & DSUB
instructions and their own flags.
Systems promoted to working
---------------------------
Paracaidista [Roberto Fresca, Grull Osgo]
* sound/c6280.cpp: Improved accuracy of volume control and LFO.
* video/huc6260.cpp: Suppress side effects for debugger reads, fixed save state issues.
* video/huc6270.cpp: Suppress side effects for debugger reads.
* Chose better types for member variables, made more local variables const, reformatted code.
* There are no byte enable or write strobe signals for I/O, and there's
only a single operand size, so word addresses make more sense.
* Also changed STBS/STWS to allow any valid signed or unsigned value of
the applicable size. This allows vamphalf attract mode to work as
well as the storage test.
-misc/limenko.cpp: Better input types for spotty.
* Only a single I/O access is generated for an I/O word read/write. The
upper half just disappears if the pins aren't present. This fixes
"phantom" I/O accesses, allowing address maps to be cleaned up a bit.
* Reduced I/O address width for models with 16-bit external bus to match
hardware.
* Made addressing consistent between interpreter and recompiler for I/O
double-word accesses.
* Implemented power down via internal I/O write for E1-X and later
cores (none of the games I tested actually use it).
-misc/pasha2.cpp: Enabled the recompiler for Zooty Drum - it gets just
as far as the interpreter now.
* Got package option (T, N or B suffix) out of device type.
* Enabled 4x PLL clock multiplier for GMS30C2216/GMS30C2232.
* Implemented entering power down mode via MCR for E1 and E1-X cores.
* Marginally better code generation for a few instructions.
* Log available bus/memory configuration options for different cores.
* Added post load handler for E1-XS and E1-XSR cores to install SDRAM
mode/configuration handlers if necessary.
* Improved comment about different Hynix and Hyperstone CPU models.
-cpu/drcbearm64.cpp: Don't update flags that aren't requested in a few
places.
* Assume ROL sets the V and C flags the same way as SHL and MOVI clears
the V flag.
-cpu/drcbex64.cpp: Optimise SUB x,0,y to a NEG instruction (gets down to
one instruction from two or three a lot of the time). This had been a
TODO comment for ages.
-cpu/drcbex86.cpp: Got rid of unnecessary std::function use. This
substantially reduces the code size and reduces allocations during code
generation.
-cpu/drcbearm64.cpp, cpu/drcbex64.cpp, cpu/drcbex86.cpp: Got rid of the
intermediate tables in favour of bit switch statements. This improves
startup time, reduces code size, and gives the compiler more
optimisation opportunities.
-cpu/drcbearm64.cpp, cpu/drcbex64.cpp, cpu/drcbex86.cpp: Got rid of
asmjit namespace qualifiers left over from when the class declarations
were in headers and hence outside the scope of the using namespace
statements.
* Fixed behaviour of delayed branches, trace exceptions, and saved PC
calculation for error exceptions in delay slots for the interpreter.
All instructions in delay slots, branching instructions that can raise
exceptions and tracing shoud now (mis)behave properly for the
interpreter, including things the manual says you shouldn't do.
* Fixed and optimised flag updates for left shifts for the recompiler.
* Optimised ROL instruction for the recompiler and made flag calculation
equivalent to the interpreter both with and without the "Missioncraft
flags" compile-time option.
* Only block interrupts for one instruction following a delayed branch.
* Optimised the SOFTWARE instruction a little for the recompiler.
* Added more SDRAM configuration logging and cleaned up code a bit.
-cpu/drcbearm64.cpp: Apply the change from 7efe37938f to OR and
XOR instructions as well, and fix some cases where a 32-bit logical
operation would fail to clear the upper bits of a register.
-cpu/drcbex64.cpp: Avoid more conditional branches on conditional MOV.
* Fixed behaviour of exceptions in delay slots, and fixed recompiler not
updating ILC and P for some exceptions.
* Implemented privilege error exception on setting L in user mode for
interpreter.
-emu/debug, osd/modules/debugger: Added an option to show
exceptionpoints in breakpoints windows.
* Implement MMC5 sound emulation
- Heavily based from devices/sound/nes_apu.cpp, Adjusted to differences compares to NES APU and MMC5.
* bus/nes/mmc5,cpp: Fix save state support, Implement MMC5 sound
* bus/nes/nes_slot.h: Fix save state support
* sound/nes_defs.h: Fix save state support
Emulate pointer error exception on load/store and range error exception
on store signed byte/half-word.
Further optimised code generation for MOV and MOVI. These are very hot,
so this alone gains a further 2% performance or so in the dgPIX games.
Also some other miscellaneous cleanup.
* Implemented pointer error exceptions on attempting to use a zero
address register (other than SR) in the recompiler.
* Also optimised load/store instructions a bit and reduced copy/paste.
* Fixed a couple of disassembler issues.
-misc/dgpix.cpp: Demoted The X-Files to not working with unemulated
protection.
* Aligned the operand field in disassembly.
* Calculate results of immediate values against the PC to make
position-independent code easier to read without constantly using a
calculator (e.g. this shows destinations for call Rd, PC, imm).
* Added more symbols to the UML helper to make logged generated code
more readable.
* Made single-instruction-per-sequence mode configurable rather than a
compile-time option.
* Got rid of a criminal amount of copy/paste in the disassembler, and
got rid of all the deprecated strcpy calls.
* Got rid of some duplicated constants, changed some constants from
macros to enumerated values or constexpr globals.
* Reduced the amound of stuff in headers that doesn't need to be there.
-cpu/drcbex64.cpp: Don't construct std::function objects during code
generation - they require allocation.
-eolith/eolith.cpp: Turned single-instruction-per-sequence mode on for
now until someone works out why turning it off causes Raccoon World to
generate so much code it's unplayably slow.
* Optimised double word shifts.
* Optimised the most common PC-relative operations to treat PC as
constant when possible, including:
- addi PC,imm (long relative branch)
- add PC,Rs (computed goto)
- sum Rd,PC,imm (calculate PC-relative address)
- add Rd,PC (calculate PC-relative address)
- ldw.d PC,Rs,imm (PC-relative load)
- stw.d PC,Rs,imm (PC-relative store)
* Changed template parameters to LlamaCase to make them more visible
different to constants/macros.
* Disabled single-instruction-per-block mode.
* Don't bother with delay slot checks where it's unnecessary.
* Try to generate a speicalised copy of the delay slot instruction
followed by a direct branch of possible.
* Use the pre-decoded instruction length for updating the PC.
* Specialised versions of the CHK instruction that always or never
raise exceptions.
* sound/s_dsp.cpp: Fix pitch modulation emulation, Fix save state support
reference: https://snes.nesdev.org/wiki/SNESdev_Wiki
* sound/s_dsp.cpp: Fix indent
* s_dsp.cpp: Reduce unnecessary lines, Fix typenames
* sound/s_dsp.cpp: More std::clamp uses, Use BIT for single bit flags
* sound/s_dsp.cpp: Fix input clock, Fix indent, Use lowercase hexadecimal values, Use reference for voice state
reference: https://snes.nesdev.org/wiki/S-SMP
* sound/s_dsp.cpp: Use logmacro.h for logging, Use BIT for single bit flags
* Fixed failing to call the debugger instruction hook for the first
instruction following an interrupt, exception or trap.
* Use UML branches to emulate non-delayed intra-block branches, avoiding
the expensive "hash jump".
* Re-worked the instruction description code:
- Calculate static branch targets for more instructions.
- Flag instructions that may cause mode changes.
- Don't be so eager to end an instruction sequence.
- Removed the local register input/output flags - FP may no be the
same when executing the code as when describing instructions.
* Fixed interpreter incorrectly setting ILC when an interrupt
immediately follows a RET instruction.
* Fixed recompiler flag calculation regressions, and optimised a little.
* Fixes interrupts not being serviced while tracing.
* Further improves recompiler performance.
* Fixes recompiler interrupt check function calling itself recursively.
* Also added debugger exception hook calls to interpreter and recompiler.
* Fixed XM (index move) instructions failing to update the destination
register on range error for interpreter and recompiler.
* Fixed double-word stores when the source indicates SR (both stored
words are zero) for interpreter and recompiler.
* Fixed recompiler failing to set ILC and P on range error and frame
error exceptions.
* Optimised recompiled code for word size shifts.
* Pushed more recompiler logic from run-time to code generation time and
simplified delay slot PC check and trace check logic.
* Use MOV rather than LOAD where possible in recompiler to improve code
generation performance and symbolic memory locatin names in
disassembled UML.
* Updated TODO list in header comment, reduced copy/paste some more.
-cpu/drcbex64.cpp: Avoid some more unnecessary register copies for
ROLAND.
cpu/e132xs: Implemented supervisor and trace modes as recompiler modes.
This eliminates or simplifies a lot of run-time checks. In particular,
the trace checks on every instruction are not generated when not
tracing, and simplified to just checking the P bit when tracing.
cpu/e132xs: Optimised code generation for RET, avoid a redundant load
when checking for an overflow trap, use the exception parameter for
exception codes rather than generating one function for each possible
code. Also simplified interpreter code for RET.
cpu/e132xs: Implemented SUMS for the recompiler.
cpu/e132xs: Implemented privilege check for setting L (interrupt
lockout) for recompiler. Not implemented for interpreter.
cpu/e132xs: Partially fixed tracing. P flag should be set by all
instructions except RET. Trace exceptions are not triggered for
branches when using the recompiler.
cpu/e132xs: Fixed ILC being set incorrectly for RET.
cpu/drcbex64.cpp: Avoid unnecessary expensive operations when a shift
operation request the zero and/or sign flags but not the carry flag.
* Also avoid an redundant load when checking if trace is active.
* Reduces generated native instruction count by about 24% on x86-64 and
gives an overall performance improvement of about 3.5% in -bench
scores.
* Made interrupt check function generate far more compact code (about
85% reduction in number of native instructions on x86-64).
* Optimised out-of-cycles check.
* Applied prior optimisation for trap/interrupt checks to static
exception checks as well (code is still copy/pasted).
* Cleaned up and commented code for generating an exception, reducing
about nine memory accesses to update SR to two.
* Implemented NEGS, and fixed ADDS and SUBS not setting excption handler
address.
* Optimised code to update Z flag on logic operations to avoid branches.
* Reduced copy/paste a bit more.
cpu/e132xs.cpp: Refactored code generation to improve performance and
fixed some issues:
* Moved a considerable amound of logic from execution time to code
generation time.
* Fixed some cases where add/subtract carry was being interpreted
incorrectly.
* Fixed a case where a load double intruction was incorrectly writing
the same register twice.
* Use UML flags to generate condition codes for addition/subtraction.
* Use UML carry flag for carry-/borrow-in.
* Reduced UML register pressure (improves performance for hosts with
fewer callee-saved CPU registers).
* Moved more logic to helper functions to simplify maintenance.
cpu/drcbex64.cpp: Fixed upper bits of UML registers being cleared when
used as address offset for LOAD/STORE.
cpu/drcbex64.cpp: Don't do expensive zero/sign flag update for shift
operations if only carry flag will be used.
cpu/drcbex64.cpp: Reduced copy/paste in READ[M]/WRITE[M] generators.
Sega Rally has an instruction that calculates d += p and loads a value into d at the same time; it is the loaded value that should be used, not the result of the ALU operation
Also only test the d register when performing an ALU operation
cpu/drcbex64.cpp: Avoid a lot of unnecessary flag manipulation on
shift/rotate operations. Don't calculate flags when not requested.
Don't preserve carry in for operations that don't use it as an input.
cpu/drcbex64.cpp: Avoid loading CL when ECX can be used. Loading CL
doesn't clear the upper bits, so it depends on the previous value of
RCX, causing pipeline dependencies. Loading ECX can grab a fresh rename
register.
cpu/drcbearm64.cpp: Attempt more optimisation on one more load immediate
operation.
cpu/e132xs: Get rid of a redundant TEST - ROLAND can set the Z flag.
* cpu/uml.cpp: Handle some more cases where ROLAND can be turned into
AND in the simplifier.
* cpu/drcbearm.cpp, cpu/drcbex64.cpp: Fixed a number of cases where
4-byte operations wouldn't clear the upper half of the destination
(there are plenty more of these caused by the simplifier that will be
harder to fix).
* cpu/drcbearm64.cpp: Fixed some cases where a conditional MOV could
unexpectedly clear the upper bits of the destination.
* cpu/drcbex64.cpp: Improved code generation for various arithmetic and
logical operations. More AND/OR/XOR/ADD/ADDC operand combinations are
optimised. Special cases of ROLAND/ROLINS are optimised.
* cpu/drcbex64.cpp: Don't treat operands to FADD/FMUL as commutative.
This isn't true when one is a NaN.
-cpu/e132xs: Use osd_printf_error for diagnositc output, and make more
local variables const.