The free-for-all on labels in software lists is not working. There's no
consistency, labels are getting excessively long, people are starting to
use non-ASCII characters in labels making it harder for others to type
them when manipulating files on the command line, and there's too much
markup being put in labels.
The length limit is 127 characters, same as for labels in MAME itself.
This should be long enough to be descriptive. Remember that the Win32
path limit is 260 characters, and many applications and frameworks have
issues with longer paths, including Windows Explorer and the .NET
framework. Labels are used as filenames, so concessions need to be
made for this.
I have not abbreviated excessively long labels myself - they're
currently causing 135 validity errors. Someone else can fix them.
Printable ASCII characters are allowed, with a few exceptions. The
exceptions are limited to characters most likely to cause issues for
interactive shells and scripts:
* ! - csh event substitution (very difficult to escape properly)
* $ - sh varibale expansion
* % - csh job control, cmd variable expansion
* / - UNIX directory separator
* : - sh path separator, Windows drive qualifier
* \ - sh escape, Windows directory separator
Most of the labels that had to be edited were using ! for markup, or
using ! and % for titles in labels. Strangely, titles in labels are
often forced to lower case, despite this never being enforced for
software lists. There are also various other edits to titles used for
labels, such as moving articles to the end (with or without a comma),
or replacing spaces with underscores. As I already said, there's no
consistency at all.
There is far too much markup in labels. They're even being used for
notes in some cases (e.g. at least one case where a dumper's name is in
the label). The XML schema supports metadata - use it. For example,
you can use part_id for an unrestricted display name for a software
part. You can also use XML comments for notes.
And while on the topic of metadata, vgmplay.xml is putting the same
thing in the part_id as well as the label. The part_id should have
the actual title, not the title mangled to make it more suitable for
use as a filename. Addressing this would be a lot of work, given how
large the file is.
For now, empty data areas in software lists cause a verbose message
rather than a validation warning. There are thousands of software
lists using empty data areas to indicate the size/width of cartridge
RAM/EEPROM/etc.
* Example data tidying
This is an example of what I'd like to do to the existing cassette dumps:
* Conform to 8.3 filename structure, for both the software name and ROM names
* Give credit to who dumped it, which also makes it easier to track these down
* Optimise the image with tapclean, which changes the file's hashes (but not the payload's magic CRC32, as the target data itself is unaltered)
Optimising the image makes it overall PASS instead of FAIL. I think this is as close as we can get to canon tape dumps.
* Give more examples of tidying up The Ultimate Tape Archive data
* Add a partial tape, B-side only
It's incomplete, but a start.
* Add a full tape, Hacker II
* Add partial dump of James Pond 2: RoboCod
* Add partial dump of Kettle
I'm only adding dumps that completely passed tapclean's inspection,
after optimisation. As this optimisation reduces wow and flutter
and other arbitrary timing aspects, it's reasonable to assume that
other people dumping the same tapes will be able to verify the present
sides and fill in the missing ones, which are oftentimes duplicates
anyway.
* Empty commit; note
Actually, that last commit message is incorrect. Identical dumps
have matching magic CRC32s (payload only), but not matching overall
CRC32s. There's still arbitrary data in there.
* These programs were rereleased by Hi Tec in 1989
...as per the inlay cards.
* Add full rerelease of Summer Camp
* Add Tetris
* Add The Greed Monster
* Fix part names; give credit
Get rid of a couple of copies of the CC0 text. Add header comment to
CC0 files to remind people editing them what the terms are. Also add
some missing XML headers. The header comments in layouts won't bloat
the binary - they get stripped out before compressing, same as any other
comments.
* Add all 1985 tapes from INPUT 64
* Add all 1986 tapes/disks from INPUT 64
* Add all 1987 tapes/disks from INPUT 64
* Add all 1988 disks from INPUT 64
* Hit Squad ➡️ The Hit Squad
This is a brand/label of Ocean. See any of their packaging to verify it's "The Hit Squad".
* Tentatively add more C64 tape dumps
* Add some more UK C64 tapes
It's a start...
* Hewson (Rack IT) ➡️ Rack It
It looks like the publisher should take the form "Label" rather than "Company (Label)", judging by "The Hit Squad" (Ocean), "Mastertronic Added Dimension" (Mastertronic), etc., so let's be consistent about that.
Also, it's "Rack It", not "Rack IT". See e.g. the scan-in at https://archive.org/25/items/uta_Steel_1988_Hewson_Rack_IT_7197/uta_Steel_1988_Hewson_Rack_IT_7197_screenshot.jpg which shows the label name for both the copyright and address.
* Use labels consistently
Gremlin Graphics (GBH) ➡️ GBH
Grandslam (Bug Byte) ➡️ Bug Byte
CDS Software (Blue Ribbon) ➡️ Blue Ribbon
* Add tape