SPARC International, Inc.

Errata for "The SPARC Architecture Manual, Version 9"

The following is a list of corrections for known errors in "The SPARC Architecture Manual, Version 9" book. Page number references are taken from the 1st (1994) printing, R1.4.2 (revision 1.4.2, identified by the text "SAV09R1429309" inside the front cover), unless otherwise indicated.

[1] Page 212 (second page of the Read State Register description), 4th paragraph from the top is printed as:

"RDFPRS waits for all pending FPops to complete before reading the FPRS register."

It *should* read:

"RDFPRS waits for all pending FPops **and loads of floating-point registers** to complete before reading the FPRS register."

[2] Page 234 (Tagged Add):
The "op3" column is incorrect in the Opcode table; the low-order bit should be "0" for all Tagged-Add instructions. The table should read:

Opcode	op3	Operation
TADDcc	10 0000 ...
TADDccTV	10 0010 ...

[3] Page 80 (subsection 6.3.6.4, RESTORED description):
In the last line of the 6.3.6.4, change:
CLEANWIN < NWINDOWS
to:
(CLEANWIN < (NWINDOWS-1))

[4] Page 216 (RESTORED):
Third paragraph, last sentence, change
CLEANWIN != NWINDOWS
to:
(CLEANWIN < (NWINDOWS-1))

[5] Page 76, Section 6.3.4.[12] (branches):
A *taken* conditional branch (not just a conditional branch) should have been referred to in the last sentences of two subsections.

Change the last sentence in 6.3.4.1, "Conditional Branches", to:

Note that the annul behavior of a taken conditional branch is different from that of an unconditional branch.

And change the last sentence in 6.3.4.2, "Unconditional Branches" to:

Note that the annul behavior of a unconditional branch is is different from that of a taken conditional branch.

[6] Page 290, Section G, Table 43:
In the table entries for "cas", "casl", "casx", and "casxl", the built-in constant names beginning with "ASI" should all be preceeded by "#" (as they were correctly specified on p.286).

[7] Page 242, Write State Register page:
In the Exceptions section:
"WRASR with rs1=16..31"
should read:

"WRASR with rd=16..31".

[8] Page 57, subsection 5.2.10 (Register-Window State Registers):
A clarification has been added to Section 5, to allow an implementation with 16 or fewer register windows the option to implement the CWP, CANSAVE, CANRESTORE, OTHERWIN, and CLEANWIN registers with fewer than 5 bits each, if desired. The following text was added:

IMPL. DEP. #126: Privileged registers CWP, CANSAVE, CANRESTORE, OTHERWIN, and CLEANWIN contain values in the range 0..NWINDOWS-1. The effect of writing a value greater than NWINDOWS-1 to any of these registers is undefined. Although the width of each of these five registers is nominally 5 bits, the width is implementation-dependent and shall be bewteen ceil(log2(NWINDOWS)) and 5 bits, inclusive. If fewer than 5 bits are implemented, the unimplemented upper bits shall read as 0 and writes to them shall have no effect. All five registers shall be the same width.

[9] Page 268, Table 32:
As a privileged instruction, "RDPR" should be listed with a trailing superscript "P".

[10] Pages 58-59, subsection subsection 5.2.10 (Register-Window State Registers):
Added note to descriptions of CWP, CANSAVE, CANRESTORE, OTHERWIN, and CLEANWIN registers that the effect of writing a value to them greater than NWINDOWS-1 is undefined.

[11] Page 81:
In section 6.3.9, "FMOVc" was corrected to read "FMOVr".

[12] Page 81:
In section 6.3.9, a sentence was added stating that FSR.cexc and FSR.ftt are cleared by FMOVcc and FMOVr whether or not the move occurs.

[13] Page 171, Annex A: Sentence added specifying that LDFSR does not affect the upper 32 bits of FSR.

[14] Page 220(r142)/A.49(r142), third paragraph:
The words "the" and "and" were transposed in the implementation dependency description. It now reads: "The location of the SIR_enable control flag and the means of accessing the SIR_enable control flag..."

[15] Page 229, paragraph beginning "Store integer...": "...used for the load..." changed to "...used for the store...".

[16] Page 231, Annex A: Corrected SWAP deprecation note to recommend use of "CASA" or "CASXA" (not "CASX") in place of SWAP.

[17] Page 258, D.3.3., rule (1): The text was clarified, to read "(1) The execution of Y is conditional on X, and S(Y) is true."

[18] Page 312, Annex I:
Missing word "not" added to Compatibility Note: "The coprocessor opcodes were eliminated because they have not been used in SPARC-V7 and SPARC-V8, ..." ^^^

[19] Page 195, Annex A:
Order of instructions in Suggested Assembly Language Syntax was rearranged to correspond to order of the instructions in the Opcode/op3/Operation table above it.

"movre" and "movrz", as the assembly-language mnemonic and its synonym, were exchanged to correspond with the instruction name of MOVRZ.

"movrne" and "movrnz", as the assembly-language mnemonic and its synonym, were exchanged to correspond with the instruction name of MOVRNZ.

[20] Page 228, Annex A:

Order of instructions in Suggested Assembly Language Syntax was rearranged to correspond to order of the instructions in the Opcode/op3/Operation table above it.

[21] Page 241, Annex A:
Added footnote to Suggested Assembly Language Syntax table, noting that the suggested syntax for WRASR with rd=16..31 may vary, citing reference to implementation dependency #48.

(Suggested Assembly Language Syntax is just that -- *suggested* -- so isn't part of the architecture specification anyway, but this change makes it clearer that if bits are interpreted differently in the instruction, one should expect its assembly-language syntax to change, as well)

[22] Page 40, Table 7:
Changed leftmost column text as follows:
"Single" to "Single f.p. or 32-bit integer"
"Double" to "Double f.p. or 64-bit integer"
"Quad" to "Quad f.p."

Corrections 22-57 were incorporated into R1.4.5, Dec 1999, | | which was to be used for the 2nd printing of the book. | | R1.4.5 (revision 1.4.5) can be identified by the text | | "SAV09R1429309" inside the front cover of the book. | | These corrections also appear in all subsequent revisions.

[23] p.13, subsection 2.57, definition of "reserved":

Wording:
"...intended to run on future version of"
was corrected to read:
"...intended to run on future versions of".

The sentence beginning "Reserved register fields" was amend to read: "Reserved register fields should always be written by software with values of those fields previously read from that register, or with zeroes; they should read as zero in hardware."

[24] p.21(r142), Editor's Notes:
Added Les Kohn's name to the Acknowledgements.

[25] p.28(r142), Tables 3,4,5:
Made use of hyphens & dashes made consistent, and easier to read.

[26] p.30(r142), paragraph just above subsection 5.1:
Changed end of sentence to read:

"...should be written with the values of those bits previously read from that register, or with zeroes."

[27] p.40(r142), Table 7:
Added lines for 32-bit and 64-bit signed integers in f.p. registers, for clarity.

[28] p.51, Figure 17:
Added bits 11..10 to the figure, so it looks like:

PID1	PID0	CLE	TLE	MM	RED	PEF	AM	PRIV	IE	AG
11	10	9	8	7	6	5	4	3	2	1

\_________/ (changed here)

[29] p.52(r142), inserted new subsection 5.2.1.1 before old one:
"IMPL. DEP. #127: The presence and semantics of PSTATE.PID1 and PSTATE.PID0 are implementation-dependent. Software intended to run on multiple implementations should only write these bits to values previously read from PSTATE, or to zeroes. See also TSTATE bits 19..18."

[30] p.55(r142), Figure 22, (TSTATE register):
Extended the "saved PSTATE" field up through bit 19 of TSTATE; changed the diagram to look like:

...

ASI from TL=x

---

PSTATE from TL=X

...

\________/ (changed here)

[31] p.56(r142):
Added a new paragraph to the end of subsection 5.2.6:

"TSTATE bits 19 and 18 are implementation-dependent. ImplDep#126: If PSTATE bit 11 (10) is implemented, TSTATE bit 19 (18) shall be implemented and contain the state of PSTATE bit 11 (10) from the previous trap level. If PSTATE bit 11 (10) is not implemented, TSTATE bit 19 (18) shall read as zero. Software intended to run on multiple implementations should only write these bits to values previously read from PSTATE, or to zeroes."

[32] p.57(r142), subsection 5.2.10 (Register-Window State Registers): Added implementation dependency #126:

IMPL. DEP. #126: Privileged registers CWP, CANSAVE, CANRESTORE, OTHERWIN, and CLEANWIN contain values in the range 0..NWINDOWS-1. The effect of writing a value greater than NWINDOWS-1 to any of these registers is undefined. Although the width of each of these five registers is nominally 5 bits, the width is implementation-dependent and shall be between ceil(log2(NWINDOWS)) and 5 bits, inclusive. If fewer than 5 bits are implemented, the unimplemented upper bits shall read as 0, and writes to them shall have no effect. All five registers should have the same width.

[33] pp.58-9(r142), subsection 5.2.10 (Register-Window State Registers):
Added note to descriptions of CWP, CANSAVE, CANRESTORE, OTHERWIN, and CLEANWIN registers that the effect of writing a value to them greater than NWINDOWS-1 is undefined.

[34] p.76, Section 6:
Last sentence in 6.3.4.1, "Conditional Branches" changed to:

Note that the annul behavior of a taken conditional branch is different from that of an unconditional branch.

And the last sentence in 6.3.4.2, "Unconditional Branches" changed to:

Note that the annul behavior of a unconditional branch is is different from that of a taken conditional branch.

[35] p.80(r142), 6.3.6.4(r142), RESTORED:
correct the equation with CLEANWIN to read
"(CLEANWIN < (NWINDOWS-1))".
and correct the text above it.

[36] p.81(r141/r142):
In section 6.3.9, "FMOVc" was corrected to read "FMOVr".

[37] p.81(r141/r142):
In section 6.3.9, a sentence was added stating the clearing of FSR.cexc and FSR.ftt during condition moves FMOVcc and FMOVr:

FMOVcc and FMOVr instructions clear these FSR fields regardless of the value of the conditional predicate.

[38] p.121(r141/r142):
An index entry for "non-faulting loads" was fixed in section 8.3.

[39] p.151(r142), A.9(r142), Compare and Swap page:
Added mention of CASL and CASXL to the Programming Note:

Compare and Swap Little (CASL) and Compare and Swap Extended Little (CASXL) synthetic instructions are available for "little endian" memory accesses.

[40] p.171, Annex A, "Load Floating-Point":
Sentence added:
The upper 32 bits of FSR are unaffected by LDFSR.

[41] p.181(r141/r142):
Section number "A.31" was fixed so it now increments to A.32. All following section numbers and odd page headers in Annex A have changed.

[42] p.191(r141/r142):
Misspelling corrected in page heading: "Conditon" --> "Condition"

[43] p.195(r141/r142), "Move Integer Register on Register Condition (MOVR)":
Order of instructions in Suggested Assembly Language Syntax was rearranged to correspond to order of the instructions in the Opcode/op3/Operation table above it.

"movre" and "movrz", as the assembly-language mnemonic and its synonym, were exchanged to correspond with the instruction name of MOVRZ.

"movrne" and "movrnz", as the assembly-language mnemonic and its synonym, were exchanged to correspond with the instruction name of MOVRNZ.

[44] p.212(r14[123]) A.43(r14[12])/A.44(r144), second page of the Read State Register instruction descirption, 4th paragraph SHOULD read:

"RDFPRS waits for all pending FPops **and loads of floating-point registers** to complete before reading the FPRS register."

[45] p.216(r142), A.46(r142), RESTORED page:
Correct the equation with CLEANWIN to read

" (CLEANWIN < (NWINDOWS-1))".

[46] p.220(r142)/A.49(r142), third paragraph:
The words "the" and "and" were transposed in the implementation dependency description. It now reads:

"The location of the SIR_enable control flag and the means of accessing the SIR_enable control flag..."

[47] p.228(r141/r142):
Order of instructions in Suggested Assembly Language Syntax was rearranged to correspond to order of the instructions in the Opcode/op3/Operation table above it.

[48] (duplicate of erratum #15)

[49] p.231(r142)/233(r144), Annex A:
Corrected SWAP deprecation note to recommend use of "CASA" or "CASXA" (not "CASX") in place of SWAP.

[50] p.234, A.58(r14[12])/A.59(r144), Tagged Add:
op3 opcodes are wrong. Both should have "0" for low-order bit (as is correctly specified in Appendix E).

[51] p.241(r142), A.62(r142), Write State Register page:

Added footnote to Suggested Assembly Language Syntax table, noting that the suggested syntax for WRASR with rd=16..31 may vary, citing reference to implementation dependency #48. (Suggested Assembly Language Syntax is just that -- *suggested* -- so isn't a normative part of the architecture specification anyway, but this makes it clearer that if bits are interpreted differently in the instruction, one should expect its assembly- language syntax to change, as well)

[52] p.242(r142), A.62(r142), Write State Register page:
In the Exceptions section, "WRASR with rs1=16..31" now reads "WRASR with rd=16..31".

[53] p.253(r142), Annex C:
Fixed 6 incorrect index entries.

[54] p.253(4142), Annex C:
Added a new Implementation Dependency:

#	Cat	Def/Ref	Description
127	f	52, 56	The presence and semantics of PSTATE.PID1 and PSTATE.PID0 are implementation-dependent. The presence of TSTATE bits 19 and 18 is implementation-dependent. If PSTATE bit 11 (10) is implemented, TSTATE bit 19 (18) shall be implemented and contain the state of PSTATE bit 11 (10) from the previous trap level. If PSTATE bit 11 (10) is not implemented, TSTATE bit 19 (18) shall read as zero. Software intended to run on multiple implementations should only write these bits to values previously read from PSTATE, or to zeroes.

[55] p.255(r142), Annex C:
Added implementation dependency #126.
(see correction #31 above for the text of implementation dependency #126)

[56] p.258(r142), D.3.3., rule (1):
The text was clarified, to read:
"(1) The execution of Y is conditional on X, and S(Y) is true."

[57] p.268(r142), Table 32:
As a privileged instruction, "RDPR" should be listed with a superscripted suffix "P".

[58] p.290(r142), Section G, Table 43:
Insert "#" before the "ASI" in the compare-and-swap synthetic instruction entries.

Corrections 58-on have been incorporated into R1.4.6, which has not yet been published.

[59] In Figure 3 in Chapter 6 (p.62), the 4th format description from the bottom of the page (op,rd,op3,rs1,i=0,--,rs2) contains an error; "i=0" should read "i=1".

[60] In section 6.3.1, "Memory Access Instructions", on p.67,
"and CAS accesses words or doublewords.
" should be amended to read:
"CASA accesses words, and CASXA acesses doublewords."

[61] In section 7.7, the async_data_error exception description should be updated to read as follows:

async_data_error [tt = 0x040] (Precise, Deferred, or Disrupting) -- An implementation-dependent exception (impl. dep. #31) that indicates that one or more unrecoverable or uncorrectable but recoverable errors have been detected in the processor. This may include errors detected in the architectural registers (general-purpose registers, floating-point registers, ASRs, or ASI registers) and other core processor hardware. A single async_data_error exception may indicate multiple errors and may occur asynchronously to instruction execution. An async_data_error exception may cause a precise, deferred, or disrupting trap. When async_data_error causes a disrupting trap, the TPC and TNPC stacked by the trap do not necessarily indicate the instruction or data access that caused the error.

[62] The following text should be added to the second paragraph of section A.27 (p.176), to clarify the behavior of a little-endian doubleword load (LDD):

With respect to little endian memory, an LDD instruction behaves as if it is composed of two 32-bit loads, each of which is byte swapped independently before being written into each destination register.

[63] The following text should be added to the second paragraph of section A.28 (p.178), to clarify the behavior of a little-endian doubleword load from alternate space (LDDA):

With respect to little endian memory, an LDDA instruction behaves as if it is composed of two 32-bit loads, each of which is byte swapped independently before being written into each destination register.

[64] In the Index, p.354, the "signal monitor instruction" index entry should instead read "software intiated reset (SIR) instruction".

[65] There is an error in the definition of CLEANWIN (p.59) and the SAVE instruction that allows the locals of the "invalid" window to in some cases not be cleaned (zeroed) when it is allocated by a SAVE instruction.

A software workaround (used in the Solaris operating system and perhaps others), to keep user registers clean of kernel data, involves the use of an extra %wstate value. When the kernel returns to user code, it sets %wstate to the new value. The new trap table entry for spills with that %wstate value spills the window as usual but also backs up a window and performs the missing "clean" operation. The spill handler then sets %wstate back to the default value for a user process.

[66] In Chapter 7, "Traps", it is implied (but not explicitly stated) that the value PSTATE.TLE is preserved during traps that cause entry into RED_state and during XIR, WDR, and SIR resets. However, PSTATE.TLE may be left in an undefined states by one of those events. The correction, which applies to sections 7.6.2.1 (p.106), 7.6.2.3 (p.108), 7.6.2.4 (p.109), and 7.6.2.5 (p.110) is to change the little-ending mode settings from:

PSTATE.CLE <-- PSTATE.TLE (set endian mode for traps)
to:
PSTATE.CLE <-- PSTATE.TLE (set endian mode for traps)
PSTATE.TLE <-- undefined

[67] In Chapter 5, section 5.1.7.9 (p.48), the last sentence of the
third paragraph is inaccurate. The entire third paragraph should
be replaced with:

Floating-point operations which cause an overflow or underflow condition may also cause an "inexact" condition. For overflow and underflow conditions, FSR.cexc bits are set and trapping occurs as follows:

o If an IEEE 754 overflow condition occurs:

-- if OFM=0 and NXM=0, the cexc.ofc and cexc.nxc bits are both set to 1, the other three bits of cexc are set to 0, and
an IEEE_754_exception trap does *not* occur.

-- if OFM=0 and NXM=1, the cexc.nxc bit is set to 1, the other four bits of cexc are set to 0, and and an IEEE_754_exception trap *does* occur.

-- if OFM=1, the cexc.ofc bit is set to 1, the other four bits of cexc are set to 0, and an IEEE_754_exception trap *does* occur.

o If an IEEE 754 underflow condition occurs:

-- if UFM=0 and NXM=0, the cexc.ufc and cexc.nxc bits are both set to 1, the other three bits of cexc are set to 0, and an IEEE_754_exception trap does *not* occur.

-- if UFM=0 and NXM=1, the cexc.nxc bit is set to 1, the other four bits of cexc are set to 0, and an IEEE_754_exception trap *does* occur.
-- if UFM=1, the cexc.ufc bit is set to 1, the other four bits of cexc are set to 0, and an IEEE_754_exception trap *does* occur.

The above behavior is summarized in the following table
(x = don't-care):

Exception(s)							Current
Detected			Trap Enable				Exception
in f.p.			Mask Bits			IEEE_754_	Bits (in
operation			(in FSR.TEM)			exception	FSR.cexc)
----	---	--	-----	-----	-----	Trap	---	---	---	----
of	uf	nx	OFM	UFM	NXM	Occurs?	ofc	ufc	nxc	Notes
----	---	--	-----	-----	-----	-------	---	---	---	----
-	-	-	x	x	x	no	0	0	0
-	-	*	x	x	0	no	0	0	1
-	*	*	x	0	0	no	0	1	1	(1)
*	-	*	0	x	0	no	1	0	1	(2)

-	-	*	x	x	1	yes	0	0	1
-	*	*	x	0	1	yes	0	0	1
-	*	-	x	1	x	yes	0	1	0
-	*	*	x	1	x	yes	0	1	0
*	-	*	1	x	x	yes	1	0	0	(2)
*	-	*	0	x	1	yes	0	0	1	(2)

(1) When the underflow trap is disabled (UFM=0), underflow is always accompanied by inexact.

(2) Overflow is always accompanied by inexact.

[68] In Appendix B, section B.3 (p.245), the first paragraph:

"Underflow occurs if the exact unrounded result has magnitude
between zero and the smallest normalized number in the
destination format."

should be replaced by the following two paragraphs:

"On an implementation that detects tininess before rounding, trapped underflow occurs when the exact unrounded result has magnitude between zero and the smallest normalized number in the destination format.

On an implementation that detects tininess after rounding, trapped underflow occurs when the result, if it was rounded to a hypothetical format having the same precision as the destination but of unbounded range, would have magnitude between zero and the smallest normalized number in the actual destination format."

[69] In Appendix B, section B.4 (p.245), the first two paragraphs:

The first paragraph:

"Underflow occurs if the exact unrounded result has magnitudebetween zero and the smallest normalized number in thedestination format, *and* the correctly rounded result in the destination format is inexact."

should be replaced by the following paragraph:

On an implementation that detects tininess before rounding, untrapped underflow occurs when the exact unrounded result has magnitude between zero and the smallest normalized number in the destination format, *and* the correctly-rounded result in the destination format is inexact."

And the beginning of the second paragraph:

"Table 28 summarizes what happens when an exact ..."
should be modified to read:
"Table 28 summarizes what happens on an implementation that detects tininess before rounding, when an exact ..."

[70] In Appendix B, Table 28, "Untrapped Floating-Point Underflow" (p.245): Table 28 (and its footnote) should be replaced by the following revised table and text:

Table 28: Untrapped Floating-Point Underflow (Tininess Detected Before Rounding)

Underflow trap mask:
UFM=1 UFM=0 UFM=0

Inexact trap mask:
NXM=x NXM=x NXM=0

u = r r is minimum normal none none none

r is subnormal UF none none

r is zero none none none

u ! = r r is minimum normal UF NX uf nx

r is subnormal UF NX uf nx

r is zero UF NX uf nx

UF = IEEE_754_exception trap with cexc.ufc=1

NX = IEEE_754_exception trap with cexc.ufc=1

uf = cexc.ufc=1, aexc.ufa=1, no IEEE_754_exception trap

nx = cexc.nxc=1, aexc.nxa=1, no IEEE_754_exception trap

In an implementation that detects tininess after rounding, Table 28 applies to a narrower range of values of the exact unrounded result u. The precise bounds depend on the rounding direction specified in FSR.RD, as follows:

o Let m denote the smallest normalized number and e the absolute difference between 1 and the next larger representable number in the destination format. Then the bounds on u for which Table 28 applies are:

	Rounding
FSR.RD	Toward	Range of Values of u
-------------	------------	---------------------
0	nearest	\|u\| < m(1 - e/4)
1	0	\|u\| < m
2	+infinity	-m < u <= m(1 - 2/2)
3	-infinity	-m(1 - e/2) <= u < m

o When u lies outside these ranges, underflow does not occur,
although an inexact exception still occurs when u != r, the rounded value.

[71] In Appendix A, section A.40, "No Operation" (p.204):
For clarity, in the instruction format diagram the eterm "op" should be replaced by five zeroes.

[72] In Appendix A, section A.53, "Store Integer" (p.227):
The following paragraph should be added near the end of the Description subsection, prior to the Programming Note, to clarify the behavior of a little-endian doubleword store (STD):

"With respect to little-endian memory, a STD instruction behaves as if it is composed of two 32-bit stores, each of which is byte-swapped independently before being written into its respective destination memory word."

[73] In Appendix A, section A.54, "Store Integer Into Alternate Space" (p.229):
The following paragraph should be added near the end of the Description subsection, prior to the Programming Note, to clarify the behavior of a little-endian doubleword store to alternate space (STDA):

"With respect to little-endian memory, a STDA instruction behaves as if it is composed of two 32-bit stores, each of which is byte-swapped independently before being written into its respective destination memory word."

[74] In Chapter 7, reference is made in two places to a range of trap priorities, with 0 as the highest priority and 31 as the lowest.
Architecturally, there are no absolute trap priorities (only relative trap priorities) and there is no specific limit to trap priority numbers. Trap priorities are only used by a processor to choose which exception will cause a trap at any given time; a trap priority is an ordinal number which need not be stored anywhere. Therefore, the following changes should be noted:

Caption above Table 15, p.101:
     Change:
                   "0 = Highest; 31 = Lowest"
      to:
                    "0 = Highest"

Text of first paragraph of section 7.5.3 on p.102:
     Change:
                    "Priority 0 is highest, priority 31 is lowest; that is, if......."
      to:
                    "A trap priority is an ordinal number, with 0 indicating
                     the highest priority and greater priority numbers
                     indicating decreasing priority; that is, if......"