Sun Fire[tm] 4800/4810 Server: Repair Procedures
This section provides the most common repair procedures for the
Sun Fire 4800/4810, which are included in the Sun
Fire 6800/4810/4800/3800 Systems Service Manual (805-7363).
Clustered Hardware Note: When repairing a clustered system,
you should replace server components by first switching over the data services
to the functioning server, halting the host to be serviced, powering down the
host, and then performing the hardware procedure to replace the component. Following
the procedure, the logical hosts should be switched back to the default masters.
Beginning with firmware revision 5.15.0, the Auto
Diagnosis and Recovery feature is provided to help diagnose "System Error
Pause" conditions when they occur. Fault codes and resolution information are
provided here.
The following table provides the page numbers for each procedure.
| Component |
Remove
|
Install
|
Tips |
| AC Input Box |
8-14
|
8-15
|
|
| Air Intake Screen 4800 |
13-6
|
13-6
|
- Periodic maintenance requires the air intake screen be inspected and/or
cleaned once every 3 months. Have spare air intake screens onsite so
that replacements are available when needed for cleaning.
|
| Air Intake Screen 4810 |
13-4
|
13-4
|
| Centerplane (Power) 4810 |
10-13
|
10-17
|
|
| Centerplane (System) 4800 |
10-19
|
10-28
|
- Within a specific date range of Sun Fire 4800s, eight
centerplanes could experience a thermal event.
(FCO A0227-1).
|
| Centerplane (System) 4810 |
10-13
|
10-17
|
|
| Compact PCI (cPCI) I/O Assembly |
5-5
|
5-6
|
- Sun Fire Servers (3800/4800/4810/6800) may encounter panic during
Dynamic Reconfiguration (DR) operation of PCI and cPCI I/O boards (FIN
I0840-1).
- Sun Fire x800 systems are subject to a "panic" problem at
boot time when a cPCI Dual FC Network Adapter (FRU or X-option) is installed
for the first time (FIN I0708-1)
|
| Compact PCI (cPCI) Card |
5-7
|
5-8
|
- Sun Fire x800 systems are subject to a "panic" problem at
boot time when a cPCI Dual FC Network Adapter (FRU or X-option) is installed
for the first time (FIN I0708-1)
|
| CPU/Memory Board |
3-7
|
3-10
|
- The Vcore setting for UltraSPARC IV 1350MHz boards should be
1.25V, but some earlier releases of firmware will power on
and operate the board at an incorrect voltage.
(FIN I1166-1).
- CPU/Memory Board FRUs for Sun Fire 12K/15K and Sun Fire 3800-6800
systems are not enabled for Capacity On Demand (COD) use (FIN I0912-1).
- Guidelines for understanding and diagnosing UltraSPARC III Level 2
(L2) SRAM Cache Memory Errors (FIN I0887-1).
- UltraSPARC III and III+ based platforms could be susceptible to UCC
errors that may cause system panics (FIN I0856-1).
- Issue with diagnosing of "send mondo" panics (FIN I0765-1)
- 900MHz CPUs operating at 750MHz (FIN I0759-1)
- Under certain conditions, flashupdate -u command can create incompatible
firmware versions between boards, rendering system unusable until problem
is corrected (FIN I0731-1)
- Loose EMI spring fingers on chassis may damage CPU/Memory Boards (FIN
I0720-1)
- A limited set of Sun Fire System Boards may be vulnerable to Uncorrectable
Errors in L2 SRAM. (FCO A0248-1).
|
| CPU/Memory Board EMI Spring Finger Clip |
3-18
|
3-20
|
- Loose EMI spring fingers on chassis may damage CPU/Memory Boards (FIN
I0720-1)
|
| Disk Drives |
-
|
-
|
-
Improved firmware for Seagate 10K.6 disk drives will reduce the incidence of unexpected outages due to a spindle motor issue (FIN I1136-1).
|
| Fan Tray |
9-5
|
9-6
|
|
| ID Board (4800) |
10-47
|
10-47
|
|
| ID Board (4810) |
10-45
|
10-47
|
|
| Memory |
3-13
|
3-14
|
- Best Practices Guide for Memory Errors for diagnosing UltraSPARC III
memory errors now available (FIN I1018-1).
- Diagnosing Main Memory errors versus L2SRAM errors on UltraSPARC III
and UltraSPARC III Cu systems (FIN I0954-1).
-
Too many Memory DIMMs are being unnecessarily replaced on the
UltraSPARC II, III, III+, IIIi and, IV families of systems,
increasing customers' service actions and related system downtime.
(FIN I0760-2).
- Systems containing 256MB Samsung B-die DIMMs, having a module date
code between 0115 and 0127 (built between weeks 15 and 27 of 2001),
may experience Uncorrectable Memory Errors (UE). This can lead to System
Panics (FCO A0223-1).
|
| PCI I/O Assembly |
4-7
|
4-9
|
- The QFE network interface is reporting excessive input packet
errors when running back to back stress tests. (FIN I1138-1).
|
| PCI Card |
4-10
|
4-11
|
|
| Power Supply (4800) |
8-9
|
8-10
|
|
| Power Supply (4810) |
8-7
|
8-8
|
|
| Repeater Board |
7-7
|
7-9
|
|
| System Controller Board |
6-9
|
6-12
|
- Sun Fire System Controllers with 5.11.x firmware could experience
a loss of network settings during a firmware upgrade (FIN I0891-1).
- Sun Fire 3800/4800/4810/6800 domains can hang when System Controller
firmware is downgraded from 5.13.0, 5.13.1, or 5.13.2 to 5.12.7 or lower
(FIN I0890-1).
- Systems with redundant SCs may experience failed domains after clock
failure (FIN I0762-1)
- Unexpected behavior from SCs due to downrev firmware (FIN I0756-1)
|
|