Octane: PCI Bridge Error
#21
RE: Octane: PCI Bridge Error
(03-14-2019, 01:30 AM)gijoe77 Wrote:  oh hey now what's this?  I upgraded the Tezro with the quad 1ghz nodeboard so I might have to look into this.  I have 6.5.30 installed - was this something that came with the OS like how the o2 PROM gets flashed with newer IRIX releases?

It seems Irinikus had VRM errors from his L1 when upgrading his Tezro (guess the 1GHz has different voltages too). Here's the thread http://forums.irix.cc/thread-75.html
So I don't think this is the problem. Does "l1cmd env" see the three extra fans?

The main PROM is automatically updated with IRIX and should be at 6.210 at IRIX 6.5.30 (6.211 if you installed relevant post 6.5.30 patches) but the L1 isn't.

Upgrading the L1 can be tricky. Toby Jennings wrote up a thread about it on Nekochan which turned into a Wiki article which seems to have been mirrored here: https://wiki.preterhuman.net/L1_Controller_Updates  Basically there's a dependency between the flashsc utility and the L1 firmware versions it can flash safely. You also cannot upgrade from very old L1s to the last in one step. Do it wrong and it may flash the wrong firmware into your L1. This is mostly relevant to earlier generations like the Fuel and O300, but beware!
jan-jaap
SGI Collector

Posts: 305
Threads: 12
Joined: Jun 2018
Website Find Reply
03-14-2019, 08:33 AM
#22
RE: Octane: PCI Bridge Error
here is what I got:

hinv:

Code:
-bash-4.3# hinv    
4 1.0 GHZ IP35 Processors
CPU: MIPS R16000 Processor Chip Revision: 3.0
FPU: MIPS R16010 Floating Point Chip Revision: 3.0
Main memory size: 8192 Mbytes
Instruction cache size: 32 Kbytes
Data cache size: 32 Kbytes
Secondary unified instruction/data cache size: 16 Mbytes
Integral SCSI controller 8: Version SAS/SATA LS1068
 Disk drive: unit 1 on SCSI controller 8 (XVM Local Disk) (primary path)
 Disk drive: unit 2 on SCSI controller 8 (XVM Local Disk) (primary path)
Integral SCSI controller 9: Version SAS/SATA LS1068
 Disk drive: unit 1 on SCSI controller 9 (XVM Local Disk) (primary path)
 Disk drive: unit 2 on SCSI controller 9 (XVM Local Disk) (primary path)
Integral SCSI controller 2: Version IDE (ATA/ATAPI) IOC4
Integral SCSI controller 0: Version QL12160, low voltage differential
 Disk drive: unit 1 on SCSI controller 0
 Disk drive: unit 2 on SCSI controller 0
Integral SCSI controller 1: Version QL12160, low voltage differential
IOC3/IOC4 serial port: tty3
IOC3/IOC4 serial port: tty4
IOC3/IOC4 serial port: tty7
IOC3/IOC4 serial port: tty8
Graphics board: V12
Integral Gigabit Ethernet: tg0, module 001c01, PCI bus 1 slot 4
Iris Audio Processor: version MAD revision 1, number 1
Iris Audio Processor: version RAD revision 13.0, number 2
XT-DIGVID Multi-standard Digital Video: controller 0, unit 0, version 0x0
IOC3/IOC4 external interrupts: 1
Dual Channel Display
USB controller: type OHCI
USB controller: type OHCI
-bash-4.3#


hinv -mv

Code:
-bash-4.3# hinv -mv
Location: /hw/module/001c01/node
      IP59_4CPU Board: barcode RBH308     part 030-1989-003 rev -C
Location: /hw/module/001c01/IXbrick/xtalk/11
      WS_INT_53 Board: barcode MTM289     part 030-1881-006 rev -A
Location: /hw/module/001c01/IXbrick/xtalk/12
     ODY128B1_2 Board: barcode NRY124     part 030-1884-005 rev -B
Location: /hw/module/001c01/IXbrick/xtalk/13
      XT-DIGVID Board: barcode NBM190     part 030-1927-003 rev  B
Location: /hw/module/001c01/IXbrick/xtalk/15
      WS_INT_53 Board: barcode MTM289     part 030-1881-006 rev -A
Location: /hw/module/001c01/IXbrick/xtalk/15/pci-x/0/1/ioc4
            IO9 Board: barcode MTS421     part 030-1771-005 rev -A
Location: /hw/module/001c01/IXbrick/xtalk/15/pci-x/1/2
    PCI_SIO_UFC Board: barcode MRJ325     part 030-1657-003 rev  A
4 1.0 GHZ IP35 Processors
CPU: MIPS R16000 Processor Chip Revision: 3.0
FPU: MIPS R16010 Floating Point Chip Revision: 3.0
CPU 0 at Module 001c01/Slot 0/Slice A: 1.0 Ghz MIPS R16000 Processor Chip (enabled)
 Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz  Tap 0x15
CPU 1 at Module 001c01/Slot 0/Slice B: 1.0 Ghz MIPS R16000 Processor Chip (enabled)
 Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz  Tap 0x15
CPU 2 at Module 001c01/Slot 0/Slice C: 1.0 Ghz MIPS R16000 Processor Chip (enabled)
 Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz  Tap 0x15
CPU 3 at Module 001c01/Slot 0/Slice D: 1.0 Ghz MIPS R16000 Processor Chip (enabled)
 Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz  Tap 0x15
Main memory size: 8192 Mbytes
Instruction cache size: 32 Kbytes
Data cache size: 32 Kbytes
Secondary unified instruction/data cache size: 16 Mbytes
Memory at Module 001c01/Slot 0: 8192 MB (enabled)
 Bank 0 contains 1024 MB (Premium) DIMMS (enabled)
 Bank 1 contains 1024 MB (Premium) DIMMS (enabled)
 Bank 2 contains 1024 MB (Premium) DIMMS (enabled)
 Bank 3 contains 1024 MB (Premium) DIMMS (enabled)
 Bank 4 contains 1024 MB (Premium) DIMMS (enabled)
 Bank 5 contains 1024 MB (Premium) DIMMS (enabled)
 Bank 6 contains 1024 MB (Premium) DIMMS (enabled)
 Bank 7 contains 1024 MB (Premium) DIMMS (enabled)
Integral SCSI controller 8: Version SAS/SATA LS1068
 Disk drive: unit 1 on SCSI controller 8 (unit 1) (XVM Local Disk) (primary path)
 Disk drive: unit 2 on SCSI controller 8 (unit 2) (XVM Local Disk) (primary path)
Integral SCSI controller 9: Version SAS/SATA LS1068
 Disk drive: unit 1 on SCSI controller 9 (unit 1) (XVM Local Disk) (primary path)
 Disk drive: unit 2 on SCSI controller 9 (unit 2) (XVM Local Disk) (primary path)
Integral SCSI controller 2: Version IDE (ATA/ATAPI) IOC4
Integral SCSI controller 0: Version QL12160, low voltage differential
 Disk drive: unit 1 on SCSI controller 0 (unit 1)
 Disk drive: unit 2 on SCSI controller 0 (unit 2)
Integral SCSI controller 1: Version QL12160, low voltage differential
IOC3/IOC4 serial port: tty3
IOC3/IOC4 serial port: tty4
IOC3/IOC4 serial port: tty7
IOC3/IOC4 serial port: tty8
Graphics board: V12
Integral Gigabit Ethernet: tg0, module 001c01, PCI bus 1 slot 4
Iris Audio Processor: version MAD revision 1, number 1
Iris Audio Processor: version RAD revision 13.0, number 2
 PCI Adapter ID (vendor 0x1000, device 0x0054) PCI slot 1
 PCI Adapter ID (vendor 0x1000, device 0x0054) PCI slot 2
 PCI Adapter ID (vendor 0x10a9, device 0x100a) PCI slot 1
 PCI Adapter ID (vendor 0x104c, device 0xac28) PCI slot 2
 PCI Adapter ID (vendor 0x1077, device 0x1216) PCI slot 3
 PCI Adapter ID (vendor 0x14e4, device 0x1645) PCI slot 4
 PCI Adapter ID (vendor 0x1412, device 0x1724) PCI slot 2
 PCI Adapter ID (vendor 0x10a9, device 0x0005) PCI slot 1
 PCI Adapter ID (vendor 0x10a9, device 0x0003) PCI slot 2
 PCI Adapter ID (vendor 0x1033, device 0x0035) PCI slot 3
 PCI Adapter ID (vendor 0x1033, device 0x0035) PCI slot 3
 PCI Adapter ID (vendor 0x1033, device 0x00e0) PCI slot 3
XT-DIGVID Multi-standard Digital Video: controller 0, unit 0, version 0x0
IOC4 firmware revision 83
IOC3/IOC4 external interrupts: 1
HUB in Module 001c01/Slot 0: Revision 2 Speed 200.00 Mhz (enabled)
Dual Channel Display
IP35prom in Module 001c01/Slot n0: Revision 6.210
USB controller: type OHCI
USB controller: type OHCI
-bash-4.3#

l1cmd env

Code:
-bash-4.3# /usr/sbin/l1cmd --scdev /hw/module/001c01/L1/controller env
Environmental monitoring is enabled and running.

Description    State       Warning Limits     Fault Limits       Current
-------------- ----------  -----------------  -----------------  -------
         1.8V    Enabled  10%   1.62/  1.98  20%   1.44/  2.16    1.720
          12V    Enabled  10%  10.80/ 13.20  20%   9.60/ 14.40   11.938
       12V #2    Enabled  10%  10.80/ 13.20  20%   9.60/ 14.40   12.125
         3.3V    Enabled  10%   2.97/  3.63  20%   2.64/  3.96    3.285
         2.5V    Enabled  10%   2.25/  2.75  20%   2.00/  3.00    2.509
       12V IO    Enabled  10%  10.80/ 13.20  20%   9.60/ 14.40   11.875
       5V AUX    Enabled  10%   4.50/  5.50  20%   4.00/  6.00    4.888
     3.3V AUX    Enabled  10%   2.97/  3.63  20%   2.64/  3.96    3.268
           5V    Enabled  10%   4.50/  5.50  20%   4.00/  6.00    4.992
 XIO 12V BIAS    Enabled  10%  10.80/ 13.20  20%   9.60/ 14.40   11.750
       XIO 5V    Enabled  10%   4.50/  5.50  20%   4.00/  6.00    4.992
     XIO 2.5V    Enabled  10%   2.25/  2.75  20%   2.00/  3.00    2.457
 XIO 3.3V AUX    Enabled  10%   2.97/  3.63  20%   2.64/  3.96    3.285
IP59 3.3V AUX    Enabled  10%   2.97/  3.63  20%   2.64/  3.96    3.302
  IP59 5V AUX    Enabled  10%   4.50/  5.50  20%   4.00/  6.00    4.888
     IP59 12V    Enabled  10%  10.80/ 13.20  20%   9.60/ 14.40   11.875
    IP59 VCPU    Enabled  10%   1.14/  1.40  20%   1.02/  1.52    1.269
    IP59 SRAM    Enabled  10%   2.25/  2.75  20%   2.00/  3.00    2.457
    IP59 1.5V    Enabled  10%   1.35/  1.65  20%   1.20/  1.80    1.495

Description     State       Warning RPM  Current RPM
--------------- ----------  -----------  -----------
FAN  0   NODE 1    Enabled         1800         2860
FAN  1   NODE 2    Enabled         1800         2860
FAN  2   NODE 3    Enabled         1800         2986
FAN  3    PCI 1    Enabled         1350         1454
FAN  4    PCI 2    Enabled         1350         1454
FAN  5       HD    Enabled         1620         3068
FAN  6    ODY 1    Enabled         1300         2481
FAN  7    ODY 2    Enabled         1300         2376
FAN  8  N0 LEFT    Enabled         1800         2702
FAN  9  N0 CNTR    Enabled         1800         2678
FAN 10 N0 RIGHT    Enabled         1800         2803

                             Advisory   Critical   Fault      Current      
Description       State       Temp       Temp       Temp       Temp      
----------------- ----------  ---------  ---------  ---------  ---------  
0 INTERFACE 0       Enabled    [Autofan Control]    76C/168F   40C/104F
1 INTERFACE 1       Enabled    [Autofan Control]    76C/168F   34C/ 93F
2 INTERFACE 2       Enabled    [Autofan Control]    76C/168F   34C/ 93F
3 INTERFACE 3       Enabled    [Autofan Control]    76C/168F   41C/105F
4 ODYSSEY           Enabled    [Autofan Control]    76C/168F   45C/113F
5 NODE              Enabled    [Autofan Control]    76C/168F   50C/122F
6 BEDROCK           Enabled    [Autofan Control]    85C/185F   47C/116F

                    Zone Temp     Target    Current   Zone Fan   Curr/Min
Zone Name  State     Sensors       Average   Average   Index      Fan %
---------  --------  ------------  --------  --------  ---------  ---------
NODE        Enabled           5,6  53C/127F  48C/118F          0   62%/ 46%
BOOST       Enabled           N/A       N/A       N/A     8,9,10   64%/ 50%
HD          Enabled           N/A       N/A       N/A          5   56%/ 38%
PCI         Enabled       0,1,2,3  45C/113F  37C/ 98F        3,4   57%/ 57%
ODY         Enabled             4  50C/122F  45C/113F          6   82%/ 64%

-bash-4.3#


l1cmd serial all

Code:
-bash-4.3# /usr/sbin/l1cmd --scdev /hw/module/001c01/L1/controller serial all

Data                            Location      Value
------------------------------  ------------  --------
Local System Serial Number      NVRAM         P1003384
Reference System Serial Number  Attached L2   P1003384
Local Brick Serial Number       EEPROM        MTM289
Reference Brick Serial Number   NVRAM         MTM289


EEPROM      Product Name    Serial         Part Number           Rev  T/W    
----------  --------------  -------------  --------------------  ---  ------
INTERFACE   WS_INT_53       MTM289         030_1881_006          A    00    
IO9         IO9             MTS421         030_1771_005          A    00    
ODYSSEY     ODY128B1_2      NRY124         030_1884_005          B    00    
SNOWBALL    SNOWBALL_EDGE   NBM190         030_1927_003          B    00    
NODE        IP59_4CPU       RBH308         030_1989_003          C    00    
IO DGHTR    CHWS_IO_DAUG    NFV805         030_1875_003          A    00    

EEPROM     JEDEC-SPD Info           Part Number        Rev  Speed  SGI    
---------- ------------------------ ------------------ ---- ------ --------
DIMM 0     CE0000000000000028F32D00 M3 46L2820BT1-CA0   0B    8.0  N/A    
DIMM 2     CE0000000000000028EB2D00 M3 46L2820BT1-CA0   0B    8.0  N/A    
DIMM 4     CE000000000000000C28E600 M3 46L2820ET3-CA0   3E   10.0  N/A    
DIMM 6     CE000000000000000C3BE800 M3 46L2820ET3-CA0   3E   10.0  N/A    
DIMM 1     CE0000000000000028F42D00 M3 46L2820BT1-CA0   0B    8.0  N/A    
DIMM 3     CE0000000000000028FC2D00 M3 46L2820BT1-CA0   0B    8.0  N/A    
DIMM 5     CE000000000000000CC3E700 M3 46L2820ET3-CA0   3E   10.0  N/A    
DIMM 7     CE000000000000000CCBE700 M3 46L2820ET3-CA0   3E   10.0  N/A    

-bash-4.3#

next time I get messages about temp I'll post them here
(This post was last modified: 03-30-2019, 09:24 AM by gijoe77.)
gijoe77
Tezro

Posts: 604
Threads: 33
Joined: Jun 2018
Find Reply
03-30-2019, 09:16 AM
#23
RE: Octane: PCI Bridge Error
(03-12-2019, 09:21 AM)mrthinlysliced Wrote:  Did you follow the troubleshooting section of the octane owners manual?

http://www.sgistuff.net/hardware/systems...octane.pdf

Page 262 onwards kind of thing.

There are a bunch of diagnostic lights under the front cover too - but I couldn't find where it explains what those mean - might be worth looking into as well.

Aha - found the explanation:


Code:
Meaning of the seven green LEDs that are behind the frontpanel.

Perhaps you have seen the seven small holes near the undocumented connector on the front of your Octane. These are link status lights and show which ports of the Crossbar have items connected to.

BaseIO X
 QA    X  X  PCI Expansion
 QD    X  X      QB
 QC    X  X    Heart

My Octane has *only* QA (V6) and Heart LEDs lit. The rest is off. At https://wiki.preterhuman.net/SGI_Octane#Troubleshooting 
it states both BaseIO and Heart are always lit.  Dodgy

What is BaseIO?


After having reseated the system module, the LEDs changed to BaseIO AND Heart On, the rest is off now. Don't know if this means my V6 is not usable right now.  Confused


What I have tried so far:

* reseated the system module with and w/o DIMMS multiple times - NOK
* swapping memory and CPU with other known to work models (double checked in a working Octane) - NOK
* powering on only with memory, no CPU - NOK
* powering on only with CPU, no memory - NOK
* powering on with no CPU and no memory - NOK
* all of the above with keyboard + mouse + vga (all three working and having been used earlier with the same machine) - NOK
* all of the above with only power cable and null modem cable connected to port 1 - NOK


NOK means I still have the original symptoms:

* Machine only powers on when power cable is plugged in.
* It won't react to the power button, reset button, nor NMI button.
* Both fans on the rear spin (upper one and PSU).
* HDD is read and spins on power on (when power cable is plugged in).
* No video output
* No console output
* Lightbar is always off
* Frontside status LEDs lit: BaseIO and Heart



Good. Ahm... I bought another Octane to do some of the above checks and thought it was a good idea to also swap the other way: take the CPU (400MHz) + RAM (1.5GB) from the failing Octane and try them in this new Octane (250MHz + 128MB IIRC).

This happened:

The new system immediately displayed the typical screens but the lightbar started flashing in red color. It then loaded IRIX fine  and I could login as usual. I looked up what flashing red in the lightbar meant and it was a memory failure. I checked using hinv and in fact only 1GB of the 1.5GB were recognized. Some minutes later the same happened as with the other Octane: The system completely freezes and doesn't respond to anything. No power, reset or NMI button, no keybard, no mouse, nothing. I had to unplug it and now every time I plug it in again, I have exactly the same symptoms as with the other machine! So, my dear SGI enthusiasts, I now have two unusable machines with exactly the same problems.  Undecided


Ok, I started looking at the system module of the first Octane and have seen 4 different spots with jumpers, labeled:

1. Chassis GND Jumper (top and side)
2. Enable Passwd (top and side)
3. NIC (top and side)
4. Select Flash PROM (top and side)

My question to the pros here: Can/Should I do something using any of these jumpers? And how do I use them?


Any help is very much appreciated.
(This post was last modified: 03-31-2019, 03:20 PM by TruHobbyist.)
TruHobbyist
Octane

Posts: 74
Threads: 9
Joined: May 2018
Find Reply
03-31-2019, 08:22 AM
#24
RE: Octane: PCI Bridge Error
(03-31-2019, 08:22 AM)TruHobbyist Wrote:  
(03-12-2019, 09:21 AM)mrthinlysliced Wrote:  Did you follow the troubleshooting section of the octane owners manual?

http://www.sgistuff.net/hardware/systems...octane.pdf

Page 262 onwards kind of thing.

There are a bunch of diagnostic lights under the front cover too - but I couldn't find where it explains what those mean - might be worth looking into as well.

Aha - found the explanation:


Code:
Meaning of the seven green LEDs that are behind the frontpanel.

Perhaps you have seen the seven small holes near the undocumented connector on the front of your Octane. These are link status lights and show which ports of the Crossbar have items connected to.

BaseIO X
 QA    X  X  PCI Expansion
 QD    X  X      QB
 QC    X  X    Heart

My Octane has *only* QA (V6) and Heart LEDs lit. The rest is off. At https://wiki.preterhuman.net/SGI_Octane#Troubleshooting 
it states both BaseIO and Heart are always lit.  Dodgy

What is BaseIO?


After having reseated the system module, the LEDs changed to BaseIO AND Heart On, the rest is off now. Don't know if this means my V6 is not usable right now.  Confused


What I have tried so far:

* reseated the system module with and w/o DIMMS multiple times - NOK
* swapping memory and CPU with other known to work models (double checked in a working Octane) - NOK
* powering on only with memory, no CPU - NOK
* powering on only with CPU, no memory - NOK
* powering on with no CPU and no memory - NOK
* all of the above with keyboard + mouse + vga (all three working and having been used earlier with the same machine) - NOK
* all of the above with only power cable and null modem cable connected to port 1 - NOK


NOK means I still have the original symptoms:

* Machine only powers on when power cable is plugged in.
* It won't react to the power button, reset button, nor NMI button.
* Both fans on the rear spin (upper one and PSU).
* HDD is read and spins on power on (when power cable is plugged in).
* No video output
* No console output
* Lightbar is always off
* Frontside status LEDs lit: BaseIO and Heart



Good. Ahm... I bought another Octane to do some of the above checks and thought it was a good idea to also swap the other way: take the CPU (400MHz) + RAM (1.5GB) from the failing Octane and try them in this new Octane (250MHz + 128MB IIRC).

This happened:

The new system immediately displayed the typical screens but the lightbar started flashing in red color. It then loaded IRIX fine  and I could login as usual. I looked up what flashing red in the lightbar meant and it was a memory failure. I checked using hinv and in fact only 1GB of the 1.5GB were recognized. Some minutes later the same happened as with the other Octane: The system completely freezes and doesn't respond to anything. No power, reset or NMI button, no keybard, no mouse, nothing. I had to unplug it and now every time I plug it in again, I have exactly the same symptoms as with the other machine! So, my dear SGI enthusiasts, I now have two unusable machines with exactly the same problems.  Undecided


Ok, I started looking at the system module of the first Octane and have seen 4 different spots with jumpers, labeled:

1. Chassis GND Jumper (top and side)
2. Enable Passwd (top and side)
3. NIC (top and side)
4. Select Flash PROM (top and side)

My question to the pros here: Can/Should I do something using any of these jumpers? And how do I use them?


Any help is very much appreciated.

Any luck? I seem to be having a very similar problem. As soon as I plug it in, it kicks on. Power button does nothing. I have done pretty much the same trouble shooting.
maxx_electro
O2

Posts: 1
Threads: 0
Joined: Aug 2018
Find Reply
05-24-2019, 12:17 AM


Forum Jump:


Users browsing this thread: 1 Guest(s)