CASE.EDU:    HOME | DIRECTORIES | SEARCH

Emergency Maintenance

Emergency Maintenance: Veale-M1-E1; Module 3, WS-X6548-GE-TX; Bus Asic #0 transient Pb error. Recovered. (0x0002, 0x0000): Module needs troubleshooting or TAC

Problem:   Veale-M1-E1; Module 3, WS-X6548-GE-TX; Bus Asic #0 transient Pb error.  Recovered. (0x0002, 0x0000): Module needs troubleshooting or TAC
Cause:     unknown
Affects:   up-to-forty-eight direct data local-area-network connections
Started:   11/06/2009 02:00 PM
Resolved:  11/06/2009 03:00 PM

Notes:

1. First to install a WS-X6148-45AF module in to an empty or vacant Slot 2.
2. Second to move all patches to Module 2 from Module 3.
3. The Forty-Eight directly-connected End-Users, they may experience a brief interruption in the service for possibly only a few minutes, while the patches are being moved, this afternoon.
4. We apologize for any inconvenience this may cause during this brief network outage.
5. Out-of-the-forty-eight direct network connections,
forty-three are of the general use, while five are of the use of the five vending machines.


Created: 11/06/2009 14:39:03 by euw

Updates:


Emergency Maintenance: Enrollment Active Campus application unavailable

Problem:   Enrollment Active Campus application unavailable
Cause:     Need to replace the Raid Battery 
Affects:   Active Campus web site users
Started:   11/09/2009 03:00 AM
Resolved:  11/09/2009 06:00 AM

Notes:

It is necessary to shutdown the active campus application and database servers to replace the onboard raid battery for each server. Enrollment Management agreed to the scheduling of this outage.


Created: 11/05/2009 09:56:36 by rak7

Updates:


Emergency Maintenance: DB7 (non-ERP Oracle production) downtime

Problem:   DB7 (non-ERP Oracle production) downtime
Cause:     hardware repair
Affects:   all users of non-ERP Oracle databases
Started:   11/06/2009 03:00 AM
Resolved:  11/06/2009 06:00 AM

Notes:

RE-SCHEDULED: the repair time has been scheduled for the morning of Friday, Nov. 6 during the regulary 3-6am maintenance window.

This outage is being scheduled to replace the hardware that failed on Friday 10/30.

Affected applications include (but are not limited to):
Ad Astra
BlackBoard (Course mgmt. system)
Dental School Clinic
Identity Management (user account/password changes)
IP address/host name management
E-mail lists
MyCase portal
Pinnacle (phone billing)
University Library web sites (not including EuclidPLUS/on-line catalog system)
DARS
ProSam (Financial Aid)
Encyclopedia of Cleveland History
Internal IT applications: Appworx, ChangeMan, Mashups, VirtualCenter

Done @5:25 Databases should be back online


Created: 11/02/2009 14:10:53 by jan3

Updates: 11/02/2009 19:24:14 by jan3, 11/04/2009 14:53:42 by jan3, 11/06/2009 05:24:51 by rfw


Emergency Maintenance: Development ERP Windows Server Needs rebooted.

Problem:   Development ERP Windows Server Needs rebooted.
Cause:     Equipment maintenance
Affects:   Unknown
Started:   10/12/2009 12:00 PM
Resolved:  10/12/2009 12:30 PM

Notes:

The development Peoplesoft server Akita was having issues with the Symantec

Read more Emergency Maintenance posts. Subscribe

Problem Report

Problem Report: backup system down

Problem:   backup system down
Cause:     primary server crashed (root cause unknown)
Affects:   users of the Legato Networker system
Started:   11/06/2009 11:45 PM
Resolved:  11/07/2009 02:05 PM

Notes:

system rebooted.


Created: 11/07/2009 14:17:58 by jan3

Updates:


Problem Report: Database error on LDAP replica (ldap-replica7)

Problem:   Database error on LDAP replica (ldap-replica7)
Cause:     Unknown
Affects:   Direct users of LDAP and downstream applications such as Single Sign On
Started:   11/07/2009 12:10 AM
Resolved:  11/07/2009 10:15 AM

Notes:

The LDAP replica was complaining about problems with its internal user record database and was not serving information to any application querying it. One of the affected applications was the Single Sign On (SSO) service which would hang waiting for information that would never come. Stopping and restarting the replica caused it to rebuild its internal database which appears to have corrected the problem, in addition to clearing up the downstream issues (SSO).

We will continue to monitor the system throughout the weekend to make sure the issue does not reoccur.


Created: 11/07/2009 10:51:58 by dak

Updates:


Problem Report: Google Mail Issue

Problem:   Google Mail Issue
Cause:     Google Minor Service Outage
Affects:   May not have affected anyone on Case campus
Started:   11/01/2009 01:15 PM
Resolved:  11/01/2009 08:30 PM

Notes:

Google experienced a service disruption that affected less that 0.001% of the GMail users. While the disruption was occurring, affected users were unable to access their mail.

While we have received no reports of outages for the Case campus, we are posting this report to notify our clients that a VERY minor disruption was experienced by Google.


Created: 11/02/2009 07:01:21 by dak

Updates:


Problem Report: Campus Unable to Login

Problem:   Campus Unable to Login
Cause:     unknown
Affects:   user unable to access any systems
Started:   11/01/2009 02:11 AM
Resolved:  11/01/2009 02:40 AM

Notes:

[11/01/2009 9:30 AM] - By the time we had a chance to look into the issue it had resolved itself. We are guessing that it had something to do with the time change and computers not quite in sync as far a their clocks are concerned. Verified that the issue was resolved with the Help Desk (they were able to reach the Case-provided tools again).

Helpdesk called to notify me that users across campus receiving "Your session has expired." errors on all systems. Notified Dave Kovacic - currently looking into issue.


Created: 11/01/2009 02:13:11 by jxo63

Updates: 11/01/2009 09:36:21 by dak


Problem Report: Database Server (DB7) is down

Problem:   Database Server (DB7) is down
Cause:     It looks like another bad memory module
Affects:   Several databases and the services that use them, see below
Started:   10/30/2009 11:55 AM
Resolved:  10/30/2009 04:30 PM

Notes:

The databases are back up and most services are back as well.

The system has bene rebooted an is on its way back up however it may take some time to bring all the databases back up

Affected applications include:

Advp, Apexp, appworxp, astrap, aurorap, blackboard, Degree Audit Reporting System, dashboard, dental school, IP Self Registration for computers, isisp, KSL Oracle Database, Mail Group, New My Case Portal, Oracle Internet Directory, onbasep (Financial Aid Application), oncorep, onregp, personp, Pinnacle Telephone Application, prosamp, prtlp, sccp, Serena Change Man, t2Parking, Unified Messaging, and WebEvent.

Because the Oracle Internet directory is on DB7 any database connections using LDAP may also be down.
This includes:
   Password change/reset
   Account activation
   System registration tools
   Mail list manager (Sympa)


Created: 10/30/2009 12:23:53 by dak

Updates: 10/30/2009 16:24:48 by dak


Problem Report: Intermittent access to some services

Problem:   Intermittent access to some services
Cause:     Unknown - possible data center or network problem?
Affects:   Users of services including ERP - all services in Crawford data center seem intermittently affected
Started:   10/23/2009 04:30 PM
Resolved:  10/23/2009 06:10 PM

Notes:

[10/23/09 7:22PM] - The problems we were seeing seemed to clear up around 6:10 PM - Network engineers are continuing to dig into the issue to see if they can pinpoint and prevent reoccurences.

The issues started about 4:30PM as far as we can tell - the outages are so short that we may have missed them earlier in the day. Things seem to become unresponsive for about 30-60 seconds and then come back. Only certain parts of each service seem affected, for instance you cannot connect to calendar.case.edu, but the server is still reachable to log into....

A network engineer is looking into the problem. We will post updates as they become available.


Created: 10/23/2009 18:49:08 by dak

Updates: 10/23/2009 19:21:54 by dak


Problem Report: MyCase Down

Problem:   MyCase Down
Cause:     DB7 Maintenance
Affects:   All users of the MyCase portal
Started:   10/22/2009 03:00 AM
Resolved:  10/22/2009 07:15 AM

Notes:

Following the db7 maintenance the MyCase portal needed a gentle nudge to come back on line.


Created: 10/22/2009 07:22:00 by jms20

Updates:


Problem Report: Novell server PULITZER down

Problem:   Novell server PULITZER down
Cause:     unknown
Affects:   all users of PULITZER
Started:   10/20/2009 05:50 PM
Resolved:  10/20/2009 06:30 PM

Notes:

fixed by 6:30pm

requires a hard reboot, expect to be back on-line by about 6:45pm.


Created: 10/20/2009 18:07:10 by jan3

Updates: 10/20/2009 18:56:36 by jan3


Problem Report: Tomlinson network switch power failure

Problem:   Tomlinson network switch power failure
Cause:     lack of power
Affects:   no one i believe
Started:   10/20/2009 04:42 PM
Resolved:  10/21/2009 07:42 AM

Notes:

I am guessing that,
the construction power work failed the UPS over,
to battery; and
they didn't come back alive,
when the power was restored.
First, the batteries drained,
second, the switch died.
Reviving the UPS, kick-started the switch.
Batteries re-charging.


Created: 10/20/2009 17:45:23 by jhm

Updates: 10/20/2009 19:27:17 by euw, 10/21/2009 07:32:34 by euw


Problem Report: help.case.edu down

Problem:   help.case.edu down
Cause:     Amazon Cloud Service Virtual Machine died
Affects:   All users of help.case.edu
Started:   10/20/2009 02:00 PM
Resolved:  10/20/2009 02:24 PM

Notes:

Service has been restored

Actual start time may have been earlier...this is when I got the report.


Created: 10/20/2009 14:25:53 by jms20

Updates: 10/20/2009 14:42:47 by man27


Problem Report: Case gadgets in webstart are not loading

Problem:   Case gadgets in webstart are not loading
Cause:     not sure yet
Affects:   Users of webstart.case.edu
Started:   10/20/2009 08:27 AM
Resolved:  10/20/2009 10:03 AM

Notes:

The issue has been resolved. Looks like it was an issue on Google's end and they fixed it.
Sorry for the inconvenience.

Update:
A case has been opened at Google.

The gadgets we created specifically for the Case community are not loading correctly on Webstart.
We are looking into the issue and will be notifying Google to see if it is a problem on their end.


Created: 10/20/2009 08:30:24 by gsr9

Updates: 10/20/2009 09:26:09 by gsr9, 10/20/2009 10:06:27 by gsr9


Problem Report: CDC CRAC 2 ALARM

Problem:   CDC CRAC 2 ALARM
Cause:     unknown
Affects:   CDC
Started:   10/19/2009 10:00 AM
Resolved:  

Notes:

This morning, the Facility Maintenance has been contacted.


Created: 10/19/2009 13:28:02 by euw

Updates:


Problem Report: DB7 rebooted and file systems did not come back up

Problem:   DB7 rebooted and file systems did not come back up
Cause:     Unknown
Affects:   All production Databases on DB7 - see list below
Started:   10/19/2009 06:54 AM
Resolved:  10/19/2009 09:52 AM

Notes:

The server DB7 rebooted at 6:54 am this morning and none of the file systems came back up after the reboot. This affects the following applications: Apex, Advp, blackboard, aurorap, dashboard, dental school, KSLP, onbasep, oncorep, pinnap, prosamp, prtlp, sccp, serenap, t2Parking, umsgp, and webevntp.

Because the Oracle Internet directory is on DB7 any database connections using LDAP will also be down.

UPDATE: All file systems back on line and the databases were up and running at 9:52. The database mailgrpp had some issues and was not back until 10:28.

Server Engineering has been contacted and is looking into the problem. There is no ETA at this time.


Created: 10/19/2009 08:22:03 by rxg263

Updates: 10/19/2009 10:25:16 by rxg263


Problem Report: Single Sing On (SSO) partially unavailable

Problem:   Single Sing On (SSO) partially unavailable
Cause:     One of the SSO servers was in a confused state
Affects:   Everyone using SSO
Started:   10/19/2009 07:00 AM
Resolved:  10/19/2009 07:15 AM

Notes:

We are unsure why the server started having problems, but it suddenly began claiming all logins were invalid. Restarting the server fixed the problem.


Created: 10/19/2009 07:53:34 by dak

Updates:


Problem Report: Master Authentication Database (Kerneros KDC) was in a confused state

Problem:   Master Authentication Database (Kerneros KDC) was in a confused state
Cause:     Unknown at this time
Affects:   Peopel trying to activate accounts, change passwords, some internal tools
Started:   10/14/2009 09:50 AM
Resolved:  10/14/2009 12:17 PM

Notes:

The authentication program that handles writing information to our user ID/password database was in a state where the program could not access the database. Restarting the application fixed the problem. We are continuing to investigate why the issue occurred in the first place.


Created: 10/14/2009 12:59:06 by dak

Updates:


Problem Report: ERP FS & HCM down

Problem:   ERP FS & HCM down
Cause:     Oracle cluster off-line
Affects:   all users of PeopleSoft ERP Financials or HCM modules
Started:   10/14/2009 10:25 AM
Resolved:  10/14/2009 11:45 AM

Notes:

while repairing 2nd cluster node, primary node went off-line as well. Primary has been restarted, 2ndary repair is now complete as well.


Created: 10/14/2009 12:10:46 by jan3

Updates:


Problem Report: ERP (HCM, Finanacials) test/development database server down

Problem:   ERP (HCM, Finanacials) test/development database server down
Cause:     unknown
Affects:   all test/development users of HCM and Financials applications
Started:   10/13/2009 02:55 PM
Resolved:  10/14/2009 01:15 PM

Notes:

Final update: system has been repaired & returned to service. Databases & apps are expected to be restarted within an hour.


UPDATE: hardware fault isolated. System is running on 1/2 CPU & RAM, pending vendor replacement of failed part(s).


Databases are NOT being restarted yet, pending a determination of time-to-repair. Plans are in place to re-start critical development databases on production system if needed before repairs can be arranged on Weds. morning.



Preliminary investigation suggests a hardware failure. Working to diagnose failed part. Best case ETA is technician will arrive with a replacement part Wednesday at 9am.



Engineer is investigating. Server has spontaneously rebooted (and appears to be stuck in a loop).


Created: 10/13/2009 15:15:04 by jan3

Updates: 10/13/2009 16:55:16 by bsc4, 10/13/2009 21:41:24 by jan3, 10/14/2009 12:14:32 by jan3


Problem Report: Degraded Internet connectivity

Problem:   Degraded Internet connectivity 
Cause:     Internet routing problem in the global crossing network
Affects:   all internet services web browsing, email
Started:   10/01/2009 02:24 PM
Resolved:  

Notes:

UPDATE: 10/1 19:16 ISP able to resolve routing issue at their end. Internet Connectivity restored. Engineers will continue to monitor the Case network.

UPDATE: 10/1 18:41 ISP is continuing working on the problem. End-users may have problem reaching the Internet. Offcampus users may also have problem reaching the Case network. No ETA at this point.

The problem is beyond our Internet provider.
The problem appears to be on the global crossing network

OUR ISP has a ticket open with Global crossing on this ticket.

local CASE engineers are monitoring this issue.
degraded access to cnn and msnbc have been noted.

we will advise as more information become available


Created: 10/01/2009 14:30:11 by lxc152

Updates: 10/01/2009 18:45:30 by wxc16, 10/01/2009 18:48:46 by wxc16, 10/01/2009 19:20:35 by wxc16


Problem Report: Degraded Internet Connectivity

Problem:   Degraded Internet Connectivity
Cause:     Unexpected ISP maintenance issue
Affects:   Case Network Connecivity to the Internet
Started:   09/30/2009 04:16 PM
Resolved:  

Notes:

UPDATE: 10/1 19:16 ISP able to resolve routing issue at their end. Internet Connectivity restored. Engineers will continue to monitor the Case network.

10/1 14:25 - Case Network is currently experiencing another intermittent outage to the Internet again due to problem with our ISP. ISP is aware of this issue and working on resolving it. Engineer will continue monitor.

9/30 17:01 - Our ISP informed us that the Internet connectivity issue has been partially resolved. The problem was caused by ISP maintenance which resulted in improper routing of Case network traffic. Engineers will continue to monitor this issue.


9/30 16:37 - The problem appears to be beyond our ISP. Engineers continue to monitor the issue.

Engineers are aware of some intermittent Internet connectivity issue. No problem found with network connection inside campus. End users may experience problem connecting to the Internet. It appears to be problem at the ISP's end. Engineers are contacting ISP support.


Created: 09/30/2009 16:18:55 by wxc16

Updates: 09/30/2009 16:41:29 by wxc16, 09/30/2009 17:05:38 by wxc16, 10/01/2009 14:20:46 by lxc152, 10/01/2009 14:30:16 by wxc16, 10/01/2009 19:20:43 by wxc16


Problem Report: VPN Unavailable

Problem:   VPN Unavailable
Cause:     unknown
Affects:   any VPN User
Started:   09/25/2009 07:37 AM
Resolved:  

Notes:

Engineers are investigating


Created: 09/25/2009 07:38:37 by man27

Updates:


Problem Report: Wireless Network Outage

Problem:   Wireless Network Outage
Cause:     Unknown
Affects:   Campus Wireless Network
Started:   08/31/2009 09:00 AM
Resolved:  

Notes:

Engineer got reports regarding wireless outage throughout campus. Engineer is investigating.


Created: 08/31/2009 13:48:57 by wxc16

Updates:


Problem Report: Fiji House

Problem:   Fiji House
Cause:     Power event
Affects:   data, telephony
Started:   08/11/2009 03:20 AM
Resolved:  

Notes:

VG248 did not come back after recycle of power.
Switch up. Switch config is lost. No one in the house yet,
Engineering got the analog phones working. data still down.


Created: 08/11/2009 10:21:57 by jhm

Updates: 08/11/2009 11:00:54 by jhm


Problem Report: Network packet loss

Problem:   Network packet loss
Cause:     Unknown
Affects:   All Internet Traffic
Started:   07/30/2009 09:23 AM
Resolved:  

Notes:

Engineers are investigating the problem
Users are experiencing intermittent degradation in internet connection speed.


Created: 07/30/2009 09:31:06 by lxc152

Updates: 07/30/2009 09:37:35 by man27


Problem Report: Case VPN Services - zero network connetivity after VPN session is established.

Problem:   Case VPN Services - zero network connetivity after VPN session is established.
Cause:     Unknown
Affects:   Case VPN Services
Started:   07/20/2009 09:00 AM
Resolved:  

Notes:

User may experience zero network connectivity after he or she established a Case VPN session. Suspect VPN server's threat detection erroneously dropped the returning traffic to the user. VPN Server's Threat Detection restarted. Engineer continue to monitor.

Workaround:

Disconnect VPN session and reconnect.


Created: 07/24/2009 18:11:27 by wxc16

Updates:


Problem Report: Case Voicemail Performance Degraded

Problem:   Case Voicemail Performance Degraded
Cause:     Unknown
Affects:   Degrade Voicemail services
Started:   07/20/2009 09:00 AM
Resolved:  08/03/2009 05:00 PM

Notes:

Voicemail system's hardwares have been replaced. Voicemail system's software have been upgraded to version 2.4. Delayed LDAP response no longer seen in the new version of software.

Voicemail System has return back to normal performance. Engineer will continue monitor the system.

User may experiencing delay when trying to retrieve Voicemail messages via the telephone. User may experience up to 30 sec of delay after he or she enter the Passcode before the system responds / plays user's voicemail messages.

Workaround:
1) Hang up and retry
2) Retrieve voicemail via your Emails.

Sorry for the inconvenience. Engineer is working on resolving this issue.


Created: 07/24/2009 17:33:39 by wxc16

Updates: 07/30/2009 17:14:39 by wxc16, 08/07/2009 22:19:50 by wxc16


Problem Report: Sympa Mailing List Server is down

Problem:   Sympa Mailing List Server is down
Cause:     Database server is down
Affects:   All mailing lists and admin aliases
Started:   07/15/2009 10:50 AM
Resolved:  07/15/2009 03:30 PM

Notes:

The mailing list server is down because the database server crashed and Sympa can not run without a database.

The database server group is working on restoring the database server.

Update: The DB was restored and mail flowing again at 2:15pm. All queued messages were delivered by 3:30pm. We are now running normally.


Created: 07/15/2009 12:06:49 by emr

Updates: 07/15/2009 15:47:12 by emr


Problem Report: Unix Server DB7 is having problems

Problem:   Unix Server DB7 is having problems
Cause:     Unknown Hardware Problem
Affects:   All Non PeopleSoft Production Oracle Databases
Started:   07/15/2009 10:45 AM
Resolved:  07/15/2009 02:15 PM

Notes:

The Unix Server DB7 is currently experiencing problems and rebooted at approx. 10:45 am. This currently affects 40 Oracle databases including Blackboard, Advanced Contributions, All APEX Systems, APPWORX, Degree Audit, Dental School, KSL, ISIS, Pinnacle, Portal, MyCaseP, ONBASE, ONCORE, TeamTrack, T2Park, Serena. Server Engineering is currently working on the issue.

The server was rebooted and all disk file systems had to be manually mounted. All of the databases were up at 2:15 pm. Diagnostic files have been sent to the vendor.

The server vendor has determined that a catastrophic memory failure in the system's main memory caused the CPU panic and subsequent crash. We will schedule an emergency maintenance outage during the maintenance window as soon as we are in contact with the vendor's server engineer.
   


Created: 07/15/2009 11:41:13 by rxg263

Updates: 07/15/2009 12:04:11 by rxg263, 07/15/2009 12:10:27 by rxg263, 07/15/2009 15:10:23 by dxw134


Problem Report: KSL Data Center's CRAC DC-5 Humidity reading = 48%, setting = 44%

Problem:   KSL Data Center's CRAC DC-5 Humidity reading = 48%, setting = 44%
Cause:     unknown
Affects:   KSL Data Center
Started:   07/07/2009 05:30 AM
Resolved:  

Notes:

Facility Maintenance has been notified, and
apprised of the situation.


Created: 07/07/2009 11:55:30 by euw

Updates:


Problem Report: Pathology-p3-e1 cooling issue

Problem:   Pathology-p3-e1 cooling issue
Cause:     HVAC issues
Affects:   network equipment
Started:   06/22/2009 02:58 PM
Resolved:  07/21/2009 09:27 AM

Notes:

Resolved.

According to Facility, HVAC to the SER had to be shutdown to fix flooding in the building.
There is minimal impact at the moment but If prolonged high temperature in the SER, it will affect the network equipment causing potential outage to Wired, wireless, phone and security panels in pathology.


Created: 06/23/2009 09:04:48 by roo

Updates: 07/21/2009 09:27:12 by roo


Problem Report: Bingham Hub

Problem:   Bingham Hub
Cause:     Cooling problem in Hub
Affects:   Wired, wireless, phones, security panels for several buildings on south side
Started:   06/20/2009 02:48 AM
Resolved:  07/21/2009 10:09 AM

Notes:

Resolved and closed

2009, June 24, 08:15 AM, we restored the back-up link
between the Bingham Hub and the KSL Data Center.

2009, June 24, 06:30 AM, we restored the main link
between the Bingham Hub and the Crawford Data Center,
which restored all network Connections and Connectivity,
for the South Campus, with regards to the Bingham Hub;
however, the back-up link between the Bingham Hub and
the KSL Data Center, it is still down, and it will need
to be further investigated.

June 24, 04:45 AM Lost several building networks because the AC failed again. Facility on the way.

June 22, 03:00 PM Hub experiencing cooling issues again. Plant services have been called. The SER AC keeps shutting down.

11:52PM: The line cards in the hubs have started experiencing temperature failure again. Called plant services to look into it. looks like the AC keeps tripping off.

03:30 am Update: Plan services is currently working on the SER cooling. The switch line cards are recovering from Temperature failure.
Jun 20 03:33:45 EDT: %C6KENV-SP-4-MINORTEMPALARMRECOVER: module 9 outlet temperature crossed threshold #1(=60C). It has returned to normal operating temperature range.

Services are coming back online.
Still monitoring.

Investigating.
Called Plant services to check cooling.
Suspect failed cooling in SER.
bingham-h0-e1>show mod
Mod MAC addresses Hw Fw Sw Status
--- ---------------------------------- ------ ------------ ------------ -------
   1 0009.11f7.e830 to 0009.11f7.e83f 1.0 7.2(1) 8.5(0.46)RFW MinFail
   2 000c.ceb5.a900 to 000c.ceb5.a90f 1.0 7.2(1) 8.5(0.46)RFW MinFail
   3 000c.ceb5.aa40 to 000c.ceb5.aa4f 1.0 7.2(1) 8.5(0.46)RFW MinFail
   4 0003.feac.7772 to 0003.feac.7779 2.0 7.2(1) 3.5(1) Ok
   5 000c.ce63.e864 to 000c.ce63.e867 2.1 7.7(1) 12.2(18)SXF1 Ok
   9 000d.6550.b866 to 000d.6550.b869 1.1 12.2(14r)S5 12.2(18)SXF1 MinFail

Mod Sub-Module Model Serial Hw Status
---- --------------------------- ------------------ ----------- ------- -------
   1 Distributed Forwarding Card WS-F6K-DFC3A SAD072004CL 1.0 MinFail
   2 Distributed Forwarding Card WS-F6K-DFC3A SAD072300XR 1.0 MinFail
   3 Distributed Forwarding Card WS-F6K-DFC3A SAD072004BU 1.0 MinFail
   5 Policy Feature Card 3 WS-F6K-PFC3A SAD072100G1 1.1 Ok
   5 MSFC3 Daughterboard WS-SUP720 SAD072100JS 1.2 Ok
   9 Distributed Forwarding Card WS-F6700-DFC3A SAD074805CH 1.0 MinFail

bingham-h0-e1>


Created: 06/20/2009 02:52:45 by roo

Updates: 06/20/2009 03:37:16 by roo, 06/20/2009 23:51:59 by roo, 06/22/2009 15:38:35 by roo, 06/24/2009 06:09:17 by roo, 06/24/2009 06:54:11 by euw, 06/24/2009 08:19:44 by euw, 07/21/2009 10:09:21 by roo


Problem Report: ERP Financials server outage

Problem:   ERP Financials server outage 
Cause:     Hardware error
Affects:   Provided Financial ERP services
Started:   06/15/2009 08:00 AM
Resolved:  

Notes:

Received report the financials process server was unavailable. Server Engeneering staff responded onsite and the server was reporting a hard disk error and hung.

Power-cycled the server and the services became available after the reboot completed. Services ran in a degraded state as the rebuild of the disk was running.

The rebuild of the hard disk failed, the hard drive was replaced, the rebuild was attempted again and failed.

Continuing to troubleshoot the issue.


Created: 06/17/2009 11:57:04 by rak7

Updates:


Problem Report: Scholars house segemented from CCN

Problem:   Scholars house segemented from CCN
Cause:     Suspecting hardware failure.  
Affects:   No users are in this building for the summer.  
Started:   06/08/2009 04:39 PM
Resolved:  

Notes:

Suspecting supervisor module failure.

Investigating.


Created: 06/08/2009 16:57:59 by jhm

Updates: 06/08/2009 18:31:38 by roo, 06/09/2009 08:47:52 by euw


Problem Report: LDAP Replica (ldap-replica7) is down

Problem:   LDAP Replica (ldap-replica7) is down
Cause:     Seems to be a disk problem
Affects:   Only people pointed directly at ldap-replica7.cwru.edu
Started:   05/22/2009 11:35 AM
Resolved:  

Notes:

[5/22/09 1:30 PM] - A reboot seems to have brought the confused disk back so the LDAP replica is back in operation. We have called the vendor to have a look at the system however and are leaving it out of any access paths for the moment in case the system dies permanently. We will close this problem report when we have a definitive closure to the problem.

The disk on which the LDAP server binaries are stored appears to no longer be mounted on the system (this is a local disk). Server Engineering is looking into the issue.

The system is one of several redundant LDAP replicas. The only applications affected will be those who are pointed directly at this particular LDAP replica.


Created: 05/22/2009 12:24:57 by dak

Updates: 05/22/2009 13:27:28 by dak


Read more Problem Report posts. Subscribe

Scheduled Maintenance

Scheduled Maintenance: Veale-M1-E1; Module 3, WS-X6548-GE-TX; Bus Asic #0 transient Pb error. Recovered. (0x0002, 0x0000): Module needs troubleshooting or TAC

Problem:   Veale-M1-E1; Module 3, WS-X6548-GE-TX; Bus Asic #0 transient Pb error.  Recovered. (0x0002, 0x0000): Module needs troubleshooting or TAC
Cause:     unknown
Affects:   up-to-forty-eight direct data local-area-network connections
Started:   11/09/2009 08:00 AM
Resolved:  11/09/2009 09:00 AM

Notes:

1. First to install a WS-X6148V-GE-TX module in to an empty or vacant Slot 3.
2. Second to move all patches to Module 3 from Module 2.
3. The Forty-Eight directly-connected End-Users, they may experience a brief interruption in the service for possibly only a few minutes, while the patches are being moved, this afternoon.
4. We apologize for any inconvenience this may cause during this brief network outage.
5. Out-of-the-forty-eight direct network connections,
forty-three are of the general use, while five are of the use of the five vending machines.


Created: 11/06/2009 14:44:05 by euw

Updates:


Read more Scheduled Maintenance posts. Subscribe