Emergency Maintenance
Emergency Maintenance: Development ERP Windows Server Needs rebooted.
Problem: Development ERP Windows Server Needs rebooted. Cause: Equipment maintenance Affects: Unknown Started: 10/12/2009 12:00 PM Resolved: 10/12/2009 12:30 PM
Notes:
The development Peoplesoft server Akita was having issues with the Symantec
Problem Report
Problem Report: Network Outage - Buildings on Bellflower Road and E.117th.
Problem: Network Outage - Buildings on Bellflower Road and E.117th. Cause: Power Outage Affects: see notes Started: 05/26/2010 04:04 PM Resolved:
Notes:
Power outage along Bellflower Raod and E.117th. Affected buildings include AXO ZBT PGD, Bellflower House, Wolstein Hall, all rental apartments on East 117 street.
Security and Plant Services have been notified. No ETA at this point.
Created: 05/26/2010 16:09:44 by wxc16
Updates:
Problem Report: Self registration of computers on the wired network is down (setup.case.edu)
Problem: Self registration of computers on the wired network is down (setup.case.edu) Cause: unknown Affects: Unregistered computers on wired network Started: 02/12/2010 02:14 PM Resolved:
Notes:
Until fix is completed, please call the helpdesk at 216368-help and ask the analyst to register the computer for you.
Created: 02/12/2010 14:15:42 by man27
Updates:
Problem Report: Spam filter machine (mpspam3) for local mailboxes has died
Problem: Spam filter machine (mpspam3) for local mailboxes has died Cause: We suspect a failed disk Affects: About 1/4 of those people still not migrated to Google Apps Started: 01/13/2010 07:15 AM Resolved: 01/13/2010 10:00 AM
Notes:
[01/13/2010 10:30AM] - We have done the necessary work to remove the failed hardware from the mail delivery path. At this point mail that was queued on other systems waiting for the failed hardware has been delivered.
Mail is currently being delivered into the local (iPlanet) mail system without the benefit of any spam filtering other than the most basic for those people on the failed server (about 1700 people), who had spam filtering turned on at spamcontrol.case.edu. Connection to spamcontrol.case.edu is no longer possible for those individuals as well. Individuals who did not have spam filtering turned on will not notice any difference in the amount of spam received.
The failed equipment is no longer supported by the vendor so it is unlikely that we will be able to replace the failed hardware. For that reason, people are encouraged to migrate fully to their Google Apps mailbox immediately. Note that mail migration is required for everyone before February 2010 in any event.
One of the four machines that filter spam for the local (iPlanet) mailboxes has died. This machine only serves 1/4 of the people who have not yet migrated to Google Apps (about 1700 people).
We suspect a failed disk as the culprit. A mail administrator is on the way to see if the machine can be revived now.
While the system is down, no mail will be lost as it will just queue on other systems.
Created: 01/13/2010 07:29:18 by dak
Updates: 01/13/2010 10:34:39 by dak
Problem Report: Millie science center network is down.
Problem: Millie science center network is down. Cause: Unknown Affects: Network connectivity in millis sci center. Data and voice. Started: 01/10/2010 12:59 AM Resolved:
Notes:
Engineer is investigating the issue. No ETA at this point
Engineer discovered equipment room's Network Switches hung. Reboot switches and restored network connectivity.
Created: 01/10/2010 13:01:35 by wxc16
Updates: 01/10/2010 15:07:50 by wxc16
Problem Report: In the Wickenden Hall, SER 1, both Cisco Switches lost power, at about ten o'clock this morning.
Problem: In the Wickenden Hall, SER 1, both Cisco Switches lost power, at about ten o'clock this morning. Cause: Both Emerson Liebert UPSs Normal Mode Failed, at about ten o'clock this morning. Affects: Data and Voice, Wired and Wireless Started: 12/16/2009 10:00 AM Resolved:
Notes:
At about eleven o'clock this morning,
both Cisco Switches are running off of
both Emerson Liebert UPSs, which are now running,
in the Bypass Mode, and both Internal Batteries are now
not actively part of the present electrical power circuits.
Created: 12/16/2009 11:06:18 by euw
Updates:
Problem Report: WRB Hall, Room 1-336, SER 3, UPS 1, WRB-P3-U1, its Web-Page hasn't been loading up now.
Problem: WRB Hall, Room 1-336, SER 3, UPS 1, WRB-P3-U1, its Web-Page hasn't been loading up now. Cause: Its Web-Card may need to be reprogrammed and reseated now. Affects: You aren't be able to access its SNMP card remotely now. Started: 12/15/2009 10:33 AM Resolved:
Notes:
WRB Hall, Room 1-336, SER 3, UPS 1, WRB-P3-U1,
its Web-Page hasn't been loading up now.
Its Web-Card may need to be reprogrammed and reseated now.
You aren't be able to access its SNMP card remotely now.
Model No. GXT5000R-208.
Serial No. 030-700-300-6BW-502.
Created: 12/15/2009 13:53:56 by euw
Updates:
Problem Report: WRB Hall, Room 2-406, SER 5, UPS 1, WRB-P5-U1, its Web-Page hasn't been loading up now.
Problem: WRB Hall, Room 2-406, SER 5, UPS 1, WRB-P5-U1, its Web-Page hasn't been loading up now. Cause: Its Web-Card may need to be reprogrammed and reseated now. Affects: You aren't be able to access its SNMP card remotely now. Started: 12/15/2009 10:33 AM Resolved:
Notes:
WRB Hall, Room 2-406, SER 5, UPS 1, WRB-P5-U1,
its Web-Page hasn't been loading up now.
Its Web-Card may need to be reprogrammed and reseated now.
You aren't be able to access its SNMP card remotely now.
Model No. GXT5000R-208.
Serial No. 030-700-302-0BW-502.
Created: 12/15/2009 13:43:16 by euw
Updates:
Problem Report: WRB Hall, Room 3-411, SER 6, UPS 1, WRB-P6-U1, its Web-Page hasn't been loading up now.
Problem: WRB Hall, Room 3-411, SER 6, UPS 1, WRB-P6-U1, its Web-Page hasn't been loading up now. Cause: Its Web-Card may need to be reprogrammed and reseated now. Affects: You aren't be able to access its SNMP card remotely now. Started: 12/15/2009 10:33 AM Resolved:
Notes:
WRB Hall, Room 3-411, SER 6, UPS 1, WRB-P6-U1,
its Web-Page hasn't been loading up now.
Its Web-Card may need to be reprogrammed and reseated now.
You aren't be able to access its SNMP card remotely now.
Model No. GXT5000R-208.
Serial No. 030-700-302-2BW-502.
Created: 12/15/2009 13:35:47 by euw
Updates:
Problem Report: WRB Hall, Room 3-406, SER 7, UPS 1, WRB-P7-U1, its Web-Page hasn't been loading up now.
Problem: WRB Hall, Room 3-406, SER 7, UPS 1, WRB-P7-U1, its Web-Page hasn't been loading up now. Cause: Its Web-Card may need to be reprogrammed and reseated now. Affects: You aren't be able to access its SNMP card remotely now. Started: 12/15/2009 10:33 AM Resolved:
Notes:
WRB Hall, Room 3-406, SER 7, UPS 1, WRB-P7-U1,
its Web-Page hasn't been loading up now.
Its Web-Card may need to be reprogrammed and reseated now.
You aren't be able to access its SNMP card remotely now.
Model No. GXT5000R-208.
Serial No. 030-700-301-9BW-502.
Created: 12/15/2009 13:28:06 by euw
Updates:
Problem Report: WRB Hall, Room 6-411, SER 12, UPS 1, WRB-P12-U1, its Web-Page hasn't been loading up now.
Problem: WRB Hall, Room 6-411, SER 12, UPS 1, WRB-P12-U1, its Web-Page hasn't been loading up now. Cause: Its Web-Card may need to be reprogrammed and reseated now. Affects: You aren't be able to access its SNMP card remotely now. Started: 12/15/2009 10:33 AM Resolved:
Notes:
WRB Hall, Room 6-411, SER 12, UPS 1, WRB-P12-U1,
its Web-Page hasn't been loading up now.
Its Web-Card may need to be reprogrammed and reseated now.
You aren't be able to access its SNMP card remotely now.
Model No. GXT2-6000RT208.
Serial No. 082-90R-004-1BW-571.
Created: 12/15/2009 13:08:17 by euw
Updates:
Problem Report: WRB Hall, Room 6-406, SER 13, UPS 1, WRB-P13-U1, its Web-Page hasn't been loading up now.
Problem: WRB Hall, Room 6-406, SER 13, UPS 1, WRB-P13-U1, its Web-Page hasn't been loading up now. Cause: Its Web-Card may need to be reprogrammed and reseated now. Affects: You aren't be able to access its SNMP card remotely now. Started: 12/15/2009 10:33 AM Resolved:
Notes:
WRB Hall, Room 6-406, SER 13, UPS 1, WRB-P13-U1,
its Web-Page hasn't been loading up now.
Its Web-Card may need to be reprogrammed and reseated now.
You aren't be able to access its SNMP card remotely now.
Model No. GXT2-6000RT208.
Serial No. 051-310-007-6BW-572.
Created: 12/15/2009 12:34:04 by euw
Updates:
Problem Report: Reported BOTNET infected system on the wireless network
Problem: Reported BOTNET infected system on the wireless network Cause: Infect computer system on the wireless system Affects: All wireless users that DO NOT VPN into the university from the wireless network Started: 12/14/2009 10:21 AM Resolved:
Notes:
Greetings,
The host(s) listed at the bottom of this message have been identified as likely bot infected. The specific type of bot infection may or may not be known.
If a source port is identified below, this is the source port used by the infected machine to contact a miscreant server.
Please examine this machine for signs of break-in. Should you feel you've received this report in error, please let us know.
Wireless users should VPN back into the university if they find websites that are not responding or responding slowly
Engineers are investigating now
we currently do not have an ETA for repair
All times are -0000 (UTC)
IP Address Timestamp
----------------------------------------
192.5.109.49 2009-12-13.02:35:28-0000 SrcPort:TCP/61700 MalwareType:Torpig
Created: 12/14/2009 10:26:12 by lxc152
Updates:
Problem Report: Backup DHCP server for VoIP offline
Problem: Backup DHCP server for VoIP offline Cause: Disk Failure Affects: No one Started: 11/20/2009 03:48 PM Resolved:
Notes:
The Backup Server Roo (VoIP) suffered a boot disk failure.
Server Engineers are aware of the problem and looking at the server.
Created: 11/20/2009 16:09:42 by dnd
Updates: 11/20/2009 16:12:28 by dnd
Problem Report: CDC CRAC 2 ALARM
Problem: CDC CRAC 2 ALARM Cause: unknown Affects: CDC Started: 10/19/2009 10:00 AM Resolved:
Notes:
This morning, the Facility Maintenance has been contacted.
Created: 10/19/2009 13:28:02 by euw
Updates:
Problem Report: Degraded Internet connectivity
Problem: Degraded Internet connectivity Cause: Internet routing problem in the global crossing network Affects: all internet services web browsing, email Started: 10/01/2009 02:24 PM Resolved:
Notes:
UPDATE: 10/1 19:16 ISP able to resolve routing issue at their end. Internet Connectivity restored. Engineers will continue to monitor the Case network.
UPDATE: 10/1 18:41 ISP is continuing working on the problem. End-users may have problem reaching the Internet. Offcampus users may also have problem reaching the Case network. No ETA at this point.
The problem is beyond our Internet provider.
The problem appears to be on the global crossing network
OUR ISP has a ticket open with Global crossing on this ticket.
local CASE engineers are monitoring this issue.
degraded access to cnn and msnbc have been noted.
we will advise as more information become available
Created: 10/01/2009 14:30:11 by lxc152
Updates: 10/01/2009 18:45:30 by wxc16, 10/01/2009 18:48:46 by wxc16, 10/01/2009 19:20:35 by wxc16
Problem Report: Degraded Internet Connectivity
Problem: Degraded Internet Connectivity Cause: Unexpected ISP maintenance issue Affects: Case Network Connecivity to the Internet Started: 09/30/2009 04:16 PM Resolved:
Notes:
UPDATE: 10/1 19:16 ISP able to resolve routing issue at their end. Internet Connectivity restored. Engineers will continue to monitor the Case network.
10/1 14:25 9/30 17:01 9/30 16:37 Engineers are aware of some intermittent Internet connectivity issue. No problem found with network connection inside campus. End users may experience problem connecting to the Internet. It appears to be problem at the ISP's end. Engineers are contacting ISP support.
Created: 09/30/2009 16:18:55 by wxc16 Updates: 09/30/2009 16:41:29 by wxc16, 09/30/2009 17:05:38 by wxc16, 10/01/2009 14:20:46 by lxc152, 10/01/2009 14:30:16 by wxc16, 10/01/2009 19:20:43 by wxc16 Engineers are investigating
Created: 09/25/2009 07:38:37 by man27 Updates: Engineer got reports regarding wireless outage throughout campus. Engineer is investigating.
Created: 08/31/2009 13:48:57 by wxc16 Updates: VG248 did not come back after recycle of power.
Created: 08/11/2009 10:21:57 by jhm Updates: 08/11/2009 11:00:54 by jhm Engineers are investigating the problem
Created: 07/30/2009 09:31:06 by lxc152 Updates: 07/30/2009 09:37:35 by man27 User may experience zero network connectivity after he or she established a Case VPN session. Suspect VPN server's threat detection erroneously dropped the returning traffic to the user. VPN Server's Threat Detection restarted. Engineer continue to monitor.
Created: 07/24/2009 18:11:27 by wxc16 Updates: Voicemail system's hardwares have been replaced. Voicemail system's software have been upgraded to version 2.4. Delayed LDAP response no longer seen in the new version of software.
Voicemail System has return back to normal performance. Engineer will continue monitor the system.
User may experiencing delay when trying to retrieve Voicemail messages via the telephone. User may experience up to 30 sec of delay after he or she enter the Passcode before the system responds / plays user's voicemail messages.
Created: 07/24/2009 17:33:39 by wxc16 Updates: 07/30/2009 17:14:39 by wxc16, 08/07/2009 22:19:50 by wxc16 The mailing list server is down because the database server crashed and Sympa can not run without a database.
Created: 07/15/2009 12:06:49 by emr Updates: 07/15/2009 15:47:12 by emr The Unix Server DB7 is currently experiencing problems and rebooted at approx. 10:45 am. This currently affects 40 Oracle databases including Blackboard, Advanced Contributions, All APEX Systems, APPWORX, Degree Audit, Dental School, KSL, ISIS, Pinnacle, Portal, MyCaseP, ONBASE, ONCORE, TeamTrack, T2Park, Serena. Server Engineering is currently working on the issue.
Created: 07/15/2009 11:41:13 by rxg263 Updates: 07/15/2009 12:04:11 by rxg263, 07/15/2009 12:10:27 by rxg263, 07/15/2009 15:10:23 by dxw134 Facility Maintenance has been notified, and
Created: 07/07/2009 11:55:30 by euw Updates: Resolved.
According to Facility, HVAC to the SER had to be shutdown to fix flooding in the building.
Created: 06/23/2009 09:04:48 by roo Updates: 07/21/2009 09:27:12 by roo Resolved and closed
2009, June 24, 08:15 AM, we restored the back-up link
2009, June 24, 06:30 AM, we restored the main link
June 24, 04:45 AM Lost several building networks because the AC failed again. Facility on the way.
June 22, 03:00 PM Hub experiencing cooling issues again. Plant services have been called. The SER AC keeps shutting down.
11:52PM: The line cards in the hubs have started experiencing temperature failure again. Called plant services to look into it. looks like the AC keeps tripping off.
03:30 am Update: Plan services is currently working on the SER cooling. The switch line cards are recovering from Temperature failure.
Investigating.
Created: 06/20/2009 02:52:45 by roo Updates: 06/20/2009 03:37:16 by roo, 06/20/2009 23:51:59 by roo, 06/22/2009 15:38:35 by roo, 06/24/2009 06:09:17 by roo, 06/24/2009 06:54:11 by euw, 06/24/2009 08:19:44 by euw, 07/21/2009 10:09:21 by roo Received report the financials process server was unavailable. Server Engeneering staff responded onsite and the server was reporting a hard disk error and hung.
Created: 06/17/2009 11:57:04 by rak7 Updates: Suspecting supervisor module failure.
Created: 06/08/2009 16:57:59 by jhm Updates: 06/08/2009 18:31:38 by roo, 06/09/2009 08:47:52 by euw [5/22/09 1:30 PM] - A reboot seems to have brought the confused disk back so the LDAP replica is back in operation. We have called the vendor to have a look at the system however and are leaving it out of any access paths for the moment in case the system dies permanently. We will close this problem report when we have a definitive closure to the problem.
Created: 05/22/2009 12:24:57 by dak Updates: 05/22/2009 13:27:28 by dak
Problem Report: VPN Unavailable
Problem: VPN Unavailable
Cause: unknown
Affects: any VPN User
Started: 09/25/2009 07:37 AM
Resolved:
Notes:
Problem Report: Wireless Network Outage
Problem: Wireless Network Outage
Cause: Unknown
Affects: Campus Wireless Network
Started: 08/31/2009 09:00 AM
Resolved:
Notes:
Problem Report: Fiji House
Problem: Fiji House
Cause: Power event
Affects: data, telephony
Started: 08/11/2009 03:20 AM
Resolved:
Notes:
Switch up. Switch config is lost. No one in the house yet,
Engineering got the analog phones working. data still down.
Problem Report: Network packet loss
Problem: Network packet loss
Cause: Unknown
Affects: All Internet Traffic
Started: 07/30/2009 09:23 AM
Resolved:
Notes:
Users are experiencing intermittent degradation in internet connection speed.
Problem Report: Case VPN Services - zero network connetivity after VPN session is established.
Problem: Case VPN Services - zero network connetivity after VPN session is established.
Cause: Unknown
Affects: Case VPN Services
Started: 07/20/2009 09:00 AM
Resolved:
Notes:
Workaround:
Disconnect VPN session and reconnect.
Problem Report: Case Voicemail Performance Degraded
Problem: Case Voicemail Performance Degraded
Cause: Unknown
Affects: Degrade Voicemail services
Started: 07/20/2009 09:00 AM
Resolved: 08/03/2009 05:00 PM
Notes:
Workaround:
1) Hang up and retry
2) Retrieve voicemail via your Emails.
Sorry for the inconvenience. Engineer is working on resolving this issue.
Problem Report: Sympa Mailing List Server is down
Problem: Sympa Mailing List Server is down
Cause: Database server is down
Affects: All mailing lists and admin aliases
Started: 07/15/2009 10:50 AM
Resolved: 07/15/2009 03:30 PM
Notes:
The database server group is working on restoring the database server.
Update: The DB was restored and mail flowing again at 2:15pm. All queued messages were delivered by 3:30pm. We are now running normally.
Problem Report: Unix Server DB7 is having problems
Problem: Unix Server DB7 is having problems
Cause: Unknown Hardware Problem
Affects: All Non PeopleSoft Production Oracle Databases
Started: 07/15/2009 10:45 AM
Resolved: 07/15/2009 02:15 PM
Notes:
The server was rebooted and all disk file systems had to be manually mounted. All of the databases were up at 2:15 pm. Diagnostic files have been sent to the vendor.
The server vendor has determined that a catastrophic memory failure in the system's main memory caused the CPU panic and subsequent crash. We will schedule an emergency maintenance outage during the maintenance window as soon as we are in contact with the vendor's server engineer.
Problem Report: KSL Data Center's CRAC DC-5 Humidity reading = 48%, setting = 44%
Problem: KSL Data Center's CRAC DC-5 Humidity reading = 48%, setting = 44%
Cause: unknown
Affects: KSL Data Center
Started: 07/07/2009 05:30 AM
Resolved:
Notes:
apprised of the situation.
Problem Report: Pathology-p3-e1 cooling issue
Problem: Pathology-p3-e1 cooling issue
Cause: HVAC issues
Affects: network equipment
Started: 06/22/2009 02:58 PM
Resolved: 07/21/2009 09:27 AM
Notes:
There is minimal impact at the moment but If prolonged high temperature in the SER, it will affect the network equipment causing potential outage to Wired, wireless, phone and security panels in pathology.
Problem Report: Bingham Hub
Problem: Bingham Hub
Cause: Cooling problem in Hub
Affects: Wired, wireless, phones, security panels for several buildings on south side
Started: 06/20/2009 02:48 AM
Resolved: 07/21/2009 10:09 AM
Notes:
between the Bingham Hub and the KSL Data Center.
between the Bingham Hub and the Crawford Data Center,
which restored all network Connections and Connectivity,
for the South Campus, with regards to the Bingham Hub;
however, the back-up link between the Bingham Hub and
the KSL Data Center, it is still down, and it will need
to be further investigated.
Jun 20 03:33:45 EDT: %C6KENV-SP-4-MINORTEMPALARMRECOVER: module 9 outlet temperature crossed threshold #1(=60C). It has returned to normal operating temperature range.
Services are coming back online.
Still monitoring.
Called Plant services to check cooling.
Suspect failed cooling in SER.
bingham-h0-e1>show mod
Mod MAC addresses Hw Fw Sw Status
--- ---------------------------------- ------ ------------ ------------ -------
1 0009.11f7.e830 to 0009.11f7.e83f 1.0 7.2(1) 8.5(0.46)RFW MinFail
2 000c.ceb5.a900 to 000c.ceb5.a90f 1.0 7.2(1) 8.5(0.46)RFW MinFail
3 000c.ceb5.aa40 to 000c.ceb5.aa4f 1.0 7.2(1) 8.5(0.46)RFW MinFail
4 0003.feac.7772 to 0003.feac.7779 2.0 7.2(1) 3.5(1) Ok
5 000c.ce63.e864 to 000c.ce63.e867 2.1 7.7(1) 12.2(18)SXF1 Ok
9 000d.6550.b866 to 000d.6550.b869 1.1 12.2(14r)S5 12.2(18)SXF1 MinFail
Mod Sub-Module Model Serial Hw Status
---- --------------------------- ------------------ ----------- ------- -------
1 Distributed Forwarding Card WS-F6K-DFC3A SAD072004CL 1.0 MinFail
2 Distributed Forwarding Card WS-F6K-DFC3A SAD072300XR 1.0 MinFail
3 Distributed Forwarding Card WS-F6K-DFC3A SAD072004BU 1.0 MinFail
5 Policy Feature Card 3 WS-F6K-PFC3A SAD072100G1 1.1 Ok
5 MSFC3 Daughterboard WS-SUP720 SAD072100JS 1.2 Ok
9 Distributed Forwarding Card WS-F6700-DFC3A SAD074805CH 1.0 MinFail
bingham-h0-e1>
Problem Report: ERP Financials server outage
Problem: ERP Financials server outage
Cause: Hardware error
Affects: Provided Financial ERP services
Started: 06/15/2009 08:00 AM
Resolved:
Notes:
Power-cycled the server and the services became available after the reboot completed. Services ran in a degraded state as the rebuild of the disk was running.
The rebuild of the hard disk failed, the hard drive was replaced, the rebuild was attempted again and failed.
Continuing to troubleshoot the issue.
Problem Report: Scholars house segemented from CCN
Problem: Scholars house segemented from CCN
Cause: Suspecting hardware failure.
Affects: No users are in this building for the summer.
Started: 06/08/2009 04:39 PM
Resolved:
Notes:
Investigating.
Problem Report: LDAP Replica (ldap-replica7) is down
Problem: LDAP Replica (ldap-replica7) is down
Cause: Seems to be a disk problem
Affects: Only people pointed directly at ldap-replica7.cwru.edu
Started: 05/22/2009 11:35 AM
Resolved:
Notes:
The disk on which the LDAP server binaries are stored appears to no longer be mounted on the system (this is a local disk). Server Engineering is looking into the issue.
The system is one of several redundant LDAP replicas. The only applications affected will be those who are pointed directly at this particular LDAP replica.
Scheduled Maintenance
