Category Archives: File Server

24 Hour Faculty/Staff Novell Hardware Upgrade Outage

Info Systems proposes to install new hardware for the Faculty/Staff Novell server (FS) on Tuesday, 10/21/2003 during the Fall Semester break. This process will begin with a full quiescent backup (i.e. all users logged out of the server). All users should be logged off the Faculty/Staff Novell server by 6:00pm on Monday evening, October 20.

The installation process will take most of the day on Tuesday, 10/21. Users should NOT expect to be able to access the FS network disk space (i.e. faculty/staff P: G: Z: drives) during normal working hours on Tuesday, 10/21.

While the actual hardware upgrade involves only the FS server, a precautionary full Novell Systems backup will be performed between 6:00am and 9:00am on Tuesday, 10/21. During this time ALL Novell servers, will be off-line which means that the EMU email and calendar systems will be unavailable during this time (i.e. email and calendar access authorization services are provided via LDAP through the Novell NDS service).

– – – – – – – – – – – – – SUMMARY – – – – – – – – – – – – –

F/S Novell Server
UNAVAILABLE 6:00pm Mon, 10/20 thru 11:30pm Tue, 10/21

ALL Novell Servers
UNAVAILABLE 6:00am thru 12:00pm Tue, 10/21

EMail and Oracle Calendar Servers
UNAVAILABLE 6:00am thru 12:00pm Tue, 10/21

– – – – – – – – – – – – – – – – – – – – – – – – – – – – – –

IMPORTANT NOTE: A more detailed description of the outage timetable is available on a web page here. You are urged to view click the link to see the detailed information.

===========================================================
Outage Follow Up Summary:

Delays were encountered from the start on Tues morning (10/21) due to problems with the overnight backup procedures plus an unexplained crash of the FSAPPS server. By noon the FS migration process was at least 3 hours behind schedule.

Migration process was started about 12:30pm and finished about 6:30pm. While doing various validation and maintenace procedures a significant error was discovered about 7:30pm (-608, Volume ID number conflict). Research was done and a few procedures attempted without success. Jeremy Good was called and joined the effort about 9:00pm. This lead eventually to placing a support call to Novell (available only because Jeremy has a CNE designation). By 10:15pm a Novell engineer was available and soon identified the problem — a known flaw in the architecture of the Migration Wizard. He provided a procedure which successfully corrected the problem.

By 11:30pm the system was stable but it was the end of a very long day, particularly for Dan Maple Jr. Sophos and backup software were then installed and a server image was done along. The final procedure was to launch a one-time full backup to tape.

One user item needed to be recreated (Brenda Fairweather) because her user record was purposely deleted as part of the testing procedure in the effort to find a fix to the -608 error. Voicemail was left with her alerting her to the need to use our assigned password and then go change it to one of her liking.

— Management summary by Jack Rutt —

FS hardware migration outage summary

Outage Follow Up Summary:

The backup for FS didn’t occur as expected. Seems Novastor virtual expects weekly tapes if the job is defined as a weekly no mater what day the job occurs on. Got FSAPPS running again (crashed for some reason), and finally got a full DSREPAIR to work on it. Downed all servers but NS, imaged the sys partitions to NS. Downed NS and imaged it to FSAPPS.

Brought all servers back online, started a “real” tape backup of FS. After it completed, started the migration process. The migration process seemed to go relatively smooth. After it completed around 6:30, post migration procedures were started. DSREPAIR failed with a 608 error when “checking volume objects and trustees”. This occurred because of an objectID conflict between the volume and another object. Of course this was the case for ALL the volumes. 1 was in conflict with a container, the other were in conflict with users. After various procedures and research on Novells site failed to correct the problem, Jeremy Good came in and we contacted Novell. Trevor stated the “error” was known and the best way to resolve the problem was by deleting the user objects and recreating. (yea right). This was tried with one user – it worked. He then found a utility to change the objectid of a volume. “voleid.nlm” This was tried and worked on the other volumes. Using this utility, we changed the volume object id’s 1 at a time then ran dsrepair (check volume id’s and trustees), then repeated the cycle for the rest of the volumes. It was stated that this NLM doesn’t appear to work for NSS volumes – luckily we have traditional volumes. This process was completed around 11:30PM.

Novastor, pwrchute, and sophos were added to the FS server and appeared to run fine. Changes were made to the autoexec.ncf to allow for automatic loading of these and a few other Novell nlms. An image was made to FSAPPS and a full backup was started to “real” tape. What had yet to be done was go through the “Performance tuning” tid, and check a few other parameters.

— technical summary by Dan Marple Jr. —

NOVELL SERVERS: 10 Minute Outage for REBOOT, 5:30pm, Today

A problem has been identified following the Novell software upgrade of Saturday that will now require a reboot of all the Novell servers.

Info Systems will reboot the Novell servers at 5:30pm today. A message to save any open documents and logoff Novell will be broadcast 10 minutes before the servers go down. The outage will last about 10 minutes.

This only affects network disks, network printers and authentication to email and calendar.

If users do NOT logoff of Novell by 5:30pm it would be best to reboot their computers in order to properly reconnect to the Novell servers after the outage.

Novastore backup problems

The Novell backups failed after the SP6 Novell upgrade.

The Novastor “nnadmin” was unable to see the ultrium tape or the other Novell servers.

After searching the Novell and Novastor websites a TID was found that stated: In Novastor, communication between Novell servers may fail after the installation of SP6. If this occurs you must install the latest TCP stack drivers found on Novell’s website.

These were installed and seemed to take care of the Novell communication problem. HOWEVER, we still had the problem with the ultrium tape drive.

After much digging and trying different/updated drivers we found that in “NNADMIN” the wrong/corrupted SCSI driver was being referenced for the ultrium tape drive. This lead us to find out the correct driver was not being loaded due to a
“invalid slot” failure. When loading the driver manually, it stated which slots must be referenced. After correcting the slot value, all seemed to work well.

NOTE:
During this testing we found out that if you change the load order of drivers in the “startup.ncf” file, NNADMIN assigns a different value to those drivers. This will cause errors in the backup because the device you selected to backup to will no longer exist.

Currently the adpt160m.ham driver is the only updated driver. If there are backup problems, this will need to be rolled back.

Novell Server Service Pack 6 Upgrade

Novell Service Pack 6 needs to be installed prior to hardware upgrade migration process planned in late October or early November.

The planned outage for this Service Pack upgrade is Saturday morning, 10/04/03

Systems that will be UNAVAILABLE:
Novell Network Disks/Printers (FS, ST, FSAPPS, STAPPS)
Novell NDS Authentication Services:
– EMail (all clients)
– Calendar

Systems that WILL REMAIN available:
EMU Web Server (www.emu.edu)
Blackboard (http://bb.emu.edu)
Sadie (Library Catalog System)
AS400 (CampusWeb)

The installation is expected to take 6 to 9 hours. The process will begin about 6am.
——————————

Close-out Note as of 1:00pm, Saturday 10/04/03:

This upgrade process went very well. All servers returned to service by 11:30am, Saturday, 10/4/03.

Thank you, Dan Marple Jr, for doing a superb job of planning and implementation of this service pack upgrade that involved 6 production Novell servers. This is probably a first to have *NO GLITCHES*!

— Jack Rutt —

= = = = = = = Follow up Comments = = = = = = = =

In preparing for the SP6 upgrade I ran DSREPAIR. This gave me errors on 4 servers. This error was a -771 in reference to “error initializing schema cache”. No matter what I did I couldn’t resolve the error. Following TID 10063329 I did the following on the NS server:

Downed it,
restart using “server -ndb”
ran a local repair
down
restart normally.

This seemed to resolve the errors. I decided not to do anything further till Sat morninng. On Sat Morning I bounced all servers and ran a DSREPAIR on all of them. No errors were reported. Why – I’m not sure.

What was accomplished this outage:

1. Install SP6 on the following servers: NS, LD, FSAPPS, FS ST STAPPS.
2. Image all servers before and after upgrade.
3. No drivers were changed during the upgrade.
4. No files were backed up using the SP backup routine.
5. FSAPPS was upgraded to 2GB memory.

— Dan Marple Jr. —

Novell server FSAPPS was crashing

FSAPPS file server was behaving irradically. When trying to run DSREPAIR and do an unattended repair, the nlm hung.

Other access seemed to be normal, so nothing was initially done. AFter an hour or so, things started to slow down, bindery connections were not permitted, and other “funky” things.

All users of goldmine were notified to log off the system.

The server was NOT able to be brought down gracefully. Meaning the server was powered down, all volumes were “verpaired”, and the server restarted.

All seemed normal after that.

The reason for this could have been because of:
1. Not enough memory,
2. Memory incompatibility – 2 512 chips and 2-128 chips – Novell recommends all chips be the same.

— Dan Marple Jr. —

Faculty/Staff/Student Novell Server NIC Upgrades

All Novell servers will be down while new gigabit NIC cards are installed.

Added the gig card driver to the FSAPPS server. Removed the old 10/100 card. Everything went fine. The FS server did not have the driver loaded. The Card was added and wouldn’t load. Tried a different card in a different slot and it still wouldn’t load. Added old 10/100 card back into the machine and went back to using it. Will research why the drivers to the gig card will not load.

Due to the time needed for the FS server, the Student servers were not changed and were brought back on line around 8:00AM.