This is an old revision of the document!
System Daily Maintenance
Author: Thai Tran
Exchange server
Physical Environmental Checks
- Verify that environmental conditions are tracked and maintained.
- Check temperature and humidity to ensure that environmental systems such as heating and air conditioning settings are within acceptable conditions, and that they function within the hardware manufacturer's specifications.
- Ensure that your physical network and related hardware such as routers, switches, hubs, physical cables, and connectors are operational.
Check Backups
- Make sure that the recommended minimum backup strategy of a daily online backup is completed.
- Verify that the previous backup operation completed.
- Analyze and respond to errors and warnings during the backup operation.
- Verify that the transaction logs were successfully purged (if your backup type is purging logs).
Performance
- % Processor Time.
- Available MBs.
- % Committed Bytes in Use.
Event Logs
- Filter application and system logs on the Exchange server to see all errors.
- Filter application and system logs on the Exchange server to see all warnings.
- Note repetitive warning and error logs.
- Respond to discovered failures and problems.
Exchange Database
- Check the number of transaction logs generated since the last check. Is the number increasing at the “usual” rate?
- Verify that databases are mounted.
- Make sure that public folder replication is up-to-date.
- If full-text indexing is enabled, verify that indexes are up-to-date.
- Test mailbox, verify the logon of each database and the send/receive capabilities.
MAPI Client Performance and server availability
- Examine System Monitor counters.
- Examine Event Viewer logs.
- Verify that a test account can log on to the Exchange server and has send/receive capabilities.
- Verify your Performance monitor RPC counters against a baseline - RPC average latency/RPC requests/RPC operations.
Check Queue viewer
- Check queues for each server using the Queue Viewer tool in the Exchange Management Console.
- Record queue size.
Message Paths and Mail flow
- Send messages between internal servers using test accounts.
- Check and verify that messages deliver successfully.
- Send outgoing messages to non-local accounts.
- Check and verify that outgoing messages deliver successfully. With the test account on the external host, verify that mail comes in.
- Verify successful message transfer across connectors and routes.
Security Logs
- Mail Essential and Mail Security for Exchange (these licenses are expired in 12/31/2011).
- View the security event log on Event Viewer and match security changes to known, authorized configuration changes.
- Investigate unauthorized security changes discovered in security event log.
- Check security news for latest virus, worm, and vulnerabilities.
- Update and fix discovered security problems and vulnerabilities.
- Verify that SMTP does not relay anonymously, or lock down to specific servers that require functionality.
- Verify that SSL is functioning for configured secure channels.
- Update virus signatures daily.
Note: All the backups sync to the local hard drive.
CRM, OnContact Server
- Verify that SQL Services are running (SQL Agent).
- Verify that SQL Agent jobs succeeded.
- Verify that spindles have free space.
- Verify that data and log files for each database have free space.
Check Backups
- Make sure that the recommended minimum backup strategy of a daily online backup is completed.
- Verify that the previous backup operation completed.
- Verify that full backups succeeded.
- Verify that transactional log Backups succeeded.
- Analyze and respond to errors and warnings during the backup operation.
- Verify that the transaction logs were successfully purged (if your backup type is purging logs).
Performance
- % Processor Time.
- Available MBs.
- % Committed Bytes in Use.
Event Logs
- Filter application and system logs on the SQL to see all errors.
- Filter application and system logs on the SQL server to see all warnings.
- Note repetitive warning and error logs.
- Respond to discovered failures and problems.
Note: All the backups sync to the local hard drive.
Infusion server
- Verify that SQL Services are running (ie. SQL Agent).
- Verify that SQL Agent jobs succeeded.
- Verify that spindles have free space.
- Verify that data and log files for each database have free space.
Check Backups
- Make sure that the recommended minimum backup strategy of a daily online backup is completed.
- Verify that the previous backup operation completed.
- Verify that full backups succeeded.
- Verify that transactional log Backups succeeded.
- Analyze and respond to errors and warnings during the backup operation.
- Verify that the transaction logs were successfully purged (if your backup type is purging logs).
Performance
- % Processor Time.
- Available MBs.
- % Committed Bytes in Use.
Event Logs
- Filter application and system logs on the SQL to see all errors.
- Filter application and system logs on the SQL server to see all warnings.
- Note repetitive warning and error logs.
- Respond to discovered failures and problems.
Note: All the backups sync to the local hard drive.
Time Clock Server
- Clock communication – general items.
- Clock communication – error messages.
- Clock communication – error situational problems.
- Make sure that the recommended minimum backup strategy of a daily online backup is completed.
- Verify that the previous backup operation completed.
- Verify that full backups succeeded.
- Verify that transactional log Backups succeeded.
- Analyze and respond to errors and warnings during the backup operation.
- Verify that the transaction logs were successfully purged (if your backup type is purging logs).
Note: All the backups sync to the local hard drive.
File Server
- Check application and system logs on the server to see all errors.
- Check application and system logs on the Exchange server to see all warnings.
- Note repetitive warning and error logs.
- Respond to discovered failures and problems.
- Use daily data from event log and System Monitor
- Check on disk usage.
- Check on memory and CPU usage.
- Check uptime and availability.
- List the top generated, resolved, and pending incidents.
- Create solutions for unresolved incidents.
- Check anti-virus definition updates timely.
- Check server and network status for the overall organization and segments.
- Check organizational performance and availability.
- Check risk analysis and evaluation including upcoming changes.
- Check capacity, availability, and performance reviews.
- Review items that have not met target objectives.
Note: Backup on this server is sync to the NAS
Spark Server
- Check disk space availability.
- Check status of backups.
- Check that the pmon process is running.
- No changes to /etc/passwd /etc/shadow /etc/hosts /etc/group.
- Check the latest entries in the logs.
Note: Manual backup users/groups from the web GUI
SWdev Server (Software Development)
- Check disk space availability.
- Check status of backups.
- Check that the pmon process is running.
- No changes to /etc/passwd /etc/shadow /etc/hosts /etc/group.
- Check the latest entries in the logs.
Web Server (www)
- Check disk space availability.
- Check status of backups.
- Check that the pmon process is running.
- No changes to /etc/passwd /etc/shadow /etc/hosts /etc/group.
- Check the latest entries in the logs.
Note: Backup sync/mirror to the internal drive and NAS.
Emulator Server (BoSanova)
- User manual and installation procedures: \\NAS\public\Software.apps\ES.server.Bosanova\DOCS
- Check for emulator server services are running.
- Check for users’ connectivity.
Router/Switches/Firewall gateway
- Check system monitor, CPU usage, uptime, disk usage, system load, and performance.
- Check web security, black list, custom sites, and policies.
- Check and monitor remote user/VPN settings and logs.
- Assign and adjust network configuration settings related to the IP addresses were given are met.
- Check for system logs, error messages, and system diagnostics to analyze the network connectivity.
Suggestions
- Need to re-design a new network infrastructure for better productivity, connectivity, eliminate downtime, and point of failures.
- All production servers need to be replaced at least once every five years.
- Need to replace all the home build servers: Infusion, Oncontact, and Timeclock. These servers do not have hardware redundant functionality to handle production environment.
- Need to rebuild and replace fileserver because of hardware failure and running out of space.
- Need to rebuild and upgrade exchange server to exchange 2010 with backup and restore software licenses.
- Need a new gateway router that can monitor Audina bandwidth, productivity, and threats from the outside world.
- Need new network switches.
- Need to re-wire the whole network infrastructure.
- Need to install a patch panel.
- Eliminate all the small network switches, this will cause the slowness and bottleneck of the network.
- Need to replace all QC computers except Sherry’s computer.
- Need to have a better Internet bandwidth for better productivity.
NOTE: These suggestions had been told and mentioned when I first started from day one. Keep in mind; my intentions here are to protect Audina’s data. — Thai Tran 2011/10/28 12:19