For a customer quite a few Linux servers had to be monitored. During the roll-out of these OM12 SP1 Agents to the Linux systems several errors popped up. Thanks to a highly experienced Linux guru working for this customers these issues were sorted out pretty fast. Based on this experience I have made a table with the most occurring errors and their possible causes and their fixes.
|Issue||Cause & Resolution|
|DNS Configuration error||01: Faulty reverse DNS Lookup Zone. When fixed all went just fine|
02: Linux system had multiple names, all registered in DNS. After a couple of retries the Agent landed properly.
03: System resided in an old segment which didn’t have a zone on the new DNS servers. When fixed all went just fine
|Failed during SSH Discovery||01: SSH was locked down to ROOT only. When fixed for the OM12 SP1 account used by Linux all went just fine.|
02: An outdated version of SSH which isn’t compatible with the .NET SSH implementation Microsoft uses on the OM12 SP1 side. SSH requires an update.
03: An outdated version of SSH which doesn’t accept certain SSH calls. SSH requires an update.
|Failed to install kit||01: Home folder of the OM12 SP1 Linux account was missing. After having added this folder all went just fine.|
02: Certain files were locked. When retried the installation of the OM12 SP1 Agent some hours later all went just fine.
|Installation hangs||On some systems the installation of the OM12 SP1 Linux Agent just hanged. Had to hard stop the OM12 SP1 Console. Then a second attempt went just fine.|
|Unexpected Discovery Result||01: Reason unknown. Second attempt (some hours later) ran just fine.|
02: A restart of the OM12 SP1 services on the OM12 SP1 MS running the Discovery (be careful though): http://www.opsman.co.za/?p=50
|WinRM cannot complete the operation||Firewall was blocking WinRM service. After having opened that port (TCP 1270) it still didn’t work. See this posting to get it working: http://blogs.technet.com/b/chandanbharti/archive/2011/12/21/linux-agent-install-issue.aspx|
|Agent verification failed||Multiple DNS issues:|
1: Linux system has a different hostname compared to the FQDN. Correct it (hostname or FQDN) and all is just fine.
2: DNS record isn’t present. Add the record and all is just fine.
Other resources for troubleshooting OM12 SP1 UNIX/Linux Agent installation issues:
- Bob Cornelissen: http://www.bictt.com/blogs/bictt.php/2011/05/29/scom-trick-15-cross-platform
- Microsoft TechNet Wiki: http://social.technet.microsoft.com/wiki/contents/articles/4966.troubleshooting-unixlinux-agent-discovery-in-system-center-2012-operations-manager.aspx
- Stefan Roth: http://blog.scomfaq.ch/2012/09/11/scom-2012-linux-discovery-unspecified-failure/
- Enabling logging and debugging in OM12: http://technet.microsoft.com/en-us/library/hh212862
- Microsoft TechNet – Trouble shooting UNIX/Linux monitoring: http://technet.microsoft.com/en-us/library/hh212885
Other useful resources, all related to UNIX/Linux monitoring with OM12: