Restoring Normal Collector Loading After Any SLS Cluster Incident or Restart

Overview

This article helps users fix the collector failing to load logs after an SLS cluster restart

 

Environment

All Sensage AP versions 

 

Requirements

Sensage AP full deployment with loading collector

 

Root Cause

Collector loader process unresponsive after the SLS restart

 

Resolution

  1. Check for SLS corruption that might be affecting the loading
  2. Check on the live collector error logs to find the following message appearing every 15 seconds for the affected loaders:

    2018-08-29 15:33:25 local1.err blph841.bhdc.att.com collector[109518]: PARTS:7f464ffc0d7973c7d34895591a97fa17:0003:0001 2018-08-29T15:33:25+0000|109518|1|C2020|Controller::spawnLoader|line 794|ERROR [C2020]: <Collector - spawn loader> A Loader is not responding. [A Loader has disappeared and will not be respawned.  May have exited on error or been killed.  (name: l_microsoft_windows2008_iptv_load)]\nError stack:\nX at /opt/app/sensage/latest/bin/../lib/perl/Addamark/Log.pm line 349\n    Addamark::Log::log_error('C2020', 'l_microsoft_windows2008_iptv_load') called at /opt/app/sensage/latest/bin/../lib/perl/Addamark/Collector/Controller.pm line 794\n    Addamark::Collector::Controller::spawnLoader('Addamark::Collector::Controller=HASH(0x3492b10)', 'l_microsoft_windows2008_iptv_load') called at /opt/app/sensage/latest/bin/../lib/perl/Addamark/Collector/Controller.pm line 693\n    Addamark::Collector::Controller::manage

  3. Validate if the SLS was restarted or had an issue prior to a collector restart
  4. Rename the .noload files to .log files and restart the collector to resume correct loading

 

Validation

The files will load correctly after the loader restarts

 

Content Author: Miguel Molina

Comments

0 comments

Please sign in to leave a comment.