Data migration after a storage performance issue and stopping the collector for long periods of time can cause the remote collectors to go out of sync. When the load collector is restarted, data is pulled quickly for collectors with fewer data while the ones with more data are pulled at a slower rate, causing the merged files to contain data from a broad range of timestamps.
The fragmentation slows down the load considerably and requires compacting to bring the table to a healthy and performant state.
The best alternative is to update the PTLs to load the data using the current load time as the timestamp which ensures no fragmentation no matter the time order of the incoming data. Open a support ticket for the same.
Although PTL updates are supposed to be and have been historically done by the Ignite PS team, L2 support agents may sometimes perform the changes if they have the required knowledge and skill.
Here is an attached updated PTL file with the required timestamp parsing logic, as done by an L2 agent for a customer in this ticket.
Screenshot from file:
The change introduces the current load time as TS, and the event timestamp can be now stored in a new EVENT_TS column. Calculate the current timestamp in Perl parsing rather than in the SQL part of the PTL, because the current timestamp in the SQL part will be the same value. Larger files can cause large leaves which causes uncompactable fragmentation. Doing it in the Perl section guarantees more distributed/different timestamps at the time of loading and zero fragmentation no matter the data time order.
Once the PTL is updated, you can test that fragmentation is not increasing on the table-level directory. Use this command to count fragments multiple times during collector loading:
[sensage@mlpd538 nix_sshd2_event_ts-1.d]$ zcat NODE.dat | grep LeafIndex | wc -l
The number of fragmented leaves should maintain unchanged and the load speed should return to optimal.
Please sign in to leave a comment.