hi @Shashank_87
This is going to be a long post, but consider it.
Firstly, a bit on your GC logs. I don't think there are 2 different types of GC algos here. The first one, parnew is the eden generation clean up and the second one is CMS on the oldgen. So basically, these logs are from one single application/process on a server.
In CMS, eden generation clean up IS a stop the world event. So for your questions, I suggest the following:
1- Frequency of GC : Divide into 2 , eden(young) gen clean up and oldgen cleanup
so you use something like this |rex field=_raw "gcName:(?<eden>.*)"
And now, if you run the search for, say the last 1 hour, |stats count(eden) gives you the frequency of eden clean up for the last 1 hour. What time period you choose for your frequency calcs are best left to you. In the same way you can extract the frequency for the second set, the oldgen
2- Time taken by GC : I can see 2 fields here, colTime & gCDuration, which one do you want to use to determine the time taken by for 1 cleanup cycle? Assuming you use coltime , your code will be something like this - |rex field=_raw "colTime:(?<time>.*)" BUT i have doubts on this values, GC times are generally in milliseconds, in which case the value of your oldgen clean up time is too high! Your application will burst. So instead you might want to use the gcduration. This is specific to your JVM algo print out and Splunk can not inform you on what field / how your GC logs capture the GC duration.
3- Other stuff: Asuuming you come this far, there are a lot of stuff on analytics that you can run , for example you can run a predict command based on the coltime. So if i use something like this - |rex field=_raw "gCEnd:(?<heap>.*)" I can append a timechart to this |rex field=_raw "gCEnd:(?<heap>.*)"|timechart span=15min avg(heap) as avg_heap|predict avg_heap AS predicted_heap algorithm=LLT upper90=high lower90=low future_timespan=10 holdback=10
This will predict your next 10 gc clean up times (for a 15 mins interval, that comes to the next 2.5 hrs). To read more on the usage of the predict command and to customize your algo refer here - https://docs.splunk.com/Documentation/Splunk/7.3.1/SearchReference/Predict
I will still recommmend LLT , since this is a case which is not seasonal/cyclical but has a clear trend. You can then set a threshold of say 3-5GB as your total available heap and peer into 2.5 hrs in the future and see if your application memory is about to get exhausted.
Remember, each time a GC clean up happens the predict command will take the lower value of the heap into consideration.
Let us know how it goes 🙂
... View more