Reporting

How to get Capacity planning and Availability report in Splunk for servers?

ansif
Motivator

How to get Capacity planning and Availability report in Splunk for servers?

I wish to get reports on Servers Availability and Capacity planning. Is there any readymade search available for those reports?

Tags (1)
0 Karma

lakshman239
Influencer

Capacity planning is very broad. From nix and windows TA, you could monitor and trend the CPU, memory, disk utilisation a period of time [ say daily, monthly, yearly]. You could then, based on your goals, decide to procure additional hardware or disk when constantly you are reaching your threshold [ e.g. disk usage is more than 75%].

Also, you can group the performance by applications, eg. web servers usage, database usage and decide to procure servers only for them and increase your scale/availability.

0 Karma

ansif
Motivator

Can I have a sample search to achieve this using avg values of cpu,memory and disk?

0 Karma

skoelpin
SplunkTrust
SplunkTrust

I actually just built a Capacity Planning solution for my organization. It's using machine learning to forecast when a server or cluster will run out of disk and doubles up as a "what if machine" so the user can go through scenarios to see what happens if they remove 10TB from this cluster, when will it run out of disk. You can also enter any future date and it will give you disk usage at that date.

You should first define what your future state will be and what you want to accomplish.

0 Karma

gowtham495
Path Finder

I'm working on a similar problem @skoelpin . Could you please elaborate your approach of solving this.
Problem i'm facing is like, single host has multiple mounts(C:, D:, etc). my approach is good when servers list is less but when # of servers increase, it's difficult.
Thanks is advance.

0 Karma

skoelpin
SplunkTrust
SplunkTrust

Sure, I had the same problem. We had to figure out a clever way to scale this. We achieved this through a few methods.. We first started with a single drive and 5 clusters with a total of 15 servers. I created 2 total lookup files, the first one with host values to drive the first dropdown, when the user selects the app, it dynamically populates the second drop down so the user can select a single host or an aggregate of the cluster. The second lookup table holds a row for the host, slope, y intercept, and drive letter. Anytime disk is purged or added, the y intercept value will change but the slope will remain constant.

When we started to scale, we had to reduce our dependency on the lookups because it was getting difficult to maintain these values across hundreds of servers. We found a way to dynamically populate the slope value and created an additional dropdown for drives so we could do multiple drives per host.

Another approach we took to match the model name to the host value selected was to use a good naming convention for the model names. So if the user were to select a hostname in the dropdown, that hostname will be passed to the model name and will look like this | apply Forecasting_$HOST$.

One last word of advice.. Create short feedback loops to judge accuracy. You gotta be confident in the results your getting from the forecast so creating a few panels dedicated to accuracy is important

ansif
Motivator

Can I have a sample search to achieve this using avg values of cpu,memory and disk?

0 Karma

skoelpin
SplunkTrust
SplunkTrust

Sure. The SPL below just does disk, but you can easily add cpu and mem with additional counters.

index=xxx host=xxx   sourcetype="Perfmon:FreeDiskSpace" counter="% Free Space" OR counter="Free Megabytes" instance=G:
| eval FreeGB=FreeMBytes/1024
| eval Free_percent=100-storage_used_percent
| timechart span=1d min(FreeGB) AS FreeGB min(Free_percent) AS Free_percent
| eval Used_percent=100-Free_percent
| eval Total_Cap=100*(FreeGB/Free_percent)

Next I created a timeshift so I could create empty buckets for future values then fed it into the MLTK to fill the empty buckets with the (slope + the previous value) to get future forecasted values. The "what if" part comes from adjusting the y intercept value.

| makeresults count=100000 
| streamstats count as count 
| eval earliest_time=now() 
| eval time=case(count=100000,relative_time(earliest_time,"+100000d"),count=1,earliest_time) 
| makecontinuous time span=1d 
| eval timeAsANumber=time 
| eval _time=time 
| eval time_human=strftime(time, "%Y-%m-%d %H:%M:%S") 
| fields + time 
| append 
    [| search
index=xxx host=xxx   sourcetype="Perfmon:FreeDiskSpace" counter="% Free Space" OR counter="Free Megabytes" instance=G:
    | eval FreeGB=FreeMBytes/1024
    | eval Free_percent=100-storage_used_percent
    | timechart span=1d min(FreeGB) AS FreeGB min(Free_percent) AS Free_percent
    | eval Used_percent=100-Free_percent
    | eval Total_Cap=100*(FreeGB/Free_percent)]
0 Karma

lakshman239
Influencer

Nope. You would need to build one based on your needs.

0 Karma

ansif
Motivator

Do you have a suggestion over Capacity planning? I am using both unix and windows addon to get memory,CPU and disk utilization.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...