Getting Data In

How can I determine the total amount of data received and indexed by UFs?

Glasses2
Communicator

Hi 
I am working on a query to determine the hourly (or daily) totals of all indexed data (in GBs) coming from UFs.

In our deployment, UFs send directly to the Indexer Cluster.  

The issue I am having w/ the following query, is that the volume is not realistic, and I am probably misunderstanding the _internal metrics log.  Perhaps the kb field is not the correct field to sum as data thruput?

 

 

index=_internal source=*metrics.log group=tcpin_connections fwdType=uf 
| eval GB = kb/(1024*1024) 
| stats sum(GB) as GB

 

 

 

Any advice appreciated.
Thank you

Labels (3)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

The Metrics log is a sample of events, not an audit log.

---
If this reply helps you, Karma would be appreciated.

isoutamo
SplunkTrust
SplunkTrust

Here is some comments about metrics.log

By default, metrics.log reports the top 10 results for each type.

see more from https://docs.splunk.com/Documentation/Splunk/latest/Troubleshooting/Aboutmetricslog

As you could see metrics.log don’t contains all metrics, it’s just “samples” of those. As @richgalloway already said you must use license_usage or calculate that from _raw.

 

0 Karma

Glasses2
Communicator

Thank you for your reply, do you have a method of querying to get an answer for my question?
I am not finding the key logs containing UF data thruput or ingest information.  

0 Karma

richgalloway
SplunkTrust
SplunkTrust

The most accurate method would be to add up the size of _raw for each UF (host), but that would have terrible performance.

Try using the license_usage log.  The h field is the host (UF) sending the data.

index=_internal source=*license_usage.log
| stats sum(b) as bytes by h
| eval KB = bytes/1024
| rename h as UF
| table UF KB
---
If this reply helps you, Karma would be appreciated.

Glasses2
Communicator

Thank you for the reply.  I also looked at this log but it requires curating an exact list of the UFs, bc I have some pollution, e.g. h= HFs, SC4S, etc.  The license_usage log may be the best route if I can put together a lookup of just UFs.

0 Karma

Glasses2
Communicator

 

 

 

index=_internal source=*license_usage.log earliest=-1d@d latest=now [search index=_internal source=*metrics.log fwdType=uf earliest=-1d@d latest=now | rename hostname as h | fields h] 
| stats sum(b) as total_usage_bytes by h 
| eval total_usage_gb = round(total_usage_bytes/1024/1024/1024, 2) 
| fields - total_usage_bytes
| addcoltotals label="Total" labelfield="h" total_usage_gb

 


I think this is what I wanted, unless someone thinks its inaccurate? 
Please advise.
TY

 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

If you're confident the sampling done for metrics.log will catch all of your UFs then the search looks good.

---
If this reply helps you, Karma would be appreciated.

Glasses2
Communicator

The numbers are not exact, from the DS Forwarder Management > 1275, dc(h) from metrics > 1287, and the total stats count from the final query > 1166  so its not accurate.  I will need to create a lookup of UFs.

Thank you for your support.

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Get the T-shirt to Prove You Survived Splunk University Bootcamp

As if Splunk University, in Las Vegas, in-person, with three days of bootcamps and labs weren’t enough, now ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...