Getting Data In

How to get a record count of a file under some path

prerana_jain
Explorer

How can I get a record count of a particular file under some path where more than one file exist.

Ex: host=xxxx /home/xxxx/ there are many files. I need the record count of each file present under /home/xxxx.

wmyersas
Builder

UPDATE based on comment:

If all you want to know is how many events came from each file, then just do:

index=ndx sourcetype=srctp
| where source like("%/%")
| stats count by source

This is identical to my second-provided suggestion in the initial answer.


Expanding somewhat on @kamlesh_vaghela's answer, you could do either of the following:

index=ndx sourcetype=srctp 
| where source like("%/%")
| makemv delim="/" source
| eval filename=mvindex(source,-1)
| stats count by filename

This will filter on data that has come in with a *nix filepath in its source field (all data has a source), convert the source field in a multivalue field (split on the filepath character), grab-out the filname (which will always be in the last position in the multivalue field (ie mvindex(filename,-1)) and then count based on the newly-minted filename.

Problems with this approach are that if you have the same filename in multiple paths, they'll get grouped together. That may or may not be a bad thing.

You can also just do the simpler:

index=ndx sourcetype=srctp 
| where source like("%/%")
| stats count by source

Which will count all filenames including their paths - if you need to distinguish, for example, /var/log/messages on multiple endpoints.


Of course, feel free to add more count by fields as desired/needed, eg:

index=ndx sourcetype=srctp 
| where source like("%/%")
| makemv delim="/" source
| eval filename=mvindex(source,-1)
| stats count by filename host

Or

index=ndx sourcetype=srctp 
| where source like("%/%")
| stats count by source host

If I misunderstood, and you're actually looking just to count all the file paths you're collecting from (regardless of how many files are in any given path), you can do the following (hat tip for the start of the regex😞

index=ndx sourcetype=srctp 
| where source like("%/%")
| rex field=source "(?<filepath>.+)\/[^\/]+$"
| stats count by filepath

This rexes-out the filepath (everything up to the last frontslash), and then counts by filepath

A slightly more expanded version of the previous, adding filename extraction to filepath extraction:

index=ndx sourcetype=srctp 
| where source like("%/%")
| rex field=source "(?<filepath>.+)\/(?<filename>[^\/]+)$"
| stats count by filepath filename
0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@prerana_jain

Here I have assumed that file path is available in filepath field. You need to change the field name before executing a search.
Can you please try this?

YOUR_SEARCH | 
| rex field=filepath  "\/home\/path\/(?<filename>.*)" | stats count by filename

Example:

| makeresults count=5 
| eval filepath="/home/path/File",name=1 
| accum name 
| eval filepath=filepath+name+".txt"
| rex field=filepath  "\/home\/path\/(?<filename>.*)" | stats count by filename
0 Karma

prerana_jain
Explorer

@kamalesh, I am getting the count of filename.. But I need the count of records present in each filename

0 Karma

wmyersas
Builder

@prerana_jain - see my answer

All you need to do to find out the number of events (records) in each file is:

index=ndx sourcetype=srctp
| where source like("%/%)
| stats count by source
0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...