Splunk Search

Log Field extraction When should i do it?

linu1988
Champion

Hello Guyz,
I have to extract around 30/40 fields from logs and monitor them. They are well formatted and can be extracted easily through regex. I am just concerned where should i do it?

While indexing the logs or while searching? I mean keeping an eye on performance.

Sample Data

[Date][PreciseTime][Time][Pid][Tid][SrcFile][Function][TransactionID][AgentName][Resource][User][Group][Realm][Domain][Directory][Policy][AgentType][Rule][ErrorValue][ReturnValue][ErrorString][IPAddr][IPPort][Result][Returns][CallDetail][Data][Message] 
[====][===========][====][===][===][=======][========][=============][=========][========][====][=====][=====][======][=========][][][][==========][===========][===========][======][======][======][=======][==========][====][=======]

where "===" -> the data. It may have or may bot have value.

| rex field=_raw "\[(?<DDD>[\d\/]+)\]\[(?<DDD1>[\d:\.]+)\]\[(?<DDD2>[\d\:]+)\]\[(?<DDD3>[\d]+)\]\[(?<DDD4>[\d]+)\]\[(?<DDD5>[A-Za-z_\.\:\d]+)\]"|table DDD,DDD1,DDD2,DDD3,DDD4,DDD5

Just Planing the regex as well for them. Is that okay to set while indexing. And how do i mention something in the [] than [A-Za-z_.:\d] where i may miss some character?

Any kind of suggestion is welcome.

Thank you

0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

In almost every case you'll want search time extractions, simple ones as EXTRACT-foo and more complex ones as REPORT-bar with a corresponding transforms.conf stanza [bar]. Only use indexed fields if you have a good reason to, such as values that commonly exist outside a field killing searchtime filtering performance.

As for your character classes, consider using [^]]* for your data fields to match until before the closing square bracket.

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

In almost every case you'll want search time extractions, simple ones as EXTRACT-foo and more complex ones as REPORT-bar with a corresponding transforms.conf stanza [bar]. Only use indexed fields if you have a good reason to, such as values that commonly exist outside a field killing searchtime filtering performance.

As for your character classes, consider using [^]]* for your data fields to match until before the closing square bracket.

linu1988
Champion

I need to put stats from the extracted the fields from the logs. As you suggested i will go with search time extraction seems flexible and i will see if there is frequent use i will schedule the search. Thank you for your help.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Indextime field extractions will put some load on your indexer, yeah - but the bigger disadvantage I see is that you lose the flexibility of Splunk's schema-on-the-fly searchtime extractions.

As for dashboards, those launch regular searches so it doesn't matter much if a search is on a dashboard or not. If you have a high number of users frequently loading the same dashboard with identical searches you're often better off just scheduling the searches behind the dashboard.

What's best for your case depends on your case though.

0 Karma

linu1988
Champion

Thanks Martin. So how if i do it in index time, will the load on the index will be more? And when the extraction happens at search with every use is it a good approach for dashboards? I have no intention of summarizing them as they would be just reference for 1-3 days

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...