Splunk Search

Splunk field-extraction usage performance

Venkat_16
Contributor

Below is my sample log format

%timestamp% com_java_package1.subpackage someMessage exceptionMessage
%timestamp% someText com_java_package2.v1.subpackage exceptionMessage
%timestamp% com_java_package3_v2.subpackage exceptionMessage
%timestamp% someText someOtherText someVeryBigText com_java_package4.subpackage someMessage exceptionMessage

Usage 1:

index=someIndex sourcetype=someSourceType (packageName=com_java_package1 OR packageName=com_java_package2)

Usage 2:

index=someIndex sourcetype=someSourceType ("com_java_package1" OR "com_java_package2")

The logs are in a very bad shape where I cannot write a generic regex to extract packageName field.
It requires lot of effort to put all combination to extract the packageName field.

Now my question is - do I really need field extraction for packageName?
Is there any potential benefits in performance of above usage over the other?

0 Karma

s2_splunk
Splunk Employee
Splunk Employee

You don't need field extractions if all you do is an event search for these log entries. As soon as you want to do any kind of statistics functions other than counting events, you'll need it.
Also, the raw text search will find the strings you are looking for anywhere in your log messages, so you may get events you don't want.
If you are sure that the package name is the only string of characters with the format abc.def.ghi...., you should be able to do a pretty simple RegEx to pull it out genrically. It depends on how the rest of your log events look like.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...