Below is my sample log format
%timestamp% com_java_package1.subpackage someMessage exceptionMessage
%timestamp% someText com_java_package2.v1.subpackage exceptionMessage
%timestamp% com_java_package3_v2.subpackage exceptionMessage
%timestamp% someText someOtherText someVeryBigText com_java_package4.subpackage someMessage exceptionMessage
Usage 1:
index=someIndex sourcetype=someSourceType (packageName=com_java_package1 OR packageName=com_java_package2)
Usage 2:
index=someIndex sourcetype=someSourceType ("com_java_package1" OR "com_java_package2")
The logs are in a very bad shape where I cannot write a generic regex to extract packageName field.
It requires lot of effort to put all combination to extract the packageName field.
Now my question is - do I really need field extraction for packageName?
Is there any potential benefits in performance of above usage over the other?
You don't need field extractions if all you do is an event search for these log entries. As soon as you want to do any kind of statistics functions other than counting events, you'll need it.
Also, the raw text search will find the strings you are looking for anywhere in your log messages, so you may get events you don't want.
If you are sure that the package name is the only string of characters with the format abc.def.ghi...., you should be able to do a pretty simple RegEx to pull it out genrically. It depends on how the rest of your log events look like.