I have to extract the same features from two sets of logs with very different formats and need to take the additional features into account to shortlist the logs. Let me explain the case with an example,
LOG_TYPE_1 || field_1 || field_2 || field_3............. || field_9
LOG_TYPE_2 || field_a || field_1 || field_2 || field_b || field_c || field_3...........|| field_9
I have to filter LOG_TYPE_2 | where field_a="type_a"
Now for both these I have to take Log_type, field_1, field_2, field_3, field_9 from both and then continue with the rest of the query in common.
For above case how can I create two rex/regex and do above Splunk query in a single search string (or most efficient manner) rather than the time consuming lengthy JOIN otherwise.
P.s. There are many other types of logs in the data. I only need to use the above 2 for the purpose.
Below should work. It pulls in both data sets by putting an OR
between the two strings to search for. Then performs the 2 rex
commands, either of which only applies to the event type it matches. Then we want to take all the events from the first log type plus the events from the second type that match field6 = "direct"
.
index=* host=* "LOG_RESPONSE" OR "LOG_QUERY"
| rex ".*LOG_RESPONSE \|\| (?<id>.+) \|\| (?<sequence>.+) \|\| (?<field1>.+) \|\| (?<field2>.+) \|\| (?<field3>.+) \|\| (?<field4>.+) \|\| (?<result>.+).*"
| rex ".*QUERY \|\| (?<id>.+) \|\| (?<sequence>.+) \|\| (?<field1>.+) \|\| (?<field2>.+) \|\| (?<field3>.+) \|\| (?<field4>.+) \|\| (?<field5>.+) \|\| (?<field6>.+)\|\| (?<result>.+).*"
| search "LOG_RESPONSE" OR field6 = "direct"
If there are nicer ways to recognize the "LOG_RESPONSE" events, rather than from that string, you can change the | search ...
part accordingly.
Hi AshimaE,
if the different logs are related to different sourcetypes, you could try to extract a field for each sourcetype (also using the same name) but using different regexes.
If instead all the logs have the same sourcetype (not a good configuration!): you could extract two fields with different regexes and then merge them using the coalesce function, something like this:
| eval my_field=coalesce(my_field1,my_field2)
Bye.
Giuseppe
I believe it'll be helpful for us to have some real data and corresponding sample search (if you'd extract fields from one log type only).
Individual rex are as follows
index=* host=* "LOG_RESPONSE" | rex ".*LOG_RESPONSE \|\| (?<id>.+) \|\| (?<sequence>.+) \|\| (?<field1>.+) \|\| (?<field2>.+) \|\| (?<field3>.+) \|\| (?<field4>.+) \|\| (?<result>.+).*"
index=* host=* "LOG_QUERY" | rex ".*QUERY \|\| (?<id>.+) \|\| (?<sequence>.+) \|\| (?<field1>.+) \|\| (?<field2>.+) \|\| (?<field3>.+) \|\| (?<field4>.+) \|\| (?<field5>.+) \|\| (?<field6>.+)\|\| (?<result>.+).*" | where field6 = "direct"
and I had done the rest of the processing individually thereafter which is common for both.
Is it possible to combine the above two rex in some manner in a single query without using JOIN.
Agreed, I find it very hard to follow what exactly you are trying to achieve and without something that looks like the actual data it's even harder to make sense of this.