Hello all,
I'm relatively new to splunk and have been trying to correlate a series of events that occur in our logs.
2013-01-14 11:12:20,512 [71] 54110 INFO WebService RequestTypeA .......
2013-01-14 11:12:23,512 [71] 54110 INFO WebService UserLogin: Tester .......
2013-01-14 11:12:25,512 [71] 54110 INFO WebService Response .......
The log is receiving thousands of entries per minute, so the way I've been handling this manually is grepping through our log files for:
[71] 54110
Because not all of the information is ever present within a single log entry, I'd like to chain them together using the transaction command by using a field that that matches based on the following regex:
\[\d+\]\s+\[\d+\]\
Which seems to work based on what I'm seeing at: http://regexpal.com/ . Is this a scenario where I would need to create a custom field at index time?
The search I'd like to run would look for a particular type of request, look up its corresponding unique identifier ([71] 54110) and group them as a transaction. Then take that result set and look up all fields containing "UserLogin) - search might look something lie:
index="main" "RequestTypeA" | transaction [CUSTOMKEY] | UserLogin
To do this - I've modified -
tranforms.conf
[pididThreadid]
REGEX = \[\d+\]\s+\[\d+\]
FORMAT = pid_thread_key::$1$2
WRITE_META = true
REPEAT_MATCH = false
CLEAN_KEYS = 1
props.conf
[log4j]
TRANSFORMS-pididThreadid = pididThreadid
fields.conf
[pid_thread_key]
INDEXED=true
Am I on the right track here?
You can do this from the search line using the "rex" command as well. It might be easier to create two different fields because you might want to have a transaction for each pid and threadid.
Here is an example:
index="main" "RequestTypeA" | rex field=_raw ",\d+\s+\[(?<pid>\d+)\]\s+(?<threadid>\d+)\s+" | transaction threadid maxspan=15m
Do not make a custom field at index time. You will find that it is hard to manage and costly over the long run.
Use the rex
command as tgow suggests, or use a search time field extraction. These choices will perform equally well and will not cause the heartache of index time fields.
Next question: why do you need the transaction command?
Why not do this:
1. Create a field extraction for the unique identifier. Call it uid
or something.
2. Search like this
index="main" "RequestTypeA" uid="[71] 54110" UserLogin
Of course, I don't know what you want to see in your final results.
You can do this from the search line using the "rex" command as well. It might be easier to create two different fields because you might want to have a transaction for each pid and threadid.
Here is an example:
index="main" "RequestTypeA" | rex field=_raw ",\d+\s+\[(?<pid>\d+)\]\s+(?<threadid>\d+)\s+" | transaction threadid maxspan=15m
Thanks! Works great.
You most definitely can. As lisa says in her answer, you should NEVER create index-time field extractions unless you're completely sure of what you're doing and why. To the commands in the search pipeline, there is no difference in search-time and index-time extracted fields - it's just fields by the time they arrive to the search pipeline.
Thanks!
What I'm trying to do is extract an array of values that match [\d+]\s+[\d+]\ for my search criteria. Then for every value in that array link then requests back together using the transaction command.
I thought I read somewhere that you couldn't use a field extracted at search time to do this, and that's what sent me down the route I was on.
Using the request you outline, I can't seem to get the transaction piece to work.
Can you use fields extracted at search time with the transaction command?