I would like to do this permanently but let us try it on the command line first.
command:
source="C:\\..." | rex "^(?<client_ip>[0-9\.]+) (?<user>[0-9\-]*) (?<profile>[0-9\-]*) (?<timestamp>\[[^\]]+\]) (?<url>\"[^\"]+\") (?<http_status>[0-9\-]+) (?<bytes>[0-9\-]+) (\"[^\"]+\") (?<user_agent>\"[^\"]+\") (?<processing_time>[0-9\-]+) (?<registrant>\"[^\"]+\") (?<forward_for>[0-9\.\-]+)" | lookup registrant.csv registrant
input, apache custom log
1.1.1.1 - - [24/Apr/2013:16:10:08 +0200] "GET /someurl HTTP/1.1" 200 278356 "-" "some useragent" 123 "9999" 111.222.333.444
Lookup file:
registrant,fullname
9999,John Doe
7777,Jane Doe
UPDATE
So I tried the suggestion given.
In etc\system\lookups\registrants.csv I have
applicationid,username,email
0,Anonymous,
1234,John Doe,johndoe@xxx.com
4321,Jane Doe,janedoe@xxx.com
in \etc\system\local\props.conf I have
[registrants]
EXTRACT-fields=^(?<client_ip>[0-9\.]+) (?<user>[0-9\-]*) (?<profile>[0-9\-]*) (?<timestamp>\[[^\]]+\]) (?<url>\"[^\"]+\") (?<http_status>[0-9\-]+) (?<bytes>[0-9\-]+) (\"[^\"]+\") (?<user_agent>\"[^\"]+\") (?<processing_time>[0-9\-]+) (?<applicationid>\"[^\"]+\") (?<forward_for>[0-9\.\-]+)
LOOKUP-applicationid=registrants username
in \etc\system\local\transforms.conf I have
[registrants]
filename = registrants.csv
I am now getting error on all searches: The lookup table 'registrant' does not exist. It is referenced by configuration 'combined_wcookie'.
which I have no idea how to get rid of :(((
UPDATE
The documentation is not clear. It uses the same names for the source file fields and the csv.
I am now a bit further - one issue is that the csv is forced by splunk to be numeric values since the field I am testing happens to be a number. In the log file it is a string so I have to extract it without quotes.
I now extract ONE (1) record out of many. It is not encouraging...
PS: On windows command > file.txt 2>&1
works in normal circumstances
Does not matter where I put the redirect on my win7 box. I get a popup, a fast scrolling set of data and the popup closes - Why does splunk make me feel like I haven't been around computers since 1986!?
btool sends its output to stdout so you should be able to capture it. That said, your stderr>stdout redirect comes too late even if btool would be sending to stderr - it should be before you redirect everything into splunk.txt
thanks again. Just tried this. I cannot seem to capture the output no matter what I do $SPLUNK_HOME/bin/splunk cmd btool props list --debug > splunk.txt 2>&1 or whatever
For instance, this should prove helpful in your case:
$SPLUNK_HOME/bin/splunk cmd btool props list --debug
This will show you all props settings, and info on where they came from (that's what the --debug flag does).
Ah OK, my apologies - didn't mean to state the obvious, just figured you didn't know about the error 🙂
You might find the btool utility to be very useful when checking for stuff like this. Runs from the command line and shows you various combined configurations after Splunk has merged them from all possible places, so you don't have to look in 20 different props files for instance...you just need to check the btool output.
Yes, thanks - I realise that. the problem is finding the "somewhere" since it is not in any file so it must be in some cached setting.
You've defined your lookup as "registrants" but somewhere you're referencing it as "registrant" (lacking the s).
Thanks @chris - I'll check it after my vacation. The command line thing is very interesting seeing how hard it seems to get debug information - and the whole issue here is that the fields are only recognised if I do a | rex
"^my regex"
To refresh props & transforms search time configs you can use the following command in splunk "| extract reload=True" (or restart Splunk). If you do a search are the fields that you defined in EXTRACT recognized? This has to work for the automated lookup to work. To check if the lookup configuration is ok you can use the following splunk comman "| inputlookup registrants".
There are 2 things that i can see here the stanza in props.conf should match the sourcetype of your events you are searching it is not related to the name in props or the lookup table if this is on purpose then you are ok. The 2nd thind is the automatic lookup definition it should be LOOKUP-
If you put your registrant.csv file in $SPLUNK_HOME/etc/system/lookups your command should work and you should see "John Doe" in the fields sidebar in a field called fullname.
You will probably get an info message that looks similar to this: Assuming implicit lookup table with filename 'registrant.csv'.
The 'proper/complete' way to do this is described in Configure field lookups in the Knowledge Manager Manual. You will have to create a transforms.conf file and optionally a props.conf file for automated lookups. Those files can also be used to do the field extraction that you're doing in the command for you.
Both files can be created in $SPLUNK_HOME/etc/system/local/
To make things easier it makes sense to use sourcetypes in your searches and config files (all your inputs with the same log format should have the same sourcetype) your log looks almost like the predefined access_combined sourcetype but I do not think that the registrant field which is important for you is standard. So I suggest you set it to a custom value e.g. access_custom
transforms.conf
[registrant]
filename = registrant.csv
props.conf
[access_custom]
EXTRACT-fields=^(?<client_ip>[0-9\.]+) (?<user>[0-9\-]*) (?<profile>[0-9\-]*) (?<timestamp>\[[^\]]+\]) (?<url>\"[^\"]+\") (?<http_status>[0-9\-]+) (?<bytes>[0-9\-]+) (\"[^\"]+\") (?<user_agent>\"[^\"]+\") (?<processing_time>[0-9\-]+) (?<registrant>\"[^\"]+\") (?<forward_for>[0-9\.\-]+)
LOOKUP-registrant=registrant registrant
Please see update
That looks very promising I'll look tomorrow
If you haven't already dived into the whole of docs.splunk.com I highly recommend it. There's loads of info on subjects like this. For instance http://docs.splunk.com/Documentation/Splunk/5.0.2/Knowledge/Addfieldsfromexternaldatasources and http://docs.splunk.com/Documentation/Splunk/5.0.2/Tutorial/Usefieldlookups and http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Lookup
Short answers:
1. Lookup files go in a suitable lookups directory, for instance etc/apps/search/lookups
2. In the field "fullname".
I guess I've been doing this for too long to think of how things else would work - it's pretty straightforward as I see it (except for the idiotic aliasing syntax that's completely the opposite of how one would expect it should be). You need a lookup somewhere and you need to somehow tell Splunk that this lookup should be applied. That's not that complicated, is it? 🙂
As I've said previously the lookup will work as long as there are fields for it to work on, which is why you need a lookup command after an inline rex command.
Thank. I did read that page from end to end and that is how I uploaded the lookup. What I fail to grasp is the next step. Since my automatic extraction didn't work, the automatic use of lookup also did not work, now I am trying to test the thing with your suggested Rex and that finally started to work, but the lookup seems so unnecessary complicated - as does many things in splunk. Powerful but opaque. I will try again tomorrow. Thanks for the persistent assistance