Splunk Search

Splunk Lookups - Data Enrichment

morningwood
Explorer

Unfortunately our proxy data does not have user information. However I do have access to AV data that is able to map client IP to user information.

Via the "Lookup Definitions" link in the Splunk Manager I can setup Max and Min Offset for my "enrichment data". I see that these settings would used if my "enrichment" data is in the future. Unfortunately my enrichment data is usually one day behind. This is causing issues with the correct encrichment data being applied to the events.

Config below:

[av_lookup]   
filename = av_lookup.csv    
time_field = savreportcheckin    
lookup_table = av_lookup ip_address AS c_ip OUTPUTNEW clientuser computer savreportcheckin

Enrichment data below:

clientuser,ip_address,savreportcheckin,computer
u000000,10.0.0.0,2010-07-26 22:24:00,WP103702A740532
z000000,10.0.0.0,2010-07-27 22:23:00,WP103702A740532

If I search on this event data:

2010-07-26 22:55:09 3 10.0.0.0 200 TCP_RESCAN_HIT 2597 851 GET http www.newyorklife.com 80 - - 206.210.18.92 - "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.04506.648; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)" Financial%20Services - 127.0.0.1 -

2010-07-26 22:55:09 1 10.0.0.0 200 TCP_HIT 7268 866 GET http www.newyorklife.com 80 - - 206.210.18.92 - "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.04506.648; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)" Financial%20Services - 127.0.0.1 -

The z00000 user ID is returned. This is the incorrect user id since the event occured on 07-26 not 07-27 and the person updated there AV sigs approximately 30 mins before this event happened.

Is there a config within lookup definitions to only match "enrichment" data to events that occured within so many seconds either before or after the actual event occured ?

Thanks for the help,

Scott

Tags (1)
0 Karma

gkanapathy
Splunk Employee
Splunk Employee

I believe that you can specify negative values for max_offset_secs and min_offset_secs to restrict times to the past.

However, I'm not sure that this matters too much. I think that the documentation of these is bad because the use of "ahead" and "behind" is entirely ambiguous. I think the actual meaning (and default behavior) of these settings will work just fine for you. In other words, if the timestamp on the IP/user mapping is the time when it becomes effective, and you should use it until you see a more recent (newer) mapping (which would be the normal case), then it should work fine for you. I guess all you might need to do is set the min to some number less than zero to account for timestamp discrepancies?

I am about to file a bug on the ambiguity of the docs on this point.

0 Karma

morningwood
Explorer

gkanpathy,

I changed my settings to:

[av_lookup]
filename = av_lookup.csv
max_offset_secs = 86400
min_offset_secs = -86400
time_field = savreportcheckin

and I unable to perform any lookups at all. I thought it might of been a time_format issue, so I changed my time_format to:

time_format = %Y-%m-%d %H:%M:%S

and it still is broken.

I removed the min and max the lookups started working again. Any thoughts on a setting that I might be missing or have wrong ?

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...