Getting Data In

lookup tables: 2 sources fileds aliased to match 1 lookupfield

borgy95
Path Finder

I have some very large lookup tables for known bad domains.(4m+ entries)

the lookup has a field called 'kap_chk' which i want to match two searchtime extractions with. one is kap_uri the other kap_tld.
To get them both to match i figured id use an alias for them; kap_chk.

Therefore my props.conf and transform.conf would look like this:
props.conf

[access_combined_wcookie]
REPORT-kap_uri = kapersky_uri
REPORT-kap_tld = kapersky_tld
FIELDALIAS-kapersky_uri = kap_uri AS kap_chk
FIELDALIAS-kapersky_tld = kap_tld AS kap_chk

transforms.conf

[ss2url_lookup]
filename = ss2url_lookup.csv
case_sensitive_match = false

[ss1url_lookup]
filename = ss1url_lookup.csv
case_sensitive_match = false

 [kapersky_uri]
 SOURCE_KEY = uri
 REGEX = (?:( http\:\/\/www\.|\w+:\/\/|www\.|)(?<kap_uri>.+))

 [kapersky_tld)
 SOURCE_KEY = uri
 REGEX = (http|https|ftp)?(:\/\/)?(www\.)?(?<kap_tld>.+?)(\/|:).*

In you're opinions is this the correct way to go about achieving this?
in the end it should mean i can do the following:

sourcetype="access_combined_wcookie" AND kap_chk="*" | dedupe kap_chk, | lookup ss1url_lookup kap_chk OUTPUT masktype maskid kap_chk | table masktype, maskid, kap_chk

thanks

0 Karma
1 Solution

woodcock
Esteemed Legend

You have not stated your goal but I assume it is that if kap_uri and kap_tld are the same, only do the lookup once, then this will work efficiently:

sourcetype="access_combined_wcookie" | eval kap_chk=if(kap_uri==kap_tld,kap_uri,kap_uri . ":::" . kap_tld) | makemv delim=":::" kap_chk| mvexpand kap_chk | dedup kap_chk, | lookup ss1url_lookup kap_chk OUTPUT masktype maskid kap_chk | table masktype, maskid, kap_chk

`

View solution in original post

0 Karma

borgy95
Path Finder

My aim is to do a search time extraction on the uri field as defined by the custom field kap_uri and kap_tld. I would then like to alias these fields to kap_chk so both fields can be matched against the lookup table field kap_chk.

0 Karma

woodcock
Esteemed Legend

You have not stated your goal but I assume it is that if kap_uri and kap_tld are the same, only do the lookup once, then this will work efficiently:

sourcetype="access_combined_wcookie" | eval kap_chk=if(kap_uri==kap_tld,kap_uri,kap_uri . ":::" . kap_tld) | makemv delim=":::" kap_chk| mvexpand kap_chk | dedup kap_chk, | lookup ss1url_lookup kap_chk OUTPUT masktype maskid kap_chk | table masktype, maskid, kap_chk

`

0 Karma

woodcock
Esteemed Legend

The documentation says:

Note: Splunk Enterprise's field aliasing functionality does not currently support multivalue fields.

This probably also means that if your events have both fields, then kap_chk will only be set once to one of them (probably the first time and then the second alias will be thwarted by the fact that kap_chk already exists). But even if you can create a multi-value field with FIELDALIAS, It still will not work with the lookup; you will still have to pass the stream through mvexpand. If you need something "automatic-ish" then I suggest creating a macro out of my solution and then always calling the macro.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...