Splunk Search

What is the best way to solve this key pair field value issue?

sjaworski
Communicator

I have a data set with multiple key pair field values that start with the same key name.


Data source is Web Sense proxy logs authenticated by Active Directory.

=1 category=227 user=LDAP://1.2.3.4 OU=Sub Department,OU=IS,OU=HQ,OU=Employee,OU=WidgetCo,DC=subdomain,DC=TLDdomain,DC=local/Splunk Nerd src_host=192.168.100.200
=1 category=227 user=LDAP://1.2.3.4 OU=Sub Department,OU=IS,OU=HQ,OU=Employee,OU=WidgetCo,DC=subdomain,DC=TLDdomain,DC=local/Splunk Nerd src_host=192.168.100.200
=7 category=1526 user=LDAP://1.2.3.4 OU=Sub Department,OU=IS,OU=HQ,OU=Employee,OU=WidgetCo,DC=subdomain,DC=TLDdomain,DC=local/Splunk Nerd src_host=192.168.100.200


By default Splunk is parsing the first OU= and the first DC=. However, it is not parsing the remaining OU and DC pairs. I tried using | eval NextOU=mvindex(OU,1). That does not seem to be working. I wonder if that is because the OU= pairs are all on the same line?

I have a working regex that allows me to parse out the username. | rex field=_raw "^.local\/(?P.?)src_host.*$" I could create a regex to parse out each OU and DC pair. However, there is the possibility a particular user may be nested under more or less OU’s.

Not sure what I am missing here. I looked into the extract command, but I think Splunk is working as expected.

0 Karma
1 Solution

lguinn2
Legend

Try this and see... carefully copy the exact spacing, etc.

In props.conf

[yoursourcetypehere]
REPORT-websense_ext=websense_extraction

In transforms.conf

[websense_extraction]
DELIMS = ", ", "="
MV_ADD = true

View solution in original post

lguinn2
Legend

Try this and see... carefully copy the exact spacing, etc.

In props.conf

[yoursourcetypehere]
REPORT-websense_ext=websense_extraction

In transforms.conf

[websense_extraction]
DELIMS = ", ", "="
MV_ADD = true

sjaworski
Communicator

Hi lquinn,

The information you provided works, thank you. However, more event fields are created with garbage data. I think it may have to do with there are other key value pairs separated by spaces in the same event. I am reviewing the props.conf and transforms.conf documentation to better understand what is occurring.

What do you think? Can Splunk handle parsing the key value pairs that have spaces and commas in the same event?

Here is an entire event.

Dec 3 16:48:31 101.1.1.42 vendor=Websense product=Security product_version=3.2.1 action=permitted severity=1 category=17 user=LDAP://1.2.3.4 OU=Sub Department,OU=IS,OU=HQ,OU=Employee,OU=WidgetCo,DC=subdomain,DC=TLDDomain,DC=local/Splunk Nerd src_host=1.2.3.4 src_port=60608 dst_host=context.bestbuy.com dst_ip=172.226.16.62 dst_port=80 bytes_out=2102 bytes_in=768 http_response=200 http_method=GET http_content_type=image/gif http_user_agent=Mozilla/5.0_(compatible;_MSIE_9.0;_Windows_NT_6.1;_WOW64;_Trident/5.0) http_proxy_status_code=200 reason=- disposition=1048 policy=Web Surfer role=8 duration=3 url=http://context.bestbuy.com/

0 Karma

lguinn2
Legend

Other things to try:

  1. Leave out the DELIMS attribute, but keep the MV_ADD. Don't change anything else in the answer above and see what happens.

  2. Replace the transforms.conf stanza with

    [websense_extraction_ou]
    REGEX=(OU)=(\S+?)(:?\s|,)
    FORMAT = $1::$2
    MV_ADD = true

    [websense_extraction_dc]
    REGEX=(DC)=(\S+?)(:?\s|,)
    FORMAT = $1::$2
    MV_ADD = true

and props.conf becomes

[yoursourcetypehere]
 REPORT-websense_ext=websense_extraction_ou,websense_extraction_dc

sjaworski
Communicator

This works perfectly. I may have been incorrect about the additional key value pairs being created due to the props.conf and transforms.conf modification. What appears to be happening is Splunk is parsing additional event fields out of really long URL strings in each event that contain sometext=sometext. Depending on the the results to my search, I sometimes have more or less of goofy event fields.

Thank you again for the help.

0 Karma

lguinn2
Legend

Yes, I have seen that problem with URL strings, too. There isn't much you can do about it, except just ignore the weird fields.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...