Splunk Search

What are the best way to merge a set of values that refer to the same thing in a field?

Stevelim
Communicator

For example in a field "customer", I have the following events and values:
Event 1: abc
Event 2 :abc pte ltd

I want to merge their values to say "abc". Is it possible to do it programatically instead of a manual replace command for every occurrence?

Tags (3)
0 Karma
1 Solution

woodcock
Esteemed Legend

Based on your clarification, if we assume the first word is key, you can do it like this:

... | rex field=customer "(?<CustomerAsFirstWord>[\S]+)" | ...

But surely this is not good enough so the next thing you can do is to create a lookup file like this:

rawCustomer,normalizedCustomer
Abc, Abc
Abc Pte Ltd,Abc
Abc Technologies,Abc

Then you do this:

... | lookup mylookup rawCustomer AS customer OUTPUT normalizedCustomer AS customer

View solution in original post

0 Karma

woodcock
Esteemed Legend

Based on your clarification, if we assume the first word is key, you can do it like this:

... | rex field=customer "(?<CustomerAsFirstWord>[\S]+)" | ...

But surely this is not good enough so the next thing you can do is to create a lookup file like this:

rawCustomer,normalizedCustomer
Abc, Abc
Abc Pte Ltd,Abc
Abc Technologies,Abc

Then you do this:

... | lookup mylookup rawCustomer AS customer OUTPUT normalizedCustomer AS customer
0 Karma

Stevelim
Communicator

Thank you so much! I didnt know splunk is able to generate an Output and append another Value!

0 Karma

woodcock
Esteemed Legend

It would really help if you majorly clarified your question with full details including exactly what is in what fields. I am assuming that field customer is a multivalued field and that you would like to see how many events go with each customer; you can do that like this:

... | stats count BY customer
0 Karma

Stevelim
Communicator

Hi there, heres the additional information:

Example Data:
Event 1: customer=Abc
Event 2: customer=Abc Pte Ltd
Event 3: customer=Abc Technologies

I will like to normalize all of them to Abc as they are actually the same entity. Different dept keyed in the same entity under different names.

If I do a stats count by customer, it will probably treat all 3 events as 3 different entities. My current solution is to do a replace "Abc Pte Ltd" with "Abc" in customer. Im just wondering if there are any solutions that can automatically do this that is more general so that I dont have to crawl through the entire list via say a stats values(customer) command to slowly add in the replace commands.

Hope this clears up my situation.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...