Splunk Search

Create a new field and populate it with normalized data?

dbcase
Motivator

Hi,

I have a field in my existing data set called mso. Within that field are company names

Example

CompanyA
CompanyAA1
CompanyABB1
CoB
CoBBBB1
CompC
CompC2

What I would like to do is create a new field called normalized_mso with normalized company names. so that it looks like the below

CompanyA
CoB
CompC

How could I accomplish this?

0 Karma

DalJeanis
Legend

This works against your examples.

| makeresults
| eval mso="ExxonUSA,ExxonEurope,ExxonAPAC,ShellUSA,ShellSouthAmerica,ShellNZ,ChevronCalifornia,ChevronIceland,Chevron"
| makemv mso delim="," 
| mvexpand mso
| eval mso2=mso
| rex mode=sed field=mso2 "s/(^[A-Z]+?[a-z]+)([A-Z]*?[a-z]*?[a-zA-Z]*?$)/\1/g"

You may need to post any examples you have of companies that have numbers, special characters, or all lower case for testing. This works against a few more scenarios...

 | makeresults
 | eval mso="ExxonUSA,ExxonEurope,ExxonAPAC,ShellUSA,ShellSouthAmerica,ShellNZ,ChevronCalifornia,ChevronIceland,Chevron,blahblahTheblah,Protection1Green,327weirdness Inc,Iceberg Titanic Company"
 | makemv mso delim="," 
 | mvexpand mso
 | eval mso2=mso
 | rex mode=sed field=mso2 "s/^([^a-z]*)([a-z0-9]+)([^a-z0-9\n]*)([a-zA-Z0-9 ]*)$/\1\2/g"
0 Karma

niketn
Legend

@dbcase, if you can use the upper case character in second string as the pattern to end field extraction, you can use the rex command | rex field=<YourFieldNameGoesHere> "(?<mso_new>[A-Z]{1}[^A-Z]+)".
Following is a run anywhere search for you to try out:

| makeresults
| eval mso="ExxonUSA,ExxonEurope,ExxonAPAC,ShellUSA,ShellSouthAmerica,ShellNZ,ChevronCalifornia,ChevronIceland,Chevron"
| makemv mso delim="," 
| mvexpand mso
| rex field=mso "(?<mso_new>[A-Z]{1}[^A-Z]+)"

PS: The regular expression means start with Capital letter and extract all characters before find next Upper Case character. You can test the same on regex101.com as well.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

dbcase
Motivator

Hi Niketnilay,

You are a busy man!!! Sadly the real data doesn't necessarily have an uppercase character.

0 Karma

niketn
Legend

@dbcase 😄 LOL I am on leave today, travelling to hometown 🙂

If there is no pattern your option would be to use lookup with wildcard. Refer to this answer https://answers.splunk.com/answers/52580/can-we-use-wildcard-characters-in-a-lookup-table.html

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

woodcock
Esteemed Legend

You need to give us the rules for equation. For example:
All names start with and end with a single capital letter with all-lowercase between.
Everything after the 2nd (ending) capital letter is trash.

0 Karma

dbcase
Motivator

Well the names are kinda all over the map, thats why I made the examples variable.

Right now I'm matching using "like" and while it mostly works, what I really need is a new field in the existing data set.

Here is a better example

Company names (made up)

ExxonUSA
ExxonEurope
ExxonAPAC
ShellUSA
ShellSouthAmerica
ShellNZ
ChevronCalifornia
ChevronIceland
Chevron

I'd like to end up with
Exxon
Shell
Chevron

0 Karma

woodcock
Esteemed Legend

I get the general idea but I need some rules of structure against which to code.

0 Karma
Get Updates on the Splunk Community!

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...