Splunk Search

How normalize field values that have slightly different field values? Regex? Match? Replace?

UMDTERPS
Communicator

Hi! 👨‍💼


I am a little stuck on how to normalize "Operating System" data I have.  Currently, we have a field called "Operating System" our data looks something like this:

 

Operating System

Windows 10        Enterprise
Windows 10 
Windows 10 enterprise 
Windows 10 
windows 10 
windows 10 20H2
Windows 10 V2004
windows 10 2004
Windows Server 
windows server
RHEL8
RHEL 8
rhel8
rhel 8
rhel 8.6
Linux Server rel 8
Windows 2012r2
Windows Server 2012 R2
Windows Server 2012

 

After I did a stats count (because data isn't normalized) we have 170+ operating systems.  What is the most efficient way to normalize data without writing 170+  "replace" or "match" statements?

For example, how would I make the following just "RHEL 8": 

RHEL8
RHEL 8
rhel8
rhel 8
rhel 8.6
Linux Server rel 8

Thanks!

Labels (3)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust
| eval os=case(match(os,"(?i)rhel\s*8[\d\.]*"),"RHEL 8",match(os,"Linux Server rel 8"),"RHEL 8",match(os,"(?i)\s*windows\s10.*"),"Windows 10",match(os,"Windows (|Server )2012.*"),"Windows Server 2012",1==1,os)

and so on

View solution in original post

ITWhisperer
SplunkTrust
SplunkTrust

What do you mean by "normalize"? Do you want all the Windows * operating systems to be simply Windows, and all the rest to be *nix for example?

0 Karma

UMDTERPS
Communicator

For example, how would I make the following just "RHEL 8": 

RHEL8
RHEL 8
rhel8
rhel 8
rhel 8.6
Linux Server rel 8

For example, how would I make the following just "Windows 10":

 Windows 10 Enterprise
Windows 10
Windows 10 enterprise
Windows 10
windows 10
windows 10 20H2
Windows 10 V2004
windows 10 2004

For example, how would I make the following just "Windows Server 2012":

Windows 2012r2
Windows Server 2012 R2
Windows Server 2012

Etc...

Thanks

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| eval os=case(match(os,"(?i)rhel\s*8[\d\.]*"),"RHEL 8",match(os,"Linux Server rel 8"),"RHEL 8",match(os,"(?i)\s*windows\s10.*"),"Windows 10",match(os,"Windows (|Server )2012.*"),"Windows Server 2012",1==1,os)

and so on

UMDTERPS
Communicator

Doesn't seem to be working. For example, the field we have for OS is called "Operating System" and there is one entry that is "RHEL 8."  The following SPL, 

 

|eval "os"=case(match("os","RHEL 8"),"RHEL 8")
|fields ip "system" os

 


The search runs, no errors, but the search returns  nothing for "os:"

IP                         system    os
192.168.1.1      ABC 

"os" is blank, any ideas?

Thanks!

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

If your field name has spaces in you need to enclose it is single quotes not double quotes.

UMDTERPS
Communicator

Ahh Yes! Thanks!  It works now!  Karma Granted!

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...