Splunk Search

How to search a lookup table and return the matching term?

clv1clv1
Explorer

All-

I am new to Splunk and trying to figure out how to return a matched term from a CSV table with inputlookup. I just researched and found that inputlookup returns a Boolean response, making it impossible to return the matched term.

With that being said, is the any way to search a lookup table and return a matching term? I would imagine eval would do the trick, but I have not been successful in making it work.

I have tried the below:

index=proxysg sourcetype=proxysg_base [|inputlookup aterms.csv | return 10000 $aterms] | eval matchedterm=if( [|inputlookup aterms.csv | return 10000 $aterms], $aterms)

Thanks for the help!

1 Solution

gokadroid
Motivator

Lets say your Lookup table is "inputLookup.csv" and it is as follows:

Field1,Field2
AA,11
AB,22
AC,33
BA,21
BB,22
BC,23

You can match terms from input lookup on any of the above fields Field1 or Field2 as follows (I am matching on Field1 and displaying Field2):

|inputlookup inputLookup.csv | search Field1=A* | fields Field2 

If you now want to use all the Field2 values which returned based on your match Field1=A* as subsearch then try:

yourBaseSearch [ |inputlookup inputLookup.csv | search Field1=A* | fields Field2 ]

Updating as per the comment conversation:

index=proxysg sourcetype=proxysg_base [|inputlookup aterms.csv | return 10000 $aterms]
 | rename _raw as rawText
 | eval foo=[|inputlookup aterms.csv |stats values(aterms) as query | eval query=mvjoin(query,",") | fields query | format "" "" "" "" "" ""]
 | eval foo=split(foo,",") 
 | mvexpand foo 
  | eval foo=lower(foo)     
  | eval rawText=lower(rawText)
 | where like(rawText,"%"+foo+"%")
 | table rawText, foo

View solution in original post

renjujacob88
Path Finder

index=proxysg sourcetype=proxysg_base search [| inputlookup lookup.csv | eval search="".term."" | rename search as term |fields term(lookup columnname) ]

0 Karma

gokadroid
Motivator

Lets say your Lookup table is "inputLookup.csv" and it is as follows:

Field1,Field2
AA,11
AB,22
AC,33
BA,21
BB,22
BC,23

You can match terms from input lookup on any of the above fields Field1 or Field2 as follows (I am matching on Field1 and displaying Field2):

|inputlookup inputLookup.csv | search Field1=A* | fields Field2 

If you now want to use all the Field2 values which returned based on your match Field1=A* as subsearch then try:

yourBaseSearch [ |inputlookup inputLookup.csv | search Field1=A* | fields Field2 ]

Updating as per the comment conversation:

index=proxysg sourcetype=proxysg_base [|inputlookup aterms.csv | return 10000 $aterms]
 | rename _raw as rawText
 | eval foo=[|inputlookup aterms.csv |stats values(aterms) as query | eval query=mvjoin(query,",") | fields query | format "" "" "" "" "" ""]
 | eval foo=split(foo,",") 
 | mvexpand foo 
  | eval foo=lower(foo)     
  | eval rawText=lower(rawText)
 | where like(rawText,"%"+foo+"%")
 | table rawText, foo

msichani
Explorer

Sorry to post comment on an old answer;
The solution is working fine but it uses a lot of resources when the number of rows in csv file and index size grow. In my case, I have a structured data file like this:

Field-ID,Field-SourceType,Field-Substring
1,sourcetype1,Some text goes here
2,sourcetype1,Another other text with WILDCARD * here
3,sourcetype2,This is a different text for different sourcetype
...

I run the above query (returning "Field-Substring" field) against some index data/events to count the number of occurrences of substrings. As there are huge number of events and quite large number of substrings in the csv file, it takes ages to return the result. Just wondering if there's another method to expedite searching unstructured log files for all the values in my lookup csv file and return the stats/count/etc.
These unstructured indexed data/logs are only categorised based on different sourcetypes and as you can see in the lookup csv file, each line shows the substring and it's corresponding sourcetype which needs to be searched.

0 Karma

ariez
New Member

Great solution! Not sure if I apply it correctly. I assume mvexpand utilize a lot of memory. I hit the error "command mvexpand results will be truncated due to excessive memory usage". We are not able to modify the 500MB memory threshold. Any other method beside using mvexpand?

0 Karma

nblhmus
New Member

Hello, I have got other fields like latitude and longitude. How do I show it on the Table as well?

0 Karma

clv1clv1
Explorer

my lookup table is a list of hundreds of strings that I am searching against logs. The search works perfect as: index=proxysg sourcetype=proxysg_base [|inputlookup aterms.csv | return 10000 $aterms] and returns all of the events that have one of the matching strings in my lookup table.

The issue is that I need to display the input table string that was identified in the corresponding event.

0 Karma

gokadroid
Motivator

Are you trying to match the strings which get returned from inputlookup to the event strings of outersearch randomly or the outer search will have the text matched randomly but the matched string will happen to be a field value in your outer search. As the latter case might be a good case for lookup command then.

0 Karma

clv1clv1
Explorer

Yeah, so we are looking for keywords in the outer search, I just need the inner search to identify the keyword that hit. The input lookup only returns the boolean true, so I would need the inner search to identify the keyword in the event that was returned from the input lookup.

0 Karma

gokadroid
Motivator
0 Karma

clv1clv1
Explorer

That post is simply highlighting the matched term, I am looking to add the matched term into a new field.

0 Karma

gokadroid
Motivator

I put in some research and coming back with solution which does following:

  • Outer search matches your lookup strings in events
  • Rename _raw as rewText so not to lose it downstream
  • Take out all the strings in your lookup in a field called foo
  • Split foo as multivalue field
  • Expand the field foo and match it piecemeal in your rawText.
  • When matched table it out with rawText and foo.

Let me know if it worked so I can update in answer. Hope that is what you were looking for

index=proxysg sourcetype=proxysg_base [|inputlookup aterms.csv | return 10000 $aterms]
| rename _raw as rawText
| eval foo=[|inputlookup aterms.csv |stats values(aterms) as query | eval query=mvjoin(query,",") | fields query | format "" "" "" "" "" ""]
| eval foo=split(foo,",") 
| mvexpand foo 
 | eval foo=lower(foo)     
 | eval rawText=lower(rawText)
| where like(rawText,"%"+foo+"%")
| table rawText, foo

doodoodonk
Engager

*Splunk Version: 6.5.1 *

I needed to do this to see what strings were matching my powershell script blocks to weed out high false positive rates, but I keep getting a eval malformed error when I try to example above.

  1. sourcetype=WinEventLog:PowerShell [|inputlookup Powershell_Rules.csv | return 10000 $Powershell_Rules]
  2. | rename _raw as rawText
  3. | eval foo=[|inputlookup Powershell_Rules.csv |stats values(Powershell_Rules) as query | eval query=mvjoin(query,",") | fields query | format "" "" "" "" "" ""]
  4. | eval foo=split(foo,",")
  5. | mvexpand foo
  6. | eval foo=lower(foo)
  7. | eval rawText=lower(rawText)
  8. | where like(rawText,"%"+foo+"%")
  9. | table rawText, foo

    Powershell_Rules CSV List

    PS_Commands Match
    Set-MasterBootRecord Set-MasterBootRecord
    Get-GPPPassword Get-GPPPassword
    Get-Keystrokes Get-Keystrokes
    Get-TimedScreenshot Get-TimedScreenshot
    Get-VaultCredential Get-VaultCredential
    Get-ServiceUnquoted Get-ServiceUnquoted
    Get-ServiceEXEPerms Get-ServiceEXEPerms
    Get-ServicePerms Get-ServicePerms
    Get-RegAlwaysInstallElevated Get-RegAlwaysInstallElevated
    Get-RegAutoLogon Get-RegAutoLogon

Error I get is: Error in 'eval' command: The expression is malformed. An unexpected character is reached at ')'.

0 Karma

kishorksudha
Explorer

I get a similiar error when trying the command

| eval foo=[|inputlookup Powershell_Rules.csv |stats values(Powershell_Rules) as query | eval query=mvjoin(query,",") | fields query | format "" "" "" "" "" ""]

Error in 'eval' command: Fields cannot be assigned a boolean result. Instead, try if([bool expr], [expr], [expr]).

0 Karma

clv1clv1
Explorer

It worked perfectly!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Thanks a million!!

sundareshr
Legend

See if this is what you're looking for...

index=proxysg sourcetype=proxysg_base [|inputlookup aterms.csv | rename aterms AS search | return search]
0 Karma
Get Updates on the Splunk Community!

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...