Hello!
I have two CSV files:
An example of the first file:
And an example of the second file:
I need to associate each hostname with its corresponding location. To do that, I need to search for each hostname if its begining match with one of the expressions in the second list. Example: ABCZER12, ABCSDF56 and ABCFHG76 will match with ABC, means that those hostnames correspond to the location Europe.
Some hostnames can not match any of the expressions in the second list (ex: OPQYGT65).
In the second file, expressions does not always have the same amount of letters.
In the two files, there is no field that are the same, so I think that I can't use lookups to resolve my problem here.
If anyone have an idea to resolve my problem, it would be great.
Thanks 🙂
Have you looked at the Wild card option with lookups? If you're able to convert data from 2nd csv to a lookup with following format, you should be able to use the wildcard feature.
hostprefix,country
ABC*,Europe
OPQR*,Brazil
XY*,USA
See an example here. https://answers.splunk.com/answers/52580/can-we-use-wildcard-characters-in-a-lookup-table.html
Yes, thank you. It works.
okay, so you are going to have to reformat your second file in order to make this work. Here's one way.
[your search here ]
| fields hostname _time Otherstuff
| eval MatchField=mvappend(
"Left02=".substr(hostname,1,2),
"Left03=".substr(hostname,1,3),
"Left04=".substr(hostname,1,4),
"None99=((None))")
| mvexpand MatchField
| join max=0 MatchField
[| inputlookup prefixfile.csv
| fields prefix region
| eval MatchField="Left0".len(prefix)."=".prefix
| fields MatchField region
| append [makeresults | eval MatchField="None99=((None))" | eval region="((none))" ]
| dedup MatchField region
]
| sort 0 _time hostname MatchField
| dedup _time hostname
| fields hostname _time Otherstuff MatchField region
The above code assumes there won't be any events on the same hostname with the same _time timestamp. If your first file is only the host name, just remove _time and Otherstuff from the above code and it should work.
If you have any prefixes longer than 9 digits, you'll have to adjust the code somewhat. (Format the len(prefix) as 2-digits.)
edited to use sort 0 instead of sort in case there were more than 100 values to be sorted.
Obviously the code "Left04=".substr(hostname,1,4), needs to be repeated for any length up to the longest one in your second file.