I have two files with a simple list of filenames in each. What I'd like to do is to compare one file to the other and remove any thing in the first file if it appears in the second; for example-
File 1
foo.exe
bar.exe
car.exe
dar.exe
File 2
car.exe
dar.exe
smar.exe
My desired output would be:
foo.exe
bar.exe
as I am trying to use File 2 as a sort of whitelist and filter File 1 through it. Is this even possible? Any help would be greatly appreciated, thanks!
I'm assuming each line in a file is one event in Splunk.
source="file1*" OR source="file2*" | eventstats count by _raw | where count = 1 AND match(source,"file1")
I'm assuming each line in a file is one event in Splunk.
source="file1*" OR source="file2*" | eventstats count by _raw | where count = 1 AND match(source,"file1")
OK it turns out you were spot on with that query, I recreated the data sets and imported them again and it worked first shot. It must have had something to do with the method of importing data; All I know is that I'm happy I can continue on now so thank you very much Martin 😄
So you get results after the eventstats but nothing after the where?
There's only two things that can go wrong, and they're unrelated to the version. Either the count filters out everything - then the pre-where results have no events with count=1, or the match filters out everything - then the matching against the source file name doesn't work.
Ahh yes, I did have the double \'s in but thismorning when I was setting it up again I neglected to put those in sorry.
I also tried to move the WHICH statement as apart of the search statement, since splunk was very insistant about combining the two together but it didnt make any difference.
One other thing I just realised - I am running an old version; I will update immediately and try it again, this might be an issue as well. :<
Note, the second argument to match() is treated as a regular expression. Writing "\s" causes the matcher to look for a space, while "\d" will look for a digit. You will have to escape the backslashes like so: "\\"
The lazy would just write match(source, "difffile1") 🙂
The query does indeed produce something when I strip off the WHERE command. My query string is thus:
source="c:\splunk data\difffile1.csv*" OR source="c:\splunk data\difffile2.csv*" | eventstats count by _raw | where count = 1 AND match (source, "c:\splunk data\difffile1.csv"
Thanks again for working this through with me, I very much appreciate it!
Does the query without the where command produce outputs?
What's your modified query, and your source field contents?
Thanks for your help, I've replaced "fileX" for my file names and run the query but it doesn't produce any outputs. PS, each line in the source files is it's own event, you're correct.