Hello,
I have a pattern in one file that I need to check if it has occurred in another file. The two files are like:
file1:
aaa bbb ccc STRING I NEED 1 ddd some random text
aaa bbb ccc STRING I NEED 2 ddd some random text
aaa bbb ccc STRING I NEED 3 ddd some random text
file 2:
www xxx PATTERN FROM FILE 1 yyy zzz
www xxx PATTERN FROM FILE 1 yyy zzz
I tried something like this but doesn't return anything
source="file2" [search source="file1" "aaa bbb ccc" | rex "aaa bbb ccc (?<extraction_name>.*) ddd"]
though I admit I don't fully understand the above query. Help would be appreciated, thanks.
Just like in math you start with parentheses, in SPL you start with square brackets (subsearch). The subsearch in this case is looking for the "STRING I NEED" in source file 1. The results of the subsearch then become part of the main search as in "source=file2 STRING I NEED". Your query seems nearly there. Try adding a return
statement to the subsearch. Like this:
source="file2" [search source="file1" "aaa bbb ccc" | rex "aaa bbb ccc (?<extraction_name>.*) ddd" | return extraction_name]
Hi Vettori,
some additional information:
in File 1, do you have a list of patterns and each event is the pattern to search in File 2 or does each event contain patterns to search in File 2?
anyway if each event of file 1 is exactly the pattern to search (as you use brackets) you could use something like this:
search_on_File2 [ search search_on_File1 | rename _raw AS query | fields query ]
| ...
if instead pattern is like the example you shared "aaa bbb ccc STRING I NEED 1 ddd some random text", you have to extract pattern from the search in File1 (e.g. using a regex like the following ^(?<pattern>.*)\sSTRING I NEED
)
so try something like this:
search_on_File2 [ search search_on_File1 | rex "^(?<pattern>.*)\sSTRING I NEED" | rename pattern AS query | fields query ]
| ...
Bye.
Giuseppe
My case is the latter one. Seems solved now though. Thanks.
Just like in math you start with parentheses, in SPL you start with square brackets (subsearch). The subsearch in this case is looking for the "STRING I NEED" in source file 1. The results of the subsearch then become part of the main search as in "source=file2 STRING I NEED". Your query seems nearly there. Try adding a return
statement to the subsearch. Like this:
source="file2" [search source="file1" "aaa bbb ccc" | rex "aaa bbb ccc (?<extraction_name>.*) ddd" | return extraction_name]
I tried the above query but still did not return any results. However using format directive seems to work.
So now my query is like:
source="file2" [search source="file1" "aaa bbb ccc" | rex "aaa bbb ccc (?<extraction_name>.*) ddd" | stats count by extraction_name | fields + extraction_name | format | eval search =replace(search, "extraction_name=", "")]
I ran the subquery alone from above and saw that the results returned were like so:
((STRING I NEED 1) OR (STRING I NEED 3) OR (STRING I NEED 3))
which is exactly what I need to be searched in the outer query.
In the query mentioned in the question, the intention was that the subquery would return something along:
("STRING I NEED 1" "STRING I NEED 2" "STRING I NEED 3")
It turned out it didn't so, I don't understand why.
Thanks.
The OR
keywords are significant. Without them, AND
is implied and won't work since no event has all of the strings you need. Good on you for discovering format
.
If your problem is resolved, please accept an answer to help future readers.