I have this search string, and I'm unsure of what some of it does. This is the search:
| inputlookup append=T malware_tracker | stats min(firstTime) as firstTime,dc(dest) by signature | eval _time=firstTime | `daysago(30)` | sort 100 - firstTime | `uitime(firstTime)` | fields firstTime,signature,dc(dest)
This is what I'm confused about:
-I think that what the inputlookup append=T malware_tracker
is doing is ignoring indexes and using another input source. Is the other input source data from malware_tracker
? Is malware_tracker
even an input source? Or is the input source something/somewhere else entirely?
-Again, I think the stats min
part is obtaining the minimum value of the field (firstTime)
. But what is that field? Did eval create it?
-What is the dc(dest) by signature
part doing?
-What is the eval _time=firstTime
part doing?
-Is the very last part just showing you specific fields in the final output?
I'll try and address each question here below...
-I think that what the inputlookup append=T malware_tracker is doing is ignoring indexes and using another input source. Is the other input source data from malware_tracker? Is malware_tracker even an input source? Or is the input source something/somewhere else entirely?
In splunk, you can have data that is indexed (stored in indexes) and/or stored as lookup tables (http://docs.splunk.com/Documentation/Splunk/6.4.3/SearchReference/Lookup). Data stored in indexes (large volumes) are timeseries data whereas once stored in lookup tables (small volumes) are pretty static, used for the most part, as cross-reference. For example, you can have a list of IP address to HostName mapping stored in the lookup table, vs all data pertaining to that IP address streamed from various devices stored in indexes. inputlookup
command is one way to view the data stored in lookup files. In you example, there must be a lookup file called malware_tracker (probably stores static malware). The append=t
implies the data returned from the lookup file is appended to the current set of results rather than replacing it. So, if you have anything before the | inputlookup
command, the data will be appended to that. If | inputlookup
is the start of your search, then, there's nothing to append to.
-Again, I think the stats min part is obtaining the minimum value of the field (firstTime). But what is that field? Did eval create it?
If | inputlookup
is the start of your search, the lookup file must contain a field called firstTime
and min(firstTime)
as you correctly deduced, is the lowest value in that field.
-What is the dc(dest) by signature part doing?
dc(dest)
= distinct_count(dest)
which is a count of distinct values in the field called dest
. tHE by
clause is for grouping the stats command. So, min(firstTime)
and dc(dest)
are grouped by
values in the field called signature
-What is the eval _time=firstTime part doing?
This is assigning the values of the field firstTime
to a field called _time
-Is the very last part just showing you specific fields in the final output?
That's right, the last part is limiting the final set of fields to firstTime signature dc(dest)
I'll try and address each question here below...
-I think that what the inputlookup append=T malware_tracker is doing is ignoring indexes and using another input source. Is the other input source data from malware_tracker? Is malware_tracker even an input source? Or is the input source something/somewhere else entirely?
In splunk, you can have data that is indexed (stored in indexes) and/or stored as lookup tables (http://docs.splunk.com/Documentation/Splunk/6.4.3/SearchReference/Lookup). Data stored in indexes (large volumes) are timeseries data whereas once stored in lookup tables (small volumes) are pretty static, used for the most part, as cross-reference. For example, you can have a list of IP address to HostName mapping stored in the lookup table, vs all data pertaining to that IP address streamed from various devices stored in indexes. inputlookup
command is one way to view the data stored in lookup files. In you example, there must be a lookup file called malware_tracker (probably stores static malware). The append=t
implies the data returned from the lookup file is appended to the current set of results rather than replacing it. So, if you have anything before the | inputlookup
command, the data will be appended to that. If | inputlookup
is the start of your search, then, there's nothing to append to.
-Again, I think the stats min part is obtaining the minimum value of the field (firstTime). But what is that field? Did eval create it?
If | inputlookup
is the start of your search, the lookup file must contain a field called firstTime
and min(firstTime)
as you correctly deduced, is the lowest value in that field.
-What is the dc(dest) by signature part doing?
dc(dest)
= distinct_count(dest)
which is a count of distinct values in the field called dest
. tHE by
clause is for grouping the stats command. So, min(firstTime)
and dc(dest)
are grouped by
values in the field called signature
-What is the eval _time=firstTime part doing?
This is assigning the values of the field firstTime
to a field called _time
-Is the very last part just showing you specific fields in the final output?
That's right, the last part is limiting the final set of fields to firstTime signature dc(dest)
Awesome, thank you a lot, this really helps!
Just one more question, how does it group the values by the field called signature and what does that field do. Also, could you explain what fields are/do? I've read through the Splunk site on it, but I still don't really understand it very well.
Could you just help me with this last question? How does it group the values by the field called signature and what does that field do? And I figured out what fields are.
signature
must be a field in your lookup file. Grouping by signature means it will show min(FirstTime)
for each value in the signature
field
Ok thank you a lot, you have no idea how much you helped me out