Splunk Search

Can I get clarification on what my search string is doing?

Justin1224
Communicator

I have this search string, and I'm unsure of what some of it does. This is the search:

| inputlookup append=T malware_tracker | stats min(firstTime) as firstTime,dc(dest) by signature | eval _time=firstTime | `daysago(30)` | sort 100 - firstTime | `uitime(firstTime)` | fields firstTime,signature,dc(dest)

This is what I'm confused about:
-I think that what the inputlookup append=T malware_tracker is doing is ignoring indexes and using another input source. Is the other input source data from malware_tracker? Is malware_tracker even an input source? Or is the input source something/somewhere else entirely?
-Again, I think the stats min part is obtaining the minimum value of the field (firstTime). But what is that field? Did eval create it?
-What is the dc(dest) by signature part doing?
-What is the eval _time=firstTime part doing?
-Is the very last part just showing you specific fields in the final output?

Tags (4)
0 Karma
1 Solution

sundareshr
Legend

I'll try and address each question here below...

-I think that what the inputlookup append=T malware_tracker is doing is ignoring indexes and using another input source. Is the other input source data from malware_tracker? Is malware_tracker even an input source? Or is the input source something/somewhere else entirely?

In splunk, you can have data that is indexed (stored in indexes) and/or stored as lookup tables (http://docs.splunk.com/Documentation/Splunk/6.4.3/SearchReference/Lookup). Data stored in indexes (large volumes) are timeseries data whereas once stored in lookup tables (small volumes) are pretty static, used for the most part, as cross-reference. For example, you can have a list of IP address to HostName mapping stored in the lookup table, vs all data pertaining to that IP address streamed from various devices stored in indexes. inputlookup command is one way to view the data stored in lookup files. In you example, there must be a lookup file called malware_tracker (probably stores static malware). The append=t implies the data returned from the lookup file is appended to the current set of results rather than replacing it. So, if you have anything before the | inputlookup command, the data will be appended to that. If | inputlookup is the start of your search, then, there's nothing to append to.

-Again, I think the stats min part is obtaining the minimum value of the field (firstTime). But what is that field? Did eval create it?
If | inputlookup is the start of your search, the lookup file must contain a field called firstTime and min(firstTime) as you correctly deduced, is the lowest value in that field.
-What is the dc(dest) by signature part doing?

dc(dest) = distinct_count(dest) which is a count of distinct values in the field called dest. tHE by clause is for grouping the stats command. So, min(firstTime) and dc(dest) are grouped by values in the field called signature

-What is the eval _time=firstTime part doing?

This is assigning the values of the field firstTime to a field called _time

-Is the very last part just showing you specific fields in the final output?
That's right, the last part is limiting the final set of fields to firstTime signature dc(dest)

View solution in original post

sundareshr
Legend

I'll try and address each question here below...

-I think that what the inputlookup append=T malware_tracker is doing is ignoring indexes and using another input source. Is the other input source data from malware_tracker? Is malware_tracker even an input source? Or is the input source something/somewhere else entirely?

In splunk, you can have data that is indexed (stored in indexes) and/or stored as lookup tables (http://docs.splunk.com/Documentation/Splunk/6.4.3/SearchReference/Lookup). Data stored in indexes (large volumes) are timeseries data whereas once stored in lookup tables (small volumes) are pretty static, used for the most part, as cross-reference. For example, you can have a list of IP address to HostName mapping stored in the lookup table, vs all data pertaining to that IP address streamed from various devices stored in indexes. inputlookup command is one way to view the data stored in lookup files. In you example, there must be a lookup file called malware_tracker (probably stores static malware). The append=t implies the data returned from the lookup file is appended to the current set of results rather than replacing it. So, if you have anything before the | inputlookup command, the data will be appended to that. If | inputlookup is the start of your search, then, there's nothing to append to.

-Again, I think the stats min part is obtaining the minimum value of the field (firstTime). But what is that field? Did eval create it?
If | inputlookup is the start of your search, the lookup file must contain a field called firstTime and min(firstTime) as you correctly deduced, is the lowest value in that field.
-What is the dc(dest) by signature part doing?

dc(dest) = distinct_count(dest) which is a count of distinct values in the field called dest. tHE by clause is for grouping the stats command. So, min(firstTime) and dc(dest) are grouped by values in the field called signature

-What is the eval _time=firstTime part doing?

This is assigning the values of the field firstTime to a field called _time

-Is the very last part just showing you specific fields in the final output?
That's right, the last part is limiting the final set of fields to firstTime signature dc(dest)

Justin1224
Communicator

Awesome, thank you a lot, this really helps!

Justin1224
Communicator

Just one more question, how does it group the values by the field called signature and what does that field do. Also, could you explain what fields are/do? I've read through the Splunk site on it, but I still don't really understand it very well.

0 Karma

Justin1224
Communicator

Could you just help me with this last question? How does it group the values by the field called signature and what does that field do? And I figured out what fields are.

0 Karma

sundareshr
Legend

signature must be a field in your lookup file. Grouping by signature means it will show min(FirstTime) for each value in the signature field

Justin1224
Communicator

Ok thank you a lot, you have no idea how much you helped me out

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...