Solved: Summary Index or not?

tradecraft1914 · ‎06-18-2013

All,

We have Windows and Linux BIND DNS servers logging into one index in Splunk. Because of the way Windows logs domain names in DNS requests we are doing a search time extraction. If I want to search both types of DNS logs for any lookups for www.splunk.com we do a search this way:

index=dnslogs win_query="www.splunk.com" bind_query="www.splunk.com"

It works but is very inefficient because of the search time extraction on win_query. What I would like to do is create a new index and populate it with the unique values from each of those fields daily, deduping between fields of course. I have been researching and am not certain that a summary index is what I want. We basically want to search months worth of DNS logs to see if a domain shows up or not. At that point we dont need the actual log event, just that it exists or not. Is it possible to take unique values from 2 different fields and populate a new index with those values? Other suggestions?

chris · ‎06-18-2013

Hi tradecraft,

I would set up the different DNS logs to have different sourcetypes (info, more info)
Then based on those sourcetypes I would configure automatic field (search time) extractions using props.conf and transforms.conf (info).
The fields that you create should have the same names.
The Common Information Model (CIM) suggests names (CIM info, field reference) that can/should be used accross all sourcetypes to make the searches easy.
To make your solution reusable you should organize the configurations (props & transforms) in so called technology add ons (Info about TAs).

This is a lot of work but it is worth it, you will have a solid Splunk installation that can easily fullfill future requirements.

A lot of Work has already been done.

Have a look at the bind app: download here
And the Active Directory app contains an add on for windows DNS download here (unpack the tar.gz and search for TA-DNSServer-NT5 or TA-DNSServer-NT6 and put that folder into your apps directory). I'm not sure whether the field names of the AD App are CIM compliant but if the fields are extracted the can be aliased.

I would only start using a summary index if the amount of events you have to process in one search is too big (if a single search takes too long to complete).

Hope this gets you started

View solution in original post

chris · ‎06-18-2013

Hi tradecraft,

I would set up the different DNS logs to have different sourcetypes (info, more info)
Then based on those sourcetypes I would configure automatic field (search time) extractions using props.conf and transforms.conf (info).
The fields that you create should have the same names.
The Common Information Model (CIM) suggests names (CIM info, field reference) that can/should be used accross all sourcetypes to make the searches easy.
To make your solution reusable you should organize the configurations (props & transforms) in so called technology add ons (Info about TAs).

This is a lot of work but it is worth it, you will have a solid Splunk installation that can easily fullfill future requirements.

A lot of Work has already been done.

Have a look at the bind app: download here
And the Active Directory app contains an add on for windows DNS download here (unpack the tar.gz and search for TA-DNSServer-NT5 or TA-DNSServer-NT6 and put that folder into your apps directory). I'm not sure whether the field names of the AD App are CIM compliant but if the fields are extracted the can be aliased.

I would only start using a summary index if the amount of events you have to process in one search is too big (if a single search takes too long to complete).

Hope this gets you started

chris · ‎06-19-2013

Ok, glad you found a solution to your issue

tradecraft1914 · ‎06-18-2013

Thanks Chris. I should have clarified a couple things. We do have different sourcetypes for both Windows and Linux DNS logs. We wanted to stay away from search time extractions because they are very expensive resource wise. It takes 15 minutes longer for the same search with search extractions.

I just found a way to use SEDCMD to fix the odd Microsoft format for domain being queried prior to indexing. Once that is done we can search the index without the need for any search time extractions. Hopefully 🙂 Thanks again.

Summary Index or not?

Introducing Splunk Enterprise 9.2

Adoption of RUM and APM at Splunk

Routing logs with Splunk OTel Collector for Kubernetes