Splunk Search

Does index time field extraction make sense for our situation?

shawnce
Engager

(currently using Splunk 4.3.3 build 128297)

I have poked around the docs covering index time field extraction and some of the related Q&A but I decide I would ask directly outlining our situation.

We have a logging facility that several of our future product will use. This facility receives JSON payloads containing key/value pairs like the following (names have been changed to protect the innocent).

{ 
  "key1" : "value1",
  "key2" : "value2",
  (could contain more pairs)
  "entries" : [
                {
                 "key3" : "value3a",
                 "key4" : "value4a",
                 "key5" : "value5",
                 (could contain more pairs)
                },
                {
                 "key3" : "value3b",
                 "key4" : "value4b",
                 "key6" : "value6",
                 (could contain more pairs)
                },
                (could contain more entries)
              ]
}

When the logging facility gets the above example JSON payload it would turn it into the following two log statements and push those to splunk via TCP.

timestamp key1="value1" key2="value2" key3="value3a" key4="value4a" key5="value5"
timestamp key1="value1" key2="value2" key3="value3b" key4="value4b" key6="value6"

We are defining "key1" to be used to denote the product/component submitting the data and the value it contains would follow a reverse DNS style naming convention but with no real restrictions on the hierarchy of it other then ensuring it likely unique across our family of products. For example: "mycompany.product.component" or "mycompany.mydivision.product.component.subcomponent".

The remaining key/value pairs are product specific (aka can be whatever the product wants). In other words key1 will be used to namespace the rest of the key/value pairs.

We are considering adding "key1" to be extracted at index time. I believe by doing so would speed our ability to focus on the events coming from a particular product and/or component out in the field.

Search possibilities...

key1="mycompany.product.*" ...blah...
key1="mycompany.product.component"  ...blah...
key1="*.component.*"  ...blah...
etc.

Opinions?

0 Karma

sdaniels
Splunk Employee
Splunk Employee

Based on this post, it sounds like this may be one of the cases where it does makes sense:

http://splunk-base.splunk.com/answers/842/do-search-time-fields-have-performance-considerations?page...

0 Karma

tfletcher_splun
Splunk Employee
Splunk Employee

Have you considered making key1 the sourcetype or the source? It is a safer solution and will still allow you to use metasearch and other fun indexed field tricks

I advise against the use of custom indexed fields, namely because it changes the structure of your index compared to your other indices and is not advised by the docs.

Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...