Splunk Search

Auto index to mutiple index when data come in

duenguyen
Explorer

Can I have indexer smart enough to go to dedicate index base on data value

Here is my data
"2013-12-02 20:30:30","a@a1.com", . . .
"2013-12-02 20:30:30","b@b2.com", blah blah
"2013-12-02 20:30:30","a@a1.com", blah blah
"2013-12-02 20:30:30","b@b2.com", foo bar
"2013-12-02 20:30:30","c@c.com", . . .

Right now the data is feed over tcp port to main index. Then from there I setup multiple summary index that if email (second column) equal a@a.com then it goes to index=a_a1 and if email=b@b2.com then goes to b@b2.com and email=c@c.com goes to index=c_c etc

I was wondering is there anyway I could setup at indexing where it could goes straight to designate index rather have to main index then use summary to go to designate index?

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Sure, you can achieve that using props.conf and transforms.conf like this:

props.conf

[your_sourcetype]
...
TRANSFORMS-assign_index = assign_a_index, assign_b_index, assign_c_index

transforms.conf

[assign_a_index]
SOURCE_KEY = email
REGEX = ^a@a1.com$
DEST_KEY = _MetaData:Index
FORMAT = a_a1

[assign_b_index]
SOURCE_KEY = email
REGEX = ^b@b2.com$
DEST_KEY = _MetaData:Index
FORMAT = b_b2

[assign_c_index]
SOURCE_KEY = email
REGEX = ^c@c.com$
DEST_KEY = _MetaData:Index
FORMAT = c_c

Note that's assuming the field email is available at that time, ie not extracted using rex or any other search command. If it is not, you can remove the SOURCE_KEY entry and change the regex to match on the raw data instead.
See http://docs.splunk.com/Documentation/Splunk/6.0.2/Admin/transformsconf for reference.

martin_mueller
SplunkTrust
SplunkTrust

You'll have thousands of different parties in your Splunk? :notbad:

You could define the transforms stanza like this:

[assign_dynamic_index]
SOURCE_KEY = email
REGEX = ^(.+)@(.+)$
DEST_KEY = _MetaData:Index
FORMAT = $1_$2

That will take an email value of foo@bar.com and send it to index foo_bar.com - make sure that index exists.

0 Karma

duenguyen
Explorer

given fact new user could come into the system. It will be very for each user to create entry in props.conf and transforms.conf not mention transform.conf could be bloated with thousands of user email. Is there a way (with regex ??) to have this dynamic?

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

As for #1, if no stanza matches then the index will not be overwritten so the event will go where it would have gone without the transforms.conf. You can of course define a match-all stanza as a default as well.

0 Karma

duenguyen
Explorer

Thank you much for your response. Yes email always present. Two other quick questions is

1) If email is not within known email (else) will it automatically fall back to main index?
2) when I play around with summary index to achieve map reduce concept I experience it is slower than main index reference to my question (http://answers.splunk.com/answers/130457/why-smaller-index-run-much-slower-than-csv-larger) any idea why?

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...