Relative to sourcetypes I'm trying to come up with a regex to pull out the 'root' being sensitive to the naming convention for ones that Splunk has attempted to learn (-too_small & -#). What is tripping me up is when a sourcetype has a hyphen in it already. This regex gets me pretty close but if anyone can recommend a tweak I'd be all ears.
| rex field=sourcetype "(?<st>[^-]+)(?<d>-too_small$|-\d+$)?"
So for example if my sourcetype is 'my-sourcetype' the regex above will return 'my'
Use this instead:
| rex field=sourcetype "^(?<st>.+?)(?<d>-too_small|-\d+)?$"
I made these changes:
<st>
<st>
non-greedy to avoid eating up <d>
$
anchor out of the optional <d>
to force the non-greedy <st>
to go all the way^
anchor - for prettiness/readability, no functional need because <st>
will start at the beginning anywayUse this instead:
| rex field=sourcetype "^(?<st>.+?)(?<d>-too_small|-\d+)?$"
I made these changes:
<st>
<st>
non-greedy to avoid eating up <d>
$
anchor out of the optional <d>
to force the non-greedy <st>
to go all the way^
anchor - for prettiness/readability, no functional need because <st>
will start at the beginning anyway