We recently instrumented our OpenShift environment to index data into Splunk. I'm looking for the best approach for extracting fields with dynamic sourcetype names out of ocp. the data format will remain consistent per sourcetype, but I just need to key on two set's of sourcetypes, batch and soap. Once I define the structure of each, we can apply this field extraction to any new sourcetypes created out of openshift (which may not happen often, but still want to prepare). I figure I can key on kube:container: in the beginning and -soap-app at the end of each?
Soap app Sourcetypes:
kube:container:organization-soap-app
kube:container:customer-soap-app
kube:container:our-us-app-internal-soap-app
kube:container:internal-regulations-soap-app
Batch app Sourcetypes:
kube:container:our-us-app-internal-batch-app
kube:container:customer-batch-app
kube:container:organization-batch-app
kube:container:compliance-regulations-batch-app
kube:container:internal-regulations-batch-app
props.conf:
[mysourcetype]
TRANSFORMS-mytransforms = mytransforms
transforms.conf:
[mytransforms]
SOURCE_KEY = MetaData:Source
REGEX = kube:container:(\S+)-(soap|batch)-app
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::$2-$1
Should take something with sourcetype of "mysourcetype" and source of "kube:container:organization-soap-app" and give it a final sourcetype of "soap-organization". You could add more "|KEYWORDS" in the 2nd capture group of the regex.
IMO, if you have two or more sources with the same structure and same field extractions, then they're all of the same sourcetype. If you need to distinguish them for some reason, use source
or the presence of a certain string (like "-soap-app").
maybe this?
kube:container:([^\s]+)-soap-app
and
kube:container:([^\s]+)-batch-app