The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest metrics, traces, and logs to the Splunk platform using an HEC. If you are a DevOps Engineer or SRE, you may already be familiar with the OTel Collector’s flexibility, but for those less experienced, this blog post will serve as an introduction to routing logs.
The idea of OpenTelemetry as a whole is to unify the data so it's suitable for every input and output and put some processors in between to make it possible to perform operations on data (such as transforming and filtering). You may already see that one of the biggest advantages of OTel Collector is its flexibility - but sometimes figuring out how to use it in practice is a challenge.
One of the most common cases in log processing is setting up the event’s index. If you’re familiar with the Splunk HEC receiver, you might recall this configuration snippet:
This indicates that every event used by this exporter will be sent to the logs index.
As you may see, the logs index is specific to an exporter, so the intuition is to create as many splunk_hec exporters as you need, and additionally create multiple filelog receivers as well, so that we can filter which files go to which index.
Using your imagination, visualize a scenario where all the logs go to the ordinary logs index, but some are only visible to people with higher permissions levels. These logs are gathered by filelog/security receiver and the pipeline structure would look like this one:
But is it really the best solution? Let’s consider a few questions here:
Today I’ll show you how to create a pipeline with dynamic index routing, meaning it is based on incoming logs and not statically set, with a transform processor and Splunk OpenTelemetry Collector for Kubernetes (SOCK). The idea is based on this attribute from Splunk HEC Exporter documentation:
This means that we can specify com.splunk.index as a resource attribute for a log, and it will overwrite the default index. Let’s go through a few examples of how we can do it in SOCK.
Before we cover how to overwrite your pipelines, let’s start with how you can view the pipeline. The final config is the result of your configuration in values.yaml, as well as the default configuration that is delivered by SOCK. The config’s yaml file is in the pod’s configmap.
As logs are generated by the agent, you can look at the agent’s config, the command is:
Where my-splunk-otel-collector-otel-agent is the configmap’s name - it might differ in your case, especially if you chose a different name for an installation versus one from the Getting Started docs. You can take a look at a configmaps you have with the command:
An output example for a default namespace would be:
After successfully running the describe command, scroll all the way down until you see the pipelines section. For logs, it looks more or less like this:
Now you know what components your logs pipeline is made of!
Now let’s get our hands dirty! Let’s see the easy examples of index routing based on real scenarios.
Let’s say we want to pass all the events with a log.iostream attribute stderr to error_index This would capture events emitted to the error stream and send them to their own index.
This requires doing two things:
Every transform processor consists of a set of statements. We need to create one that matches our use case, by defining what we need and writing it specifically for OTel. The logical statement here would be:
set com.splunk.index value to be error_index for EVERY log from the pipeline whose attribute log.iostream is set to stderr
Then the statement in the transform processor’s syntax described here looks like this:
Next, we need to append the processor to the logs pipeline. To do that, we need to copy and paste the current processors under the agent.config section then insert our processor at the end.
The whole config will be:
After applying the config, the stderr events appear in the error_index:
Passing an event to a different index when something specific is written in the body of the log, for example, every log that contains [WARNING]:
All the keywords used here come from the transform processor documentation. We can use the transform processor, this time using the following logic:
Here are some sources that can be used to learn more about OpenTelemetry Transformation Language and its grammar.
Then we repeat the steps described in the previous solutions section. The final config is:
And the result in the Splunk Enterprise looks like this:
At this point, you might think “Oh right, that looks easy, but how would I know what attributes to use?” The logs in the transform processor can use all the elements described here, but the most useful ones are:
You can see them in the Splunk Enterprise event preview:
However, there’s no indication as to which dimensions are attributes and which are resource.attributes. You can see how it looks by running your OTel agent with this config:
This will produce all the information about log structure and which attributes are really the resource.attributes:
From this snippet, you can see that only logtag and log.iostream are attributes, all the rest are part of the resource.attributes.
The transform processor has many options aside from the ones described above, check them out here.
Let’s go even deeper and operate on two variables instead of one.
You may want to annotate the whole namespace with one splunk.com/index, but want specific pods from this namespace to redirect somewhere else. You can do this by using a transform processor to provide additional annotations to the pod of your choice.
Let’s say the annotation is second_index. This is how it looks in kubectl describe of the pod:
First, redirect logs from the pods according to the second_index annotation to convert the annotation to a resource.attribute. This can be done with extraAttributes.fromAnnotations config:
tag_name is the identifier of an element in resource.attributes, it is optional. If you don’t configure it your attribute will look like this:
k8s.pod.annotations.<key>is the output format.
With tag_name you can decide how the name of your attribute, in this example it is the same as the key:
Now that we have resource.attribute second_index set up, we can set the index destination for logs. We will use transform processor for this purpose:
We will replace the com.splunk.index resource attribute with the second_index attribute, but only when the second_index attribute is present - so it doesn’t affect logs from other pods.
Once the attribute has been moved to the log's index, we can get rid of it. This requires adding another statement to the transform processor:
This will work exactly the same as an annotation example from the previous section, the only difference is in how we’re transforming the label into resource.attribute. We now have the second_index label on a pod:
We can make it visible to the OTel collector with this config snippet:
In this article, I showed you how to route logs to different indexes. It is a commonly used feature and it can be used in many scenarios, as we can see in the examples. We will expand on other SOCK features in later articles, so stay tuned!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.