Splunk Search

How do I create an alert when Splunk sees a transaction that is missing a certain log event, but avoid false positives?

servlette
Engager

Hi,

I have something like the following, where I have a message producer and consumer.
I am using ActiveMQ for messaging.

Sometimes I notice that consumer didn't get messages and I'm logging this way:

Producer code: log.info("Status=Produced, TransactionId=123");
Consumer.code: log.info("Status=Consumed, TransactionId=123");

I also have a Dead Letter queue consumer, which logs something like:

DLQConsumer: log.info("Status=Discarded, TransactionId=123");

The whole Producer/Consumer flow is Async.

I need Splunk to alert me when it sees a transaction, that is not processed by Consumer.

How do I write a Splunk search to alert me for these?

In a nutshell what I would like to get reported is that:

All messages produced should be consumed, if not, then I need to get alerted with TransactionId.

Also I don't want to deal with a situation where a message was just produced and not yet consumed, still Splunk reporting it to me.
Maybe I can set the time range as current time - 15 minutes to current time - 1 minute to avoid a situation where a message was just produced and not yet consumed.

0 Karma

woodcock
Esteemed Legend

I don't know why you mentioned the DLQ but something like this should work for you:

... | reverse | streamstats current=t count(eval(Status="Produced")) AS sessionID by TransactionId | stats earliest(_time) AS startTime latest(_time) AS endTime count by sessionID host | where count=1 | eval waitingSeconds = now() - _time | where waitingSeconds > (15*60)
0 Karma

servlette
Engager

The reason why I mentioned DLQ is that I wanted a report telling me how many messages were not processed [on the Consumer layer]. Ideally if I produce X, then I want to consume all X. Irrespective of where the messages go (either to DLQ or not consumed), I need a report that clearly tells me X were produced and X - n were consumed and the report should just have "n" records along with transactionId's.
Yesterday I ran into an issue where Producer dropped off messages and I didn't see any activity on the Consumer side. Messages were processed by DLQConsumer after a while as Consumer had some issue (likely the connectivity to ActiveMQ was broken). Though a simple restart resolved the issue, I had no clue as to know why no messages were processed by Consumer. The issue lasted for a few hours. I would have reacted if I had a splunk alert for a situation like this and that's why I posted this question yesterday.

0 Karma

woodcock
Esteemed Legend

That was my point: for the purposes of your question, DLQ is irrelevant. My answer should suffice as-is.

0 Karma

carmackd
Communicator

... | eval is_notible_event=if(condition,"t",NULL) | transaction some_field | where isnull(is_notable_event)

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...