Deployment Architecture

Transaction problem when using "startswith"

Paolo_Prigione
Builder

Hi all, is there any known issue on the startswith clause, when using multiple fields to identify complex transactions?

In the following case using endswith alone does a good job, but using startswith or both of them will provide incorrect results.

Here's my case: it's a sample file, manually put together to explore the topic. Transactions are made by two types of events, those with tnsid "open" the transaction and those with altid "terminate" it. Those events are "chained" by an event having both fields in it.

2010-12-01 08:00:01 - init - tnsid=AAA
2010-12-01 08:00:03 - init - tnsid=BBB
2010-12-01 08:00:05 - mid1 - tnsid=AAA
2010-12-01 08:00:07 - mid1 - tnsid=BBB
2010-12-01 08:00:09 - chain - tnsid=AAA altid=111
2010-12-01 08:00:11 - mid2 - altid=111
2010-12-01 08:00:13 - chain - tnsid=BBB altid=222
2010-12-01 08:00:15 - term - altid=111
2010-12-01 08:00:17 - init - tnsid=CCC
2010-12-01 08:00:19 - mid2 - altid=222
2010-12-01 08:00:21 - mid1 - tnsid=CCC
2010-12-01 08:00:23 - chain - tnsid=CCC altid=555
2010-12-01 08:00:25 - term - altid=222
2010-12-01 08:00:27 - init - tnsid=AAA
2010-12-01 08:00:28 - mid2 - altid=555
2010-12-01 08:00:29 - mid1 - tnsid=AAA
2010-12-01 08:00:30 - chain - tnsid=AAA altid=666
2010-12-01 08:00:35 - term - altid=555

I can achieve the expected result through the simplest transam command:

| file /tmp/tnsexp.log | extract | sort - _time | transam tnsid altid

T1:
2010-12-01 08:00:01 - init - tnsid=AAA
2010-12-01 08:00:05 - mid1 - tnsid=AAA
2010-12-01 08:00:09 - chain - tnsid=AAA altid=111
2010-12-01 08:00:11 - mid2 - altid=111
2010-12-01 08:00:15 - term - altid=111
T2:
2010-12-01 08:00:03 - init - tnsid=BBB
2010-12-01 08:00:07 - mid1 - tnsid=BBB
2010-12-01 08:00:13 - chain - tnsid=BBB altid=222
2010-12-01 08:00:19 - mid2 - altid=222
2010-12-01 08:00:25 - term - altid=222
T3:
2010-12-01 08:00:17 - init - tnsid=CCC
2010-12-01 08:00:21 - mid1 - tnsid=CCC
2010-12-01 08:00:23 - chain - tnsid=CCC altid=555
2010-12-01 08:00:28 - mid2 - altid=555
2010-12-01 08:00:35 - term - altid=555
T4:
2010-12-01 08:00:27 - init - tnsid=AAA
2010-12-01 08:00:29 - mid1 - tnsid=AAA
2010-12-01 08:00:30 - chain - tnsid=AAA altid=666

The latest is an "open" transaction which should or should not be returned depending on the keepevicted setting, but having Splunk no knowledge about what's an evicted transaction the parameter has no effect. If I wish to code some logic into the command:

  • ...| transam tnsid altid endswith="term" -> will provide correct results, keepevicted will correctly control the output of T4, but will leave closed_txn=0 for all of the results
  • ...| transam tnsid altid startswith="init" -> will break everything

----- Here's its output:

2010-12-01 08:00:01 - init - tnsid=AAA
2010-12-01 08:00:05 - mid1 - tnsid=AAA
2010-12-01 08:00:09 - chain - tnsid=AAA altid=111
2010-12-01 08:00:11 - mid2 - altid=111
2010-12-01 08:00:15 - term - altid=111

2010-12-01 08:00:03 - init - tnsid=BBB
2010-12-01 08:00:07 - mid1 - tnsid=BBB
2010-12-01 08:00:13 - chain - tnsid=BBB altid=222

2010-12-01 08:00:17 - init - tnsid=CCC
2010-12-01 08:00:19 - mid2 - altid=222
2010-12-01 08:00:25 - term - altid=222

2010-12-01 08:00:27 - init - tnsid=AAA
2010-12-01 08:00:28 - mid2 - altid=555
2010-12-01 08:00:35 - term - altid=555
  • ...| transam tnsid altid startswith="init" endswith="term" -> will still return incorrect results.

----- Here's its output:

2010-12-01 08:00:01 - init - tnsid=AAA
2010-12-01 08:00:05 - mid1 - tnsid=AAA
2010-12-01 08:00:09 - chain - tnsid=AAA altid=111
2010-12-01 08:00:11 - mid2 - altid=111
2010-12-01 08:00:15 - term - altid=111

2010-12-01 08:00:17 - init - tnsid=CCC
2010-12-01 08:00:19 - mid2 - altid=222
2010-12-01 08:00:21 - mid1 - tnsid=CCC
2010-12-01 08:00:25 - term - altid=222

2010-12-01 08:00:27 - init - tnsid=AAA
2010-12-01 08:00:28 - mid2 - altid=555
2010-12-01 08:00:29 - mid1 - tnsid=AAA
2010-12-01 08:00:35 - term - altid=555

I'd need some help in finding out the rationale behind this behaviour...

Thanks, Paolo

Tags (2)

Ledion_Bitincka
Splunk Employee
Splunk Employee

This issue has been fixed in 4.3.x to get the correct behavior you should add: unifyends=true to your transaction command. The idea here is to tell the transaction command that the opening and closing events of a transaction must also share at least one field value with the rest of the events.

Lowell
Super Champion

So, is you issues with approach 1 simply that you can't determine if the transaction is still open or not? If so there may be a very simple solution.

You could determine if the transaction is complete by simply tacking a search or eval to the end.

| file /tmp/tnsexp.log | extract | sort - _time | transam tnsid altid | eval is_complete=searchmatch("term")

It is much better for performance if you can specify startswith or endswith because transaction can be evicted once they are complete (uses less memory); but it may be possible to get away without them using appropriate maxspan and/or maxpause settings.

I have some more complex approaches that might work here; but they involve subsearches and/or nested transactions, so there's no point going there if the above gets the job done.

Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...