Deployment Architecture

Transaction problem when using "startswith"

Paolo_Prigione
Builder

Hi all, is there any known issue on the startswith clause, when using multiple fields to identify complex transactions?

In the following case using endswith alone does a good job, but using startswith or both of them will provide incorrect results.

Here's my case: it's a sample file, manually put together to explore the topic. Transactions are made by two types of events, those with tnsid "open" the transaction and those with altid "terminate" it. Those events are "chained" by an event having both fields in it.

2010-12-01 08:00:01 - init - tnsid=AAA
2010-12-01 08:00:03 - init - tnsid=BBB
2010-12-01 08:00:05 - mid1 - tnsid=AAA
2010-12-01 08:00:07 - mid1 - tnsid=BBB
2010-12-01 08:00:09 - chain - tnsid=AAA altid=111
2010-12-01 08:00:11 - mid2 - altid=111
2010-12-01 08:00:13 - chain - tnsid=BBB altid=222
2010-12-01 08:00:15 - term - altid=111
2010-12-01 08:00:17 - init - tnsid=CCC
2010-12-01 08:00:19 - mid2 - altid=222
2010-12-01 08:00:21 - mid1 - tnsid=CCC
2010-12-01 08:00:23 - chain - tnsid=CCC altid=555
2010-12-01 08:00:25 - term - altid=222
2010-12-01 08:00:27 - init - tnsid=AAA
2010-12-01 08:00:28 - mid2 - altid=555
2010-12-01 08:00:29 - mid1 - tnsid=AAA
2010-12-01 08:00:30 - chain - tnsid=AAA altid=666
2010-12-01 08:00:35 - term - altid=555

I can achieve the expected result through the simplest transam command:

| file /tmp/tnsexp.log | extract | sort - _time | transam tnsid altid

T1:
2010-12-01 08:00:01 - init - tnsid=AAA
2010-12-01 08:00:05 - mid1 - tnsid=AAA
2010-12-01 08:00:09 - chain - tnsid=AAA altid=111
2010-12-01 08:00:11 - mid2 - altid=111
2010-12-01 08:00:15 - term - altid=111
T2:
2010-12-01 08:00:03 - init - tnsid=BBB
2010-12-01 08:00:07 - mid1 - tnsid=BBB
2010-12-01 08:00:13 - chain - tnsid=BBB altid=222
2010-12-01 08:00:19 - mid2 - altid=222
2010-12-01 08:00:25 - term - altid=222
T3:
2010-12-01 08:00:17 - init - tnsid=CCC
2010-12-01 08:00:21 - mid1 - tnsid=CCC
2010-12-01 08:00:23 - chain - tnsid=CCC altid=555
2010-12-01 08:00:28 - mid2 - altid=555
2010-12-01 08:00:35 - term - altid=555
T4:
2010-12-01 08:00:27 - init - tnsid=AAA
2010-12-01 08:00:29 - mid1 - tnsid=AAA
2010-12-01 08:00:30 - chain - tnsid=AAA altid=666

The latest is an "open" transaction which should or should not be returned depending on the keepevicted setting, but having Splunk no knowledge about what's an evicted transaction the parameter has no effect. If I wish to code some logic into the command:

  • ...| transam tnsid altid endswith="term" -> will provide correct results, keepevicted will correctly control the output of T4, but will leave closed_txn=0 for all of the results
  • ...| transam tnsid altid startswith="init" -> will break everything

----- Here's its output:

2010-12-01 08:00:01 - init - tnsid=AAA
2010-12-01 08:00:05 - mid1 - tnsid=AAA
2010-12-01 08:00:09 - chain - tnsid=AAA altid=111
2010-12-01 08:00:11 - mid2 - altid=111
2010-12-01 08:00:15 - term - altid=111

2010-12-01 08:00:03 - init - tnsid=BBB
2010-12-01 08:00:07 - mid1 - tnsid=BBB
2010-12-01 08:00:13 - chain - tnsid=BBB altid=222

2010-12-01 08:00:17 - init - tnsid=CCC
2010-12-01 08:00:19 - mid2 - altid=222
2010-12-01 08:00:25 - term - altid=222

2010-12-01 08:00:27 - init - tnsid=AAA
2010-12-01 08:00:28 - mid2 - altid=555
2010-12-01 08:00:35 - term - altid=555
  • ...| transam tnsid altid startswith="init" endswith="term" -> will still return incorrect results.

----- Here's its output:

2010-12-01 08:00:01 - init - tnsid=AAA
2010-12-01 08:00:05 - mid1 - tnsid=AAA
2010-12-01 08:00:09 - chain - tnsid=AAA altid=111
2010-12-01 08:00:11 - mid2 - altid=111
2010-12-01 08:00:15 - term - altid=111

2010-12-01 08:00:17 - init - tnsid=CCC
2010-12-01 08:00:19 - mid2 - altid=222
2010-12-01 08:00:21 - mid1 - tnsid=CCC
2010-12-01 08:00:25 - term - altid=222

2010-12-01 08:00:27 - init - tnsid=AAA
2010-12-01 08:00:28 - mid2 - altid=555
2010-12-01 08:00:29 - mid1 - tnsid=AAA
2010-12-01 08:00:35 - term - altid=555

I'd need some help in finding out the rationale behind this behaviour...

Thanks, Paolo

Tags (2)

Ledion_Bitincka
Splunk Employee
Splunk Employee

This issue has been fixed in 4.3.x to get the correct behavior you should add: unifyends=true to your transaction command. The idea here is to tell the transaction command that the opening and closing events of a transaction must also share at least one field value with the rest of the events.

Lowell
Super Champion

So, is you issues with approach 1 simply that you can't determine if the transaction is still open or not? If so there may be a very simple solution.

You could determine if the transaction is complete by simply tacking a search or eval to the end.

| file /tmp/tnsexp.log | extract | sort - _time | transam tnsid altid | eval is_complete=searchmatch("term")

It is much better for performance if you can specify startswith or endswith because transaction can be evicted once they are complete (uses less memory); but it may be possible to get away without them using appropriate maxspan and/or maxpause settings.

I have some more complex approaches that might work here; but they involve subsearches and/or nested transactions, so there's no point going there if the above gets the job done.

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...