Splunk Search

How to write a search to group values until threshold is reached?

david_rose
Communicator

I have data from 2 different data sources. I am trying to figure out how to distribute a value into a cost until the cost is "used up". In other words, until the sum of VALUES=COST. Then it moves on to the next COST and does the same thing with the remaining values until all the VALUES are exhausted.

Some sample data:

COST     VALUES
20000   30000
20000   5000
20000   2000
         8000
         15000

Given this, I need to be able to identify which VALUES are associated with which COST. As seen below.

COST    VALUES
20000   
        5000
        15000
20000   
        20000
20000   
        8000
        2000
        10000

The nested values add up to the COST. Also notice 30000 from the sample data was split into 20000 and 10000, due to only needing 20000 to satisfy one of the COSTs.

I have been banging my head against this wall for a week, and I am leaning towards just scripting it in python and passing it back into Splunk, unless of course some of you Splunk geniuses know of a more "splunkish" way to accomplish this.

Thanks for your help!

msivill_splunk
Splunk Employee
Splunk Employee

To your point about scripting in python, to make this a bit more "splunkish" you could create a custom search command http://dev.splunk.com/view/python-sdk/SP-CAAAEU2. So your python code could be called as a command from within the query (SPL)

0 Karma

david_rose
Communicator

Yeah thats what i was planning to do.

0 Karma

woodcock
Esteemed Legend

The algorithm required does appear a very poor fit for the SPL command set. Writing a program to do just this and then sending the data to it is probably the best way (i.e. custom command).

0 Karma

woodcock
Esteemed Legend

If you fix your example, which I am pretty sure is broken (the VALUES values are not the same between the 2 sections), I will make an attempt to answer.

0 Karma

david_rose
Communicator

The values are correct. One of the initial values was 30000. Because only 20000 of that was need to fully account for a cost, it needs to be split to 20000 and the remainder, 10000

0 Karma

woodcock
Esteemed Legend

OK, I did not read/understand the comment at the bottom which clarifies it. I see what you need now but it is a doozy.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...