Is there a way to check if a value is between a li...

charlesmcdonald · ‎09-17-2018

Hello!

Is there a way to check if a number is between a list of ranges in a multi value field?

For example on this table, I would want to create a new true/false field based on if "Value" is between one of the values in the Ranges column. I know this should be possible with mvexpand but that would get quite verbose especially if there were multiple sets of Ranges.

I tried looking but I couldn't find a 'for each' equivalent for multi value fields though maybe there is something I missed.

Thanks for the help!

zonistj · ‎09-17-2018

Hi,

I've found a way to do this but I'm making some assumptions. This method works if your "Ranges" values always come in sets of threes.

| makeresults 1 
| eval ID="0001",Ranges="1536609307-1536609307,1536782107-1536782107,1536868507-1536868507",Value="1536682105" 
| makemv delim="," Ranges
| eval first_value=mvindex(Ranges,0),last_value=mvindex(Ranges,2)
| eval first_value=replace(first_value,"\-\d+","")
| eval last_value=replace(last_value,"\-\d+","") 
| where Value > first_value AND Value < last_value

We're using mvindex to create separate values for your first and last value and then the replace function to make it numerically searchable. After that, we just compare "Value" against the first and last value from your range.

Hope this helps!

charlesmcdonald · ‎09-17-2018

Hey thanks for the response, unfortunately my ranges can be anywhere from 0 to 50 values so this will not work. Additionally, they may not be continuous so only looking if the number is between the first start and the last end may miss some values.

zonistj · ‎09-17-2018

Thanks for the additional information. I can help with the first issue :

| makeresults 1 
| eval ID="0001",Ranges="1536609307-1536609307,1536782107-1536782107,1536868507-1536868507",Value="1536682105" 
| fields - count 
| makemv delim="," Ranges
**| eval test_count = mvcount(Ranges)-1**
| eval first_value=mvindex(Ranges,0),last_value=**mvindex(Ranges,test_count)**
| eval first_value=replace(first_value,"\-\d+","")
| eval last_value=replace(last_value,"\-\d+","") 
| where Value > first_value AND Value < last_value

By using the "mvcount" function, we can dynamically set our mvindex to the last value regardless if it's 3 or 50.

I'm not sure if we can get around the other issue of the data not being continuous. It seems like that's a data source issue that is in contradiction to your use case. If possible, it's best to address that upstream at the data source.

But we might be able to do something. Can you post an example of what the non-continuous data looks like?

charlesmcdonald · ‎09-17-2018

The original example I posted is non-continuous, but another example of values that could be in Range would be: '1536695705-1536868507, 1536954904-1537214108'

zonistj · ‎09-17-2018

Okay, I think this will work but it might not perform quickly depending on your volume of data:

| makeresults 1 
| eval ID="0001",Ranges="1536695705-1536868507,1536954904-1537214108",Value="1536702105" 
| fields - count 
| makemv delim="," Ranges
| rex field=Ranges "(?<first_value>\d+)\-(?<last_value>\d+)"
| mvexpand first_value
| mvexpand last_value
| stats values(Value) AS Value  by ID,first_value,last_value
| where Value > first_value AND Value < last_value

I still think this could be best solved upstream. You might be able to get clever with transforms.conf and extract the necessary fields out at index time.

charlesmcdonald · ‎09-18-2018

Hey, this solution still will not work because say a value is between the ranges it will still get selected I believe.

zonistj · ‎09-18-2018

Oh, here, it occurred to me that we could just do the mvexpand before rex'ing out the new fields. That should solve the issue:

| makeresults 1 
| eval ID="0001",Ranges="1536695705-1536868507,1536954904-1537214108",Value="1536702105" 
| fields - count 
| makemv delim="," Ranges 
| mvexpand Ranges 
| rex field=Ranges "(?<first_value>\d+)\-(?<last_value>\d+)" 
| stats values(Value) AS Value by ID,first_value,last_value 
| where Value > first_value AND Value < last_value

If you take out the final "where" statement, you'll see that all of your ranges are there, but only the ones that have "value" in the range show in the results with the "where" statement.

charlesmcdonald · ‎09-18-2018

Okay this might work, will have to play around with it to see if there are any ways to speed it up since I'm looking at 10 mil+ rows.

zonistj · ‎09-18-2018

You're right. This mvexpand technique results in the first and last values for each range on the same lines. I'll work on it further and see if I can figure something out. Thanks for the interesting problem!

Is there a way to check if a value is between a list of values?

.conf24 | Registration Open!

ICYMI - Check out the latest releases of Splunk Edge Processor

Introducing the 2024 SplunkTrust!