Splunk Search

How to correlate Dense Sensor Data with a Sparse Data Set?

ErikaE
Communicator

I have dense sensor data (~75k events in a 3 week period) from multiple sensors that I would like to correlate to a sparse set of data (~100 events in the same time period). The data span several different sensors.

Sensor data events have these fields:
_time, SensorName=SensorA, Data=394

Sparse data events have these fields:
_time, ImportantProperty=583

Sensor data logs at a specific interval when nothing interesting is happening, or records a value if there is a significant change in the Data value.

What I would like to do is have Splunk tell me what the Data value was for each SensorName at the time that an ImportantProperty event occurred.

In the past, when I was looking at only one sensor, I have used transaction:

(source=densedata AND SensorName=SensorA) OR source=sparsedata | transaction maxspan=10m maxevents=-1 | search ImportantProperty=* | stats avg(Data) avg(ImportantProperty) by _time

However, this doesn't work when multiple SensorNames are involved.

It seems like what I would want to do is subsearch for the sensor data around the time the ImportantProperty events occur, but I can't find any documentation on how I would do that for multiple time ranges (i.e. each of the 100 sparse events) and find all the sensor data events at those times.

Help!

I am constrained by my position and can't show my actual data, only generalized searches like the above.

0 Karma
1 Solution

woodcock
Esteemed Legend

You need the map command; it allows you to run an inner subsearch once for every event contained in the outer search and allows you to pass fields from each outer search event into the inner search:

http://docs.splunk.com/Documentation/Splunk/6.3.1/SearchReference/map

View solution in original post

DalJeanis
Legend

This is a job for streamstats ...

source=densedata OR source=sparsedata 

| rename COMMENT as "optionally, if there is no SensorName on the sparsedata, then do this"
| eval SensorName=coalesce(SensorName,"sensor1 sensor2 sensor3 sensoretc")
| makemv SensorName    
| mvexpand SensorName

| rename COMMENT as "Now we keep a running average of the last ten minutes Data on all sensors"
| sort 0 SensorName _time
| streamstats avg(Data) as avgData time_window=10m by SensorName

| rename COMMENT as "And keep only the sparse events, which now have the running average for Data"
| where source="sparsedata"
| fillnull value="((none))" avgData
0 Karma

woodcock
Esteemed Legend

You need the map command; it allows you to run an inner subsearch once for every event contained in the outer search and allows you to pass fields from each outer search event into the inner search:

http://docs.splunk.com/Documentation/Splunk/6.3.1/SearchReference/map

ErikaE
Communicator

Thanks for the help! The doc is pretty sparse on examples of map. Is it possible to return matching events or more than one row of a table?

My inner search is:

search="search earliest=early_t latest=late_t Sensor=Sensor* | stats avg(Data) by Sensor"

which returns a table like:

Sensor      Data 
SensorA   39
SensorB   393
SensorC   392
SensorD  494 

The doc says that the search is a literal search. Where can I find more info on what that means and any restrictions that might come with it?

0 Karma

ErikaE
Communicator

Here is how got the output I was looking for:

source=sparsedata | eval earliest_t=_time-300 | eval latest_t=_time+300 | eval midtime=_time+0 | map maxsearches=999 search="search source=densedata earliest=$earliest_t$ latest=$latest_t$ | eval mid_t=$midtime$ | eval ImportantProperty=$ImportantProperty$ | stats avg(Data), values(ImportantProperty), values(mid_t) by Sensor" | table Sensor avg(Data) values(ImportantProperty) values(mid_t) 

The important bit I was missing when I posted my first reply was the final table command to get the output in the format I wanted.

0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...