Splunk Search

How to calculate a duration percentage in a transaction search for non-existent events?

thechivalrous
New Member

I have this specific issue where I'm trying to calculate percentage of online time for a set of devices.

I created following search:

...| transaction startswith="offline state start" endswith="offline state end" | stats sum(duration) as total_offline | eval online=100*(86400-total_offline )/86400  

This works fine if "offline state start" and "offline state end" exists in the logs and I can calculate online state percent based on that, but if those strings do not exist in logs, then how can I calculate online percent? Ideally, it should be 100% online if everything is fine (meaning if it could not find offline events) but how can I successfully execute eval to 100% if the logs do not have "offline state start" and "offline state end"?

0 Karma

skoelpin
SplunkTrust
SplunkTrust

Do your events have a unique identifier tied to the "offline state start" and "offline state end" to represent that they are pairs? If not then I suggest combining those logs together as 1 event at index time rather than search time.

To do this you will need to go on the indexer and go to Splunk_Home/etc/system/local and edit your props.conf file to include those independent events as 1. Share some sample data and I'll help you out

0 Karma

skoelpin
SplunkTrust
SplunkTrust

So you have from=offline state start, to=offline state end, in the same event.. If you're going to group them with a transaction command then they need to be separate events. Below I pasted your example but removed the "offline state end" from the first event and removed "offline state start" from your second event. So in this case you can use a transaction command to group those independent events into one event.

[2015-08-11 00:38:53,747] INFO tracking.WidgetStateMachine [tid-169]: cmd=TRANSIT_STATE, device_id=112233445566, from=offline state start, action=timeout

[2015-08-10 23:33:47,244] INFO tracking.WidgetStateMachine [tid-339] [request_id=1122334455666:6:0] : cmd=TRANSIT_STATE, device_id=112233445566, to=offline state end, action=connect

index=whatever | transaction startswith="offline state start" endswith="offline state end" | stats sum(duration) as total_offline | eval online=100*(86400-total_offline )/86400

Since you have both "offline state start" and "offline state end" in the same event already then you do not need to group them together. Is the device_id unique to each start/end? You need a unique field to tie the first event to the last event

0 Karma

thechivalrous
New Member

Sorry I'm new to Splunk so excuse my ignorance.

Actually, the logs I had pasted before was "after" running the transaction command . That is why I think they (from=offline state start, to=offline state end, )were found in the same event.

To answer your question about uniqueness of the device_id....
device_id is unique. So the situation is as follows:

  • Multiple devices exist in test
  • Each device could go offline multiple times during the test period ( say 24 hours). For example: Device-1 could go offline for first 4 hours and get back online for another hour and go offline for another 2 hours. So essentially, I need to calculate 6 hours of total offline period (duration) for Device-1. Similarly, I need to calculate offline duration for all devices in the test and group them by device_id
  • And based on the offline duration, I need calculate online duration (and percentage). The problem is to build a query which could address following:

Multiple devices are offline occasionally; calculate online percentage over the period of 24 hours.
None of the devices are offline during the period of 24 hours; calculate online percentage (and this will be 100% obviously since there are no offline devices).

0 Karma

thechivalrous
New Member

Hi skoelpin,

I actually forgot to mention the field in transaction command. Its device_id as mentioned below.

...| transaction device_id  startswith="offline state start" endswith="offline state end" | ...

So if the strings mentioned in startswith and endswith are available in logs, then I'd see events mentioned below (sample log)

Here is a sample log:

[2015-08-11 00:38:53,747] INFO tracking.WidgetStateMachine [tid-169]: cmd=TRANSIT_STATE, device_id=112233445566, from=offline state start, to=offline state end, action=timeout
 [2015-08-11 14:16:25,001] INFO tracking.WidgetStateMachine [tid-390] [request_id=1122334455666:0:FB5511D6] : cmd=TRANSIT_STATE, device_id=112233445566, from=offline state start, to=offline state end, action=connect

[2015-08-10 16:21:16,071] INFO tracking.WidgetStateMachine [tid-169]: cmd=TRANSIT_STATE, device_id=112233445566, from=offline state start, to=offline state end, action=timeout 
[2015-08-10 23:33:47,244] INFO tracking.WidgetStateMachine [tid-339] [request_id=1122334455666:6:0] : cmd=TRANSIT_STATE, device_id=112233445566, from=offline state start, to=offline state end, action=connect
0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...