Splunk Search

merging records based on matching fields

erik_frambach
New Member

Hi,

I have incoming telephone call detail records like these:

call_start caller ring_duration call_duration
2016/09/02_16:49:40 0049026084395 5 00:00:00
2016/09/02_16:49:40 0049026084395 1 00:02:19
2016/09/06_14:39:47 0031685829250 7 00:04:52
2016/09/05_15:34:14 0031655838055 3 00:00:50
2016/09/05_15:34:14 0031655838055 0 00:02:22

There is something odd about them. The first two started at exactly the same second and originate from the same caller. The last two ditto. The explanation is as follows. The call at 16:49:40 was directed to a general number (first record, phone rang). Someone picked it up (second record). This means that the first record is NOT a missed call. I should dismiss it, which I can do using "dedup call_start caller".

But now consider the last two records. In this case the call was picked up and transferred to someone else after 50 seconds. So these two records should be treated as one. However, the dedup trick will not work correctly here. Instead I would like to merge them into one record, and add up the call_durations. That should be fine for the first two records as well. Note that records like the third should not be affected.

Any thoughts on how to do this?

0 Karma

somesoni2
SplunkTrust
SplunkTrust

How about this

your base search | eval call_duration=round(strptime(call_duration,"%H:%M:%S")-relative_time(now(),"@d")) | stats values(ring_duration) as ring_duration sum(call_duration) as call_duration values(otherfieldthatyouneed) as otherfieldthatyouneed by call_start caller
0 Karma

erik_frambach
New Member

Some progress... Using a "join" I can get the correct total # of calls without losing the correct total call time:

...same search... | stats count as "Total # calls" sum(duration) | stats count as "Total calls" | join type=left [ ...same search... | dedup call_start caller | stats count as "Correct total calls" ]

But the other stats are still incorrect. Ideally, I think, all records that have both identical call_start and identical caller, should be merged into one. So records 1 and 2 would become, say:

2016/09/02_16:49:40 0049026084395 5+1 00:00:00+00:02:19

No information is lost, it just needs a bit more processing. Can this be done?

0 Karma

sundareshr
Legend

How about

... | bin span=1d _time | stats sum(call_duration) as call_dur sum(ring_duration) as ring_dur by _time caller
0 Karma

erik_frambach
New Member

Thanks sunareshr. I think the solution will be somewhat more complex. What I want bottom line is a report that shows:

  1. Total number of incoming calls (calls transferred or picked up by others should count as only 1)
  2. Total number of missed calls (not counting records like the first in my sample)
  3. Mean response time (excluding records with call_duration=0)
  4. Max response time (excluding records with call_duration=0)
  5. Total calling time (= sum(call_duration))

I've got all of this working, except that I need to handle cases with identical call_start and caller. Without a dedup or other construction, no 1 and no 2 are both too high.

Any ideas?

0 Karma

erik_frambach
New Member

No, the called_number is the same in the first and second record.

The more I think about, the more it seems that I need two separate searches. How would I do that?

Here's my current search that produces incorrect values for total # calls and missed calls. Maybe this search is not clever or efficient - sorry, I'm a newbie at Splunk:
... | convert dur2sec(call_duration) as duration
| eval missed=if(duration=0,1,0)
| eval wasputonhold=if(hold_time>0,1,0)
| stats count AS "Total # calls"
sum(missed) as "Missed calls"
sum(wasputonhold) as "Put on hold"
avg(ring_duration) AS mean_answertime
max(ring_duration) AS max_answertime
max(hold_time) AS max_wasputonhold
sum(duration) as Total_calltime
| eval Total_calltime=strftime(Total_calltime, "%H:%M:%S")
| eval mean_answertime=round(mean_answertime,1)
| eval mean_answertime=tostring(mean_answertime)." s"
| eval max_answertime=tostring(max_answertime)." s"
| eval max_wasputonhold=tostring(max_wasputonhold)." s"
| rename mean_answertime as "Mean answering time"
max_answertime as "Max. answering time"
max_wasputonhold as "Max. time on hold"
Total_calltime as "Total call time"

0 Karma

sundareshr
Legend

So from the data above, how do you know the first call went to a general number and the fourth one did not?

0 Karma

erik_frambach
New Member

From the field "called_number" (not shown here) I can tell: it's the general number. There is a group of 10 that can pick up a call to this number. From the fact that there are two records from the same caller at the same second, I deduced that this is what happened. The records don't provide proof of this, but the people involved confirmed that this is how it works.
The same goes for the 4th record. This is how these people told me they handle calls.
The telephone system simply throws out records of everything. Even if I dial one digit and then cancel the call immediately, I get a record showing what I did. The system can't be trained to be more intelligent, so I'm hoping Splunk can show its power.

0 Karma

sundareshr
Legend

So this is not the entire log/event? Can the called_number field be used to determine missed vs forwarded calls?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...