Splunk Search

Why do I lose data when comparing daily results with previous weeks, same day, using appendcols?

gatekeeper36
New Member

Hi,

My goal is to compare today's count, say Monday, to the previous Monday. Also, compare the previous week's Monday to the week before that, same day. This is performed everyday over a 2-week window.

If I run these 2 searches below separately, the data is not lost. Interestingly, only the first few days of the earliest week, same days, are lost in the 2nd search when using appendcols.

My search:
index=dataset earliest=-14d@d latest=@d method="00001111"
| bucket _time span=1d
| dedup TRANSACTIONID
| stats dc(users) as CurrentCount by _time
| eval Date=strftime(_time, "%F")
| appendcols
[search index=uxi earliest=-21d@d latest=-7d@d method="00001111"

| bucket _time span=1d
| dedup TRANSACTIONID
| stats dc(users) as PreviousWkCount by _time

Example:
Results (2 weeks) with lost data from 07/11 to 07/13 when using appendcols in the 2nd search. These aren't real numbers.

Date Day CurrentCount PreviousWkCount
07/11/2017 Tues 10000 5000 <== Count is not accurate in appendcols
07/12/2017 Wed 10000 5000 <== Count is not accurate in appendcols
07/13/2017 Thurs 10000 5000 <== Count is not accurate in appendcols
07/14/2017 Fri 10000 10000 <== Ok
07/15/2017 Sat 10000 10000 <== Ok
...
...
07/24/2017 Mon 10000 10000 <== Ok

Thanks,

0 Karma

DalJeanis
Legend

As a general case, appendcols is just not the right tool for analysis. You can do this analysis in one pass, without any subsearches.

Here's the base search...

index=dataset earliest=-21d@d latest=@d method="00001111"
| bucket _time span=1d
| dedup TRANSACTIONID
| stats dc(users) as DayCount by _time

... and you can use this run-anywhere sample for testing...

| gentimes start=7/4/17 end=7/25/17 
| eval _time =starttime
| streamstats count as TRANSACTIONID 
| eval DayCount=TRANSACTIONID+1000
| table _time DayCount

... then you do this...

| rename COMMENT as "Clone each record into two records, one for each time range"
| eval which=mvappend("current","prior")
| mvexpand which

| rename COMMENT as "Set the time ranges, and get rid of irrelevant records"
| eventstats min(_time) as mintime
| eval startcurrent=mintime+604800
| eval endprior=startcurrent+604800
| where (which="current" AND _time>=startcurrent) 
    OR (which="prior" AND _time<endprior)

| rename COMMENT as "Set fields for each time period, move `prior` records forward one week."
| eval PreviousDate=strftime(if(which="prior",_time,null()),"%Y-%m-%d")
| eval _time=if(which="prior",_time+604800,_time)
| eval CurrentCount=if(which="current",DayCount,null())
| eval PreviousWkCount=if(which="prior",DayCount,null())

| rename COMMENT as "Roll the data up and present"
| stats sum(CurrentCount) as CurrentCount  sum(PreviousWkCount) as PreviousWkCount max(PreviousDate) as PreviousDate by _time
| table _time CurrentCount PreviousDate PreviousWkCount 
0 Karma

niketn
Legend

@gatekeeper36, Running a subsearch command like append and appendcols may result in dropping of events when it reaches its max limit. Refer to documentation: http://docs.splunk.com/Documentation/Splunk/latest/Search/Aboutsubsearches#Subsearch_performance_con...

You can try out any/all of the following whichever is/are applicable:

1) Reduce query time window to reduce overall number of events
2) Add more filters like TRANSACTIONID="*" users=* to filter only events with these fields. If you have more filters please apply them as well in your base search for main query and subsearch.
3) Edit the limits.conf file to increase limits of results to be returned, provided your Splunk infrastructure can handle the same.
4) Summarize events (summary indexing), so that append can easily be applied on stats rather than raw events.
5) You can play around with other correlation techniques like multisearch (refer to Splunk Answers: https://answers.splunk.com/answers/241562/does-multisearch-suffer-from-subsearch-limits.html#answer-...) or else union command which is newly introduced in Splunk 6.6

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

gatekeeper36
New Member

Thanks for the quick reply. There isn't much room to play with on #1 and #2 but I am interested in #3-5 and will be looking at these.

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...