Splunk Search

subsearch fields do not appear in the table command..

smuderasi
Explorer

What is wrong with this search:

host="**" source="*BIP*"  NOT source="*BIP98*" NOT source="*BIP99*" |eval path=mvindex(split(source,"-"),0) | 
eval thread=mvindex(split(path,"/"),-1) |search "BIP CONTROL" | search "TASK MODE:      0"  | 
eval EXECUTION_DATE=strftime(strptime(EXECUTION_DATE,"%d %b %Y"),"%Y.%m.%d") | rex "TOTAL\s+ACCOUNTS\s+QUEUED:\s+(?[^,\s]+)" | 
 rex "TOTAL\s+SUCCESSFULLY\s+PROCESSED:\s+(?[^,\s]+)" | rex "TOTAL\s+LOCKED/SKIPPED:\s+(?[^,\s]+)"|
 rex "TOTAL\s+IN\s+ERROR:\s+(?[^,\s]+)" | 
 rex "MISSING\n\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\n(?[^,\s]+)" |stats sum(total_accounts) 
 as Total_Accounts,sum(processed) as Total_Successful,sum(skipped) as "Total_Skipped",sum(error) as "Total_Error" by Bill_run,EXECUTION_DATE 
| where Bill_run like "%M%"  | 

appendcols

 [search host="****" sourcetype="kenan_pst" | where source like "%pst_pe_arb_bil_run_bip%"  |
 eval date=mvindex(split(mvindex(split(source,"/"),-1),"_"),6) |  stats earliest(_time) as first_event latest(_time) as 
last_event by source ,date | eval first_event=strftime(first_event,"%H:%M:%S") | eval last_event=strftime(last_event,"%H:%M:%S") 
 | eval time_diff=strptime(last_event,"%H:%M:%S") - strptime(first_event,"%H:%M:%S")
 |  table time_diff date] 

 |  table Bill_run  EXECUTION_DATE Total_Accounts Total_Successful Total_Skipped  Total_Error time_diff  date
0 Karma
1 Solution

DalJeanis
Legend

Your posted code won't work. I don't know what this will do...

| where Bill_run like "%M%"  

... but it seems like you want this instead...

| where like(Bill_run,"%M%")  

... and you want it as high in the code as that field can be tested, if I have guessed right, that is immediately before the stats in your code, but it can be moved (along with the rex that extracts it) to right after the searches..


Most of the work in the subsearch can be simplified to a single stats range() presumably, the date in the subsearch has to be the same as EXECUTION_DATE, or there's nothing concrete to connect the two searches. The resulting code looks something like this...

 host="**" source="*BIP*"  NOT source="*BIP98*" NOT source="*BIP99*" 
| eval path=mvindex(split(source,"-"),0) 
| eval thread=mvindex(split(path,"/"),-1) 
| search "BIP CONTROL" 
| search "TASK MODE:      0"  
| rex "MISSING\n\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\n(?<Bill_run>[^,\s]+)" 
| where like(Bill_run,"%M%")  
| eval EXECUTION_DATE=strftime(strptime(EXECUTION_DATE,"%d %b %Y"),"%Y.%m.%d") 
| rex "TOTAL\s+ACCOUNTS\s+QUEUED:\s+(?<total_accounts>[^,\s]+)" 
| rex "TOTAL\s+SUCCESSFULLY\s+PROCESSED:\s+(?<processed>[^,\s]+)" 
| rex "TOTAL\s+LOCKED/SKIPPED:\s+(?<skipped>[^,\s]+)"
| rex "TOTAL\s+IN\s+ERROR:\s+(?<error>[^,\s]+)" 
| stats sum(total_accounts) as Total_Accounts, 
        sum(processed) as Total_Successful, 
        sum(skipped) as "Total_Skipped", 
        sum(error) as "Total_Error" by Bill_run, EXECUTION_DATE 
| left join EXECUTION_DATE
    [search host="****" sourcetype="kenan_pst" 
    | where source like "%pst_pe_arb_bil_run_bip%"   
    | eval mydate=mvindex(split(mvindex(split(source,"/"),-1),"_"),6)  
    | stats range(_time) as time_diff by source ,mydate 
    | eval EXECUTION_DATE = strftime(strptime(mydate,"formate for mydate"),"%Y.%m.%d") 
    | table time_diff EXECUTION_DATE
    ] 
|  table Bill_run  EXECUTION_DATE Total_Accounts Total_Successful Total_Skipped  Total_Error time_diff 

View solution in original post

DalJeanis
Legend

Your posted code won't work. I don't know what this will do...

| where Bill_run like "%M%"  

... but it seems like you want this instead...

| where like(Bill_run,"%M%")  

... and you want it as high in the code as that field can be tested, if I have guessed right, that is immediately before the stats in your code, but it can be moved (along with the rex that extracts it) to right after the searches..


Most of the work in the subsearch can be simplified to a single stats range() presumably, the date in the subsearch has to be the same as EXECUTION_DATE, or there's nothing concrete to connect the two searches. The resulting code looks something like this...

 host="**" source="*BIP*"  NOT source="*BIP98*" NOT source="*BIP99*" 
| eval path=mvindex(split(source,"-"),0) 
| eval thread=mvindex(split(path,"/"),-1) 
| search "BIP CONTROL" 
| search "TASK MODE:      0"  
| rex "MISSING\n\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\n(?<Bill_run>[^,\s]+)" 
| where like(Bill_run,"%M%")  
| eval EXECUTION_DATE=strftime(strptime(EXECUTION_DATE,"%d %b %Y"),"%Y.%m.%d") 
| rex "TOTAL\s+ACCOUNTS\s+QUEUED:\s+(?<total_accounts>[^,\s]+)" 
| rex "TOTAL\s+SUCCESSFULLY\s+PROCESSED:\s+(?<processed>[^,\s]+)" 
| rex "TOTAL\s+LOCKED/SKIPPED:\s+(?<skipped>[^,\s]+)"
| rex "TOTAL\s+IN\s+ERROR:\s+(?<error>[^,\s]+)" 
| stats sum(total_accounts) as Total_Accounts, 
        sum(processed) as Total_Successful, 
        sum(skipped) as "Total_Skipped", 
        sum(error) as "Total_Error" by Bill_run, EXECUTION_DATE 
| left join EXECUTION_DATE
    [search host="****" sourcetype="kenan_pst" 
    | where source like "%pst_pe_arb_bil_run_bip%"   
    | eval mydate=mvindex(split(mvindex(split(source,"/"),-1),"_"),6)  
    | stats range(_time) as time_diff by source ,mydate 
    | eval EXECUTION_DATE = strftime(strptime(mydate,"formate for mydate"),"%Y.%m.%d") 
    | table time_diff EXECUTION_DATE
    ] 
|  table Bill_run  EXECUTION_DATE Total_Accounts Total_Successful Total_Skipped  Total_Error time_diff 

woodcock
Esteemed Legend

There is no way that this is doing what you think it should; in order to have appendcols work, you must take great pains to ensure that both datasets have identical keys with no gaps in the exact same order. You would be better off showing us sample events from both datasets and explaining exactly what you need to do, including a mockup of the final desired results.

0 Karma

Sukisen1981
Champion

Hi
As stated above you can not use appencols if you dont have at least one common field with common field values as keys for both rows.
For example, if you use appendclos with a common field say name, then ALL the values for name should be present in both the main search and sub search,
If name is main search has John, Marry, Suki and Arsene
And the sub search returns values with John, Marry and Arsene append cols will match the first 2 common values and put the value of Arsene against Suki in the out table.
What you need is join - http://docs.splunk.com/Documentation/SplunkCloud/6.6.1/SearchReference/Join
refer to the link for the various join types, it will resolve your issue

0 Karma

woodcock
Esteemed Legend

No, using join subjects you to a low an inescapable subsearch limit, so do everything possible to avoid using it. See here:

Who: Nick Mealy, Sideview Apps
What: Best practices around grouping and aggregating data from different search results
Recording: https://splunk.webex.com/splunk/lsr.php?RCID=051cba772c9cce79c3eb3718b466233a
Slides: http://sideviewapps.com/slides/2017_05_02_sideview_let_stats_sort_them_out.pptx (Updated several times since the talk and since conf)
The version from the first virtual conf is http://sideviewapps.com/slides/2016_04_15_nick_mealy_grouping_talk.pptx
flowchart (in progress) : http://sideviewapps.com/slides/grouping_flowchart_INPROGRESS.png

0 Karma

cmerriman
Super Champion

when you run just your subsearch, do you get time_diff and date?

as a housekeeping question, why are you putting first_event and last_event in human readable time? you aren't actually using those two evals in your subsearch, since you need them in epoch to eval time_diff.

have you tried using fields instead of table?

appendcols doesn't join any of the fields together, so if you need EXECUTION_DATE joined with date, you might what to use join, also, if your subsearch has 5 results and your base search has 1000, there won't be a time_diff and date for every line of your base search, just the first 5.
http://docs.splunk.com/Documentation/SplunkCloud/6.6.1/SearchReference/Appendcols

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi smuderasi,
which subsearch are you speaking of? there are two subsearches.

Anyway your first subsearch is used to filter events resulting from the main search, so it doesn't add any field to results.

If you're speaking of appendcols subsearch, remember that this command append to all rows of main search fields of first row of the subsearch.
if you want to join main search results with subsearch results use join command.

Bye.
Giuseppe

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...