Solved: subsearch fields do not appear in the table comman...

smuderasi · ‎09-06-2017

What is wrong with this search:

host="**" source="*BIP*"  NOT source="*BIP98*" NOT source="*BIP99*" |eval path=mvindex(split(source,"-"),0) | 
eval thread=mvindex(split(path,"/"),-1) |search "BIP CONTROL" | search "TASK MODE:      0"  | 
eval EXECUTION_DATE=strftime(strptime(EXECUTION_DATE,"%d %b %Y"),"%Y.%m.%d") | rex "TOTAL\s+ACCOUNTS\s+QUEUED:\s+(?[^,\s]+)" | 
 rex "TOTAL\s+SUCCESSFULLY\s+PROCESSED:\s+(?[^,\s]+)" | rex "TOTAL\s+LOCKED/SKIPPED:\s+(?[^,\s]+)"|
 rex "TOTAL\s+IN\s+ERROR:\s+(?[^,\s]+)" | 
 rex "MISSING\n\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\n(?[^,\s]+)" |stats sum(total_accounts) 
 as Total_Accounts,sum(processed) as Total_Successful,sum(skipped) as "Total_Skipped",sum(error) as "Total_Error" by Bill_run,EXECUTION_DATE 
| where Bill_run like "%M%"  | 

appendcols

 [search host="****" sourcetype="kenan_pst" | where source like "%pst_pe_arb_bil_run_bip%"  |
 eval date=mvindex(split(mvindex(split(source,"/"),-1),"_"),6) |  stats earliest(_time) as first_event latest(_time) as 
last_event by source ,date | eval first_event=strftime(first_event,"%H:%M:%S") | eval last_event=strftime(last_event,"%H:%M:%S") 
 | eval time_diff=strptime(last_event,"%H:%M:%S") - strptime(first_event,"%H:%M:%S")
 |  table time_diff date] 

 |  table Bill_run  EXECUTION_DATE Total_Accounts Total_Successful Total_Skipped  Total_Error time_diff  date

DalJeanis · ‎09-10-2017

Your posted code won't work. I don't know what this will do...

| where Bill_run like "%M%"

... but it seems like you want this instead...

| where like(Bill_run,"%M%")

... and you want it as high in the code as that field can be tested, if I have guessed right, that is immediately before the stats in your code, but it can be moved (along with the rex that extracts it) to right after the searches..

Most of the work in the subsearch can be simplified to a single stats range() presumably, the date in the subsearch has to be the same as EXECUTION_DATE, or there's nothing concrete to connect the two searches. The resulting code looks something like this...

 host="**" source="*BIP*"  NOT source="*BIP98*" NOT source="*BIP99*" 
| eval path=mvindex(split(source,"-"),0) 
| eval thread=mvindex(split(path,"/"),-1) 
| search "BIP CONTROL" 
| search "TASK MODE:      0"  
| rex "MISSING\n\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\n(?<Bill_run>[^,\s]+)" 
| where like(Bill_run,"%M%")  
| eval EXECUTION_DATE=strftime(strptime(EXECUTION_DATE,"%d %b %Y"),"%Y.%m.%d") 
| rex "TOTAL\s+ACCOUNTS\s+QUEUED:\s+(?<total_accounts>[^,\s]+)" 
| rex "TOTAL\s+SUCCESSFULLY\s+PROCESSED:\s+(?<processed>[^,\s]+)" 
| rex "TOTAL\s+LOCKED/SKIPPED:\s+(?<skipped>[^,\s]+)"
| rex "TOTAL\s+IN\s+ERROR:\s+(?<error>[^,\s]+)" 
| stats sum(total_accounts) as Total_Accounts, 
        sum(processed) as Total_Successful, 
        sum(skipped) as "Total_Skipped", 
        sum(error) as "Total_Error" by Bill_run, EXECUTION_DATE 
| left join EXECUTION_DATE
    [search host="****" sourcetype="kenan_pst" 
    | where source like "%pst_pe_arb_bil_run_bip%"   
    | eval mydate=mvindex(split(mvindex(split(source,"/"),-1),"_"),6)  
    | stats range(_time) as time_diff by source ,mydate 
    | eval EXECUTION_DATE = strftime(strptime(mydate,"formate for mydate"),"%Y.%m.%d") 
    | table time_diff EXECUTION_DATE
    ] 
|  table Bill_run  EXECUTION_DATE Total_Accounts Total_Successful Total_Skipped  Total_Error time_diff

View solution in original post

DalJeanis · ‎09-10-2017

Your posted code won't work. I don't know what this will do...

| where Bill_run like "%M%"

... but it seems like you want this instead...

| where like(Bill_run,"%M%")

... and you want it as high in the code as that field can be tested, if I have guessed right, that is immediately before the stats in your code, but it can be moved (along with the rex that extracts it) to right after the searches..

Most of the work in the subsearch can be simplified to a single stats range() presumably, the date in the subsearch has to be the same as EXECUTION_DATE, or there's nothing concrete to connect the two searches. The resulting code looks something like this...

 host="**" source="*BIP*"  NOT source="*BIP98*" NOT source="*BIP99*" 
| eval path=mvindex(split(source,"-"),0) 
| eval thread=mvindex(split(path,"/"),-1) 
| search "BIP CONTROL" 
| search "TASK MODE:      0"  
| rex "MISSING\n\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\s+\W+\n(?<Bill_run>[^,\s]+)" 
| where like(Bill_run,"%M%")  
| eval EXECUTION_DATE=strftime(strptime(EXECUTION_DATE,"%d %b %Y"),"%Y.%m.%d") 
| rex "TOTAL\s+ACCOUNTS\s+QUEUED:\s+(?<total_accounts>[^,\s]+)" 
| rex "TOTAL\s+SUCCESSFULLY\s+PROCESSED:\s+(?<processed>[^,\s]+)" 
| rex "TOTAL\s+LOCKED/SKIPPED:\s+(?<skipped>[^,\s]+)"
| rex "TOTAL\s+IN\s+ERROR:\s+(?<error>[^,\s]+)" 
| stats sum(total_accounts) as Total_Accounts, 
        sum(processed) as Total_Successful, 
        sum(skipped) as "Total_Skipped", 
        sum(error) as "Total_Error" by Bill_run, EXECUTION_DATE 
| left join EXECUTION_DATE
    [search host="****" sourcetype="kenan_pst" 
    | where source like "%pst_pe_arb_bil_run_bip%"   
    | eval mydate=mvindex(split(mvindex(split(source,"/"),-1),"_"),6)  
    | stats range(_time) as time_diff by source ,mydate 
    | eval EXECUTION_DATE = strftime(strptime(mydate,"formate for mydate"),"%Y.%m.%d") 
    | table time_diff EXECUTION_DATE
    ] 
|  table Bill_run  EXECUTION_DATE Total_Accounts Total_Successful Total_Skipped  Total_Error time_diff

woodcock · ‎09-09-2017

There is no way that this is doing what you think it should; in order to have appendcols work, you must take great pains to ensure that both datasets have identical keys with no gaps in the exact same order. You would be better off showing us sample events from both datasets and explaining exactly what you need to do, including a mockup of the final desired results.

Sukisen1981 · ‎09-10-2017

Hi
As stated above you can not use appencols if you dont have at least one common field with common field values as keys for both rows.
For example, if you use appendclos with a common field say name, then ALL the values for name should be present in both the main search and sub search,
If name is main search has John, Marry, Suki and Arsene
And the sub search returns values with John, Marry and Arsene append cols will match the first 2 common values and put the value of Arsene against Suki in the out table.
What you need is join - http://docs.splunk.com/Documentation/SplunkCloud/6.6.1/SearchReference/Join
refer to the link for the various join types, it will resolve your issue

woodcock · ‎09-10-2017

No, using join subjects you to a low an inescapable subsearch limit, so do everything possible to avoid using it. See here:

Who: Nick Mealy, Sideview Apps
What: Best practices around grouping and aggregating data from different search results
Recording: https://splunk.webex.com/splunk/lsr.php?RCID=051cba772c9cce79c3eb3718b466233a
Slides: http://sideviewapps.com/slides/2017_05_02_sideview_let_stats_sort_them_out.pptx (Updated several times since the talk and since conf)
The version from the first virtual conf is http://sideviewapps.com/slides/2016_04_15_nick_mealy_grouping_talk.pptx
flowchart (in progress) : http://sideviewapps.com/slides/grouping_flowchart_INPROGRESS.png

cmerriman · ‎09-06-2017

when you run just your subsearch, do you get time_diff and date?

as a housekeeping question, why are you putting first_event and last_event in human readable time? you aren't actually using those two evals in your subsearch, since you need them in epoch to eval time_diff.

have you tried using fields instead of table?

appendcols doesn't join any of the fields together, so if you need EXECUTION_DATE joined with date, you might what to use join, also, if your subsearch has 5 results and your base search has 1000, there won't be a time_diff and date for every line of your base search, just the first 5.
http://docs.splunk.com/Documentation/SplunkCloud/6.6.1/SearchReference/Appendcols

gcusello · ‎09-06-2017

Hi smuderasi,
which subsearch are you speaking of? there are two subsearches.

Anyway your first subsearch is used to filter events resulting from the main search, so it doesn't add any field to results.

If you're speaking of appendcols subsearch, remember that this command append to all rows of main search fields of first row of the subsearch.
if you want to join main search results with subsearch results use join command.

Bye.
Giuseppe

subsearch fields do not appear in the table command..

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

.conf24 | Registration Open!

ICYMI - Check out the latest releases of Splunk Edge Processor