I have been trying to prepare pie chart with proper stats on types of database errors. For some unknown reasons, I am seeing incorrect stats when I execute the following query. Here whatever query is the first main query, fetches me correct results. When I run the each individual query, the count is different from the below subsearched query. Please help
index="xyz" attrs.io.kubernetes.pod.namespace="abc" "ERRORCODE=-4499" earliest=-60d@d
| stats count as "Connection Refused"
| appendcols
[ search index="xyz" attrs.io.kubernetes.pod.namespace="abc" "ERRORCODE=-4222" earliest=-60d@d
| stats count as "Unknown Host" ]
| appendcols
[ search index="xyz" attrs.io.kubernetes.pod.namespace="abc" "ERRORCODE=-4470" earliest=-60d@d
| stats count as "Connection is closed" ]
| appendcols
[ search index="xyz" attrs.io.kubernetes.pod.namespace="abc" "ERRORCODE=-4214" earliest=-60d@d
| stats count as "Connection Authorization Failure" ]
| transpose 5
| rename column as "Database Error Type", "row 1" as "Count"
@snayani appendcols and several similar correlation commands are restricted by sub search limitations of bringing in specific number of events for correlation depending on your settings. Since you are pulling last 60 days data, I am expecting your results are getting dropped using appendcols. This would be evident if you notice more recent data to be pulled correctly, however older events getting dropped and even not being pulled at all.
Looking at your query you are repeating same base search so you do not need appendcol at all
. Stats command is a better choice for your use case. While there is no silver bullet
or fixed way to identify which correlation method to use, following document should help you deciding the same in most common use cases: http://docs.splunk.com/Documentation/Splunk/latest/Search/Abouteventcorrelation
index="xyz" attrs.io.kubernetes.pod.namespace="abc" ( "ERRORCODE=-4499" OR "ERRORCODE=-4222" OR "ERRORCODE=-4470" OR "ERRORCODE=-4214" ) earliest=-60d@d
| rex "ERRORCODE=(?<ERRORCODE>-\d+)"
| stats count by ERRORCODE
| eval ERRORCODE=case(ERRORCODE=-4499,"Connection Refused".ERRORCODE,
ERRORCODE=-4222,"Unknown Host".ERRORCODE,
ERRORCODE=-4470,"Connection is closed".ERRORCODE,
ERRORCODE=-4214,"Connection Authorization Failure".ERRORCODE,
true(),"Unknown Error".ERRORCODE)
If you intend to create Pie chart with above data, I am not sure why you would require a transpose as the above stats can do the same.
Also although I have | rex "ERRORCODE=(?<ERRORCODE>-\d+)"
to extract ERRORCODE, since in your original query you did not use the field name, I expect the same to be automatically extracted during Spkunk's Search Time Field Discovery. Provided KV_MODE
is set to auto
and not none
for the above data sourcetype in your props.conf
. So, if the ERRORCODE field exists after running you base search, and also if you are on 6.6. or higher you can directly your Field Value based filter in your base search i.e. ERRORCODE IN (-4499,-4222,-4470,-4214)
.
While the above should be starting point, since you are correlating 60 days of data, you should also consider some optimization techniques:
1) Use summary indexing, by accelerating scheduled report/dashboard or si
or collect
command depending on need. For example daily once for previous day or hourly once for previous hour.
2) Create Data Model for ERRORCODEs and accelerate Data Model. Leverage tstats command on accelerate Data Model instead of stats which would perform way faster.
Following is a run anywhere search which cooks up some dummy ERRORCODEs as per your question. Please try out and confirm!
| makeresults
| eval data="ERRORCODE=-4499;ERRORCODE=-4499;ERRORCODE=-4222;ERRORCODE=-4222;ERRORCODE=-4222;ERRORCODE=-4470;ERRORCODE=-4470;ERRORCODE=-4214;ERRORCODE=-4214;ERRORCODE=-4214;ERRORCODE=-4214;ERRORCODE=-4214"
| makemv data delim=";"
| mvexpand data
| rename data as _raw
| rex "ERRORCODE=(?<ERRORCODE>-\d+)"
| stats count by ERRORCODE
| eval ERRORCODE=case(ERRORCODE=-4499,"Connection Refused".ERRORCODE,
ERRORCODE=-4222,"Unknown Host".ERRORCODE,
ERRORCODE=-4470,"Connection is closed".ERRORCODE,
ERRORCODE=-4214,"Connection Authorization Failure".ERRORCODE,
true(),"Unknown Error".ERRORCODE)
@snayani appendcols and several similar correlation commands are restricted by sub search limitations of bringing in specific number of events for correlation depending on your settings. Since you are pulling last 60 days data, I am expecting your results are getting dropped using appendcols. This would be evident if you notice more recent data to be pulled correctly, however older events getting dropped and even not being pulled at all.
Looking at your query you are repeating same base search so you do not need appendcol at all
. Stats command is a better choice for your use case. While there is no silver bullet
or fixed way to identify which correlation method to use, following document should help you deciding the same in most common use cases: http://docs.splunk.com/Documentation/Splunk/latest/Search/Abouteventcorrelation
index="xyz" attrs.io.kubernetes.pod.namespace="abc" ( "ERRORCODE=-4499" OR "ERRORCODE=-4222" OR "ERRORCODE=-4470" OR "ERRORCODE=-4214" ) earliest=-60d@d
| rex "ERRORCODE=(?<ERRORCODE>-\d+)"
| stats count by ERRORCODE
| eval ERRORCODE=case(ERRORCODE=-4499,"Connection Refused".ERRORCODE,
ERRORCODE=-4222,"Unknown Host".ERRORCODE,
ERRORCODE=-4470,"Connection is closed".ERRORCODE,
ERRORCODE=-4214,"Connection Authorization Failure".ERRORCODE,
true(),"Unknown Error".ERRORCODE)
If you intend to create Pie chart with above data, I am not sure why you would require a transpose as the above stats can do the same.
Also although I have | rex "ERRORCODE=(?<ERRORCODE>-\d+)"
to extract ERRORCODE, since in your original query you did not use the field name, I expect the same to be automatically extracted during Spkunk's Search Time Field Discovery. Provided KV_MODE
is set to auto
and not none
for the above data sourcetype in your props.conf
. So, if the ERRORCODE field exists after running you base search, and also if you are on 6.6. or higher you can directly your Field Value based filter in your base search i.e. ERRORCODE IN (-4499,-4222,-4470,-4214)
.
While the above should be starting point, since you are correlating 60 days of data, you should also consider some optimization techniques:
1) Use summary indexing, by accelerating scheduled report/dashboard or si
or collect
command depending on need. For example daily once for previous day or hourly once for previous hour.
2) Create Data Model for ERRORCODEs and accelerate Data Model. Leverage tstats command on accelerate Data Model instead of stats which would perform way faster.
Following is a run anywhere search which cooks up some dummy ERRORCODEs as per your question. Please try out and confirm!
| makeresults
| eval data="ERRORCODE=-4499;ERRORCODE=-4499;ERRORCODE=-4222;ERRORCODE=-4222;ERRORCODE=-4222;ERRORCODE=-4470;ERRORCODE=-4470;ERRORCODE=-4214;ERRORCODE=-4214;ERRORCODE=-4214;ERRORCODE=-4214;ERRORCODE=-4214"
| makemv data delim=";"
| mvexpand data
| rename data as _raw
| rex "ERRORCODE=(?<ERRORCODE>-\d+)"
| stats count by ERRORCODE
| eval ERRORCODE=case(ERRORCODE=-4499,"Connection Refused".ERRORCODE,
ERRORCODE=-4222,"Unknown Host".ERRORCODE,
ERRORCODE=-4470,"Connection is closed".ERRORCODE,
ERRORCODE=-4214,"Connection Authorization Failure".ERRORCODE,
true(),"Unknown Error".ERRORCODE)
Hi @niketnilay,
I've tried the actual query(first one) and the dummy query(second one) which cooks up some dummy data. Both of them worked like charm!!. Excellent!! Thanks again, for all the knowledge you have shared.
Glad that it worked... Now pass on the knowledge 😉
Hi @niketnilay,
I really appreciate for taking time to respond and giving such a wonderful explaination. I will try your solutions and let you know. Thanks again!!