Solved: Using where with a wildcard

HattrickNZ · ‎11-27-2017

I want to dynamically remove a number of columns/headers from my stats.
So my thinking is to use a wild card on the left of the comparison operator.
But this does not work

... | where "P-CSCF*">4

Sample Output: Say for example I just wanted to remove the columns P-CSCF-02 & P-CSCF-06 and have P-CSCF-05 and P-CSCF-07 showing. I would like to do it with something like ... | where "P-CSCF*">1 for example. I could use fields + option but that is static I want something dynamic.

_time                       P-CSCF-02   P-CSCF-05   P-CSCF-06   P-CSCF-07
2017-11-27T18:30:00.000+1300    1   2   0   6
2017-11-27T18:35:00.000+1300    0   2   0   6
2017-11-27T18:40:00.000+1300    0   2   0   6

EDIT1
This is my Desired Output with only 2 columns showing, the others omitted because they do not have any values > 1:

_time                         P-CSCF-05 P-CSCF-07
2017-11-27T18:30:00.000+1300    2   6
2017-11-27T18:35:00.000+1300    2   6
2017-11-27T18:40:00.000+1300    2   6

Should I be using something like ... | WHERE like(source,"/logs/%/camel-audit.log")]

Not what I want but this is wildcarding on the RHS of the comparison operator ...|search version=*10_2*

damien_chillet · ‎11-28-2017

You could try append the following to your time chart table:

| untable _time field count
| eventstats avg(count) as field_avg by field
| where field_avg > 1
| xyseries _time field count

That would remove all the field with an average lesser than 1. You can adjust depending on how you want to filter (use max, sum, min, etc.. instead of avg for example)

If you explain in more details when you want to remove a column, I could give you a more complete answer.

View solution in original post

MuS · ‎11-28-2017

So, looking at all the examples and after having a night rest over it, I gave it another try and came up with a bit a different approach which builds a dynamic list of field names and uses a threshold to match on the value. Here is the run everywhere example:

index=_internal date_minute=* 
| stats count by date_minute 
| eval header{date_minute}=date_minute 
| where 
    [ search index=_internal date_minute=* 
    | stats count by date_minute 
    | eval header{date_minute}=date_minute 
    | eval threshold=15, connector=" OR " 
    | foreach header* 
        [ eval total=if(isnull(total), "<<FIELD>>" .">=". threshold, total ."". connector ." ". "<<FIELD>>" .">=". threshold) ] 
    | table total 
    | rename total AS search ] 
| stats count by date_minute

The important part is the foreach() in the subsearch which basically builds a list like this:

| where header0>=15 OR header1>=15 OR header10>=15 OR header11>=15 OR header12>=15 OR header13>=15 OR header14>=15 OR header15>=15 OR header16>=15 OR header17>=15 OR header18>=15 OR header19>=15 OR header2>=15 OR header20>=15 OR header21>=15 OR header22>=15 OR header23>=15 OR header24>=15 OR header25>=15 OR header26>=15 OR header27>=15 OR header28>=15 OR header29>=15 OR header3>=15 OR header30>=15 OR header31>=15 OR header32>=15 OR header33>=15 OR header34>=15 OR header35>=15 OR header36>=15 OR header37>=15 OR header38>=15 OR header39>=15 OR header4>=15 OR header40>=15 OR header41>=15 OR header42>=15 OR header43>=15 OR header44>=15 OR header45>=15 OR header46>=15 OR header47>=15 OR header48>=15 OR header49>=15 OR header5>=15 OR header50>=15 OR header51>=15 OR header52>=15 OR header53>=15 OR header54>=15 OR header55>=15 OR header56>=15 OR header57>=15 OR header58>=15 OR header59>=15 OR header6>=15 OR header7>=15 OR header8>=15 OR header9>=15

The threshold value would be 1 in your case and you have to tweak the example of course, but it will get you a dynamic search approach and something to start with.

cheers, MuS

hettervik · ‎11-28-2017

How about something like this?

| foreach P-CSCF-* [ eval number=mvindex(split('<<FIELD>>', "-"), 2)]
| where number > 4

HattrickNZ · ‎11-28-2017

tks for this attempt. it looks valuable.
Although I want to filter by the value in the fields e.g. 1,2,0,6, not the actual field name e.g. P-CSCF-02.

hope that explains it. have edited my Q with my desired output.

hettervik · ‎11-28-2017

Yes, that's what the "split" is for. It splits the string (field name) into three parts, P, CSCF, and a number, and puts them into a multivalue field. The "mvindex" command takes the second indexed value in that multivalue field (i.e. the number), which you then can filter on.

damien_chillet · ‎11-28-2017

You could try append the following to your time chart table:

| untable _time field count
| eventstats avg(count) as field_avg by field
| where field_avg > 1
| xyseries _time field count

That would remove all the field with an average lesser than 1. You can adjust depending on how you want to filter (use max, sum, min, etc.. instead of avg for example)

If you explain in more details when you want to remove a column, I could give you a more complete answer.

somesoni2 · ‎11-28-2017

I don't think we need the eventstats here. Just the untable command and where filter on field count should do it.

HattrickNZ · ‎11-28-2017

tks this is it. and as above don't think I need the eventstats.
But the time axis seems to change as the below pics hopefully show. Any way I can keep it the same

before
[IMG]http://i63.tinypic.com/iw82vd.png[/IMG]

after
[IMG]http://i66.tinypic.com/35laogl.png[/IMG]

HattrickNZ · ‎11-28-2017

after a 2nd look, you need the eventstats

somesoni2 · ‎11-28-2017

Instead of xyseries command, use | timechart span=5m values(count) by field to resolve that x-axis issue.

HattrickNZ · ‎11-28-2017

tks, I did not try your way but I found this worked. probs very similar
untable _time field count | eventstats avg(count) as field_avg by field | where field_avg > 1 | xyseries _time field count | timechart max(*) as *

niketn · ‎11-27-2017

@HattrickNZ, do you need a generalized version of the following command (based on which fields are present)?

| where P-CSCF-02>4  OR  P-CSCF-05>4 OR P-CSCF-06>4 OR P-CSCF-07>4

Or

| where P-CSCF-02>4  AND  P-CSCF-05>4 AND P-CSCF-06>4 AND P-CSCF-07>4

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

HattrickNZ · ‎11-28-2017

I have edited my Q with the desired output if that helps.

Of the options you propose, the OR option might be a solution, but the AND option I am not sure what that would achieve.

MuS · ‎11-27-2017

Hi HattrickNZ,

after checking the other answer, I must admin I don't understand how it works and thought of a different approach to make it dynamic. Take this run everywhere search:

index=_internal date_minute=* 
| stats count by date_minute 
| eval header{date_minute}=date_minute 
| foreach header* 
    [ eval match=if(date_minute >= "30", "yes", "no")] 
| table date_minute match

The first 3 lines just make up dummy data and the foreach eval actually would be threshold to filter on later.

Hope this makes sense ...

cheers, MuS

HattrickNZ · ‎11-28-2017

tks for this attempt. it looks valuable.
Although I want to filter by the value in the fields e.g. 1,2,0,6, not the actual field name e.g. P-CSCF-02.

hope that explains it. have edited my Q with my desired output.

elliotproebstel · ‎11-27-2017

How about using rex to extract the value in question, and then you can use where to make the desired comparisons?

your base search | rex field=header_field "P-CSCF-(?<version_number>\d+)" | where version_number>1

MuS · ‎11-27-2017

hmm, looks like I don't understand how this works ?!? ¯\_(ツ)_/¯

elliotproebstel · ‎11-27-2017

Sure, I'm happy to explain my approach. As I understood the original post, @HattrickNZ would like to sort/filter the results based on the column headers, which all start with "P-CSCF-" and conclude with a number. My approach extracts that concluding number info a field called version_number, which makes it available for sorting, filtering, etc.

MuS · ‎11-28-2017

I still don't think this works, because your regex is using a specific field and the regex is on the value of that field not the field name itself. It could work if the field name P-CSCF-02 is actually in _raw, but we don't know for sure.
Don't take me wrong, my answer is also not really a good solution and just a workaround 😉

cheers, MuS

elliotproebstel · ‎11-28-2017

That's a fair point. I was assuming that the table being presented in the original post was the result of some stats commands, based on the poster saying, "I want to dynamically remove a number of columns/headers from my stats." If so, my approach would allow the data to be filtered along the search pipeline. If not, then you're right that this would not be a full solution.

HattrickNZ · ‎11-28-2017

tks for this attempt. it looks valuable.
Although I want to filter by the value in the fields e.g. 1,2,0,6, not the actual field name e.g. P-CSCF-02.

hope that explains it. have edited my Q with my desired output.

Using where with a wildcard

Introducing Splunk Enterprise 9.2

Adoption of RUM and APM at Splunk

Routing logs with Splunk OTel Collector for Kubernetes