Splunk Search

Can Splunk do this with a Search Query....?

lpolo
Motivator

I have a log that registers search queries with the following format:

  _time                     q       MAC
11/13/12 12:46:31.000 AM    CA      0000000B4BE8
11/13/12 12:46:31.200 AM    CARL        0000000B4BE8
11/13/12 12:46:31.250 AM    CARLIE      0000000B4BE8
11/13/12 12:46:31.255 AM    CARLIE B    0000000B4BE8
11/13/12 12:46:31.270 AM    CARLIE BR   0000000B4BE8
11/13/12 12:46:32.100 AM    CHAR        0000000B4BE8
11/13/12 12:46:32.150 AM    CHARL       0000000B4BE8
11/13/12 12:46:32.155 AM    CHARLI      0000000B4BE8
11/13/12 12:46:32.223 AM    CHARLIE     0000000B4BE8
11/13/12 12:46:32.354 AM    CHARLIE B   0000000B4BE8
11/13/12 12:46:33.400 AM    CHARLIE BR  0000000B4BE8

The result set I need should be:

MAC      count      q
0000000B4BE8 5     CARLIE BR 
0000000B4BE8 6     CHARLIE BR

The recommended query found in below answer is the following:

search construct|
| stats count by q, MAC
| eval len=length(q)
| eventstats max(len) AS longest by MAC| where len=longest
| table MAC q count

The query does not returned the distinct longest q. If this query is ran against the sample log presented before the result set is:

MAC      count    q
0000000B4BE8 6       CHARLIE BR

It should be:

MAC      count      q
0000000B4BE8 5     CARLIE BR 
0000000B4BE8 6     CHARLIE BR

Any ideas? How to fix it.

Thanks,
Lp

Tags (1)
0 Karma

sowings
Splunk Employee
Splunk Employee

Yes; you'll want to write a field extraction to either capture the "A" "AL" "ALF" "ALL" values as their own field, or include them as part of the "q" field, then you can | stats count by <field>.

If you capture it as q, the command would look like | stats count by q; if you kept the contents of q the same (right now, auto-KV extraction would extract it as "UP"), but created a new field (e.g. "otherfield") for the A- values, your command would be | stats count by q, otherfield

lpolo
Motivator

I clearly understand you. That is why I modified the original question to show an example.Thanks for your help.

0 Karma

sowings
Splunk Employee
Splunk Employee

In Splunk context, the word "distinct" is going to mean "values not exactly identical to", so from your main example CHAR and CHARL are distinct.

You seem to be asking "is this a substring of something else?" and that's a trickier question. I'll mull it over and see if I can come up with something.

0 Karma

lpolo
Motivator

I might have not laid out my question correctly. I just need the longest (length) distinct q by MAC. I modified the original question to illustrate the requirement.
Thanks for your help.

0 Karma

sowings
Splunk Employee
Splunk Employee

Now you're contradicting yourself; do you want

  • Longest query (length) by MAC -- this is the answer I've worked to provide, and is shown as the "but this is what I got" in your original question above (TYLER PERRY 11).
  • List of distinct queries by MAC -- this is what it *currently* sounds like you're asking for, though you had specifically requested "longest query" before....
  • Something else?
0 Karma

lpolo
Motivator

I fully tested your recommended query. I found that it fails identifying the longest distinct q. I updated the original question to show the problem.

0 Karma

sowings
Splunk Employee
Splunk Employee

If this helped, consider accepting it as your answer to the question....

0 Karma

sowings
Splunk Employee
Splunk Employee

Put the stats first, that should help overall search time.

0 Karma

lpolo
Motivator

It seems to work for a small data set but I am not sure for large data set. For large data sets it seems that is not doing its job. I have to fully test it. I will comment this question after I fully tested. For the time being it is a good progress.

0 Karma

sowings
Splunk Employee
Splunk Employee

Try


<your_query>
| stats count by q, MAC
| eval len=length(q)
| eventstats max(len) AS longest by MAC
| where len=longest
| table MAC q count

0 Karma

sowings
Splunk Employee
Splunk Employee

Search limitations? As in performance? No, I wouldn't think so. Eventstats can be a little heavy, but given what you're doing, it's necessary, so....

If you find that there are performance problems with the search, you could perform the stats count by q, MAC first, then do the eval, eventstats, where, etc, after.

Stats will reduce the overall number of rows by performing aggregation, and discarding detail (in this case, time stamps go away, but you get the count), so the next search commands have smaller data sets to work with, and are therefore a bit faster.

0 Karma

lpolo
Motivator

It is quite slow... Any way to improve the query?

0 Karma

lpolo
Motivator

I updated the question. It did the job. thanks I learned from your example. One more question. Should I expect any search limitations with this query?

0 Karma

sowings
Splunk Employee
Splunk Employee

If you're extracting q automatically by Splunk, you're using the automatic key=value extraction. Because of the whitespace in the q field, Splunk will see that as a separate word, and not as part of the value of "q". Therefore, yes you will need the rex command I've provided above.

0 Karma

lpolo
Motivator

I am already extracting q and MAC both are extracted automatically by Splunk. Do I need the rex command?

0 Karma

sowings
Splunk Employee
Splunk Employee

You asked for the longest q. The search string I've posted in the comment above (combined with the rex statement to capture all of the "q" field) works and produces the output you've asked for in the original question.

If you're having trouble with it, post the results you get from the above search, so that we can figure out what's going wrong.

0 Karma

lpolo
Motivator

It does not find the distinct searches. It just grabs the longest q. any idea?

0 Karma

sowings
Splunk Employee
Splunk Employee

Out of the box, Splunk will extract q having these values: UP SON HEL HELLO. I've had to use a regex on this sample to extract the full (with spaces) q field.

rex "q\=(?[\w\s]+)(?:\s+MAC=)"

Since you wanted it broken out by MAC, I also had to add this to my eventstats command, and the final stats call.


| eval len=length(q)
| eventstats max(len) AS longest by MAC
| where len=longest
| stats count by MAC, q
| table MAC q count
0 Karma

lpolo
Motivator

I updated the question. thanks for your help.

0 Karma

sowings
Splunk Employee
Splunk Employee

In every instance of your content sample, MAC is "1". Can you update that with actual sample data, and a clearer description of what you're after?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...