Splunk Search

Matching events that happen within the micro-seconds apart

dondky
Path Finder

Hello guys,

I'm working on monitoring our mssql error logs and running into a probably simple issue but I'm stumped.

I'm searching for something like this:

sourcetype=mssql:errorlog error_id=17806

Which returns this:

2016-10-03 14:30:37.77 Logon       Error: 17806, Severity: 20, State: 14.

However there is one event just above it that looks like this:

    2016-10-03 14:30:37.77 Logon       SSPI handshake failed with error code 0x8009030c, state 14 while establishing a connection with integrated security; the connection has been closed. Reason: AcceptSecurityContext failed. The Windows error code indicates the 

Which gives us the error message: Notice the time stamp of "2016-10-03 14:30:37.77"

So I came up with this:

sourcetype=mssql:errorlog | eval event_time=strftime(_time,"%F %H:%M:%S.%2N") | search [ search sourcetype=mssql:errorlog error_id=17806 | eval event_time=_time | convert timeformat="%F %H:%M:%S.%2N" ctime(event_time) | fields event_time]

Logic here is we do a sub search for the _time value against all events. Which works, but now we are getting:

2016-10-03 14:30:37.77 Logon       Login failed. The login is from an untrusted domain and cannot be used with Windows authentication. [CLIENT: 192.168.112.94]
2016-10-03 14:30:37.77 Logon       Error: 18452, Severity: 14, State: 1.

2016-10-03 14:30:37.77 Logon       SSPI handshake failed with error code 0x8009030c, state 14 while establishing a connection with integrated security; the connection has been closed. Reason: AcceptSecurityContext failed. The Windows error code indicates the cause of failure. The logon attempt failed   [CLIENT: 192.168.112.94]
2016-10-03 14:30:37.77 Logon       Error: 17806, Severity: 20, State: 14.

Which returns additonal events with the time stamp "2016-10-03 14:30:37.77", how else might we achieve this to accurately match the single error_id with the prior event without returning additional events in the same time?

Thanks

Tags (1)
0 Karma
1 Solution

somesoni2
SplunkTrust
SplunkTrust

Give this a try

sourcetype=mssql:errorlog | eval event_time=strftime(_time,"%F %H:%M:%S.%2N") | search [ search sourcetype=mssql:errorlog error_id=17806 | eval event_time=_time | convert timeformat="%F %H:%M:%S.%2N" ctime(event_time) | fields event_time] | reverse | streamstats current=f window=1 values(error_id) as prev | where error_id=17806 OR prev=17806

View solution in original post

0 Karma

tomasmoser
Contributor

Hi,

You can use code below to "sort" events in more granular way to overcome default timestamp ingranularity (no milliseconds). Verified, it works.

Field "_cd" is an internal field in a format of :. Each event belows to a bucket and has offset (position) in raw data file)

sourcetype="mssql:errorlog"
| rex field=_cd "(?<bucket>\d+):(?<address>\d+)"
| sort _time, address

Output example:

2018-11-25 14:03:19.**19** Server      Microsoft SQL Server 2016 (SP1-CU3) (KB4019916) - 13.0.4435.0 (X64) 
    Apr 27 2017 17:36:12 
    Copyright (c) Microsoft Corporation
    Standard Edition (64-bit) on Windows Server 2012 R2 Standard 6.3 <X64> (Build 9600: ) (Hypervisor)
2018-11-25 14:03:19.19 Server      UTC adjustment: 1:00
2018-11-25 14:03:19.19 Server      (c) Microsoft Corporation.
2018-11-25 14:03:19.19 Server      All rights reserved.
2018-11-25 14:03:19.19 Server      Server process ID is 2460.
2018-11-25 14:03:19.19 Server      System Manufacturer: 'VMware, Inc.', System Model: 'VMware Virtual Platform'.
2018-11-25 14:03:19.19 Server      Authentication mode is MIXED.
2018-11-25 14:03:19.19 Server      Logging SQL Server messages in file 'E:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Log\ERRORLOG'.
2018-11-25 14:03:19.19 Server      The service account is 'EZCORP\Y1017783'. This is an informational message; no user action is required.
2018-11-25 14:03:19.19 Server      Registry startup parameters: 
     -d E:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\master.mdf
     -e E:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Log\ERRORLOG
     -l E:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\mastlog.ldf
2018-11-25 14:03:19.**19** Server      Command Line Startup Parameters:
     -s "MSSQLSERVER"

--tm

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Give this a try

sourcetype=mssql:errorlog | eval event_time=strftime(_time,"%F %H:%M:%S.%2N") | search [ search sourcetype=mssql:errorlog error_id=17806 | eval event_time=_time | convert timeformat="%F %H:%M:%S.%2N" ctime(event_time) | fields event_time] | reverse | streamstats current=f window=1 values(error_id) as prev | where error_id=17806 OR prev=17806
0 Karma

dondky
Path Finder

Thanks for the fast response, however, not sure if this will work, I mean it gets the job done but what happens when we aren't searching for a specific error_id but filtering on severity_level instead:

sourcetype=mssql:errorlog | eval event_time=strftime(_time,"%F %H:%M:%S.%2N") | search [ search sourcetype=mssql:errorlog severity_level >= 16 | eval event_time=_time | convert timeformat="%F %H:%M:%S.%2N" ctime(event_time) | fields event_time]

Plus not to keen on having to redefine error_id twice. I'll continue poking at this.

0 Karma

dondky
Path Finder

Here is an example when I run based on severity levels >= 16.

2016-10-04 09:57:17.48 spid41s     An error occurred in a Service Broker/Database Mirroring transport connection endpoint, Error: 8474, State: 11. (Near endpoint role: Target, far endpoint address: '')

2016-10-04 09:57:17.48 spid41s     Error: 9642, Severity: 16, State: 3.

2016-10-04 09:57:16.48 Logon       Length specified in network packet payload did not match number of bytes read; the connection has been closed. Please contact the vendor of the client library. [CLIENT: 192.168.97.39]

2016-10-04 09:57:16.48 Logon       Error: 17836, Severity: 20, State: 17.

2016-10-04 05:09:00.98 spid45s     An error occurred in a Service Broker/Database Mirroring transport connection endpoint, Error: 8474, State: 11. (Near endpoint role: Target, far endpoint address: '')
2016-10-04 05:09:00.98 spid45s     Error: 9642, Severity: 16, State: 3.

2016-10-04 05:08:59.97 Logon       Length specified in network packet payload did not match number of bytes read; the connection has been closed. Please contact the vendor of the client library. [CLIENT: 192.168.97.225]

2016-10-04 05:08:59.97 Logon       Error: 17836, Severity: 20, State: 17.

I'm wondering if transaction can help.

0 Karma

somesoni2
SplunkTrust
SplunkTrust

I think the autoregress command might help here. Give this a try

sourcetype=mssql:errorlog | eval event_time=strftime(_time,"%F %H:%M:%S.%2N")  | reverse 
| autoregress PutYourFieldHere as prev | where PutYourFieldHere=YourFilterCondition OR prev=YourFilterCondition

So for your send example of Severity>16, it'll be like this

sourcetype=mssql:errorlog | eval event_time=strftime(_time,"%F %H:%M:%S.%2N") | reverse 
| autoregress Severity as prev | where Severity>=16 OR prev>=16
0 Karma

dondky
Path Finder

Thx, I'll re-test this. I think I've been overthinking it severely and maybe just alerting on multiple matches is ok. However I'll poke at these autogress commands prior.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...