Getting Data In

filtering logs before indexing

vrmandadi
Builder

I have json type of data and below is the sample events .I want to filter out the events which have the field called event name with
vale GetObject i.e . eventName=GetObject

sample event 1

{"awsRegion": "xx", "recipientAccountId": "1111111111", "responseElements": null, "eventVersion": "1.05", "userAgent": "aws-sdk-java/1.11.569 Linux/4.14.77-70.59.amzn1.x86_64 Java_HotSpot(TM)_64-Bit_Server_VM/25.202-b08 java/1.8.0_202 groovy/2.4.15 vendor/Oracle_Corporation", "sourceIPAddress": "54.xx.xx.xxx", "eventID": "25ac22f1-510c-461d-9c3f-4ef9010e6754", "requestID": "adaca173-a9d9-11e9-b947-41a6f2f9bb94", "eventName": "GetObject", "eventType": "AwsApiCall", "requestParameters": {"maxRecords": 100}, "userIdentity": {"accessKeyId": "edgwrhrhrwhwr", "principalId": "AROAJQQOLVHH4PVA7NPZM:redlock", "type": "AssumedRole", "arn": "arn:aws:sts::583542881430:assumed-role/RedLockReadOnlyCyber/redlock", "sessionContext": {"attributes": {"mfaAuthenticated": "false", "creationDate": "2019-07-19T03:45:44Z"}, "sessionIssuer": {"principalId": "AROAJQQOLVHH4PVA7NPZM", "userName": "ddd", "accountId": "ddd", "type": "Role", "arn": "arn:aws:iam::533333333:role/lockr"}}, "accountId": "111111"}, "eventSource": "asss.com", "eventTime": "2019-07-19T03:59:59Z"}

sample event 2
{"awsRegion": "xx", "recipientAccountId": "1111111", "responseElements": null, "eventVersion": "1.05", "userAgent": "aws-sdk-java/1.11.569 Linux/4.14.77-70.59.amzn1.x86_64 Java_HotSpot(TM)_64-Bit_Server_VM/25.202-b08 java/1.8.0_202 groovy/2.4.15 vendor/Oracle_Corporation", "sourceIPAddress": "11.xx.xx.xxxx", "eventID": "25f2b67b-802d-4736-b218-6044ce605ed9", "requestID": "ad7e8c91-a9d9-11e9-b947-41a6f2f9bb94", "eventName": "DescribeAutoScalingGroups", "eventType": "AwsApiCall", "requestParameters": {"maxRecords": 100}, "userIdentity": {"accessKeyId": "ASXXXXXXXXXXXXXXX", "principalId": "AROAJQQSFGWEGEG:redlock", "type": "AssumedRole", "arn": "arn:aws:sts::123425413513:assumed-role/R/redl", "sessionContext": {"attributes": {"mfaAuthenticated": "false", "creationDate": "2019-07-19T03:45:44Z"}, "sessionIssuer": {"principalId": "AROEGEDG", "userName": "RedLockReadOnlyCyber", "accountId": "12222222", "type": "Role", "arn": "arn:aws:iam::22222222:role/OnlyCyber"}}, "accountId": "23333333"}, "eventSource": "autoscaling.amazonaws.com", "eventTime": "2019-07-19T03:59:59Z"}

Thanks in advance

0 Karma
1 Solution

woodcock
Esteemed Legend

On your Indexers (or HFs if you use them), do this:

In props.conf:

[<Your sourcetype here>]
TRANSFORMS-drop_eventname_getobject = drop_eventname_getobject

In transforms.conf:

[drop_eventname_getobject]
REGEX = ,\s*"eventName":\s*"GetObject",
DEST_KEY=queue
FORMAT=nullQueue

View solution in original post

0 Karma

woodcock
Esteemed Legend

On your Indexers (or HFs if you use them), do this:

In props.conf:

[<Your sourcetype here>]
TRANSFORMS-drop_eventname_getobject = drop_eventname_getobject

In transforms.conf:

[drop_eventname_getobject]
REGEX = ,\s*"eventName":\s*"GetObject",
DEST_KEY=queue
FORMAT=nullQueue
0 Karma

vrmandadi
Builder

Thank You @woodcock it worked .What exactly is the syntax that this REGEX you have written uses .You have used \s* which searches for white space character but in the data there is no space for , "eventName": "GetObject", and you have used comma "," with escaping it in / and finally you have used the "eventName": and "GetObject" literally the word .

Is this different kind of syntax that is used .Can you please explain how this works and if we want to add another event name should I just add another line in transforms called REGEX or how to add multiple filters

0 Karma

woodcock
Esteemed Legend

I like to code for obvious/predictable variants so that, as much as possible, my RegEx is future-proof. It is true that this means that my RegEx is not quite as efficient as it optimally could be, but I believe that the fault-tolerance is worth it. Most people will not agree with me.

0 Karma

vrmandadi
Builder

Thank you .How to add another event name in that REGEX , is there a syntax to add and ca the stanza TRANSFORMS-drop_eventname_getobject = drop_eventname_getobject be used for all multiple source types if they have the same kind of data or do we need to create another one?

0 Karma

woodcock
Esteemed Legend

Yes, any stanza in transforms.conf may be referenced multiple times from various stanzas in props.conf. That is the whole idea.

0 Karma
Get Updates on the Splunk Community!

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...