Getting Data In

filtering logs before indexing

vrmandadi
Builder

I have json type of data and below is the sample events .I want to filter out the events which have the field called event name with
vale GetObject i.e . eventName=GetObject

sample event 1

{"awsRegion": "xx", "recipientAccountId": "1111111111", "responseElements": null, "eventVersion": "1.05", "userAgent": "aws-sdk-java/1.11.569 Linux/4.14.77-70.59.amzn1.x86_64 Java_HotSpot(TM)_64-Bit_Server_VM/25.202-b08 java/1.8.0_202 groovy/2.4.15 vendor/Oracle_Corporation", "sourceIPAddress": "54.xx.xx.xxx", "eventID": "25ac22f1-510c-461d-9c3f-4ef9010e6754", "requestID": "adaca173-a9d9-11e9-b947-41a6f2f9bb94", "eventName": "GetObject", "eventType": "AwsApiCall", "requestParameters": {"maxRecords": 100}, "userIdentity": {"accessKeyId": "edgwrhrhrwhwr", "principalId": "AROAJQQOLVHH4PVA7NPZM:redlock", "type": "AssumedRole", "arn": "arn:aws:sts::583542881430:assumed-role/RedLockReadOnlyCyber/redlock", "sessionContext": {"attributes": {"mfaAuthenticated": "false", "creationDate": "2019-07-19T03:45:44Z"}, "sessionIssuer": {"principalId": "AROAJQQOLVHH4PVA7NPZM", "userName": "ddd", "accountId": "ddd", "type": "Role", "arn": "arn:aws:iam::533333333:role/lockr"}}, "accountId": "111111"}, "eventSource": "asss.com", "eventTime": "2019-07-19T03:59:59Z"}

sample event 2
{"awsRegion": "xx", "recipientAccountId": "1111111", "responseElements": null, "eventVersion": "1.05", "userAgent": "aws-sdk-java/1.11.569 Linux/4.14.77-70.59.amzn1.x86_64 Java_HotSpot(TM)_64-Bit_Server_VM/25.202-b08 java/1.8.0_202 groovy/2.4.15 vendor/Oracle_Corporation", "sourceIPAddress": "11.xx.xx.xxxx", "eventID": "25f2b67b-802d-4736-b218-6044ce605ed9", "requestID": "ad7e8c91-a9d9-11e9-b947-41a6f2f9bb94", "eventName": "DescribeAutoScalingGroups", "eventType": "AwsApiCall", "requestParameters": {"maxRecords": 100}, "userIdentity": {"accessKeyId": "ASXXXXXXXXXXXXXXX", "principalId": "AROAJQQSFGWEGEG:redlock", "type": "AssumedRole", "arn": "arn:aws:sts::123425413513:assumed-role/R/redl", "sessionContext": {"attributes": {"mfaAuthenticated": "false", "creationDate": "2019-07-19T03:45:44Z"}, "sessionIssuer": {"principalId": "AROEGEDG", "userName": "RedLockReadOnlyCyber", "accountId": "12222222", "type": "Role", "arn": "arn:aws:iam::22222222:role/OnlyCyber"}}, "accountId": "23333333"}, "eventSource": "autoscaling.amazonaws.com", "eventTime": "2019-07-19T03:59:59Z"}

Thanks in advance

0 Karma
1 Solution

woodcock
Esteemed Legend

On your Indexers (or HFs if you use them), do this:

In props.conf:

[<Your sourcetype here>]
TRANSFORMS-drop_eventname_getobject = drop_eventname_getobject

In transforms.conf:

[drop_eventname_getobject]
REGEX = ,\s*"eventName":\s*"GetObject",
DEST_KEY=queue
FORMAT=nullQueue

View solution in original post

0 Karma

woodcock
Esteemed Legend

On your Indexers (or HFs if you use them), do this:

In props.conf:

[<Your sourcetype here>]
TRANSFORMS-drop_eventname_getobject = drop_eventname_getobject

In transforms.conf:

[drop_eventname_getobject]
REGEX = ,\s*"eventName":\s*"GetObject",
DEST_KEY=queue
FORMAT=nullQueue
0 Karma

vrmandadi
Builder

Thank You @woodcock it worked .What exactly is the syntax that this REGEX you have written uses .You have used \s* which searches for white space character but in the data there is no space for , "eventName": "GetObject", and you have used comma "," with escaping it in / and finally you have used the "eventName": and "GetObject" literally the word .

Is this different kind of syntax that is used .Can you please explain how this works and if we want to add another event name should I just add another line in transforms called REGEX or how to add multiple filters

0 Karma

woodcock
Esteemed Legend

I like to code for obvious/predictable variants so that, as much as possible, my RegEx is future-proof. It is true that this means that my RegEx is not quite as efficient as it optimally could be, but I believe that the fault-tolerance is worth it. Most people will not agree with me.

0 Karma

vrmandadi
Builder

Thank you .How to add another event name in that REGEX , is there a syntax to add and ca the stanza TRANSFORMS-drop_eventname_getobject = drop_eventname_getobject be used for all multiple source types if they have the same kind of data or do we need to create another one?

0 Karma

woodcock
Esteemed Legend

Yes, any stanza in transforms.conf may be referenced multiple times from various stanzas in props.conf. That is the whole idea.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...