Getting Data In

Extract fields in JSON during index time

sarnagar
Contributor

Hi ,
I'm a newbie to splunk in field extractions. Appreciate any help on this.
I have JSON Format logs like below:

alt text

I want source and tag as a field i.e it should not appear in events instead as separate fields like the way default fields appear on the left hand side in UI. Also I want the word "line:" to be removed. so basically only my line event should appear in splunk. How can I achieve this?
I believe props.conf and transforms should be a solution. But I dont know how to approach that. My transforms should contain a regex to capture what? I'm not understandin what my regex should do?

1 Solution

sdaniels
Splunk Employee
Splunk Employee

I don't believe you'll need any regex according to what i'm seeing. Or perhaps i don't understand exactly what you want to display. I am using the lastest Splunk 6.5 and this is what I get when i ingest your events and assign _json as the sourcetype. That is simply the raw event viewer.

alt text

The fields are being created properly. Look at the left side in the above screenshot to see those extracted fields. Then, below then you can simply use the table command to display the data how you'd like to see it.

alt text

View solution in original post

sarnagar
Contributor

Hi @sdaniels ,

Could'nt attach an image in comments section so responding
here in the answers section.
Thankyou for the response. But I believe that doesn't solve our customer's requirement completely.
Like you said , I can use SEDCMD to remove the word "line:" .
But I want only the below highlighted ones to appear in _raw events. Is that possible? How do we achieve that?

alt text

When I perform the search, the values of source and tag should not appear in _raw events but they should appear as only extracted fields.

I tried the below props and transforms but it doesn't seem to work. Could you please help?

PROPS

[httpevent]
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none
SHOULD_LINEMERGE=true
category=Structured
disabled=false
pulldown_type=true
TRANSFORMS-fields = field1,filed2

TRANSFORMS

[field1]
REGEX = (?:[^"\n]*"){7}(?P[^"]+)
FORMAT = source::$1

[field2]
REGEX = (?:[^,\n]*,){2}"\w+":"(?P[^"]+)
FORMAT = tag::$1

0 Karma

sdaniels
Splunk Employee
Splunk Employee

I'm not sure what you are trying to accomplish here. If you only want the highlighted yellow part to appear in the raw message that means you'd need to modify raw and delete the rest using the SEDCMD. The fields that appear on the lower left of the Search page create fields that are extracted from _raw. If you remove data from _raw, it's not available to create fields, therefore you wouldn't have fields for source and tag. Is there a security concern here? Is it about abstracting away complexity to the user? Why does your customer want it done this particular way?

0 Karma

sarnagar
Contributor

Hi @sdaniels,

Basically, earlier we had indexed the Dynatrace collector logs for monitoring and these logs appeared in normal format in splunk.
Now these(above images) are the Dynatrace collectors running in the Docker containers. So after these collectors are dockerized, these collector logs appear in json format. We are trying to see if we can make this json appear like the old regular non-json collector logs. Is that possible?

0 Karma

sdaniels
Splunk Employee
Splunk Employee

You can use the SEDCMD to replace your raw event with only the highlighted part. You lose source and tag because they wouldn't be part of the _raw message. You could also run script on the data before it comes into Splunk and represent the data however you'd like so it matches the old format.

0 Karma

sarnagar
Contributor

@sdaniels,

Thankyuo . But I believe I don't wanna lose data from tag and source. Else is there a way to extract more fields from the tag values like below?
ORIGINAL:
","tag":"itec-artifactory.fmr.com:6555/com.fmr.pl000123.demo.actionate:0.0.1-14/Actionate_DEV_ACTIONATE.1.385y3873nb5k4m7xsmwxokgum/92e6e10df174"

MODIFIED:
container-image=itec-artifactory.fmr.com:6555/com.fmr.pl000123.ezpaas.ezpaas-dynatrace-collector:6.3-11,container-service=Dynatrace_Collector_DEV-WLP_WLP.7.3hvzd4e5b5zdby4blgu1v8rm8,container-id=5125046f7489

0 Karma

sdaniels
Splunk Employee
Splunk Employee

I would suggest posting a new question with exactly what you want to do. The information above is nothing like the raw events you posted originally. If you post a few raw event examples and details on what you want now, i'm sure we can get you what you need.

0 Karma

sdaniels
Splunk Employee
Splunk Employee

Wasn't posting in the comments section so responding to your comment here.

Sure. Why do you want it out of the raw event if it doesn't affect your searching and viewing of the data the way that you want it? In props.conf you can use the command: SEDCMD

http://docs.splunk.com/Documentation/Splunk/6.2.4/Data/Anonymizedatausingconfigurationfiles#Anonymiz...

The link above shows how to anonymize data using a SED script. Pattern match and replace it etc... In your case, you replace it with nothing. If you do this, you may then have to create regex to pull out the source and tag fields manually though, not sure. Right now the _json format is taking care of that for you. Try it out. Use Regexr.com and you can play with RegEx matching if you need to change anything.

Something like this in prop.conf to remove source and then similar for tag-
SEDCMD - dumpsrc = \,\"source\"://g

sdaniels
Splunk Employee
Splunk Employee

I don't believe you'll need any regex according to what i'm seeing. Or perhaps i don't understand exactly what you want to display. I am using the lastest Splunk 6.5 and this is what I get when i ingest your events and assign _json as the sourcetype. That is simply the raw event viewer.

alt text

The fields are being created properly. Look at the left side in the above screenshot to see those extracted fields. Then, below then you can simply use the table command to display the data how you'd like to see it.

alt text

sarnagar
Contributor

Hi @sdaniels ,

I did that. But I don't want source and tag to be displayed in events. They should be as only fields on the left side.
Is that possible?

0 Karma

sdaniels
Splunk Employee
Splunk Employee

Responded below. thanks

0 Karma

sarnagar
Contributor

RAW DATA:

{"line":"[ERROR ] CWWKS9660E: The orb element with the defaultOrb id attribute requires a user registry but no user registry became available within 10 seconds. As a result, no application will start. Ensure that you have configured an appropriate user registry in the server.xml file.","source":"stderr","tag":"itec-artifactory.fmr.com:6555/com.fmr.pl000123.demo.actionate:0.0.1-14/Actionate_DEV_ACTIONATE.1.385y3873nb5k4m7xsmwxokgum/92e6e10df174"}
{"line":"[AUDIT ] CWWKS4104A: LTPA keys created in 1.184 seconds. LTPA key file: /opt/ibm/wlp/output/defaultServer/resources/security/ltpa.keys","source":"stdout","tag":"itec-artifactory.fmr.com:6555/com.fmr.pl000123.demo.actionate:0.0.1-14/Actionate_DEV_ACTIONATE.1.385y3873nb5k4m7xsmwxokgum/92e6e10df174"}
{"line":"[AUDIT ] CWWKZ0058I: Monitoring dropins for applications. ","source":"stdout","tag":"itec-artifactory.fmr.com:6555/com.fmr.pl000123.demo.actionate:0.0.1-14/Actionate_DEV_ACTIONATE.1.385y3873nb5k4m7xsmwxokgum/92e6e10df174"}
{"line":"[ERROR ] CWWKG0074E: Unable to update the configuration for jndiReferenceEntry with the unique identifier customDataSourceFactoryEntry because of the exception: The value jdbc/actionateDB for attribute jndiName is not unique.","source":"stderr","tag":"itec-artifactory.fmr.com:6555/com.fmr.pl000123.demo.actionate:0.0.1-14/Actionate_DEV_ACTIONATE.1.385y3873nb5k4m7xsmwxokgum/92e6e10df174"}

0 Karma

sarnagar
Contributor

Someone kindly help me write the regex for source n tag .. I'm finding it very difficult to frame since it's new to me..

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...