Getting Data In

Log pre processing

faustf
Communicator

Hi guys,
I defined my source type as follow (in props.conf):

[anomalies]
DATETIME_CONFIG =
FIELD_NAMES = COL1, COL2, TIMESTAMP, COL4, COL5, KPI_ID ,COL7, COL8, COL9, COL10, COL11, COL12, COL13, ALARM
INDEXED_EXTRACTIONS = csv
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
category = AAAA
pulldown_type = 1
disabled = false
FIELD_DELIMITER = ,
TIME_PREFIX = .*?,.*?,
MAX_TIMESTAMP_LOOKAHEAD = 10
TZ = UTC

and my log file is this:

1,2,1411261200000,4,5,6,7,8,9,10,[11],12,13,[ALARM]
1,2,1411261200000,4,5,6,7,8,9,10,[11],12,13,[ALARM]

My problem is that I need to replace all the [ and ] characters with "[ or ]"
I need this pre-processing because in my log file I've also some lines in the following format:

1,2,1411261200000,4,5,6,7,8,9,10,[11,111,1111],12,13,[ALARM]

The field [11,111,1111] is my problem because Splunk split this filed in 3 different fields:
[11
111
1111]

How can I solve this problem?

Thank you!

0 Karma
1 Solution

gvmorley
Contributor

Hi,

A different approach would be to do the extractions at Search time, using a combination of props.conf and transforms.conf

I did a quick test with this in my props.conf:

[test-csv]
REPORT-fields = sourcetype-test-csv

You'd want to remove all of the FIELD_NAMES, INDEXED_EXTRACTIONS and FIELD_DELIMITER stuff.

And this in my transforms.conf:

[sourcetype-test-csv]
REGEX = ([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),\[([^]]+)\],([^,]+),([^,]+),\[([^]]+)\]
FORMAT = COL1::$1 COL2::$2 COL3::$3 COL4::$4 COL5::$5 COL6::$6 COL7::$7 COL8::$8 COL9::$9 COL10::$10 COL11::$11 COL12::$12 COL13::$13 COL14::$14

This looks like it would give you what you're looking for?
alt text

Just make sure that the REGEX you use in transforms.conf is appropriate for the format of your data. It was a rough and ready example, so may need refining!

Hope that helps to get you closer to what you need.
(Oh and change the name of the 14th field to be ALARM if you want it to be the same as your example)

View solution in original post

0 Karma

gvmorley
Contributor

Hi,

A different approach would be to do the extractions at Search time, using a combination of props.conf and transforms.conf

I did a quick test with this in my props.conf:

[test-csv]
REPORT-fields = sourcetype-test-csv

You'd want to remove all of the FIELD_NAMES, INDEXED_EXTRACTIONS and FIELD_DELIMITER stuff.

And this in my transforms.conf:

[sourcetype-test-csv]
REGEX = ([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),\[([^]]+)\],([^,]+),([^,]+),\[([^]]+)\]
FORMAT = COL1::$1 COL2::$2 COL3::$3 COL4::$4 COL5::$5 COL6::$6 COL7::$7 COL8::$8 COL9::$9 COL10::$10 COL11::$11 COL12::$12 COL13::$13 COL14::$14

This looks like it would give you what you're looking for?
alt text

Just make sure that the REGEX you use in transforms.conf is appropriate for the format of your data. It was a rough and ready example, so may need refining!

Hope that helps to get you closer to what you need.
(Oh and change the name of the 14th field to be ALARM if you want it to be the same as your example)

0 Karma

faustf
Communicator

Thanks.
This seems what I need but I don't know why it isn't working on my environment.

I've a Splunk Enterprise server where I crated an index called "anomalies".
Also I have a second node where I installed the Splunk universal forwarder

The configuration of the Splunk universal forwarder is the following:

File: /opt/splunkforwarder/etc/apps/search/local/inputs.conf

[monitor:///tmp/my.log]
disabled = false
index = anomalies
sourcetype = test-csv

File: /opt/splunkforwarder/etc/system/local/props.conf

[test-csv]
DATETIME_CONFIG = CURRENT
REPORT-fields = sourcetype-test-csv

File: /opt/splunkforwarder/etc/system/local/transforms.conf

[sourcetype-test-csv]
REGEX = ([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),\[([^]]+)\],([^,]+),([^,]+),\[([^]]+)\]
FORMAT = COL1::$1 COL2::$2 COL3::$3 COL4::$4 COL5::$5 COL6::$6 COL7::$7 COL8::$8 COL9::$9 COL10::$10 COL11::$11 COL12::$12 COL13::$13 COL14::$14

File /tmp/my.log

1,2,1411261200000,4,5,6,7,8,9,10,[11],12,13,[ALARM]
1,2,1411261200000,4,5,6,7,8,9,10,[11],12,13,[ALARM]
1,2,1411261200000,4,5,GCD244,7,8,9,10,[11,12,13,[ALARM]
1,2,1411261200000,4,5,6,7,8,9,10,[11],12,13,[ALARM]
1,2,1411261200000,4,5,6,7,8,9,10,[11,111,1111],12,13,[ALARM]

Splunk Web search:

alt text

0 Karma

gvmorley
Contributor

I believe that the props.conf and transforms.conf files need to be on your Search Head. Which, if you have just one Splunk Enterprise server, is the same as the Indexer.

My understanding is that the Universal Forwarder doesn't do any 'processing'. You'd need to use a Heavy Forwarder for that.

But as these props and transforms are after indexing, your Search Head (Splunk Server) is the place for them.

0 Karma

faustf
Communicator

You are right.
I moved the props.conf and transforms.conf in the Search Head. Perfect!!

But I don't understand why the first version of my props.conf was working (it was in the Universal Splunk forwarder). This confused me.

Thank you again!!!

0 Karma

gvmorley
Contributor

No problem.

Just to round off the 'why was props.conf working on the forwarder'...

From looking at the manual here:
http://docs.splunk.com/Documentation/Splunk/latest/Admin/Propsconf

If you look at the section on 'Structured Data Header Extraction and configuration', there's this:

"This feature and all of its settings apply at input time, when data is first read by Splunk. The setting is used on a Splunk system that has configured inputs acquiring the data."

So, as they apply at 'input' time, they get used by the forwarder, as it's the one with the configured input.

At least that's my understanding!

0 Karma

gvmorley
Contributor

Also,

Always have a go with this stuff on a development system first!

I'd recommend running Splunk (with the Free License) on your laptop or PC, so that it's easy to test with.

0 Karma

faustf
Communicator

Of course!!!
Thanks

0 Karma

jkat54
SplunkTrust
SplunkTrust

Add this to your props.conf on your ingestion side (forwarders or indexers)

[sourceTypeName]
SEDCMD-AAAremoveSquareBrackets = s/\[|\]//g

This is to say... for every [ OR ] replace with nothing and do so globally (g is probably not required in this case)

We put AAA in front of class name because SEDCMD's are processed with ASCII order of precedence and AAA is most likely the highest priority in you environment.

0 Karma

jkat54
SplunkTrust
SplunkTrust

Well that's not what your original question said so my solution won't work for the updated version of your question.

They were broke into different fields because of the comma delimiter. It was my understanding that you just needed to remove square brackets. Happy you found your solution either way!

0 Karma

faustf
Communicator

Thank for your answer.

I need to convert the list in square brackets in a single element otherwise the number of the fields of the csv depends on the number of the elements in the square list. I modified the props.conf in this way

[anomalies]
....
SEDCMD-squares-open = s/\[/"[/g
SEDCMD-squares-close = s/\]/]"/g

But, even in this way, the list is split into several elements
alt text

As you can see in the first line of the image the square brackets have been replaced with "[ and ]" but Splunk still split these elements (COL11,COL12,COL13)

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...