Getting Data In

With a Splunk forwarder, what are the best ways to control a format's data?

oxthon
New Member

Hello everyone,

I hope you are fine.

So I have a question about the indexing of data in Splunk and especially the control of the data.

My configuration is an indexer distributed with a forwarder.

I receive data from a remote mount. These are CSV files.

An example of structure:
date, host, ipv4, ipv6, dns, nb_packet, size, ....

line 125: ipv4=12.32.45.255 => right
line 356: ipv4= 42.hello!.84.125 => wrong so go in index=error please and hurry up 🙂

I would like to control the content of the data. For example, that the format of ipv4 is good.
is it possible for each field to control the format of its value in transform.conf or props.conf?

Today, I run my CSV python (panda) to control them.

Is Splunk able to do it?

If you have an example with a CSV with two or three fields, I'm interested.

I thank you a thousand times.

Oxthon.

0 Karma

MuS
Legend

Hi oxthon,

Well looking at this in a pure technical way it is of course possible to do this, does it make sense to do it ¯\_(ツ)_/¯ Most likely better to follow @ddrillic 's advice and make sure those events do not come into Splunk in the first place.

But back to your question, you can use a props & transforms setup to check if an event contains a valid IP and put it into index=right anything with an invalid IP will go into index=error. This setup can be based on source, sourcetype or host.

Try something like this:

props.conf

[your sourcetype name here]
TRANSFORMS - 000-sourcetypeName-routing-based-on-ip = 001-sourcetypeName-routing, 000-default-errorRouting

transforms.conf

[000-default-errorRouting]
REGEX = =\s[\d\w\.!]+\s
DEST_KEY =  _MetaData:index
FORMAT = error

[001-sourcetypeName-routing]
REGEX = (?:\d{1,3}\.){3}\d{1,3}
DEST_KEY =  _MetaData:index
FORMAT = right

These setting must go onto the parsing Splunk instance (HWF or indexer) and you need to restart this instance.

Hope this helps ...

cheers, MuS

ddrillic
Ultra Champion

-- control the content of the data.

It's not part of the product. Actually, the product prides itself in allowing any data through and therefore, I would place safeguards before the data is being ingested, meaning, before it reaches Splunk.

Get Updates on the Splunk Community!

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...

Updated Team Landing Page in Splunk Observability

We’re making some changes to the team landing page in Splunk Observability, based on your feedback. The ...

New! Splunk Observability Search Enhancements for Splunk APM Services/Traces and ...

Regardless of where you are in Splunk Observability, you can search for relevant APM targets including service ...