Getting Data In

Field extraction from filename with sourcetpye csv during index time

tfechner
Path Finder

HI,

I have several files on a server loooking like: d-*_t-*.csv e.g. d-edu_t-names.csv
The csv file is a normal csv file with many columns.

We have a universal forwarder installed on this server.

My inputs.conf shows:

[monitor:///opt/log/d-*_t-*.csv]
sourcetype=csv
index=tmp
connection_host=dns

Now I must have the field d-(.+) and t-(.+) during index! time put into new fields in the index. This is due to having many more of these servers logging with the same mechanism.
Next we need a special field extraction and naming of the csv-columns. Therefore I like to enter some extractions in transforms.conf. But how can I use this as we use the sourcetype=csv and this is a global stanza?

Which is the best solution for this with less impact in indexer? and... How do I put the server-name into a field and not the source = filename)?

0 Karma

somesoni2
SplunkTrust
SplunkTrust

The transforms can be created for source as well. so you can create one stanza for your source [source::/opt/log/d-*_t-*.cs] and add your transforms under that. If you want Indexed time field extraction (read para #4 of first section of this page before deciding index-time vs search time), it should be setup on your indexer, else on Search Head.

0 Karma

tfechner
Path Finder

ok - made some investigations and the extraction could bemade at searchtime.
I found a soultion to extract field from file names but is it better to use the props.conf oder transforms.conf extraction method?

in transforms I would use.
SOURCE_KEY = source
REGEX = d-(.+)_t-(.+).csv
FORMAT=field1:$1 field2:$2

or in props.conf:
EXTRACT-sourcefields = d-(.+)_t-(.+).csv in source

0 Karma

niketn
Legend

@tfechner it is a matter of decision whether you want field extraction while indexing (which will put more load on indexer/Heavy Forwarder, during index time) or while searching (which puts load on Search Head). Since the field is being extracted from source, it seems to be metadata kind of feel so I would say you can use transforms.conf along with props.conf for index time field extraction. But do check the performance whether your regex is putting too much load while indexing and delaying the index rate or not.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

tfechner
Path Finder

i do notneed index time fields any more - got the missing information. so search time is enough. So I start with props/transforms. and: correction. the source_key should be "SOURCE_KEY = MetaData:Source"

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...