Splunk Search

How can I get rid of thousands of automatically created sourcetypes

markgo
Engager

I've had the misfortune of feeding 30K input files from Amazon S3 Cloudfront logs into my live Splunk instance, without specifying a sourcetype.

This has created a serious problem in that it has resulted in thousands of automatically created variants of sourcetype-too-small from the bizarre headers that Amazon likes to use (note that the REAL data does not cause this issue).

As a result, performance has slowed to a crawl.

I've deleted the "bad" events, but is there something I can do about the bad automatically created sourcetypes?

As to why I didn't notice this--it didn't become a problem until the number of sourcetypes grew to a prodigous value. And since my searches excluded bad events, I never noticed the sourcetypes.

MuS
Legend

Hi markgo

I recently fixed that by adding this to my props.conf & transforms.conf:

**props.conf**
[default]
TRANSFORMS-meta = fix_auto_source

**transforms.conf**
[fix_auto_source]
SOURCE_KEY = MetaData:Source
DEST_KEY = MetaData:Source
REGEX = ^(/.*|.:.*)
FORMAT = source::splunktcp://25000

this changes all those automatically created sources to splunktcp://25000.

hope this helps a bit and don't forget to change the regex to match your pattern.

regards

Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...