Getting Data In

Change Index and Sourcetype

jpcontrerasadit
Explorer

I have set of data, where I want to send events with a 404 error code to a different index as well as after processing the records, I want to set a final, different sourcetype. Neither are working. Please advise...

props.conf:

[weblogs]
SHOULD_LINEMERGE = false
LINE_BREAKER = (&&&)(?=\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b)
DATETIME_CONFIG = 
NO_BINARY_CHECK = true
category = Custom
pulldown_type = true
TRANSFORMS-1 = notfound
TRANSFORMS-2 = setsourcetype
disabled = false

transforms.conf:

[notfound]
REGEX = 404
DEST_KEY = _MetaData:Index
FORMAT = notfoundindex

[setsourcetype]
SOURCE_KEY = _raw
REGEX = ^.
DEST_KEY = Metadata:Sourcetype
FORMAT = sourcetype::access_combined
Labels (4)
0 Karma

dindu
Contributor

For line-breaking use the regex as (&&&)
In props.give Max events as 40000
Truncate as 20000(check the max using len function and adjust).

Create a new index named notfoundindex (Settings-->Index)

 [props.conf]
REGEX = (\&\&\&)
 MAX_EVENTS = 40000
TRUNCATE = 20000

TRANSFORMS-01-notfound = notfound
TRANSFORMS-02-setsourcetype= setsourcetype

transforms.conf
[notfound]
REGEX = .*404.*
DEST_KEY = _MetaData:Index
FORMAT = notfoundindex

 [setsourcetype]
SOURCE_KEY = _raw
REGEX = .*
DEST_KEY = Metadata:Sourcetype
FORMAT = sourcetype::access_combined

avoelk
Communicator

first: one minor change to the REGEX for the 404 status events: 

if you use REGEX = 

.*404.*

it takes up events that have maybe a status 200 followed by 404 also. To prevent this you could use this REGEX instead: 

 

"\s404\s

 

 that way it only takes the first number after a quotation mark and a blank space

also if you try to change the index AND the sourcetype for one input you might run into problems since splunk could potentially first address the new sourcetype and then try to send events into new indexes given the regex above. BUT when this happens they are already sourcetype=access_combined and not weblog anymore so it won't work or only one of those transforms. 

the solution is as follows: 

in props.conf your stanza shouldn't address the sourcetype "weblog" but rather the source from which your data originates. 

[source::access_combined_no_breaks.log]

this way it doesn't matter what happens first with your data cause the source will always stay the same. 

 

hope this helps anyone who might run into the same problems. if so, pls consider thumbs up 🙂 

0 Karma

navidnaddimulla
New Member

Hi @jpcontrerasaditum - I am also trying to manipulate a weblog with nearly 36k events and exactly same requirements which is :

  1. line break at &&&, then
  2. send 404 status code events to notfoundindex and
  3. reassigning all the events to access_combined sourcetype.

But it doesnt seem to work with the entire log file. So i tries with 10 events only and was able to achieve 1. but not 2. and 3. I get the following error :

truncating at 10000 bytes because size exceeded splunk with a line length >= 15512

I tried truncate = 50000 & truncate = 0 but that makes splunk unresponsive.

So were you able to resolve the issue ? Appreciate if you could help.

0 Karma

Ayn
Legend

Your index rewrite transform looks OK to me, but you've made a typo in the sourcetype changing section - it should be MetaData:Sourcetype (which capital D), not Metadata:Sourcetype.

To make things easier to debug you could/should also combine the TRANSFORMS statements into one so you can see more clearly which order they're applied in.

TRANSFORMS-changestuff = notfound, setsourcetype

navidnaddimulla
New Member

Hi @Ayn - I am also trying to manipulate a weblog with nearly 36k events and exactly same requirements which is :

  1. line break at &&&, then
  2. send 404 status code events to notfoundindex and
  3. reassigning all the events to access_combined sourcetype.

But it doesnt seem to work with the entire log file. So i tries with 10 events only and was able to achieve 1. but not 2. and 3. I get the following error :

truncating at 10000 bytes because size exceeded splunk with a line length >= 15512

I tried truncate = 50000 & truncate = 0 but that makes splunk unresponsive.

So were you able to resolve the issue ? Appreciate if you could help.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...