Our application requires access to raw events on light forwarders to do some custome processing before (or at the same time) as the events get passed to central indexer. Is there any way to tee the even stream on forwarder to split the stream into two destinations - the splunk indexer AND our separate processor? If so, is it possible for us to get some kind of the ID that is (will be) assigned to the current event so our database can have a reference to the original even as it is being added to the database?
It is possible to forward raw data to third party servers by configuring $SPLUNK_FORWARDER_HOME/etc/system/local/outputs.conf
on the forwarder instance in this manner:
[tcpout]
[tcpout:fastlane]
server = 10.1.1.35:6996
sendCookedData = false
The last line prevents the data from being prepended with additional timestamps.
Source: http://docs.splunk.com/Documentation/Splunk/latest/Deploy/Forwarddatatothird-partysystemsd
Note: $SPLUNK_FORWARDER_HOME
is /opt/splunkforwarder
for a typical Linux installation.
The lightweight and universal forwarder do not parse the events.
If you want to use splunk to process the events, use an heavy forwarder with props and transforms.
Beware the data will be parsed when send to the indexer who will not parse then twice, so make sure that your heavy forwarder as all the rules that you also apply on the indexers.
see http://docs.splunk.com/Documentation/Splunk/5.0/Deploy/Deployaforwarder
About your double processing (in splunk and in your database)
You can send a copy of some events to a third party system as raw or syslog.
see http://docs.splunk.com/Documentation/Splunk/5.0.1/Deploy/Forwarddatatothird-partysystemsd
For the unique ID to identify the same events in splunk kand in the third party system, this requires that you added the unique id yourself to your events. (probably in the original log before the indexing)