The problem has been solved.
At the same time of a bunch of other changes, some firewall rules were put in place around the Splunk server. The WinEventLog:Security input by default looks up AD to resolve SID's in events (evt_resolve_ad_obj = 1). This uses RPC ports to communicate to the AD servers. I have disabled this setting (evt_resolve_ad_obj = 0) and all event logs are now being indexed once again. There appears to be no issue with resolved usernames in the eventlogs.
I discovered this in the splunkd.log with DEBUG turned on for WinEventLog*. Initially the following entries appear just as the Security log was begining to be processed:
WinEventLogChannel - EvtDC::bind: Found DC='\SERVER1.xyz.loc', DCsite='XYZ', ClientSite = 'XYZ', Domain='xyz.loc'
WinEventLogChannel - connectToDC: DsBind failed: (1722)'The operation completed successfully.'
WinEventLogChannel - init: Failed to bind to DC, dc_bind_time=21140 msec
Then every 21 seconds:
WinEventLogChannel - connectToDC: DsBind failed: (1722)'The operation completed successfully.'
WinEventLogChannel - WinEventLogChannel::translateSidLocally Translating sids locally...
I assume that the events were being indexed, just very slowly, so that it would appear to never finish indexing the security log and move onto other logs.
I have reviewed the firewall rules and need to allow the blocked RPC port (tcp/1026).
... View more