What is the nature of the data that is causing you to exceed your index column and how is it arriving to splunk?
One option is to simply not index certain events, if you know which ones you'd like to exclude from indexing. You can do this by specifying a matching regex and routing these events to a nullqueue.
See the below docs on how to do this:
http://docs.splunk.com/Documentation/Splunk/5.0/Deploy/Routeandfilterdatad
A 100% effective, although unconventional, way to ensure that you never go over your indexing limit is to limit how fast the index can run.
$SPLUNK/etc/system/local/limits.conf:
[thruput]
maxKBps =
To figure out what the # should be, divide the daily license cap (1GB: 1073741824 bytes) by 86400 (seconds in a day), to get your max Kbps rate (12427 bytes/sec, or 12KB). This doesn't sound like much, and it isn't for a single second, but if splunk runs steadily all day long, you'll get close to your limit, but not go over it.
What is the nature of the data that is causing you to exceed your index column and how is it arriving to splunk?
One option is to simply not index certain events, if you know which ones you'd like to exclude from indexing. You can do this by specifying a matching regex and routing these events to a nullqueue.
See the below docs on how to do this:
http://docs.splunk.com/Documentation/Splunk/5.0/Deploy/Routeandfilterdatad
To be more specific, you will want to route the events you don't need indexed to the nullQueue -- these events will be discarded and do not count against your license.