This is related to an earlier question: http://answers.splunk.com/questions/490/why-do-variations-in-sourcetype-appear
This question is slightly different however, as the issue is not present in IIS logs, but in an external CSV report we monitor that contains a header of field names.
I need to have all my sourcetypes set the same so they do not have to be referenced as "ORAExtendedOrderHistory*". From what I understood from that previous question, it renames the sourcetype because it is trying to store the field names from the header, but when I look at the list of available fields, these are not available in the list anyway. Unless I misunderstood the point of this functionality, it looks like its not working for me, so I'd like to override it.
Source log /opt/oracle/admin/reports/ExtendedOrderHistoryCacti_SUNDAY.csv:
field1, field2, field3, field4, field5, field6, field7,field8,field9, field10, field11, field12, field13,field14, field15, field16 13-MAR-2010 23:59,WEB,OS,0,2,0,2,43,42,43,23,22,23,20,20,20,14-MAR-10,13-MAR-10,14-MAR-10 14-MAR-2010 00:00,WEB,OS,1,1,0,2,69,47,91,29,27,30,41,17,64,14-MAR-10,14-MAR-10,14-MAR-10 ...
Monitors in inputs.conf:
[monitor:///opt/oracle/admin/reports/ExtendedOrderHistoryCacti_MONDAY.csv] sourcetype = ORAExtendedOrderHistory disabled = false [monitor:///opt/oracle/admin/reports/ExtendedOrderHistoryCacti_TUESDAY.csv] sourcetype = ORAExtendedOrderHistory disabled = false [monitor:///opt/oracle/admin/reports/ExtendedOrderHistoryCacti_WEDNESDAY.csv] sourcetype = ORAExtendedOrderHistory disabled = false [monitor:///opt/oracle/admin/reports/ExtendedOrderHistoryCacti_THURSDAY.csv] sourcetype = ORAExtendedOrderHistory disabled = false [monitor:///opt/oracle/admin/reports/ExtendedOrderHistoryCacti_FRIDAY.csv] sourcetype = ORAExtendedOrderHistory disabled = false [monitor:///opt/oracle/admin/reports/ExtendedOrderHistoryCacti_SATURDAY.csv] sourcetype = ORAExtendedOrderHistory disabled = false [monitor:///opt/oracle/admin/reports/ExtendedOrderHistoryCacti_SUNDAY.csv] sourcetype = ORAExtendedOrderHistory disabled = false
It is doing this because there is a built-in rule that takes any file name ending in .csv
and forces the header check and sourcetype generation. You can override this by putting this in props.conf:
[source::.../ExtendedOrderHistory*.csv]
sourcetype = ORAExtendedOrderHistory
priority = 101
or some other appropriate source pattern. You can repeat this multiple times for multiple source patterns if needed.
It is doing this because there is a built-in rule that takes any file name ending in .csv
and forces the header check and sourcetype generation. You can override this by putting this in props.conf:
[source::.../ExtendedOrderHistory*.csv]
sourcetype = ORAExtendedOrderHistory
priority = 101
or some other appropriate source pattern. You can repeat this multiple times for multiple source patterns if needed.
Yup, that's done it thanks.
This needs to be set on the forwarder actually, not the indexer. It's one of the few props.conf settings that happens on the input side before parsing. See: http://www.splunk.com/wiki/Where_do_I_configure_my_Splunk_settings%3F
I added the following to props.conf on my indexer:
[source::.../OrderHistoryCacti_*.csv]
sourcetype = ORAOrderHistory
priority = 101
[source::.../ExtendedOrderHistoryCacti_*.csv]
sourcetype = ORAExtendedOrderHistory
priority = 102
After running "| extract reload=T" on my indexer, I am still seeing the sourcetypes as "ORAExtendedOrderHistory-7" as they were before the change. Should I expect the change to be reflected immediately? Or will it only happen once the file rolls and Splunk encounters a new one?
when you see the sourcetype as xxxxxx-1 Splunk tries to learn the data. For some reason, it doesn't know how to handle the csv files correctly.
I had to convert mine to a txt file for it to be considered into the sourcetype that I specified.