Getting Data In

How to Force Event Timezone Settings per Host

rbruno7
Explorer

Hi Guys,

We have built a small Splunk app to retrieve and index web usage info from multiple SQL databases. My Splunk version is 4.0.10.

We have proxies in multiple time zones that reports to the same SQL server and the only way to know what is the timezone of the event is to check where the reporting proxy is located.

Sample Data:

   DATE_TIME    RECORD_NUMBER   USER_NAME   WEBPROXY_IP DEST_IP SRC_IP  DEST_PORT   BYTES_SENT  BYTES_RECEIVED  URL_DOMAIN  FULL_URL
4/10/2010 11:00 543450000000000 user123 10.1.10.10  10.11.11.11 10.1.10.100 80  1000    2000    testsite.com    http://testsite.com/page1 
4/10/2010 11:01 123450000000000 user321 10.2.10.10  10.22.22.22 10.2.20.100 80  1000    2000    testsite.com    http://testsite.com/page1 
4/10/2010 11:02 433450000000000 user432 10.3.10.10  10.33.33.33 10.3.30.100 80  1000    2000    testsite.com    http://testsite.com/page1 

The WEBPROXY_IP is being forcibly set as the host for each event via props/transforms.

I saw the following example on the documentation and tried to replace the host name with the IP address of the source proxy (my host value) without success:

[http://www.splunk.com/base/Documentation/4.0.10/Admin/Applytimezoneoffsetstotimestamps][1]

Examples 
Events are coming to an indexer from New York City (in the US/Eastern timezone) and Mountain View, California (US/Pacific). To correctly handle the timestamps for these two sets of events, the props.conf for the indexer needs the timezone offset to be specified as US/Eastern and US/Pacific respectively. 
The first example sets the timezone offset of events from host names that match the regular expression nyc.* with the US/Eastern time zone. 

    [host::nyc*]    
    TZ = US/Eastern

Any thoughts / recommendations on how to force the timezones based on by host name (which is actually an IP address) on all past and future events?

Below are my configurations:

::::::::::::::
inputs.conf
::::::::::::::
 # Data coming from a SQL Database via scripted input every 60s
[script://$SPLUNK_HOME/etc/apps/webproxy/bin/webdb1.sh]
disabled = false
source = mssql
sourcetype = webproxy:webfilter
interval = 60   
index = idx_webproxy

 # Data coming from a SQL Database via scripted input every 60s
[script://$SPLUNK_HOME/etc/apps/webproxy/bin/webdb2.sh]
disabled = false
source = mssql
sourcetype = webproxy:webfilter
interval = 60
index = idx_webproxy


::::::::::::::
props.conf
::::::::::::::
[webproxy:webfilter]

AUTOKV=none
REPORT-webproxy_header = webproxy_header
REPORT-static_product_for_webproxy = static_product_for_webproxy
TRANSFORMS-force_host_for_webproxy = force_host_for_webproxy


 #My attempt to fix the issue
[host:: 10.1.10.10] 
TZ = US/Eastern     

[host:: 10.2.20.10]
TZ = US/Montain

[host:: 10.3.30.10]
TZ = US/Pacific


::::::::::::::
transforms.conf
::::::::::::::
[static_product_for_webproxy]
REGEX = (.)
FORMAT = vendor::webproxy product::webfilter

[force_host_for_webproxy]
REGEX =.*
SOURCE_KEY=MetaData:WEBPROXY_IP
DEST_KEY = MetaData:Host
FORMAT = host::$1

[webproxy_header]
DELIMS = "\t"
FIELDS = "DATE_TIME","RECORD_NUMBER","USER_ID","USER_NAME","WEBPROXY_IP","DEST_IP","SRC_IP","DEST_PORT","BYTES_SENT","BYTES_RECEIVED","URL_DOMAIN","FULL_URL"
Tags (1)
0 Karma
1 Solution

gkanapathy
Splunk Employee
Splunk Employee

A few problems with your configuraton:

  • I don't think you should have spaces before your host name in the host name stanza.
  • US/Montain should be US/Mountain.
  • DELIMS and FIELDS are only used at search time, so there is no field or key WEBPROXY_IP available at index time.
  • Most substantially, your host/MetaData:Host setting occurs after props.conf TZ setting, so it will never match. If you do in fact have mixed time zones in a single log from a single host, and no time zone specification in the event itself, this is a very hard problem to solve.

    While it might be possible to configure Splunk to work with it (I'm not sure it's actually possible, but if it was, you would need to send the data with a heavy forwarder or pass it through a heavy intermediate forwarder, modify the queue routing for the indexer's input, and it would need to be updated manually for DST changes.) it would be rather complex and probably introduce other problems. This sounds like something that you might want to request as an enhancement request from Splunk.

Update:

I realize now that you're getting the data via scripted input. It would be a lot easier than any Splunk-crazy configuration to just create multiple copies of the scripted input, each one selecting a different webproxy (or set of webproxies - modify the SQL select to add a WHERE WEBPROXY_IP IN ('1.2.3.4','5.6.7.8','9.0.1.2') clause) by time zone, and set the host value in the input stanza. Then the TZ setting in props will work.

That, or you maintain this in MSSQL by creating a SQL table with webproxy_ip mappings to timezones, and modifying the SQL select in your script to join on webproxy_ip, e.g., if you're doing:

SELECT DATE_TIME,RECORD_NUMBER,USER_NAME, WEBPROXY_IP,DEST_IP FROM MYTABLE

change it to

SELECT a.DATE_TIME as DATE_TIME, b.TZ as TZ, a.RECORD_NUMBER as RECORD_NUMBER, a.USER_NAME as USER_NAME, a.WEBPROXY_IP as WEBPROXY_IP,a.DEST_IP as DEST_IP, FROM MYTABLE a INNER JOIN proxytzmappingtable b on a.webproxy_ip = b.webproxyip

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

A few problems with your configuraton:

  • I don't think you should have spaces before your host name in the host name stanza.
  • US/Montain should be US/Mountain.
  • DELIMS and FIELDS are only used at search time, so there is no field or key WEBPROXY_IP available at index time.
  • Most substantially, your host/MetaData:Host setting occurs after props.conf TZ setting, so it will never match. If you do in fact have mixed time zones in a single log from a single host, and no time zone specification in the event itself, this is a very hard problem to solve.

    While it might be possible to configure Splunk to work with it (I'm not sure it's actually possible, but if it was, you would need to send the data with a heavy forwarder or pass it through a heavy intermediate forwarder, modify the queue routing for the indexer's input, and it would need to be updated manually for DST changes.) it would be rather complex and probably introduce other problems. This sounds like something that you might want to request as an enhancement request from Splunk.

Update:

I realize now that you're getting the data via scripted input. It would be a lot easier than any Splunk-crazy configuration to just create multiple copies of the scripted input, each one selecting a different webproxy (or set of webproxies - modify the SQL select to add a WHERE WEBPROXY_IP IN ('1.2.3.4','5.6.7.8','9.0.1.2') clause) by time zone, and set the host value in the input stanza. Then the TZ setting in props will work.

That, or you maintain this in MSSQL by creating a SQL table with webproxy_ip mappings to timezones, and modifying the SQL select in your script to join on webproxy_ip, e.g., if you're doing:

SELECT DATE_TIME,RECORD_NUMBER,USER_NAME, WEBPROXY_IP,DEST_IP FROM MYTABLE

change it to

SELECT a.DATE_TIME as DATE_TIME, b.TZ as TZ, a.RECORD_NUMBER as RECORD_NUMBER, a.USER_NAME as USER_NAME, a.WEBPROXY_IP as WEBPROXY_IP,a.DEST_IP as DEST_IP, FROM MYTABLE a INNER JOIN proxytzmappingtable b on a.webproxy_ip = b.webproxyip

gkanapathy
Splunk Employee
Splunk Employee

no, you have to use props.conf, with the "host" clause as you have above, and set "host" directly in the inputs.conf file for each script/stanza. this will also allow you to get rid of the host transform.

0 Karma

rbruno7
Explorer

Thanks Again, you have given my an idea. I could redo the scripts to filter agains the proxy sorce and have multiple scripts, specially because I don't have a high number of different sources (3 at the moment).

So what setting should I add to add timezone to the events by each source?
Would that be inside the inputs.conf ?

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

Oh never mind you are doing a scripted input. It might not be a bad idea to rewrite this to have the scripted input either look up and insert the TZ in the data, or to have multiple scripted inputs that got a subset of each table (by webproxy), each with a different host value in the input stanza

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

I think there is not an easy way within Splunk. How are you getting the data out of MSSQL? Would it be possible to divide each webproxy up into a separate file, or do a separate scripted input for each webproxy? Then you could set TZ by source::

0 Karma

rbruno7
Explorer

Thanks Gerald,
Nice catch, these were just typos when transcribing the config to this site. The real config had no spaces and Mountain was spelled correctly.

Thanks for the quick response.

Let me clarify, each host(WEBPROXY_IP) has only one timezone while each database server could contain events from multiple hosts and consequently multiple timezones.

With the above scenario is there any way around to set the timezone per host?

0 Karma
Get Updates on the Splunk Community!

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...