Splunk Search

Splunk for BlueCoat app problem

laurensv
Path Finder

I'm currently sending BlueCoat logs in W3C ELFF format to Splunk. I've also installed the latest Splunk for Blue Coat app.

However, it seems that log fields are not extracted correctly. None of the fields in the Dashboard show the correct field. For instance, the Top Websites shows "application/x-www-form-urlencoded;%20charset=utf-8" and "application/soap+msbin1" as the top 2 sites...

All logs have 39 fields which are separated by a space (" "). The fields are: date time time-taken c-ip sc-status s-action sc-bytes cs-bytes cs-method cs-uri-scheme cs-host cs-uri-path cs-uri-query cs-username s-hierarchy s-supplier-name cs(Content-Type) cs(User-Agent) sc-filter-result sc-filter-category x-virus-id s-ip s-sitename sc(Content-Encoding) x-bluecoat-release-version s-icap-info s-icap-status x-exception-reason x-exception-sourcefile x-virus-details x-icap-error-code x-icap-error-details cs-uri-stem cs-auth-group cs-auth-type x-cs-user-authorization-name sc-auth-status rs(Content-Type) rs(Content-Encoding).

However, the field "cs(User-Agent)" contains spaces and starts with a " and ends with a ". Between those 2 characters, there can be spaces.

The regex in the Splunk for BlueCoat app is the following:

[mainExtractions] REGEX = \d+-\d+-\d+\s\d+:\d+:\d+\s(?<time_taken>\d+)\s(?<c_ip>\d+.\d+.\d+.\d+)\s(?<sc_status>[^\s]+)\s(?<s_action>[^\s]+)\s(?<sc_bytes>[^\s]+)\s(?<cs_method>[^\s]+)\s\"(?<cs_uri_scheme>[^\s]+)\"\s(?<cs_host>[^\s]+)\s+(?<cs_uri_port>[^\s]+)\s(?<cs_uri_path>[^\s]+)\s(?<cs_uri_query>[^\s]+)\s(?<cs_username>[^\s]+)\s(?<cs_auth_group>[^\s]+)\s(?<s_hierarchy>[^\s]+)\s(?<s_supplier_name>[^\s]+)\s(?<rs_content_type>[^\s]+)\s(?<cs_referer>[^\s]+)\s(?<cs_UserAgent>[^\s]+)\s\"(?<sc_filter_result>.*)\"\s(?<cs_categories>[^\s]+)\s(?<x_virus_id>[^\s]+)\s(?<s_ip>[^\s]+)

I think it doesn't correctly filter the " around the cs_UserAgent. Can anyone help with this?

Tags (1)
0 Karma
1 Solution

laurensv
Path Finder

Ok, I've found the issue. The log format should be "bcreportermain_v1". After changing that, everything works!

View solution in original post

0 Karma

laurensv
Path Finder

My transforms.conf:

[delimExtractions]
DELIMS=" "
FIELDS="date","time","time_taken","src_ip","user","user_group","x_exception_id","filter_result","category","http_referrer","holder","http_response","action","http_method","http_content_type","uri_scheme","dest_host","dest_port","uri_path","uri_query","uri_extension","http_user_agent","dvc_ip","sc_bytes","cs_bytes","x_virus_id"

Example log files:

2010-11-26 11:28:55 113 x.x.x.x 200 TCP_NC_MISS 42168 1691 POST https the.web.site /ProcessLegend.aspx - - DEFAULT_PARENT fqdn.host.name application/x-www-form-urlencoded;%20charset=utf-8 "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)" OBSERVED none - x.x.x.x SG-HTTPS-Reverse-Proxy-Service - 5.5.3.1 - ICAP_NOT_SCANNED "-" - - - - https://the.web.site/ProcessLegend.aspx - - - - text/html;%20charset=utf-8 -

2010-11-26 11:28:31 109 x.x.x.x 200 TCP_NC_MISS 562 491 POST http another.web.site /Blablabla.svc - - DEFAULT_PARENT another.host.name application/soap+msbin1 - OBSERVED none - x.x.x.x SG-HTTP-Service - 5.5.3.1 - ICAP_NOT_SCANNED "-" - - - - http://another.web.site/Blablabla.svc - - - - application/soap+msbin1 -

Edited of course 😉

0 Karma

laurensv
Path Finder

I think I finally got it working correctly 🙂 It seems that the transforms.conf file in the Splunk for BlueCoat app is wrong.

Original transforms.conf

[delimExtractions]
DELIMS=" "
FIELDS="date","time","time_taken","dvc_ip","user","user_group","x_exception_id","filter_result","category","http_referrer","holder","http_response","action","http_method","http_content_type","uri_scheme","dest_host","dest_port","uri_path","uri_query","uri_extension","http_user_agent","src_ip","sc_bytes","cs_bytes","x_virus_id"

[nullPound]
REGEX = ^\#
DEST_KEY=queue
FORMAT=nullQueue

When I switch "dvc_ip" and "src_ip" in the above, all graphs are correctly displayed. According to the Blue Coat documentation ("SGOS Volume 8: Access Logging"), "src_ip" is actully the 4th field and "dvc_ip" is the 4th last field.

After copying the default transforms.conf file to the local directory and changing it like this:

[delimExtractions]
DELIMS=" "
FIELDS="date","time","time_taken","src_ip","user","user_group","x_exception_id","filter_result","category","http_referrer","holder","http_response","action","http_method","http_content_type","uri_scheme","dest_host","dest_port","uri_path","uri_query","uri_extension","http_user_agent","dvc_ip","sc_bytes","cs_bytes","x_virus_id"

everything works.

laurensv
Path Finder

Silvermail, can you confirm your setup? (log format etc...)

0 Karma

laurensv
Path Finder

Ok, I've found the issue. The log format should be "bcreportermain_v1". After changing that, everything works!

0 Karma

silvermail
Path Finder

Can you post the header and a few lines from your logs, as well as your transforms.conf? Probably the field-extractions naming is a bit different and that is why you are not getting the dashboards.

0 Karma

laurensv
Path Finder

Hmmm, still not working correctly... When I go to "Dashboards" -> "Traffic Dashboard", the "Top Websites" and "Top Clients" are still wrong 😞

Anybody running Splunk for BlueCoat 100% correctly? 😉

0 Karma

silvermail
Path Finder

Instead of using the Regex, I am actually using the delimeters option which I find it to be much easier to configure.

This is an example of how mine looks like. You will need to change the delimters accordingly in the transforms.conf to match what you are outputting from your Bluecoat.

props.conf

[bcoat_proxysg]
TRANSFORM-main=nullPound
REPORT-main=delimExtractions
SHOULD_LINEMERGE=false
TIME_FORMAT=%Y-%m-%d %T
MAX_TIMESTAMP_LOOKAHEAD=19
KV_MODE = none

transforms.conf

[delimExtractions]
DELIMS=" "
FIELDS="date","time","time_taken","dvc_ip","user","user_group","x_exception_id","filter_result","category","http_referrer","holder","http_response","action","http_method","http_content_type","uri_scheme","dest_host","dest_port","uri_path","uri_query","uri_extension","http_user_agent","src_ip","sc_bytes","cs_bytes","x_virus_id"

[nullPound]
REGEX = ^\#
DEST_KEY=queue
FORMAT=nullQueue

laurensv
Path Finder

Silvermail, how are you sending the logs from the Blue Coat to Splunk and in which format?

0 Karma

laurensv
Path Finder

How do you correctly filter out the User-Agent field? Like I said in my post above, the User-Agent field is everything between the 2 double quotes ("Mozilla 4.5 whatever") and that doesn't get filtered correctly if you use a space as delimiter as you can have multiple words between the double quotes with spaces...

Which log format are you using on your BlueCoat?

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...