All Apps and Add-ons

Regex help - Cisco WSA syslog data

hfernandez_
Path Finder

Hi Splunkers,

I am sending Cisco WSA data via syslog to a Heavy Forwarder in squid format. The data is getting there but it is not getting parsed correctly by the add-on. The current add-on, Splunk Add-on for Cisco WSA v3.3.0 (https://splunkbase.splunk.com/app/1747/) supports parsing via the ftp of the file but not the syslog solution. Looks like the regex works up to x_resp_dvs_verdictname then fails.

Has anyone been successful in parsing Cisco WSA data or is up to this Regex challenge?

Here is the REGEX from the add-on:

REGEX = (?<timestamp>[0-9.]+)\s+(?<x_elapsed_time>[0-9]+)\s+(?<src_ip>[a-zA-Z0-9:.]*)\s+(?<txn_result_code>[A-Z_]*)\/(?<status>[0-9]*)\s+(?<bytes_in>[0-9]*)\s+(?<http_method>\w*)\s+(?<url>\S*)\s+["|']?(?<user>[^\s"']+)["|']?\s+(?<server_contact_mode>[^\/]+)\/(?<dest>\S*)\s+(?<http_content_type>\S*)\s+(?<acltag>\S*)\s+(?:<|&lt;)(?<x_webcat_code_abbr>[^,]+),(?<wbrs_score>[^,]+),["|']?(?<x_webroot_scanverdict>[0-9]{0,2}|\-|\w+)["|']?,["|']?(?<webroot_threat_name>[^,"']+)["|']?,(?<x_webroot_trr>[^,]+),(?<x_webroot_spyid>[^,]+),(?<x_webroot_trace_id>[^,]+),(?<x_mcafee_scanverdict>[^,]+),["|']?(?<x_mcafee_filename>[^,]+?)["|']?,(?<x_mcafee_scan_error>[^,]+),(?<x_mcafee_detecttype>[^,]+),(?<x_mcafee_av_virustype>[^,]+),["|']?(?<x_mcafee_virus_name>[^,]+?)["|']?,(?<x_sophos_scanverdict>[^,]+),(?<x_sophos_scancode>[^,]+),["|']?(?<x_sophos_file_name>[^,]+?)["|']?,["|']?(?<x_sophos_virus_name>[^,]+?)["|']?,(?<x_ids_verdict>[^,]+),(?<x_icap_verdict>[^,]+),(?<x_webcat_req_code_abbr>[^,]+),["|']?(?<x_webcat_resp_code_abbr>[^,]+?)["|']?,["|']?(?<x_resp_dvs_threat_name>[^,]+?)["|']?,["|']?(?<x_wbrs_threat_type>[^,"']+)["|']?,["|']?(?<x_avc_app>[^,"']+)["|']?,["|']?(?<x_avc_type>[^,"']+)["|']?,["|']?(?<x_avc_behavior>[^,"']+)["|']?,["|']?(?<x_request_rewrite>[^"',]+)["|']?,(?<x_avg_bw>[^,]+),(?<x_bw_throttled>[^,]+),(?<x_user_type>[^,]+),["|']?(?<x_resp_dvs_verdictname>[^,"']+)["|']?,["|']?(?<x_req_dvs_threat_name>[^,"']+)["|']?(,["|']?(?<x_amp_verdict>[^,"']+)["|']?,["|']?(?<x_amp_malware_name>[^"']+)["|']?,(?<x_amp_score>[^,]+),(?<x_amp_upload>[^,]+),["|']?(?<x_amp_filename>[^,]+?)["|']?,["|']?(?<x_amp_sha>[^"',]+)["|']?)?(,["|']?(?<x_file_verdict>[^"',]+)["|']?)?(,(?<x_archive_scan_verdict>[^,]+),["|']?(?<x_archive_scan_verdict_reason>[^"']+)["|']?)?(?:>|&gt;)\s*(?<vendor_suspect_user_agent>-|["|']?[^"']+["|']?)?\s*(?<bytes_out>[0-9]*)?[^\r\n]*$

Here's some example data from my environment:

Jan 8 07:37:35 server.com Jan 08 07:37:34 server.com wsa_access_logs_splunk: Info: 1578497849.565 51 xxx.xxx.xxx.xxx NONE_SSL/504 0 POST https://device.com:443/SE/app - DIRECT/device.com - OTHER-NONE-eun_internal_group-NONE-NONE-NONE-DefaultGroup-NONE <"IW_edu",5.0,0,"-",0,0,0,-,"-",-,-,-,"-",-,-,"-","-",-,-,"IW_edu",-,"-","Education","-","Unknown","Unknown","-","-",0.00,0,-,"Unknown","-",-,"-",-,-,"-","-",-,-,"-",-> -

Jan 8 07:37:35 server.com Jan 08 07:37:34 server.com wsa_access_logs_splunk: Info: 1578497849.513 101 xxx.xxx.xxx.xxx NONE_SSL/200 0 TCP_CONNECT xxx.xxx.xxx.xxx:443 - NONE/- - OTHER-NONE-NONE-NONE-NONE-NONE-NONE-NONE <"-",-,-,"-",-,-,-,-,"-",-,-,-,"-",-,-,"-","-",-,-,"-",-,"-","-","-","-","-","-","-",0.00,0,-,"-","-",-,"-",-,-,"-","-",-,-,"-",-> -

Jan 8 07:35:34 server.com Jan 08 07:35:34 server.com wsa_access_logs_splunk: Info: 1578497734.661 0 xxx.xxx.xxx.xxx TCP_DENIED_SSL/403 0 POST https://xxx.xxx.xxx.xxx:443/server/php/index.php - NONE/- - BLOCK_ADMIN_HTTPS_NonLocalDestination-NONE-NONE-NONE-NONE-NONE-NONE-NONE <"-",-,-,"-",-,-,-,-,"-",-,-,-,"-",-,-,"-","-",-,-,"-",-,"-","-","-","-","-","-","-",0.00,0,-,"-","-",-,"-",-,-,"-","-",-,-,"-",-> -

I appreciate the help with this one.

Thanks,

0 Karma
1 Solution

PavelProstine
Explorer

There were some changes in the log format between WSA 11.5.x and 11.8.x, a couple of new fields were added, compare sections "Interpreting Access Log Scanning Verdict Entries":
https://www.cisco.com/c/en/us/td/docs/security/wsa/wsa11-5/user_guide/b_WSA_UserGuide_11_5_1.pdf and https://www.cisco.com/c/en/us/td/docs/security/wsa/wsa11-8/user_guide/b_WSA_UserGuide_11_8.pdf

I've just tested with 11.8.0-453, this config works 100%:

just add this code to local/transforms.conf and do debug/refresh or restart splunk:

[kv_for_cisco_wsa_squid]
# adjusted for syslog input
REGEX = \sInfo: (?<timestamp>[0-9.]+)\s+(?<x_elapsed_time>[0-9]+)\s+(?<src_ip>[a-zA-Z0-9:.]*)\s+(?<txn_result_code>[A-Z_]*)\/(?<status>[0-9]*)\s+(?<bytes_in>[0-9]*)\s+(?<http_method>\w*)\s+(?<url>\S*)\s+["|']?(?<user>[^\s"']+)["|']?\s+(?<server_contact_mode>[^\/]+)\/(?<dest>\S*)\s+(?<http_content_type>\S*)\s+(?<acltag>\S*)\s+(?:<|<)(?<x_webcat_code_abbr>[^,]+),(?<wbrs_score>[^,]+),["|']?(?<x_webroot_scanverdict>[0-9]{0,2}|\-|\w+)["|']?,["|']?(?<webroot_threat_name>[^,"']+)["|']?,(?<x_webroot_trr>[^,]+),(?<x_webroot_spyid>[^,]+),(?<x_webroot_trace_id>[^,]+),(?<x_mcafee_scanverdict>[^,]+),["|']?(?<x_mcafee_filename>[^,]+?)["|']?,(?<x_mcafee_scan_error>[^,]+),(?<x_mcafee_detecttype>[^,]+),(?<x_mcafee_av_virustype>[^,]+),["|']?(?<x_mcafee_virus_name>[^,]+?)["|']?,(?<x_sophos_scanverdict>[^,]+),(?<x_sophos_scancode>[^,]+),["|']?(?<x_sophos_file_name>[^,]+?)["|']?,["|']?(?<x_sophos_virus_name>[^,]+?)["|']?,(?<x_ids_verdict>[^,]+),(?<x_icap_verdict>[^,]+),(?<x_webcat_req_code_abbr>[^,]+),["|']?(?<x_webcat_resp_code_abbr>[^,]+?)["|']?,["|']?(?<x_resp_dvs_threat_name>[^,]+?)["|']?,["|']?(?<x_wbrs_threat_type>[^,"']+)["|']?,["|']?(?<x_avc_app>[^,"']+)["|']?,["|']?(?<x_avc_type>[^,"']+)["|']?,["|']?(?<x_avc_behavior>[^,"']+)["|']?,["|']?(?<x_request_rewrite>[^"',]+)["|']?,(?<x_avg_bw>[^,]+),(?<x_bw_throttled>[^,]+),(?<x_user_type>[^,]+),["|']?(?<x_resp_dvs_verdictname>[^,"']+)["|']?,["|']?(?<x_req_dvs_threat_name>[^,"']+)["|']?(,["|']?(?<x_amp_verdict>[^,"']+)["|']?,["|']?(?<x_amp_malware_name>[^"']+)["|']?,(?<x_amp_score>[^,]+),(?<x_amp_upload>[^,]+),["|']?(?<x_amp_filename>[^,]+?)["|']?,["|']?(?<x_amp_sha>[^"',]+)["|']?)?(,["|']?(?<x_file_verdict>[^"',]+)["|']?)?(,(?<x_archive_scan_verdict>[^,]+),["|']?(?<x_archive_scan_verdict_reason>[^"']+)["|']?)?.*(?:>|>)\s*(?<vendor_suspect_user_agent>-|["|']?[^"']+["|']?)?

let me know if it works.

View solution in original post

0 Karma

PavelProstine
Explorer

There were some changes in the log format between WSA 11.5.x and 11.8.x, a couple of new fields were added, compare sections "Interpreting Access Log Scanning Verdict Entries":
https://www.cisco.com/c/en/us/td/docs/security/wsa/wsa11-5/user_guide/b_WSA_UserGuide_11_5_1.pdf and https://www.cisco.com/c/en/us/td/docs/security/wsa/wsa11-8/user_guide/b_WSA_UserGuide_11_8.pdf

I've just tested with 11.8.0-453, this config works 100%:

just add this code to local/transforms.conf and do debug/refresh or restart splunk:

[kv_for_cisco_wsa_squid]
# adjusted for syslog input
REGEX = \sInfo: (?<timestamp>[0-9.]+)\s+(?<x_elapsed_time>[0-9]+)\s+(?<src_ip>[a-zA-Z0-9:.]*)\s+(?<txn_result_code>[A-Z_]*)\/(?<status>[0-9]*)\s+(?<bytes_in>[0-9]*)\s+(?<http_method>\w*)\s+(?<url>\S*)\s+["|']?(?<user>[^\s"']+)["|']?\s+(?<server_contact_mode>[^\/]+)\/(?<dest>\S*)\s+(?<http_content_type>\S*)\s+(?<acltag>\S*)\s+(?:<|<)(?<x_webcat_code_abbr>[^,]+),(?<wbrs_score>[^,]+),["|']?(?<x_webroot_scanverdict>[0-9]{0,2}|\-|\w+)["|']?,["|']?(?<webroot_threat_name>[^,"']+)["|']?,(?<x_webroot_trr>[^,]+),(?<x_webroot_spyid>[^,]+),(?<x_webroot_trace_id>[^,]+),(?<x_mcafee_scanverdict>[^,]+),["|']?(?<x_mcafee_filename>[^,]+?)["|']?,(?<x_mcafee_scan_error>[^,]+),(?<x_mcafee_detecttype>[^,]+),(?<x_mcafee_av_virustype>[^,]+),["|']?(?<x_mcafee_virus_name>[^,]+?)["|']?,(?<x_sophos_scanverdict>[^,]+),(?<x_sophos_scancode>[^,]+),["|']?(?<x_sophos_file_name>[^,]+?)["|']?,["|']?(?<x_sophos_virus_name>[^,]+?)["|']?,(?<x_ids_verdict>[^,]+),(?<x_icap_verdict>[^,]+),(?<x_webcat_req_code_abbr>[^,]+),["|']?(?<x_webcat_resp_code_abbr>[^,]+?)["|']?,["|']?(?<x_resp_dvs_threat_name>[^,]+?)["|']?,["|']?(?<x_wbrs_threat_type>[^,"']+)["|']?,["|']?(?<x_avc_app>[^,"']+)["|']?,["|']?(?<x_avc_type>[^,"']+)["|']?,["|']?(?<x_avc_behavior>[^,"']+)["|']?,["|']?(?<x_request_rewrite>[^"',]+)["|']?,(?<x_avg_bw>[^,]+),(?<x_bw_throttled>[^,]+),(?<x_user_type>[^,]+),["|']?(?<x_resp_dvs_verdictname>[^,"']+)["|']?,["|']?(?<x_req_dvs_threat_name>[^,"']+)["|']?(,["|']?(?<x_amp_verdict>[^,"']+)["|']?,["|']?(?<x_amp_malware_name>[^"']+)["|']?,(?<x_amp_score>[^,]+),(?<x_amp_upload>[^,]+),["|']?(?<x_amp_filename>[^,]+?)["|']?,["|']?(?<x_amp_sha>[^"',]+)["|']?)?(,["|']?(?<x_file_verdict>[^"',]+)["|']?)?(,(?<x_archive_scan_verdict>[^,]+),["|']?(?<x_archive_scan_verdict_reason>[^"']+)["|']?)?.*(?:>|>)\s*(?<vendor_suspect_user_agent>-|["|']?[^"']+["|']?)?

let me know if it works.

0 Karma

hfernandez_
Path Finder

Hi PavelProstine,

This looks promising. Let me give it a try.

Thanks,

0 Karma

hfernandez_
Path Finder

Ok, I implemented the changes and it worked. I appreciate the help. PavelProstine you're AWESOME!!!

0 Karma

PavelP
Motivator

The WSA format has changed over time. Which WSA version do you have? Also check on WSA: Administration > Log Subscription > accesslogs > Custom Fields.

To just make it work (quick & dirty) you have to modify the regex to (by adding "Info:" and removing trailing regexes):

Info: (?<timestamp>[0-9\.]+)\s+(?<x_elapsed_time>[0-9]+)\s+(?<src_ip>[a-zA-Z0-9:.]*)\s+(?<txn_result_code>[A-Z_]*)\/(?<status>[0-9]*)\s+(?<bytes_in>[0-9]*)\s+(?<http_method>\w*)\s+(?<url>\S*)\s+"?(?<user>[^\s"]*)"?\s+(?<server_contact_mode>[^\/]*)\/(?<dest>\S*)\s+(?<http_content_type>\S*)\s+(?<acltag>\S*)\s+(?:<|<)(?<x_webcat_code_abbr>[^,]+),(?<wbrs_score>[^,]+),"*(?<x_webroot_scanverdict>[0-9]{0,2}|\-|\w+)"*,"(?<webroot_threat_name>[^"]+)",(?<x_webroot_trr>[^,]+),(?<x_webroot_spyid>[^,]+),(?<x_webroot_trace_id>[^,]+),(?<x_mcafee_scanverdict>[^,]+),"(?<x_mcafee_filename>[^,]+)",(?<x_mcafee_scan_error>[^,]+),(?<x_mcafee_detecttype>[^,]+),(?<x_mcafee_av_virustype>[^,]+),"(?<x_mcafee_virus_name>[^"]+)",(?<x_sophos_scanverdict>[^,]+),(?<x_sophos_scancode>[^,]+),"(?<x_sophos_file_name>[^"]+)","(?<x_sophos_virus_name>[^"]+)",(?<x_ids_verdict>[^,]+),(?<x_icap_verdict>[^,]+),(?<x_webcat_req_code_abbr>[^,]+),(?<x_webcat_resp_code_abbr>[^,]+),"(?<x_resp_dvs_threat_name>[^"]+)","(?<x_wbrs_threat_type>[^,]+)","(?<x_avc_app>[^,]+)","(?<x_avc_type>[^,]+)","(?<x_avc_behavior>[^,]+)"

This will work but is still inefficient and will lose some information from the log. You can add "TIME_PREFIX=Info:\s+" to [cisco:wsa:squid] stanza in props.conf and a proper regex for a syslog part of the message (before "Info:"). To make the regex complete and parse all the fields check the WSA version and the corresponding documentation (WSA_UserGuide_1x.x.pdf, "Monitor System Activity Through Logs").

0 Karma

hfernandez_
Path Finder

Hi PavelP,

I appreciate the feedback. We are currently running WSA v11.8.0-440. I have nothing defined in for custom fields on this filter and using the defaults. Let me play around with that regex.

Thanks,

0 Karma

efavreau
Motivator

@hfernandez_ There could be lots of reasons for this. My suggestion is to check for a character limit. Your regex has 1855 characters. Try reducing the number of characters in your capture groups, or breaking this up into more than one regular expression. While less than ideal, I have worked in other software where these type of workarounds were needed.

###

If this reply helps you, an upvote would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...