All Apps and Add-ons

Splunk Add-on for Cisco WSA: How to configure field-extractions using props.conf and transform.conf for raw Cloudfront logs?

MayankSplunk
Path Finder

I have setup forwarder to dump my cloudfront logs to splunk, below is the raw logs format. I have tried following http://docs.splunk.com/Documentation/AddOns/released/CiscoWSA/Configurew3clogfieldextractions matching cloudfront logs but no luck. Below is how I have setup props & transforms.conf. I'm able to see the raw data in splunk, but I want to map to event name so i can query them.

I'm trying to follow as mentioned in http://answers.splunk.com/answers/57770/transforms-conf-and-props-conf-field-extractions.html

#Raw Logs

2015-01-27  12:48:48    JAX1    1871    71.1.1.16   GET d21rhj.cloudfront.net   /test/20150112/54b48398e4b0f8e9e9d6ddf2_141391808196.mp4    200 http://www.test.com/demo.html   Mozilla/5.0%2520(Windows%2520NT%25206.0;%2520WOW64)%2520AppleWebKit/537.36%2520(KHTML,%2520like%2520Gecko)%2520Chrome/39.0.2171.95%2520Safari/537.36    -   -   Hit Ba3xwT-zb-czH_zw==  v.test.com  http    637 0.002

#transforms.conf

[auto_kv_for_video_cloudfront_w3c]
REGEX=/\S+/g
FORMAT=date::$1,time::$2,x_edge_location::$3,sc_bytes::$4,c_ip::$5,cs_method::$6,cs_Host::$7,cs_uri_stem::$8,sc_status::$9,cs_referer::$10,cs_user_agent::$11,cs_uri_query::$12,cs_cookie::$13,x_edge_result_type::$14,x_edge_request_id::$15,x_host_header::$16,cs_protocol::$17,cs_bytes::$18,time_taken::$19

#props.conf

[cloudfrontprof]
pulldown_type=1
REPORT-auto_kv_for_video_cloudfront_w3c=auto_kv_for_video_cloudfront_w3c
0 Karma

rarsan_splunk
Splunk Employee
Splunk Employee

CloudFront access logs use W3C extended log format. Therefore you can have Splunk parse the file header and automatically extract all fields at index-time using the following simple props.conf:

[cloudfront-access-log]
INDEXED_EXTRACTIONS = W3C

You can learn more about field extractions for structured data files here:
http://docs.splunk.com/Documentation/Splunk/6.3.0/Data/Extractfieldsfromfileheadersatindextime

0 Karma

rarsan_splunk
Splunk Employee
Splunk Employee
0 Karma

aakwah
Builder

I can see that the delimiter between fields is space so you can use the following configuration,

transforms.conf

[cloudfront_w3c]
DELIMS = " "
FIELDS = date,time,x_edge_location,sc_bytes,c_ip,cs_method,cs_Host,cs_uri_stem,sc_status,cs_referer,cs_user_agent,cs_uri_query,cs_cookie,x_edge_result_type,x_edge_request_id,x_host_header,cs_protocol,cs_bytes,time_taken

props.conf

[cloudfrontprof]
REPORT-main=cloudfront_w3c
SHOULD_LINEMERGE=false
TIME_FORMAT=%Y-%m-%d %T
MAX_TIMESTAMP_LOOKAHEAD = 19
SHOULD_LINEMERGE = false
TZ = UTC
pulldown_type = true

Regards,
Ahmed Elakwah

0 Karma

MayankSplunk
Path Finder

Hi Ahmed,

Thanks for the reply. I have implemented what you have mentioned but when I query using following command I don't see any data but I do have raw logs. Event mapping won't work, any other thoughts?

index=video sourcetype="cloudfront" date="2015-01-27"

0 Karma

aakwah
Builder

In props.conf you should put sourcetype in stanza to be applied to the logs that come from sourcetype, so if your sourcetype is cloudfront, props.conf should be like this:

 [cloudfront]
 REPORT-main=cloudfront_w3c
 SHOULD_LINEMERGE=false
 TIME_FORMAT=%Y-%m-%d %T
 MAX_TIMESTAMP_LOOKAHEAD = 19
 SHOULD_LINEMERGE = false
 TZ = UTC
 pulldown_type = true

Regards,
Ahmed

0 Karma

MayankSplunk
Path Finder

thanks, i tried that din't work, any other ideas?

0 Karma

aakwah
Builder

Please use the following files then restart searchhead,

props.conf

  [cloudfront]
  TRANSFORMS-sourcetype = cloudfront
  REPORT-main=cloudfront_w3c
  SHOULD_LINEMERGE=false
  TIME_FORMAT=%Y-%m-%d %T
  MAX_TIMESTAMP_LOOKAHEAD = 19
  SHOULD_LINEMERGE = false
  TZ = UTC
  pulldown_type = true

transforms.conf

[cloudfront]
DEST_KEY = MetaData:Sourcetype
REGEX = .*
FORMAT = sourcetype::cloudfront

 [cloudfront_w3c]
 DELIMS = " "
 FIELDS = date,time,x_edge_location,sc_bytes,c_ip,cs_method,cs_Host,cs_uri_stem,sc_status,cs_referer,cs_user_agent,cs_uri_query,cs_cookie,x_edge_result_type,x_edge_request_id,x_host_header,cs_protocol,cs_bytes,time_taken

Please let me know if it worked ..

Regards,
Ahmed

0 Karma

MayankSplunk
Path Finder

Sorry, din't work.

0 Karma

aakwah
Builder

As per the raw log above is the delimiter between fields is one space or four spaces?

If it is 4 spaces, please edit transforms.conf to have:

  [cloudfront_w3c]
  DELIMS = "    "

Regards,
Ahmed

0 Karma

MayankSplunk
Path Finder

Seems like tab space, i tried tab, one ,two and four spaces, din't help

0 Karma

aakwah
Builder

Really strange I had the same config and it worked fine, just last trial can you please run the the search index=video to make sure that sourcetype is correct.

Regards,
Ahmed

0 Karma

MayankSplunk
Path Finder

Sure, Tried without the sourcetype - din't fetch any records

index=video date="2015-01-28"

Here is how my #inputs.conf looks

[monitor:///Users/m/Downloads/cloudfrontlogs.log]
index=video
sourcetype=cloudfront

0 Karma

aakwah
Builder

Fine, now I've all the configuration you have on search head, so if you can provide me a sample from the logs to reproduce the issue at my side it will be great.

0 Karma

MayankSplunk
Path Finder

sure,

**#Version: 1.0

Fields: date time x-edge-location sc-bytes c-ip cs-method cs(Host) cs-uri-stem sc-status cs(Referer) cs(User-Agent) cs-uri-query cs(Cookie) x-edge-result-type x-edge-request-id x-host-header cs-protocol cs-bytes time-taken**

2015-01-15 20:59:51 JAX1 1871 7.9.79.36 GET d21.cloudfront.net /test-production/The_W/20150112/54b4838ce4b0bba7b00fa440/54b48398e4b0f8e9e9d6ddf2_1413918081967-dr4cae_t_1421116337023_640_360_600.m3u8 200 http://www.tv.com/test/business/m/2015/01/13/1_v.html Mozilla/5.0%2520(Windows%2520NT%25206.0;%2520WOW64)%2520AppleWebKit/537.36%2520(KHTML,%2520like%2520Gecko)%2520Chrome/39.0.2171.95%2520Safari/537.36 - - Hit Ba3xwTDLovRQ12HojdT-zb-czH_cmLUqtYl_m2FmuHE0ow== videos.test.com http 637 0.002
2015-01-15 20:58:05 ATL50 779890 18.12.8.57 GET d21.cloudfront.net /test-production/S_M/20150112/54b44e04e4b0bba7b00fa2b2/54b44e13e4b0f8e9e9d6dcd9_1413918194984-wmfwb0_t_1421102625151_320_180_30000000.ts 200 http://www.tv.com/test/static/js/p/vendor/jwplayer/jw-6.11/jwplayer.flash.swf Mozilla/5.0%2520(compatible;%2520MSIE%252010.0;%2520Windows%2520NT%25206.1;%2520Win64;%2520x64;%2520Trident/6.0) - - Hit QzZckws5b8rKxZttkRy_sWSXF3fOtRnO3Kje6Qf_fvP25YpnWVcvRQ== videos.test.com http 634 2.537

0 Karma

aakwah
Builder

Hello,
All fields are successfully extracted with the above configuration as per the following snapshots:

https://drive.google.com/file/d/0B9wUSHOfDLvoT3E2dVVGcU92eGc/view?pli=1
https://drive.google.com/file/d/0B9wUSHOfDLvoSzFWMnFZWUxFZ2s/view?pli=1

I used the following configuration (I've added nullPound pat in props.conf and transforms.conf to exclude lines with # in beginning of the file )

inputs.conf (/opt/splunk/etc/system/local/inputs.conf)
[monitor:///tmp/support]
index = bcoat_logs
sourcetype = cloudfront

props.conf (/opt/splunk/etc/apps/search/default/props.conf)
[cloudfront]
TRANSFORMS-sourcetype = nullPound, cloudfront
REPORT-main=cloudfront_w3c
SHOULD_LINEMERGE=false
TIME_FORMAT=%Y-%m-%d %T
MAX_TIMESTAMP_LOOKAHEAD = 19
SHOULD_LINEMERGE = false
TZ = UTC
pulldown_type = true

transforms.conf
[nullPound]
REGEX = ^#
DEST_KEY=queue
FORMAT=nullQueue

[cloudfront]
DEST_KEY = MetaData:Sourcetype
REGEX = .*
FORMAT = sourcetype::cloudfront

[cloudfront_w3c]
DELIMS = " "
FIELDS = date,time,x_edge_location,sc_bytes,c_ip,cs_method,cs_Host,cs_uri_stem,sc_status,cs_referer,cs_user_agent,cs_uri_query,cs_cookie,x_edge_result_type,x_edge_request_id,x_host_header,cs_protocol,cs_bytes,time_taken

I think that there are some configuration files in your environment are overriding the configuration we add, try to search for "cloudfront" in all files as per the following:

grep -R 'cloudfront' /opt/splunk/etc/*

Regards,
Ahmed

0 Karma

MayankSplunk
Path Finder

Until now I was placing my props.conf in wrong directory I have update the location you specified.

#transforms.conf -> etc/system/local/transforms.conf

Running the grep command gives following output:

etc/apps/search/default/props.conf:[cloudfront]
etc/apps/search/default/props.conf:TRANSFORMS-sourcetype = nullPound, cloudfront
etc/apps/search/default/props.conf:REPORT-main=cloudfront_w3c
etc/system/local/inputs.conf:[monitor:///Users/m/Downloads/cloudfrontlogs.log]
etc/system/local/inputs.conf:sourcetype=cloudfront
etc/system/local/transforms.conf:[cloudfront]
etc/system/local/transforms.conf:FORMAT = sourcetype::cloudfront
etc/system/local/transforms.conf:[cloudfront_w3c]

nothing seems wrong with grep output. I can try to install splunk again and try it out.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...