I have setup forwarder to dump my cloudfront logs to splunk, below is the raw logs format. I have tried following http://docs.splunk.com/Documentation/AddOns/released/CiscoWSA/Configurew3clogfieldextractions matching cloudfront logs but no luck. Below is how I have setup props & transforms.conf. I'm able to see the raw data in splunk, but I want to map to event name so i can query them.
I'm trying to follow as mentioned in http://answers.splunk.com/answers/57770/transforms-conf-and-props-conf-field-extractions.html
#Raw Logs
2015-01-27 12:48:48 JAX1 1871 71.1.1.16 GET d21rhj.cloudfront.net /test/20150112/54b48398e4b0f8e9e9d6ddf2_141391808196.mp4 200 http://www.test.com/demo.html Mozilla/5.0%2520(Windows%2520NT%25206.0;%2520WOW64)%2520AppleWebKit/537.36%2520(KHTML,%2520like%2520Gecko)%2520Chrome/39.0.2171.95%2520Safari/537.36 - - Hit Ba3xwT-zb-czH_zw== v.test.com http 637 0.002
#transforms.conf
[auto_kv_for_video_cloudfront_w3c]
REGEX=/\S+/g
FORMAT=date::$1,time::$2,x_edge_location::$3,sc_bytes::$4,c_ip::$5,cs_method::$6,cs_Host::$7,cs_uri_stem::$8,sc_status::$9,cs_referer::$10,cs_user_agent::$11,cs_uri_query::$12,cs_cookie::$13,x_edge_result_type::$14,x_edge_request_id::$15,x_host_header::$16,cs_protocol::$17,cs_bytes::$18,time_taken::$19
#props.conf
[cloudfrontprof]
pulldown_type=1
REPORT-auto_kv_for_video_cloudfront_w3c=auto_kv_for_video_cloudfront_w3c
CloudFront access logs use W3C extended log format. Therefore you can have Splunk parse the file header and automatically extract all fields at index-time using the following simple props.conf:
[cloudfront-access-log]
INDEXED_EXTRACTIONS = W3C
You can learn more about field extractions for structured data files here:
http://docs.splunk.com/Documentation/Splunk/6.3.0/Data/Extractfieldsfromfileheadersatindextime
I can see that the delimiter between fields is space so you can use the following configuration,
transforms.conf
[cloudfront_w3c]
DELIMS = " "
FIELDS = date,time,x_edge_location,sc_bytes,c_ip,cs_method,cs_Host,cs_uri_stem,sc_status,cs_referer,cs_user_agent,cs_uri_query,cs_cookie,x_edge_result_type,x_edge_request_id,x_host_header,cs_protocol,cs_bytes,time_taken
props.conf
[cloudfrontprof]
REPORT-main=cloudfront_w3c
SHOULD_LINEMERGE=false
TIME_FORMAT=%Y-%m-%d %T
MAX_TIMESTAMP_LOOKAHEAD = 19
SHOULD_LINEMERGE = false
TZ = UTC
pulldown_type = true
Regards,
Ahmed Elakwah
Hi Ahmed,
Thanks for the reply. I have implemented what you have mentioned but when I query using following command I don't see any data but I do have raw logs. Event mapping won't work, any other thoughts?
index=video sourcetype="cloudfront" date="2015-01-27"
In props.conf you should put sourcetype in stanza to be applied to the logs that come from sourcetype, so if your sourcetype is cloudfront, props.conf should be like this:
[cloudfront]
REPORT-main=cloudfront_w3c
SHOULD_LINEMERGE=false
TIME_FORMAT=%Y-%m-%d %T
MAX_TIMESTAMP_LOOKAHEAD = 19
SHOULD_LINEMERGE = false
TZ = UTC
pulldown_type = true
Regards,
Ahmed
thanks, i tried that din't work, any other ideas?
Please use the following files then restart searchhead,
props.conf
[cloudfront]
TRANSFORMS-sourcetype = cloudfront
REPORT-main=cloudfront_w3c
SHOULD_LINEMERGE=false
TIME_FORMAT=%Y-%m-%d %T
MAX_TIMESTAMP_LOOKAHEAD = 19
SHOULD_LINEMERGE = false
TZ = UTC
pulldown_type = true
transforms.conf
[cloudfront]
DEST_KEY = MetaData:Sourcetype
REGEX = .*
FORMAT = sourcetype::cloudfront
[cloudfront_w3c]
DELIMS = " "
FIELDS = date,time,x_edge_location,sc_bytes,c_ip,cs_method,cs_Host,cs_uri_stem,sc_status,cs_referer,cs_user_agent,cs_uri_query,cs_cookie,x_edge_result_type,x_edge_request_id,x_host_header,cs_protocol,cs_bytes,time_taken
Please let me know if it worked ..
Regards,
Ahmed
Sorry, din't work.
As per the raw log above is the delimiter between fields is one space or four spaces?
If it is 4 spaces, please edit transforms.conf to have:
[cloudfront_w3c]
DELIMS = " "
Regards,
Ahmed
Seems like tab space, i tried tab, one ,two and four spaces, din't help
Really strange I had the same config and it worked fine, just last trial can you please run the the search index=video to make sure that sourcetype is correct.
Regards,
Ahmed
Sure, Tried without the sourcetype - din't fetch any records
index=video date="2015-01-28"
Here is how my #inputs.conf looks
[monitor:///Users/m/Downloads/cloudfrontlogs.log]
index=video
sourcetype=cloudfront
Fine, now I've all the configuration you have on search head, so if you can provide me a sample from the logs to reproduce the issue at my side it will be great.
sure,
**#Version: 1.0
2015-01-15 20:59:51 JAX1 1871 7.9.79.36 GET d21.cloudfront.net /test-production/The_W/20150112/54b4838ce4b0bba7b00fa440/54b48398e4b0f8e9e9d6ddf2_1413918081967-dr4cae_t_1421116337023_640_360_600.m3u8 200 http://www.tv.com/test/business/m/2015/01/13/1_v.html Mozilla/5.0%2520(Windows%2520NT%25206.0;%2520WOW64)%2520AppleWebKit/537.36%2520(KHTML,%2520like%2520Gecko)%2520Chrome/39.0.2171.95%2520Safari/537.36 - - Hit Ba3xwTDLovRQ12HojdT-zb-czH_cmLUqtYl_m2FmuHE0ow== videos.test.com http 637 0.002
2015-01-15 20:58:05 ATL50 779890 18.12.8.57 GET d21.cloudfront.net /test-production/S_M/20150112/54b44e04e4b0bba7b00fa2b2/54b44e13e4b0f8e9e9d6dcd9_1413918194984-wmfwb0_t_1421102625151_320_180_30000000.ts 200 http://www.tv.com/test/static/js/p/vendor/jwplayer/jw-6.11/jwplayer.flash.swf Mozilla/5.0%2520(compatible;%2520MSIE%252010.0;%2520Windows%2520NT%25206.1;%2520Win64;%2520x64;%2520Trident/6.0) - - Hit QzZckws5b8rKxZttkRy_sWSXF3fOtRnO3Kje6Qf_fvP25YpnWVcvRQ== videos.test.com http 634 2.537
Hello,
All fields are successfully extracted with the above configuration as per the following snapshots:
https://drive.google.com/file/d/0B9wUSHOfDLvoT3E2dVVGcU92eGc/view?pli=1
https://drive.google.com/file/d/0B9wUSHOfDLvoSzFWMnFZWUxFZ2s/view?pli=1
I used the following configuration (I've added nullPound pat in props.conf and transforms.conf to exclude lines with # in beginning of the file )
inputs.conf (/opt/splunk/etc/system/local/inputs.conf)
[monitor:///tmp/support]
index = bcoat_logs
sourcetype = cloudfront
props.conf (/opt/splunk/etc/apps/search/default/props.conf)
[cloudfront]
TRANSFORMS-sourcetype = nullPound, cloudfront
REPORT-main=cloudfront_w3c
SHOULD_LINEMERGE=false
TIME_FORMAT=%Y-%m-%d %T
MAX_TIMESTAMP_LOOKAHEAD = 19
SHOULD_LINEMERGE = false
TZ = UTC
pulldown_type = true
transforms.conf
[nullPound]
REGEX = ^#
DEST_KEY=queue
FORMAT=nullQueue
[cloudfront]
DEST_KEY = MetaData:Sourcetype
REGEX = .*
FORMAT = sourcetype::cloudfront
[cloudfront_w3c]
DELIMS = " "
FIELDS = date,time,x_edge_location,sc_bytes,c_ip,cs_method,cs_Host,cs_uri_stem,sc_status,cs_referer,cs_user_agent,cs_uri_query,cs_cookie,x_edge_result_type,x_edge_request_id,x_host_header,cs_protocol,cs_bytes,time_taken
I think that there are some configuration files in your environment are overriding the configuration we add, try to search for "cloudfront" in all files as per the following:
grep -R 'cloudfront' /opt/splunk/etc/*
Regards,
Ahmed
Until now I was placing my props.conf in wrong directory I have update the location you specified.
#transforms.conf -> etc/system/local/transforms.conf
Running the grep command gives following output:
etc/apps/search/default/props.conf:[cloudfront]
etc/apps/search/default/props.conf:TRANSFORMS-sourcetype = nullPound, cloudfront
etc/apps/search/default/props.conf:REPORT-main=cloudfront_w3c
etc/system/local/inputs.conf:[monitor:///Users/m/Downloads/cloudfrontlogs.log]
etc/system/local/inputs.conf:sourcetype=cloudfront
etc/system/local/transforms.conf:[cloudfront]
etc/system/local/transforms.conf:FORMAT = sourcetype::cloudfront
etc/system/local/transforms.conf:[cloudfront_w3c]
nothing seems wrong with grep output. I can try to install splunk again and try it out.