Getting Data In

IIS log files are not read properly - parts of multiple lines getting put together as one

madrum
Explorer

I have a report that groups webpage request by from an IIS log by SC_STATUS. The results are really bad because splunk appears to be getting confused on what line and what part of a line it's reading, resulting in data like "myurl.com" showing up where "200" for sc_status should be.

I have Splunk set up to monitor the folder where log files are stored in real time and I manually selected IIS logs when identifying the format of the files.

This is what Splunk has stored for one request:
2015-12-30 15:06:54 W3SVC3 MYWEBSERVER 192.111.11.11 GET /App_Themes/Blue/Blue.css - 80 - 54.69.58.243 HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) stuff_id=stuff;+user=stuff;+persistcookie=True;+stuffSelection=STUFF1,STUFF2,STUFF3,STUFF4,STUFF5,;+MYWEBSITE=R285025761;+ASP.NET_SessionId=3sgbsssgrvbwizta31fcynmx;+MyWebSite.ASPXAUTH=D2E24F7A75F2114DCF6AFB5DA65C739A2972D39870A74C1735EF0B3A819F27D5E743DE70EB6C5D7ADF944507DA71042D235483889FEA3A736EFBA2E81AB02F47A08BA93D51C6563422CE17055236EA5BBDCC03A03B4389CE042ADDFB89AA7A7D6C7246376DB20045AD709BE50444332F048A79BD65269C0919B0A5ADA4EE415EE1E96BCFBF3D5D33507D663A5671DE9E https://m5.0+(Macintosh;+Intel+Mac+OS+X+10_10)+AppleWebKit/600.1.25+(KHTML,+like+Gecko)+Version/8.0+... MYWEBSITE=R285025761;+ASP.NET_SessionId=o2hgz2wa34vj2v0i2c5zdmis https://mywebsite.thisisawesome.com/Logon.aspx?ReturnUrl=%2f mywebsite.thisisawesome 200 0 0 24916 515 31

This request appears to be a mashup of two or more requests:
Part 1: 2015-12-30 15:06:54 W3SVC3 MYWEBSERVER 192.111.11.11 GET /App_Themes/Blue/Blue.css - 80 - 54.69.58.243 HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+Trident/7.0;+rv:11.0)+like+Gecko stuff_id=stuff;+user=stuff;+persistcookie=True;+datalistSelection=OFAC,PEP_FO,;+MYWEBSITE=R285025761;+ASP.NET_SessionId=ykvwd2cgbhjcjck45jcy1w13 https://mywebsite.thisisawesome.com/logon.aspx mywebsite.thisisawesome.com 304 0 0 92 593 62

Part 2: 2015-12-30 15:06:38 W3SVC3 MYWEBSERVER 192.111.11.11 GET /Includes/jquery-1.4.2.min.js - 80 - 209.15.236.88 HTTP/1.1 Mozilla/5.0+(Macintosh;+Intel+Mac+OS+X+10_10)+AppleWebKit/600.1.25+(KHTML,+like+Gecko)+Version/8.0+Safari/600.1.25 MYWEBSITE=R285025761;+ASP.NET_SessionId=o2hgz2wa34vj2v0i2c5zdmis https://mywebsite.thisisawesome.com/Logon.aspx?ReturnUrl=%2f mywebsite.thisisawesome.com 200 0 0 24916 515 31

and part of another request in the middle.

I can see at least one place where the lines were mashed together. In this snippit, "5671DE9E https://m5.0+(Macintosh;+Int", you can see "https://m" is part of a URL and "5.0+" is part of a user agent but they're put together without a space as if they're one field.

Other than that, I'm not sure where the data is coming from in the log file to put that one request together in Splunk.

My question is, how do I get Splunk to read my IIS logs properly and not mash up multiple lines into one line?

Thanks!

0 Karma

jkat54
SplunkTrust
SplunkTrust

It appears you have a line merge / line breaker problem. You'll want to check your inputs.conf for the sourcetype you're using to consume these logs, then you'll want to match that up to your props.conf to see if SHOULD_LINEMERGE = false, and configure a line breaker... looks like date will be best.

inputs.conf:

[<input stanza>]
...
sourcetype=sourcetypeName

props.conf:

[sourcetypeName]
 ...
SHOULD_LINEMERGE=false
LINE_BREAKER=\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}

Check the docs for reference. If you're sending from universal forwarder, you'll need to put props on the forwarder.
http://docs.splunk.com/Documentation/Splunk/latest/admin/Propsconf

0 Karma

madrum
Explorer

I believe I should be looking for sourcetype of "iis" since that what I've configured the data input as.

In the inputs.conf file, I do not see anything for iis so I'm not sure if any changes are necessary.

I see this in the props.conf file. SHOULD_LINEMERGE is already set to false.

[iis]
pulldown_type = true
MAX_TIMESTAMP_LOOKAHEAD = 32
SHOULD_LINEMERGE = False
INDEXED_EXTRACTIONS = w3c
detect_trailing_nulls = auto
category = Web
description = W3C Extended log format produced by the Microsoft Internet Information Services (IIS) web server

I'll add this below description: LINE_BREAKER=\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}
... and see what happens.

0 Karma

jkat54
SplunkTrust
SplunkTrust

make it a lowercase false, but yeah you gotta have LINE_BREAKER or MUST_BREAK_BEFORE ONLY_BREAK_AFTER ONLY_BREAK_BEFORE etc if you set SHOULD_LINEMERGE=false.

0 Karma

jkat54
SplunkTrust
SplunkTrust

The inputs.conf will be located on the forwarder on the IIS servers or wherever splunk is reading the log files from.

You can run $splunk_home$/bin/splunk cmd btool inputs list --debug to see what inputs.conf stanzas are loaded and what app their loaded from.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...