Getting Data In

Squid Log files

jgauthier
Contributor

Hey All,

I enabled the squid app for splunk and threw a log file into it.  Pretty quick and easy, and I whipped out an additional dashboard.    (Thanks to who put this together)

I noticed an issue, and in my noobness, looking for some direction. When I loaded the log file, splunk recorded 80,000 records loaded at 8:00pm. Well, it's true I loaded them, but I think it should have parsed the timestamp so I can do historical reporting. (Correct me if I'm wrong)

So, I looked at the transform and the regex is:
^\d+\.\d+\s+(\d+)\s+([0-9\.]*)\s+([^/]+)/(\d+)\s+(\d+)\s+(\w+)\s+((?:([^:]*)://)?([^/:]+):?(\d+)?(/?[^ ]*))\s+(\S+)\s+([^/]+)/([^ ]+)\s+(.*)$
format is:

duration::$1 clientip::$2 action::$3 http_status::$4 bytes::$5 method::$6 uri::$7 proto::$8 uri_host::$9 uri_port::$10 uri_path::$11 username::$12 hierarchy::$13 server_ip::$14 content_type::$15

The first field should be timestamp. When looking at squid data in search, "fields" include "timestamp" but it's determine that there are "none".

As a refresher, the log file entries look so:

1301087053.193 182 10.2.40.179 TCP_MISS/400 1083 GET http://api.twitter.com/1/statuses/user_timeline.json? username DIRECT/199.59.148.87 application/json

My regex-foo is weak, and I'm definitely below average. However, shouldn't this include the timestamp in order for splunk to index it by time properly?

So, I want to load last months data, but I will not be able to report on February 2011, because it appears to be all new data as of the load data.

Thanks for the advice. Moving forward, the records are correct. Obviously, splunk is doing it's own timestamp.

Tags (2)
1 Solution

gkanapathy
Splunk Employee
Splunk Employee

The transforms.conf file is fine. However, the TIME_FORMAT specified in the props.conf file in the squid app is wrong. I'm not sure why or if it's a typo, but the file says:

TIME_FORMAT = %3N

It should be:

TIME_FORMAT = %s.%3N
TIME_PREFIX = ^

Changing/adding the lines should solve your problem.

The app probably worked in the past because when the defined TIME_FORMAT failed, it used the default Splunk time formats. Because of this issue, the default timestamps stopped working for timestamps after March 12, 2011, so Splunk just used current time, which isn't ideal.

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

The transforms.conf file is fine. However, the TIME_FORMAT specified in the props.conf file in the squid app is wrong. I'm not sure why or if it's a typo, but the file says:

TIME_FORMAT = %3N

It should be:

TIME_FORMAT = %s.%3N
TIME_PREFIX = ^

Changing/adding the lines should solve your problem.

The app probably worked in the past because when the defined TIME_FORMAT failed, it used the default Splunk time formats. Because of this issue, the default timestamps stopped working for timestamps after March 12, 2011, so Splunk just used current time, which isn't ideal.

jgauthier
Contributor

Sure! I modified some of yours to fit my needs better. So for instance I removed the client IP charts and replaced them with usernames. I also added a "heaviest bandwidth user" search on the main dashboard with this query: 'sourcetype="squid" action="*" | stats sum(bytes) as tb by username | sort -tb | head 10'

I then created a dashboard I called "Sites", which I pull the top ten users of certain "hot" sites at my company. Like facebook, pandora, youtube, etc. These are all done by 'hits', but I would like reproduce them all by bandiwdth as well, as they are different metrics!

0 Karma

Ayn
Legend

Hi! I'm authoring the Squid app. Thanks gkanapathy for discovering this issue, like you say it's a typo that apparently has gone undiscovered thus far. I'll put out an updated version which fixes this.

jgauthier, I'm interested to hear what additional dashboard you created - maybe it's something that could be useful to Squid app users in general? In that case I could include that in the updated version as well.

0 Karma

jgauthier
Contributor

Thank you! I reloaded a small subset of data and ran some tests. It was perfect. I am going to reload the file now. Thanks so much!

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...