About parallaxed

parallaxed · ‎11-03-2010

concluded it must have been another component taking over while splunkd was in iowait... we found our high iowait was due to distance from the NFS filer (>3ms)

parallaxed · ‎09-23-2010

Try | head 5

parallaxed · ‎09-20-2010

If we could, we would, unfortunately we can't aggregate all this data in one place, and that's part of the problem that we're using Splunk to try and solve. On the other note about caching, yes these operations are cacheable (even in v3), but short lived as I understand it, for the reasons you mentioned. Certain things like attribute cache are possible to disable for security. If the read pattern isn't able to benefit from caching much, perhaps there may be some worth in determining which ops are faster without caching effects, and conservatively using those when network filesystems are in use

parallaxed · ‎09-17-2010

The key to the final resolution is that our use case involves a lot of small files, and we have a notable latency (~5ms) between filer and indexer. Splunk uses stat() and access() a fair bit during it's various uptake cycles. With lots of small files (as opposed to a few big ones), Splunk is spending expensive, uncached iops to stat() the files as it traverses the inputs. Had the situation been reverse (a few big files), readahead cache would've kicked in, and the effect of the latency would've been negligible. To mitigate this a little, we added forwarders closer to the source (<1ms), to take advantage of less RTT on the noncached iops. Curiously, we've observed NFS caching being drastically less effective on access() calls at higher latencies, but we're still investigating some of these interesting side-effects.

parallaxed · ‎09-17-2010

Do you think it leans too heavily on stat(), access() and friends? In our traces we saw mixtures of both. Is every call justified? You're not wrong about the latency, it turned out to be a major part of the problem - I've given a quick overview above...

parallaxed · ‎09-17-2010

Responding here to get the full formatting - this was solved using a combination of the above, although it's perhaps not as suitable as more advanced pattern matching on the host string. Create a file (call it grp1) with desired list of hosts inside. In this case I had a file containing box1.* box2.* box1500.* ... and so on You need to get Splunk to index this file, go w/o linemerge (use a newline breaker) For whatever strange reason, | fields +host Displays the host field twice, and causes a strange artifact with | format, making your string look like OR ( host=box1.domain.com host=box1.* ) OR ( host=box1.domain.com host=box1.* ) The | rex overcomes this problem, so the final search string (to search for all hosts you listed in the file: index=mydata [ search source=*grp1* | rex field=_raw "host=(?<host>.*)" | fields + host | format ] The subsearch returns the results for the host group, the main search provides the data.

parallaxed · ‎09-17-2010

At least that limit can be changed!

parallaxed · ‎09-16-2010

Subsearch is the way to go I think, just need to find an optimal way of getting the data in there.

parallaxed · ‎09-16-2010

Liking this idea very much, but | metadata queries on hosts only returns 10000 results :[ (afaik this is hardcoded limitation) - the use case I'm looking at has in excess of that number. We may have to yield to generating plaintext files with the groups, and getting Splunk to index them so they can be returned with a simple subsearch for the sourcefile with a group listing. Something along those lines...

parallaxed · ‎09-16-2010

Globbing/wildcards does not work with the example provided unfortunately. For example host1* matches host1, host10, host100, host1000 and all in between. There's no way to specify host321-host426 for example, using wildcards. A full pattern match or range operator [] would suffice, but as I understand it that's currently not possible.

parallaxed · ‎09-16-2010

Hi, We want to search for hundreds of hosts at a time. The question is similar to these: http://answers.splunk.com/questions/968/how-can-i-easily-filter-or-limit-my-search-down-to-a-specific-group-of-hosts ^ Globbing is not good because a full text expansion will not match groups like the one in the title. Tags would be in the order of hundreds which becomes difficult to maintain. http://answers.splunk.com/questions/730/how-to-search-multiple-value-on-the-same-field/734#734 ^ This is more promising, but not ideal for a managed installation where clients may use it, as the csv has to exist in a dir on the server. What are the alternatives?

parallaxed · ‎08-13-2010

Restart was definitively needed, that was clearly hampering the testing.

parallaxed · ‎08-13-2010

The UI source is under .../share/splunk/search_mrsparkle/ if you want to hack it out. I don't think there's an option to disable it.

parallaxed · ‎08-13-2010

Looks like MetaData:Source should be used, but despite many variations and | extract reload=t, I can't seem to get this to work, even by attempting to force it, as per below transforms.conf [net_type] DEST_KEY = MetaData:Source REGEX = .* FORMAT = source::VMSTAT WRITE_META = true props.conf [net] SHOULD_LINEMERGE=false TRANSFORMS-net_type = net_type ^ Firstly, this "forcing" seems like it should be valid - it may not be, please correct me. I'm looking to apply this depending on the raw text of the event, so my source type isn't fixed and can't be set in inputs.conf. Is source override possible for only certain types of inputs? I should add this is Splunk 4.1.x, and that this transformation works if I use MetaData:Sourcetype instead of MetaData:Source. Why would it work with one field but not the other?

parallaxed · ‎07-26-2010

You'll need to do a couple of things. Firstly, partition your DHCP data into two events as best you can - separating the salient data from the repeating headers. You can do this by setting the right LINE_BREAKER/MUST_BREAK_AFTER/BREAK_ONLY_BEFORE in props.conf http://www.splunk.com/base/Documentation/4.1/Admin/Propsconf You then need to discard the trash event by sending it to the nullQueue, as described here: http://www.splunk.com/base/Documentation/4.1.4/Admin/Routeandfilterdata

parallaxed · ‎06-16-2010

Thanks, Lowell. Since it is valid, can this::"$1" syntax (with quotes) appear in the spec for transforms.conf? It'd be good to make it clear in both places on the docs...

parallaxed · ‎06-15-2010

I should add that you're getting no results for the second conf, which kind of backs that up. The first transforms.conf is valid. If you think there's nothing wrong with your regex, try splitting the capture in to 2 separate transforms and see if you can get it to work that way?

parallaxed · ‎06-15-2010

As with a lot of Splunk quirks, I don't see this documented (http://www.splunk.com/base/Documentation/latest/Admin/Transformsconf), so I'm not certain you need those quotes, or that it's even valid syntax in the latest version. Space-escaping is mentioned in that document, but only in relations to FIELDS= capturing, which is used alongside auto-kv/delims extraction (which is not what you're doing).

parallaxed · ‎06-14-2010

Curious - do you have these keys defined in fields.conf? You shouldn't need the quotes in transforms.conf, I'm unsure what that is supposed to achieve, but I assume it works for you in earlier versions? What does your props.conf look like?

parallaxed · ‎06-14-2010

Apologies for the lack of info: 4.1.2, Linux x64, 16GB, 4-way dual-core Xeon (2.33GHz). Files are being read off fast NetApp filers (nfsstat shows no bottlenecks)

parallaxed · ‎06-14-2010

Depending on the source of your data you need to set TZ appropriately, both on the input (props.conf), and in your environment ("export TZ=My/TimeZone") http://en.wikipedia.org/wiki/List_of_zoneinfo_time_zones Splunk will then search correctly with your given offsets, unless something extra-special is happening.

parallaxed · ‎06-14-2010

We have a configuration that's been idling for over two days, and instead of processing locations that the tailing processor has acknowledged, it continues to loop over previously processed locations, and the internal logs. Does this mean there's a practical/hard limit on the number of directories that can be absorbed? It seems other monitor inputs are being neglected somewhat. The tailing processor acknowledged these directories 2 days ago, but had not yet processed down to the bottommost level (the files themselves). Are there any good commands to inspect what the tailing processor is up to? What's on the queues etc?

parallaxed · ‎06-14-2010

I haven't yet correlated with any pauses in strace, so that's promising. I'm guessing it's another component doing heavy lifting, but at the same time we have a lot of inputs to digest. Is there any way to give priority to the various input processors over other components?

parallaxed · ‎06-08-2010

Since the rewrite of the tailing processor in 4.1, on the whole it seems much better than previous incarnations, but it appears to induce a hardcoded delay on directory traversal. There are consistent gaps in our debug output. We have these set: category.TailingProcessor=DEBUG category.WatchedFile=DEBUG category.BatchReader=DEBUG category.FileTracker=DEBUG The gaps we see are always around ~250-300ms, always when traversing into directories. Prior versions had similar problems, but these went away somewhat with the tailing_proc_speed option. In the worst case (0.3s), for 10000 distinct directories, this equates to ~50 minutes of idle time introduced by the tailing engine. A few questions: Is this really a hardcoded pause? If so, what's the reasoning? Also, is there a way to tune / remove it?

parallaxed · ‎06-02-2010

Why is it hard coded? Is there an alternative to count sources that's as quick as using the metadata?

Posts	37
Solutions	5
Karma Given	32
Karma Received	13
Member Since	‎03-09-2010

Online Status	Offline
Date Last Visited	‎06-05-2020 02:02 AM

Searching for large groups of hosts (or any other ...

Override source (tcp:xxxx) of a tcp input using tr...

Practical limit for monitor inputs? 20000+ directo...

splunkd - hardcoded pause on directory traversal?

| metadata type=sources maxes out at 10000 - limit...

Stop times like '0:20:00' being read as 8pm

Re: splunkd - hardcoded pause on directory travers...

Re: Limit result of a search query

Re: Practical limit for monitor inputs? 20000+ dir...

Re: Practical limit for monitor inputs? 20000+ dir...

Re: Practical limit for monitor inputs? 20000+ dir...

Re: Searching for large groups of hosts (or any ot...

Re: Searching for large groups of hosts (or any ot...

Re: Searching for large groups of hosts (or any ot...

Re: Searching for large groups of hosts (or any ot...

Re: Searching for large groups of hosts (or any ot...

Searching for large groups of hosts (or any other ...

Re: Override source (tcp:xxxx) of a tcp input usin...

Re: How can I remove the "First time logging in?" ...

Override source (tcp:xxxx) of a tcp input using tr...

Re: Remove DHCP Header

Re: Why is index-time field extraction not searcha...

Re: Why is index-time field extraction not searcha...

Re: Why is index-time field extraction not searcha...

Re: Why is index-time field extraction not searcha...

Re: Practical limit for monitor inputs? 20000+ dir...

Re: Custom time not working maybe locale issue

Practical limit for monitor inputs? 20000+ directo...

Re: splunkd - hardcoded pause on directory travers...

splunkd - hardcoded pause on directory traversal?

Re: | metadata type=sources maxes out at 10000 - l...