About zsimic

zsimic · ‎05-21-2011

Thank you for the clarification, I was used to the stackoverflow.com way which goes the other way around 🙂

zsimic · ‎05-20-2011

My accept rate stays at 0% even though I have 3 out of 5 questions where I accepted an answer. Looks like there's a bug somewhere 🙂 Or is there something that I need to do to make it register?

zsimic · ‎05-20-2011

I have an input setup to monitor a folder where new log files get generated daily. Today however, a bad process generated a file that is 4GB in size, the process that generated it was stuck in an infinite loop, generating the same log output over and over, until we noticed and killed the process. That shot up my splunk limit for the day in one pass. I have 2 questions: Is there a way to delete entirely that one file from the index (and reclaim the space that it used up)? Is there a way to tell splunk to automatically ignore such bad files? Like a limit, tell it to stop indexing any file if it grows bigger than a certain size (say 100MB) Any over clever setting that could tell splunk to watch out for such occurrences? (where a rogue process gets stuck in an infinite loop and generates garbage logs ad vitam eternam)

zsimic · ‎05-20-2011

Thanks, that clarifies a lot 🙂 Initially I wanted followTail=1 because I want to skip the already existing GBs of log files that were generated before, but I found out that I can use something like ignoreOlderThan=2d which solves that pretty nicely. I see what you mean for sourcetype, it looks like it'll work out nicely. I'm giving it a try and will come back promptly to this answer. Thank you again for the answer.

zsimic · ‎05-19-2011

OK, this works well. The logs where this appear consider '_' to be a character just as any other, so tokenizing it there is odd. But this is a very acceptable solution.

zsimic · ‎05-19-2011

How to search for a whole word? I try searching for something like "something", but I get matches for many things starting with "something" and followed by an underscore character '_', such as: "something_else", "something_other" etc. I would like to match only "something" (without any underscore after it)

zsimic · ‎05-19-2011

Excellent! This seems to be quite suitable for this. Ignoring files older than 2 days will cover every situation in this case. Thanks!

zsimic · ‎05-19-2011

Looks like your quotes are wrong, and didn't specify earlest/latest correctly try this: index="named" earliest=@d-13h latest=@d-1h Or, if you prefer to have quotes everywhere: index="named" earliest="@d-13h" latest="@d-1h" If you meant "between yesterday 11 PM and today 11 AM", then try this (same as yours, just quotes corrected): index="main" earliest="@d-1h" latest="@d+11h" Hope this helps 🙂

zsimic · ‎05-19-2011

I have an ActiveBatch setup that generates many files (tens of thousands) in a folder. I'd like to have Splunk read only files freshly generated in these ActiveBatch folders. I am using the setting followTail=1 for now, and it works OK. Is there a better way to do this? It took splunk several hours of 100% CPU usage to go through a couple of such folders (with 30K files each). The files are generated once and are never modified after that (so "following their tail" is useless). Is there a way to tell that to splunk? A setting similar to followTail but that would tell it to: look only at new files in a folder (ignore any files that existed before the input was defined in splunk) each file is created when corresponding job starts running, the file grows for some time (anywhere from 1 second to several hours, depending how long the corresponding job takes to complete) once the corresponding job is finished the log file will never be modified again (no use tailing it anymore) there are tens of thousands of such files, in several folders (it looks like tailing all those files is taking a serious toll on splunkd) each of these files has a common section at the end, that can be used to determine that no more monitoring is necessary (you can see that common section this question)

zsimic · ‎05-19-2011

I want to consume log files generated by jobs running under Active Batch. I'm pretty new to splunk. What would be the best way to set this up be? (I could maybe even make an app for this...). I have several questions, any help greatly appreciated. Which inputs.conf file should best hold this? I see that there are many choices for this... If I make an app, I guess the answer is obvious: in the app's 'local' folder. But if I don't make an app? Is it best to put it under etc/apps/search/local ? Or etc/system/local ? Or somewhere else? Active batch generates log files in folders of this form (on Windows): \\MachineName\ASCI_ABATLOG\SchedulerName\MachineName\process_name_0000032407-21Jan2011-173055_060.log (the last part being some sort of unique number, followed by a date and timestamp, then a "sequence" number). Each run gets a separate log file. The contents of each log file is composed of whatever the job output is + a common section at the end describing where the process ran, when, exit code, etc (unclear whether something can be done to teach splunk to extract useful data from that common section). I've pasted an example of what that common section looks like at the bottom of this question. My questions are (sorry, I read the documentation extensively, but can't figure out what the best approach here is) - how to setup things so that 'source' and 'sourcetype' (and possibly 'host') get extracted from the path name? - 'host' would correspond to 'MachineName' above in the path - 'source' should be the full path up to the '_NNNN-date-...' I guess (don't think it's useful to keep that part) - 'sourcetype' should correspond to the 'process_name' part of the path above I got started by defining this for example as output for now in etc/apps/search/local/inputs.conf, it works but 'source' and 'sourcetype' aren't useful, and I have to keep repeating sections like these... [monitor://\\FFSVK05\ASCI_ABATLOG\QAsched\FFSVK05] host = FFSVK05 disabled = false followTail = 1 index = qa Here's what the 'common section' I mentioned above looks like. This is always at the end of each file... Would be awesome if Splunk could be taught about it and make it so that it extracts the info in there in a meaningful way. **************************************************** * J O B S T A T I S T I C S * * * * ActiveBatch (r) Version 7 * * The Enterprise Job Scheduling System * * Engineered By * * Advanced Systems Concepts Inc * * http://www.advsyscon.com * **************************************************** Job Id : 33096 Job Name : some_exe_name Batch Id : 33009 Command Line : \\some_share\some_exe_name.exe some_command_line -foo=bar Working Directory : c:\temp Client Machine : MachineName Submitted by : DOMAIN\user Job Start Time : 1/24/2011 9:55:40 AM Execution User : DOMAIN\user Execution Queue : MachineName Execution Machine : MachineName Job Scheduler : QASCHED Job Completed at : 1/24/2011 10:02:16 AM Elapsed Time : 0 00:06:35.791 CPU Time : 0 00:05:02.140 Completion Status : 0 (0x0) : (The operation completed successfully. ) -------------Job Object Statistics-------------- Total User Time : 0 00:05:00.531 Total Kernel Time : 0 00:00:01.609 Page Faults : 97807 Process Count : 1 Peak Process Memory : 241967104 Peak Job Memory : 241967104 Read Operations : 875 Read Byte Count : 3006947 Write Operations : 140 Write Byte Count : 9942 Other I/O Operations : 76578 Other I/O Byte Count : 6066282 ---------- Note: except where specifically noted, all times are based on the Execution Agent Machine. *********************End of Log*********************

zsimic · ‎05-01-2011

Allright! Made it work, all I needed to do was add index=imported as well in the join 🙂 as in: index=imported sourcetype=A class=FOO | join eventid [search index=imported sourcetype=B] | timechart span=24h count by eventid

zsimic · ‎05-01-2011

The join doesn't seem to work, I tried but I get no entries returned. The first part of the search had to be like this because of how I organized the data (had to omit the first 'search' word otherwise I get nothing): index=imported sourcetype=A class=FOO That returns all results as I expected, but adding a join yields 0 events, no matter what I try

zsimic · ‎05-01-2011

Hi, I'm trying to feed some data coming from a SQL database to Splunk. I have multiple tables that I'm trying to "flatten out" for Splunk, and I'd like to know if there is any way to keep the connections that existed in the SQL DB. I have something like this: Table A: Columns: timestamp, eventid, class, user, host and multiple other key-value pairs Table B: Columns: timestamp, eventid, severity, message Many rows in Table B will correspond to 1 row in Table A Table C: Columns: eventid, type, value Many rows in Table C will also correspond to 1 row in Table A What I would like to know is basically what's the best approach to teach Splunk about the connection that exists here via 'eventid'? (which is just a number). I'm planning on simply outputting the contents of Table A and B in this form: For Table A, one line per row: [time] eventid=<number>, class=<string>, user=<string>, host=<string>, etc... For Table B, also one line per row: [time], eventid=<number>, severity=<string>: <message> For Table C, also one line per row: [time], eventid=<number>, type=<string>, value=<decimal number> I'm new to Splunk, so maybe what I'm worrying about here is irrelevant for Splunk... Is there a way to correlate entries from B and C without repeating 'class=...' in each entry output for B and C? Can Splunk basically find all 'eventid' numbers that were generated for a given 'class' (in a given time frame), then fetch all B and C items for all those 'eventid'-s? And could it sum/average the 'value' found in C for such a 'class'? I don't want to repeat 'class=' in each line for B and C because the amount of data repeated would be huge: there are many fields like 'class' (like 'user' and 'host' shown above), some of them with relatively long values there are many entries in B and C for 1 row in A... To explain this with a more concrete example: suppose I search for all Table A entries with class=FOO, I get a set of entries found with their 'eventid', I would like to now show a chart per day with the number (count) of entries in Table B having one of the 'eventid'-s in that set. Also, with Table C's 'value', I would like to see a chart with average 'value'-s per day, all the 'value'-s considered being averaged in that set of 'eventid'-s, and by 'type' (found in the Table C output). What to do basically (how to best organize the data I output) to make sure Splunk has everything it needs to make the connections?

Posts	13
Solutions	1
Karma Given	6
Karma Received	9
Member Since	‎04-28-2011

Online Status	Offline
Date Last Visited	‎06-05-2020 02:02 AM

Why does my 'accept rate' stay at 0%?

How to delete a file entirely from an index?

How to restrict a search to match a full word?

How to tell splunk to read log files only once, bu...

Recommended way to consume Active Batch logs?

How to tell splunk that 2 entries are related when...

Re: Why does my 'accept rate' stay at 0%?

Why does my 'accept rate' stay at 0%?

How to delete a file entirely from an index?

Re: Recommended way to consume Active Batch logs?

Re: How to restrict a search to match a full word?

How to restrict a search to match a full word?

Re: How to tell splunk to read log files only once...

Re: data query

How to tell splunk to read log files only once, bu...

Recommended way to consume Active Batch logs?

Re: How to tell splunk that 2 entries are related ...

Re: How to tell splunk that 2 entries are related ...

How to tell splunk that 2 entries are related when...