I'm not really sure where else to look at troubleshooting this problem below:
I have 4 data-input directories that are being watched by Splunk. Two hosts dirs, each with an apache and a tomcat dir. Once each day at midnite EST logs are copied here into dated dirs. One directory and it's log is routinely ignored it seems from what I see doing a meta search on the index for the sources.
I have tried deleting and re-adding the dir path in Splunk Manager (4.3). These dirs and files all have 644 and root:root and Splunk runs as root.
/opt/prod_log_spool/sjctriwa04/apache/FEB14 << ignored
/opt/prod_log_spool/sjctriwa04/tomcat/FEB14
/opt/prod_log_spool/sjctriwa03/apache/FEB14
/opt/prod_log_spool/sjctriwa03/tomcat/FEB14
/opt/prod_log_spool/sjctriwa04/apache/FEB15 << ignored
/opt/prod_log_spool/sjctriwa04/tomcat/FEB15
/opt/prod_log_spool/sjctriwa03/apache/FEB15
/opt/prod_log_spool/sjctriwa03/tomcat/FEB15
for example
# /opt/splunk/bin/splunk search "| metadata type=sources index="logprod*" earliest=-2d" | awk '/FEB14/ {print $4}'
/opt/prod_log_spool/sjctriwa03/tomcat/FEB14/catalina.out
/opt/prod_log_spool/sjctriwa04/tomcat/FEB14/catalina.out
/opt/prod_log_spool/sjctriwa03/apache/FEB14/extended_log
#
# /opt/splunk/bin/splunk search "| metadata type=sources index="hznprod*" earliest=-2d" | awk '/FEB15/ {print $4}'
/opt/prod_log_spool/sjctriwa03/tomcat/FEB15/catalina.out
/opt/prod_log_spool/sjctriwa04/tomcat/FEB15/catalina.out
/opt/prod_log_spool/sjctriwa03/apache/FEB15/extended_log
#
now today ... for the 25th, three of the four logs are vanished! Even last night there were three of the four logs ... two more went missing overnight!!!! this is insane.
$ splunklogchk
Thu Mar 1 19:04:37 GMT 2012 - Splunk prod logs for the last 7 days summary ...
updated=Fri 24 Feb 2012 06:30:18 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/apache/FEB23/extended_log
updated=Fri 24 Feb 2012 06:21:53 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/apache/FEB23/extended_log
updated=Fri 24 Feb 2012 06:37:57 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/tomcat/FEB23/catalina.out
updated=Fri 24 Feb 2012 06:25:59 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/tomcat/FEB23/catalina.out
updated=Sat 25 Feb 2012 05:39:40 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/apache/FEB24/extended_log
updated=Sat 25 Feb 2012 05:37:17 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/apache/FEB24/extended_log
updated=Sat 25 Feb 2012 05:43:26 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/tomcat/FEB24/catalina.out
updated=Sat 25 Feb 2012 05:38:00 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/tomcat/FEB24/catalina.out
updated=Sun 26 Feb 2012 09:49:25 PM GMT, source=/opt/prod_log_spool/sjctricwa03p/tomcat/FEB25/catalina.out
***
*** three logs missing here
***
updated=Mon 27 Feb 2012 05:25:37 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/apache/FEB26/extended_log
updated=Mon 27 Feb 2012 05:26:13 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/tomcat/FEB26/catalina.out
updated=Tue 28 Feb 2012 05:01:22 PM GMT, source=/opt/prod_log_spool/sjctricwa03p/tomcat/FEB26/catalina.out
*** 1 log missing here
updated=Tue 28 Feb 2012 05:39:55 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/apache/FEB27/extended_log
updated=Tue 28 Feb 2012 05:38:14 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/apache/FEB27/extended_log
updated=Tue 28 Feb 2012 05:41:23 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/tomcat/FEB27/catalina.out
updated=Tue 28 Feb 2012 05:39:10 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/tomcat/FEB27/catalina.out
updated=Wed 29 Feb 2012 07:50:42 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/apache/FEB28/extended_log
updated=Wed 29 Feb 2012 05:37:57 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/apache/FEB28/extended_log
updated=Thu 01 Mar 2012 02:09:33 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/tomcat/FEB28/catalina.out
updated=Wed 29 Feb 2012 05:38:48 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/tomcat/FEB28/catalina.out
updated=Thu 01 Mar 2012 09:45:06 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/apache/FEB29/extended_log
updated=Thu 01 Mar 2012 09:47:11 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/apache/FEB29/extended_log
updated=Thu 01 Mar 2012 09:48:38 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/tomcat/FEB29/catalina.out
updated=Thu 01 Mar 2012 09:46:39 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/tomcat/FEB29/catalina.out
nothing ... or am I missing something here in concept or syntax? like this?
/opt/splunk/bin/splunk search source="
or like this
/opt/splunk/bin/splunk search "| metadata type=sources index="hznprod_apache" source="
or something else ... ???
http://docs.splunk.com/Documentation/Splunk/latest/searchreference/metadata says that the metadata command doesn't support option 'source'.
please note here that I'm actually just (at this stage) looking for the file name/path to validate.
Further I'm seeing something VERY odd ... i have a cron job that moves the 4 files each night at 00:20 EST. Two hours later I have cron job that does the meta search and emails me the results. I have noticed occassionally that Splunk will apparentely 'forget' about files that it has already consumed!
For instance.
for logs of the 26th, i got this the first night:
pdated=Mon 27 Feb 2012 05:25:37 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/apache/FEB26/extended_log
updated=Tue 28 Feb 2012 05:00:14 PM GMT, source=/opt/prod_log_spool/sjctricwa03p/apache/FEB26/extended_log
updated=Mon 27 Feb 2012 05:26:13 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/tomcat/FEB26/catalina.out
updated=Tue 28 Feb 2012 05:01:22 PM GMT, source=/opt/prod_log_spool/sjctricwa03p/tomcat/FEB26/catalina.out
the next night I got this - one log was 'forgotten'!
updated=Mon 27 Feb 2012 05:25:37 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/apache/FEB26/extended_log
updated=Mon 27 Feb 2012 05:26:13 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/tomcat/FEB26/catalina.out
updated=Tue 28 Feb 2012 05:01:22 PM GMT, source=/opt/prod_log_spool/sjctricwa03p/tomcat/FEB26/catalina.out
this has happened several times now. One time I did a one-time index of the file, verified it, and again two days later it disappeared AGAIN! I'm VERY VERY concerned about this bevhaviour if i'm seeing it with just 4 logs from 2 hosts (like 3GB/day). What's going to break when I scale up, and how am I going to track and verify that Splunk is actually doing what I have instructed it to do!
FYI the output here is from metadata piped thru awk as below
querySplunk ()
{
echo "Splunk prod logs for the last $end days summary ...";
for date in "${d[@]}"; do
echo "## $date";
/opt/splunk/bin/splunk search "| metadata type=sources index=hznprod_apache" \
-auth "$splUname:$splPass" | awk -v dateStr="$date" '$0 ~ dateStr {printf "updated=%s, source=%s\n", strftime("%c",$3), $4}';
/opt/splunk/bin/splunk search "| metadata type=sources index=hznprod_tc" | awk -v dateStr="$date" '$0 ~ dateStr {printf "updated=%s, source=%s\n", strftime("%c",$3), $4}' ;
done
}
nothing? I'm still wedged.... 😞
Lets make sure the data is in even in Splunk.
Can you run the following search and let me know if it found anything:
source="*apache/FEB14*"
Make sure you set the search time for "All Time". The most common issue I've seen around data not being in Splunk is that it got conjangled around the date / time stamp and actually set the event date as some future date.
Brian
Thanks Brian - That 'FileStatus' info is super helpful. I wasn't aware of that. And puzzeling it shows that the file was read.... I'm not sure what to make of that, because Splunk still shows it was not in the index, using the meta search. And in the splunk log (all), I see plenty of entries about FEB14/tomcat/catalina.log for both servers, but NONE for apache/extended.log ... but I can see from the meta output that one server's apache/extended.log WAS read ... ok. now i'm even more confused.
> /opt/prod_log_spool/sjctricwa04p/apache/FEB14
> parent /opt/prod_log_spool/sjctricwa04p/apache
> type directory
> /opt/prod_log_spool/sjctricwa04p/apache/FEB14/extended_log
> file position 111741482 file
> size 111741482
> parent /opt/prod_log_spool/sjctricwa04p/apache
> percent 100.00 type finished reading
> /opt/prod_log_spool/sjctricwa04p/apache/FEB15
> parent /opt/prod_log_spool/sjctricwa04p/apache
> type directory
> /opt/prod_log_spool/sjctricwa04p/apache/FEB15/extended_log
> file position 117054516 file
> size 117054516
> parent /opt/prod_log_spool/sjctricwa04p/apache
> percent 100.00 type finished reading
# /opt/splunk/bin/splunk search "| metadata type=sources index="hznprod*" earliest=-2d" | grep FEB14
1328623241 1329281946 1329321589 /opt/prod_log_spool/sjctricwa03p/tomcat/FEB14/catalina.out 1035125 sources
1329177600 1329281953 1329322086 /opt/prod_log_spool/sjctricwa04p/tomcat/FEB14/catalina.out 786746 sources
1329195601 1329281999 1329321250 /opt/prod_log_spool/sjctricwa03p/apache/FEB14/extended_log 318495 sources
#
http://docs.splunk.com/Documentation/Splunk/latest/searchreference/metadata says that the metadata command doesn't support option 'source'.
nothing ... or am I missing something here in concept or syntax?
like this?
or like this
or something else ... ???
What do you get if you do a search for source="/FEB14/" with the time range of all time?
EDIT: THat's source="
Brian
Do you see anything in $SPLUNK_HOME/var/log/splunkd.log on that server?
If you go to https://
Another good resource is http://blogs.splunk.com/2011/01/02/did-i-miss-christmas-2/
Anyone have any ideas? I don't even know where to look next ...