Getting Data In

not indexing 1 of 4 logs

pcorchary
Explorer

I'm not really sure where else to look at troubleshooting this problem below:

I have 4 data-input directories that are being watched by Splunk. Two hosts dirs, each with an apache and a tomcat dir. Once each day at midnite EST logs are copied here into dated dirs. One directory and it's log is routinely ignored it seems from what I see doing a meta search on the index for the sources.

I have tried deleting and re-adding the dir path in Splunk Manager (4.3). These dirs and files all have 644 and root:root and Splunk runs as root.

/opt/prod_log_spool/sjctriwa04/apache/FEB14  << ignored 
/opt/prod_log_spool/sjctriwa04/tomcat/FEB14
/opt/prod_log_spool/sjctriwa03/apache/FEB14
/opt/prod_log_spool/sjctriwa03/tomcat/FEB14

/opt/prod_log_spool/sjctriwa04/apache/FEB15  << ignored 
/opt/prod_log_spool/sjctriwa04/tomcat/FEB15
/opt/prod_log_spool/sjctriwa03/apache/FEB15
/opt/prod_log_spool/sjctriwa03/tomcat/FEB15

for example

# /opt/splunk/bin/splunk search "| metadata type=sources index="logprod*" earliest=-2d" | awk '/FEB14/ {print $4}'
/opt/prod_log_spool/sjctriwa03/tomcat/FEB14/catalina.out
/opt/prod_log_spool/sjctriwa04/tomcat/FEB14/catalina.out
/opt/prod_log_spool/sjctriwa03/apache/FEB14/extended_log
# 
# /opt/splunk/bin/splunk search "| metadata type=sources index="hznprod*" earliest=-2d" | awk '/FEB15/ {print $4}'
/opt/prod_log_spool/sjctriwa03/tomcat/FEB15/catalina.out
/opt/prod_log_spool/sjctriwa04/tomcat/FEB15/catalina.out
/opt/prod_log_spool/sjctriwa03/apache/FEB15/extended_log
# 
Tags (1)

pcorchary
Explorer

now today ... for the 25th, three of the four logs are vanished! Even last night there were three of the four logs ... two more went missing overnight!!!! this is insane.

$ splunklogchk

Thu Mar  1 19:04:37 GMT 2012 - Splunk prod logs for the last 7 days summary ...
updated=Fri 24 Feb 2012 06:30:18 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/apache/FEB23/extended_log
updated=Fri 24 Feb 2012 06:21:53 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/apache/FEB23/extended_log
updated=Fri 24 Feb 2012 06:37:57 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/tomcat/FEB23/catalina.out
updated=Fri 24 Feb 2012 06:25:59 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/tomcat/FEB23/catalina.out

updated=Sat 25 Feb 2012 05:39:40 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/apache/FEB24/extended_log
updated=Sat 25 Feb 2012 05:37:17 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/apache/FEB24/extended_log
updated=Sat 25 Feb 2012 05:43:26 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/tomcat/FEB24/catalina.out
updated=Sat 25 Feb 2012 05:38:00 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/tomcat/FEB24/catalina.out

updated=Sun 26 Feb 2012 09:49:25 PM GMT, source=/opt/prod_log_spool/sjctricwa03p/tomcat/FEB25/catalina.out
***
*** three logs missing here
***
updated=Mon 27 Feb 2012 05:25:37 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/apache/FEB26/extended_log
updated=Mon 27 Feb 2012 05:26:13 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/tomcat/FEB26/catalina.out
updated=Tue 28 Feb 2012 05:01:22 PM GMT, source=/opt/prod_log_spool/sjctricwa03p/tomcat/FEB26/catalina.out
*** 1 log missing here

updated=Tue 28 Feb 2012 05:39:55 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/apache/FEB27/extended_log
updated=Tue 28 Feb 2012 05:38:14 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/apache/FEB27/extended_log
updated=Tue 28 Feb 2012 05:41:23 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/tomcat/FEB27/catalina.out
updated=Tue 28 Feb 2012 05:39:10 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/tomcat/FEB27/catalina.out

updated=Wed 29 Feb 2012 07:50:42 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/apache/FEB28/extended_log
updated=Wed 29 Feb 2012 05:37:57 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/apache/FEB28/extended_log
updated=Thu 01 Mar 2012 02:09:33 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/tomcat/FEB28/catalina.out
updated=Wed 29 Feb 2012 05:38:48 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/tomcat/FEB28/catalina.out

updated=Thu 01 Mar 2012 09:45:06 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/apache/FEB29/extended_log
updated=Thu 01 Mar 2012 09:47:11 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/apache/FEB29/extended_log
updated=Thu 01 Mar 2012 09:48:38 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/tomcat/FEB29/catalina.out
updated=Thu 01 Mar 2012 09:46:39 AM GMT, source=/opt/prod_log_spool/sjctricwa03p/tomcat/FEB29/catalina.out
0 Karma

pcorchary
Explorer

nothing ... or am I missing something here in concept or syntax? like this?
/opt/splunk/bin/splunk search source="/FEB14/"

or like this
/opt/splunk/bin/splunk search "| metadata type=sources index="hznprod_apache" source="/FEB14/"

or something else ... ???

http://docs.splunk.com/Documentation/Splunk/latest/searchreference/metadata says that the metadata command doesn't support option 'source'.

please note here that I'm actually just (at this stage) looking for the file name/path to validate.

Further I'm seeing something VERY odd ... i have a cron job that moves the 4 files each night at 00:20 EST. Two hours later I have cron job that does the meta search and emails me the results. I have noticed occassionally that Splunk will apparentely 'forget' about files that it has already consumed!

For instance.

for logs of the 26th, i got this the first night:

pdated=Mon 27 Feb 2012 05:25:37 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/apache/FEB26/extended_log
updated=Tue 28 Feb 2012 05:00:14 PM GMT, source=/opt/prod_log_spool/sjctricwa03p/apache/FEB26/extended_log
updated=Mon 27 Feb 2012 05:26:13 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/tomcat/FEB26/catalina.out
updated=Tue 28 Feb 2012 05:01:22 PM GMT, source=/opt/prod_log_spool/sjctricwa03p/tomcat/FEB26/catalina.out

the next night I got this - one log was 'forgotten'!

updated=Mon 27 Feb 2012 05:25:37 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/apache/FEB26/extended_log
updated=Mon 27 Feb 2012 05:26:13 AM GMT, source=/opt/prod_log_spool/sjctricwa04p/tomcat/FEB26/catalina.out
updated=Tue 28 Feb 2012 05:01:22 PM GMT, source=/opt/prod_log_spool/sjctricwa03p/tomcat/FEB26/catalina.out

this has happened several times now. One time I did a one-time index of the file, verified it, and again two days later it disappeared AGAIN! I'm VERY VERY concerned about this bevhaviour if i'm seeing it with just 4 logs from 2 hosts (like 3GB/day). What's going to break when I scale up, and how am I going to track and verify that Splunk is actually doing what I have instructed it to do!

FYI the output here is from metadata piped thru awk as below

querySplunk ()
    {
        echo "Splunk prod logs for the last $end days summary ..."; 
        for date in "${d[@]}"; do
            echo "## $date";
            /opt/splunk/bin/splunk search "| metadata type=sources index=hznprod_apache" \
-auth "$splUname:$splPass" | awk -v dateStr="$date" '$0 ~ dateStr {printf "updated=%s, source=%s\n", strftime("%c",$3), $4}'; 
            /opt/splunk/bin/splunk search "| metadata type=sources index=hznprod_tc" | awk -v dateStr="$date" '$0 ~ dateStr {printf "updated=%s, source=%s\n", strftime("%c",$3), $4}' ;
        done
    }
0 Karma

pcorchary
Explorer

nothing? I'm still wedged.... 😞

0 Karma

Brian_Osburn
Builder

Lets make sure the data is in even in Splunk.

Can you run the following search and let me know if it found anything:
source="*apache/FEB14*"
Make sure you set the search time for "All Time". The most common issue I've seen around data not being in Splunk is that it got conjangled around the date / time stamp and actually set the event date as some future date.

Brian

0 Karma

pcorchary
Explorer

Thanks Brian - That 'FileStatus' info is super helpful. I wasn't aware of that. And puzzeling it shows that the file was read.... I'm not sure what to make of that, because Splunk still shows it was not in the index, using the meta search. And in the splunk log (all), I see plenty of entries about FEB14/tomcat/catalina.log for both servers, but NONE for apache/extended.log ... but I can see from the meta output that one server's apache/extended.log WAS read ... ok. now i'm even more confused.

> /opt/prod_log_spool/sjctricwa04p/apache/FEB14 
> parent    /opt/prod_log_spool/sjctricwa04p/apache
> type  directory
> /opt/prod_log_spool/sjctricwa04p/apache/FEB14/extended_log    
> file position 111741482 file
> size  111741482
> parent    /opt/prod_log_spool/sjctricwa04p/apache
> percent   100.00 type finished reading
> /opt/prod_log_spool/sjctricwa04p/apache/FEB15 
> parent    /opt/prod_log_spool/sjctricwa04p/apache
> type  directory
> /opt/prod_log_spool/sjctricwa04p/apache/FEB15/extended_log    
> file position 117054516 file
> size  117054516
> parent    /opt/prod_log_spool/sjctricwa04p/apache
> percent   100.00 type finished reading

# /opt/splunk/bin/splunk search "| metadata type=sources index="hznprod*" earliest=-2d" | grep FEB14
1328623241 1329281946 1329321589 /opt/prod_log_spool/sjctricwa03p/tomcat/FEB14/catalina.out                1035125 sources
1329177600 1329281953 1329322086 /opt/prod_log_spool/sjctricwa04p/tomcat/FEB14/catalina.out                 786746 sources
1329195601 1329281999 1329321250 /opt/prod_log_spool/sjctricwa03p/apache/FEB14/extended_log                 318495 sources
# 
0 Karma

pcorchary
Explorer

http://docs.splunk.com/Documentation/Splunk/latest/searchreference/metadata says that the metadata command doesn't support option 'source'.

0 Karma

pcorchary
Explorer

nothing ... or am I missing something here in concept or syntax?
like this?

/opt/splunk/bin/splunk search source="/FEB14/"

or like this

/opt/splunk/bin/splunk search "| metadata type=sources index="hznprod_apache" source="/FEB14/"

or something else ... ???

0 Karma

Brian_Osburn
Builder

What do you get if you do a search for source="/FEB14/" with the time range of all time?

EDIT: THat's source="/FEB14/" for some reason it converted it italics..o_O

Brian

0 Karma

Brian_Osburn
Builder

Do you see anything in $SPLUNK_HOME/var/log/splunkd.log on that server?

If you go to https://:8089/services/admin/inputstatus/TailingProcessor:FileStatus you should see what the server status is around what files it has open.

Another good resource is http://blogs.splunk.com/2011/01/02/did-i-miss-christmas-2/

pcorchary
Explorer

Anyone have any ideas? I don't even know where to look next ...

0 Karma