Getting Data In

Source does not show up in search

chaseleechun
Explorer

I added a directory with 5 files, but the search only return events from 2 files.

Some background:

  1. Added the 5 individual files with default sourcetypes.
  2. Added the dir with the 5 files with a manual sourcetype.
  3. Use | delete to remove the earlier added files in (1).
  4. Now the search only return events from 2 files.

I used "splunk list monitor" and it shows that all 5 files are being monitored.

I used "| metadata type=sourcetypes" and the results show the "totalCount" of the 2 files only. (Note: the previously added files in (1) had 0 totalCount)

Also, when I disable or delete the Data Input (2), the results from the 2 files will still be shown.

Can anyone explain the behaviour or what I should have / not have done? How can I 're-add' the 5 files?

Tags (2)
0 Karma
1 Solution

chaseleechun
Explorer

Found the answer in the inputs.conf setting. http://www.splunk.com/base/Documentation/latest/admin/Inputsconf

crcSalt = <string>
* Use this setting to force Splunk to consume files that have matching CRCs (cyclic redundancy checks). (Splunk only 
  performs CRC checks against the first few lines of a file. This behavior prevents Splunk from indexing the same 
  file twice, even though you may have renamed it -- as, for example, with rolling log files. However, because the 
  CRC is based on only the first few lines of the file, it is possible for legitimately different files to have 
  matching CRCs, particularly if they have identical headers.)
* If set, <string> is added to the CRC.
* If set to the literal string <SOURCE> (including the angle brackets), the full directory path to the source file 
  is added to the CRC. This ensures that each file being monitored has a unique CRC.   When crcSalt is invoked, 
  it is usually set to <SOURCE>.
* Be cautious about using this attribute with rolling log files; it could lead to the log file being re-indexed 
  after it has rolled. 
* Defaults to empty. 

View solution in original post

0 Karma

chaseleechun
Explorer

Found the answer in the inputs.conf setting. http://www.splunk.com/base/Documentation/latest/admin/Inputsconf

crcSalt = <string>
* Use this setting to force Splunk to consume files that have matching CRCs (cyclic redundancy checks). (Splunk only 
  performs CRC checks against the first few lines of a file. This behavior prevents Splunk from indexing the same 
  file twice, even though you may have renamed it -- as, for example, with rolling log files. However, because the 
  CRC is based on only the first few lines of the file, it is possible for legitimately different files to have 
  matching CRCs, particularly if they have identical headers.)
* If set, <string> is added to the CRC.
* If set to the literal string <SOURCE> (including the angle brackets), the full directory path to the source file 
  is added to the CRC. This ensures that each file being monitored has a unique CRC.   When crcSalt is invoked, 
  it is usually set to <SOURCE>.
* Be cautious about using this attribute with rolling log files; it could lead to the log file being re-indexed 
  after it has rolled. 
* Defaults to empty. 
0 Karma

chaseleechun
Explorer

I read somewhere that previously indexed files may clash with the newly uploaded files if they happen to be the same files (content). Could this be the case here, even though I have deleted the earlier indexed files?

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...