Splunk Search

Why search results adds up when source is modified?

boney_s
Explorer

Hello friends,

I have indexed my own .log file in to Splunk and there are about 10 events in that log files. I wonder why the number of events returned gets doubled if I make a change on the .log file manually (edit it on notepad and save). For example if i change the value of one field from A to B (field seperated by 😃 and let say there are x results, splunk will show x + x events for the new search.
somebody please explain me how splunk works in this scenario?. Thanks in advance.

Tags (1)
0 Karma

boney_s
Explorer

Hello Guys,

I have one more question related to the one above .I have indexed one DB table in splunk and its data changes frequently. Why splunk shows previously stored results even if the DB table is empty. Is there any way to show real time DB table values in Splunk?. I.e if table has 10 rows splunk should show "10 events" and if table has 0 rows splunk should show "0 events". Is that possible?

Thanks in advance

0 Karma

MuS
Legend
0 Karma

boney_s
Explorer

Thanks man that really saved my time.
BTW what is the minimum time I can give for a "scheduled report". Right now I am using */1 * * * * (cron) for running it every 1 minute I guess. Is it possible to reduce this time to seconds range.

0 Karma

ngatchasandra
Builder

To answer this question, I have done the steps like follow:

  1. First time, i created a index and i indexed a csv file with that index. When I start search by putting “index=test1” in search bar, the result give me 3 events:

    9/15/14
    10:52:20.411 AM
    8750321,"*ALL",JPD910,"11484/6788",UNKNOWN,"Mon Sep 15 10:52:20.411000","IPCMISC.C299
    State information for process 10036, User=8750321, Role=*ALL, Environment=JPD910, Profile=NONE, Application=P5542319, Client Machine=WKCLSE1WEBCP06, Version=LIS0001, Thread ID=9280, Thread Name=WRK:8750321_09D55650_P5542319."
    host = student16-PC source = APPCB02.csv sourcetype = csv

    9/13/14
    3:28:27.380 PM

    6635103,"*ALL",JPD910,"2664/14920",UNKNOWN,"Sat Sep 13 15:28:27.380002","IPCMISC.C299
    State information for process 7764, User=6635103, Role=*ALL, Environment=JPD910, Profile=NONE, Application=P5542319, Client Machine=WKCLSE1WEBCP07, Version=LIS0001, Thread ID=12632, Thread Name=WRK:Starting jdeCallObject."
    host = student16-PC source = APPCB02.csv sourcetype = csv

    9/11/14
    12:59:54.203 PM
    6107085,"*ALL",JPD910,"12760/14360",UNKNOWN,"Thu Sep 11 12:59:54.203001","IPCMISC.C299
    State information for process 10784, User=6107085, Role=*ALL, Environment=JPD910, Profile=NONE, Application=P5542319, Client Machine=WKCLSE1WEBCP04, Version=LIS0001, Thread ID=13536, Thread Name=WRK:Starting jdeCallObject."
    host = student16-PC source = APPCB02.csv sourcetype = csv

  2. Second time, in by editing this csv file, I make a change of first value of first event like this:
    8750321 its become 8750999.
    When I re-indexed the same file with another index which I created and name it test2, the same search give me results like follow with 6 events:

    9/15/14
    10:52:20.411 AM
    " State information for process 10036, User=8750321, Role=*ALL, Environment=JPD910, Profile=NONE, Application=P5542319, Client Machine=WKCLSE1WEBCP06, Version=LIS0001, Thread ID=9280, Thread Name=WRK:8750321_09D55650_P5542319."""
    host = student16-PC source = APPCB02.csv sourcetype = csv

    9/15/14
    10:52:20.411 AM
    "8750999,""*ALL"",JPD910,""11484/6788"",UNKNOWN,""Mon Sep 15 10:52:20.411000"",""IPCMISC.C299"
    host = student16-PC source = APPCB02.csv sourcetype = csv

    9/13/14
    3:28:27.380 PM

    " State information for process 7764, User=6635103, Role=*ALL, Environment=JPD910, Profile=NONE, Application=P5542319, Client Machine=WKCLSE1WEBCP07, Version=LIS0001, Thread ID=12632, Thread Name=WRK:Starting jdeCallObject."""
    host = student16-PC source = APPCB02.csv sourcetype = csv

    9/13/14
    3:28:27.380 PM

    "6635103,""*ALL"",JPD910,""2664/14920"",UNKNOWN,""Sat Sep 13 15:28:27.380002"",""IPCMISC.C299"
    host = student16-PC source = APPCB02.csv sourcetype = csv

    9/11/14
    12:59:54.203 PM
    " State information for process 10784, User=6107085, Role=*ALL, Environment=JPD910, Profile=NONE, Application=P5542319, Client Machine=WKCLSE1WEBCP04, Version=LIS0001, Thread ID=13536, Thread Name=WRK:Starting jdeCallObject."""
    host = student16-PC source = APPCB02.csv sourcetype = csv

    9/11/14
    12:59:54.203 PM
    "6107085,""*ALL"",JPD910,""12760/14360"",UNKNOWN,""Thu Sep 11 12:59:54.203001"",""IPCMISC.C299"
    host = student16-PC source = APPCB02.csv sourcetype = csv

  3. In last, I make the same process with a index name test (index=test sourcetype=tes), but now, I has configured manually the sourcetype before saving. This give me the results like follow:

    User,Role,Environment,PID,"Thread_Name","Date_Thread","File_Thread"
    host = student16-PC source = APPCB02.csv sourcetype = tes

    9/15/14
    10:52:20.411 AM
    8750321,"*ALL",JPD910,"11484/6788",UNKNOWN,"Mon Sep 15 10:52:20.411000","IPCMISC.C299
    State information for process 10036, User=8750321, Role=*ALL, Environment=JPD910, Profile=NONE, Application=P5542319, Client Machine=WKCLSE1WEBCP06, Version=LIS0001, Thread ID=9280, Thread Name=WRK:8750321_09D55650_P5542319."
    host = student16-PC source = APPCB02.csv sourcetype = tes

    9/13/14
    3:28:27.380 PM

    6635103,"*ALL",JPD910,"2664/14920",UNKNOWN,"Sat Sep 13 15:28:27.380002","IPCMISC.C299
    State information for process 7764, User=6635103, Role=*ALL, Environment=JPD910, Profile=NONE, Application=P5542319, Client Machine=WKCLSE1WEBCP07, Version=LIS0001, Thread ID=12632, Thread Name=WRK:Starting jdeCallObject."
    host = student16-PC source = APPCB02.csv sourcetype = tes

    9/11/14
    12:59:54.203 PM
    6107085,"*ALL",JPD910,"12760/14360",UNKNOWN,"Thu Sep 11 12:59:54.203001","IPCMISC.C299
    State information for process 10784, User=6107085, Role=*ALL, Environment=JPD910, Profile=NONE, Application=P5542319, Client Machine=WKCLSE1WEBCP04, Version=LIS0001, Thread ID=13536, Thread Name=WRK:Starting jdeCallObject."
    host = student16-PC source = APPCB02.csv sourcetype = tes

This is the same result with first step only that, first event represent the head line of csv file.
In conclusion, I can say that, the search results adds up when source is modified because the sourcetype is not configured manually.

Note: If you don't specify a sourcetype when you searching, the events is adds(duplication) anytime you re-launch a search (index=test) like this: example, when i launch a same search of step3 without it sourcetype after one day, i obtain now 10 events like follow:

12/10/14 
1:47:56.000 PM  
User,Role,Environment,PID,"Thread_Name","Date_Thread","File_Thread"
host = student16-PC source = APPCB02.csv sourcetype = tes


9/15/14 
10:52:20.411 AM 
"   State information for process 10036, User=8750321, Role=*ALL, Environment=JPD910, Profile=NONE, Application=P5542319, Client Machine=WKCLSE1WEBCP06, Version=LIS0001, Thread ID=9280, Thread Name=WRK:8750321_09D55650_P5542319."""
host = student16-PC source = C:\Users\student16\Desktop\APPCB022.csv sourcetype = csv


9/15/14 
10:52:20.411 AM 
"8758888,""*ALL"",JPD910,""11484/6788"",UNKNOWN,""Mon Sep 15 10:52:20.411000"",""IPCMISC.C299"
host = student16-PC source = C:\Users\student16\Desktop\APPCB022.csv sourcetype = csv


9/15/14 
10:52:20.411 AM 
8750321,"*ALL",JPD910,"11484/6788",UNKNOWN,"Mon Sep 15 10:52:20.411000","IPCMISC.C299
    State information for process 10036, User=8750321, Role=*ALL, Environment=JPD910, Profile=NONE, Application=P5542319, Client Machine=WKCLSE1WEBCP06, Version=LIS0001, Thread ID=9280, Thread Name=WRK:8750321_09D55650_P5542319."
host = student16-PC source = APPCB02.csv sourcetype = tes


9/13/14 
3:28:27.380 PM  
"   State information for process 7764, User=6635103, Role=*ALL, Environment=JPD910, Profile=NONE, Application=P5542319, Client Machine=WKCLSE1WEBCP07, Version=LIS0001, Thread ID=12632, Thread Name=WRK:Starting jdeCallObject."""
host = student16-PC source = C:\Users\student16\Desktop\APPCB022.csv sourcetype = csv


9/13/14 
3:28:27.380 PM  
"6635103,""*ALL"",JPD910,""2664/14920"",UNKNOWN,""Sat Sep 13 15:28:27.380002"",""IPCMISC.C299"
host = student16-PC source = C:\Users\student16\Desktop\APPCB022.csv sourcetype = csv


9/13/14 
3:28:27.380 PM  
6635103,"*ALL",JPD910,"2664/14920",UNKNOWN,"Sat Sep 13 15:28:27.380002","IPCMISC.C299
    State information for process 7764, User=6635103, Role=*ALL, Environment=JPD910, Profile=NONE, Application=P5542319, Client Machine=WKCLSE1WEBCP07, Version=LIS0001, Thread ID=12632, Thread Name=WRK:Starting jdeCallObject."
host = student16-PC source = APPCB02.csv sourcetype = tes


9/11/14 
12:59:54.203 PM 
"   State information for process 10784, User=6107085, Role=*ALL, Environment=JPD910, Profile=NONE, Application=P5542319, Client Machine=WKCLSE1WEBCP04, Version=LIS0001, Thread ID=13536, Thread Name=WRK:Starting jdeCallObject."""
host = student16-PC source = C:\Users\student16\Desktop\APPCB022.csv sourcetype = csv


9/11/14 
12:59:54.203 PM 
"6107085,""*ALL"",JPD910,""12760/14360"",UNKNOWN,""Thu Sep 11 12:59:54.203001"",""IPCMISC.C299"
host = student16-PC source = C:\Users\student16\Desktop\APPCB022.csv sourcetype = csv


9/11/14 
12:59:54.203 PM 
6107085,"*ALL",JPD910,"12760/14360",UNKNOWN,"Thu Sep 11 12:59:54.203001","IPCMISC.C299
    State information for process 10784, User=6107085, Role=*ALL, Environment=JPD910, Profile=NONE, Application=P5542319, Client Machine=WKCLSE1WEBCP04, Version=LIS0001, Thread ID=13536, Thread Name=WRK:Starting jdeCallObject."
host = student16-PC source = APPCB02.csv sourcetype = tes
0 Karma

somesoni2
Revered Legend

When Splunk is monitoring a file or files in a folder, it creates a crc (Cyclic Redundancy Check) handler for each file, so as it will not re-index a file with same data (even though the file gets renamed). The attribute which handles this crc handler is called initCrcLength and is present in inputs.conf.

Default value for this property is 256 bytes (first 256 bytes in the file). So, if you change anything in the file which is within the limit set by initCrcLength attribute, Splunk will treat it as new file and will re-index all the entries (not just the updated entries).

0 Karma

boney_s
Explorer

Thank you guys. So i need to reduce the initCrcLength. BTW what is the minimum value for that parameter?

0 Karma

somesoni2
Revered Legend

Per documentations, it should be in the range of 256-1048576 so the minimum value is the default value. Is it possible for you to change the way you update the file, means if you just want to update one entry, create a file with just that entry or something?

0 Karma

boney_s
Explorer

Thanks my friend.

0 Karma

MuS
Legend

did you add a

crcSalt = <SOURCE>

in inputs.conf for this log?

0 Karma

boney_s
Explorer

No. I tried including that command to input.conf but still not working.

0 Karma

MuS
Legend

don't do this, because this can cause exactly this - re-indexing files. Only include this if really needed.

0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...