About DalJeanis

DalJeanis · ‎08-24-2020

Moved question to Splunk Adminstration, "Getting Data In", where you will get better answers than in "Search".

DalJeanis · ‎08-24-2020

Okay, we've torn your code apart and reconstructed it, and then realized that it seems like it probably wastes a lot of machine time based on what you are ending up with. In plain English, you are trying to get a count of Agent records that match a bunch of specific filters. Those filters come from either a JSON in the CEM-Applog or from SN_CALL_FLAGS in the other records. It seems like you should be able to get this in a much more concise way, but we'd need to know more about the data in order to code that. Here's a shot-in-the-dark rewrite to use the Splunk soup method. ( index=ivr_app sourcetype="CEM-AppLog" rosterInfo ) OR ( index=ivr_app "pipeline at completion" CALL_FLOW DNIS EXCHANGE NOT NPS NOT TFRDEST NOT TFRNUM NOT "SN_CONTACT_TYPE=Transfer" NOT "SN_TARGET_TYPE=Release" "SN_CONTACT_REASON=" (SN_CALL_FLAGS="foo1" OR NOT SN_CALL_FLAGS="foo2") ) | rename COMMENT as "This section processes the CEM-Applog records" | rex "^(?:[^{]*){7}(?P<my_data>.+)" | spath input=my_data output=vq path=TOD | spath input=my_data output=steps path=steps{} | spath input=my_data output=type path=type | spath input=my_data output=virtualQueue path=virtualQueue | spath input=my_data output=last_step path=steps{} | eval res = mvindex(last_step,mvcount(last_step)-1) | spath input=res output=name path=name | spath input=res output=type path=type | rex field=_raw "SN_CONTEXT_ID (?P<SN_CONTEXT_ID1>[^\s]+) produced" | dedup SN_CONTEXT_ID1 keepempty=true | rename COMMENT as "This section processes the 'pipeline at completion' records" | dedup SN_CONTEXT_ID CONNID keepempty=true | eval SN_CONTEXT_ID=coalesce(SN_CONTEXT_ID,SN_CONTEXT_ID1) | fields - SN_CONTEXT_ID1 | stats values(*) as * by SN_CONTEXT_ID | foreach SN_CALL_FLAGS [ eval <<FIELD>> = if(isnull(<<FIELD>>) OR len(<<FIELD>>)==0, "NO_CALL_FLAG", <<FIELD>>) ] | search CLI="foo4" AND CONNID="foo5" AND SN_CALL_FLAGS="foo6" AND DNIS="foo7" Read the above very carefully and skeptically, and then try it to see how it performs relative to your other code. Further Notes - Replacing code with asterisks can really mess up the advice you get. Tradition is to use "foo" and "bar" and "baz" to represent different values, or "foo1" "foo2" foo3" etc. Even with actual code, this construct here does not seem to be meaningful. If the two asterisks represent the same thing, then it's equivalent to isnotnull(SN_CALL_FLAGS). If they represent different things, then the first clause is redundant, because when the first is true, the second is always true. AND SN_CALL_FLAGS="*" OR NOT SN_CALL_FLAGS="*"

DalJeanis · ‎06-30-2020

Much of what you are doing does not make much sense. The initial dedup will give you exactly one record for each host and package, so all the following logic that presumes there is anything to count, or any MV in the records, is not needed. The latest() aggregate function for stats will pull you the last value, so you don't even have to dedup. So, this gets you the equivalent output of what you wrote. tag=Windows_Update package=* host=* | fields host package eventtype | stats latest(_time) as _time latest(eventtype) as eventtype by host package | eval status=case(eventtype=="Update_Successful", "Successful at ("._time.")", eventtype=="Update_Failed", "Failed at ("._time.")", true(),"NA") | table host package status Now, if you wanted a history, including the last value, then you could do something like this: tag=Windows_Update package=* host=* (eventtype=="Update_Successful" OR eventtype=="Update_Failed") | fields host package eventtype | eval status=case(eventtype=="Update_Successful", "Successful at ("._time.")", eventtype=="Update_Failed", "Failed at ("._time.")") | sort 0 host package - _time | stats latest(status) as Last_Status list(status) as Status_History by host package

DalJeanis · ‎06-26-2020

DalJeanis · ‎06-26-2020

Simple to understand and simple to support are good decisions. Well done.

DalJeanis · ‎06-26-2020

Try this index=pan app=* | stats count values(app) as app by src | where NOT (app="ssl")

DalJeanis · ‎06-26-2020

DalJeanis · ‎06-26-2020

You're quite welcome. We love to help.

DalJeanis · ‎06-23-2020

kmeans has an option for setting a range of ks to attempt. | kmeans k=3-12 Just feed it different data a few times and see what it does for you. Here's the reference. https://docs.splunk.com/Documentation/Splunk/8.0.4/SearchReference/Kmeans

DalJeanis · ‎06-23-2020

if the searches are otherwise identical, then you can just use the multiselect value with the n token to avoid formatting it, and set the value to a single space when you don't want it to contain anything of significance. Your search would look more or less like this: "search with a token inside" $multiselect_value2|n$ And the checkbox like this: <input type="checkbox" token="test_checkbox"> <choise value="true">Enable</choise> <change> <condition match="$test_checkbox$="true""> <set token="multiselect_value2">$multiselect_value$</set> </condition> <condition> <set token="multiselect_value2"></set> </condition> </change> </input> That could be made slightly simpler, but that would work.

DalJeanis · ‎06-23-2020

I'm skeptical that predict would be the right way to do that. It seems like the right thing to do would be, each night off peak, to calculate the next day's boundaries once for each 5, 10 or 15 minute increment, and output those times and limits to a lookup table. Then, you'd just have to calculate the current errors and read the lookup table to get the limits for whatever _time and site you are running and test the compliance.

DalJeanis · ‎06-23-2020

There are several ways. If it will be a consistent number of hours offset for each query, (for example 5 hours) then just subtract the number of seconds involved. | eval _time = _time - 5*3600 You could also use a method such as the one in this answer https://community.splunk.com/t5/Getting-Data-In/need-help-in-time-conversion/m-p/471116 Or you could look at my answer and koshyk's answers on this one and see if they can be adapted. https://community.splunk.com/t5/Getting-Data-In/Timezone-and-Timestamp-modification-at-search-report-time/td-p/16659/page/3 Or here https://community.splunk.com/t5/Splunk-Search/Is-there-a-way-to-show-local-time-of-the-device-of-that-area/m-p/348658

DalJeanis · ‎06-23-2020

A summary index can contain literally any number of columns. Just output the record with one column for each item you want to report. So, if an event had values for functions a, c r and t, and the Overall function was 1, then it might look like (time) total_function=23, overall=1, a=12, c=7, r=0, t=15 or, if I misunderstood your meaning, maybe it might be (time) total_function=23 overall="1;3" detail="a;c;r;t" or (time) total_function=23 overall="1;3" detail="a=12;c=7;r=0;t=15" The next record does not have to have all the same fields.

DalJeanis · ‎06-23-2020

Try this: index=My_log | stats count by Account_ID Cell_Number | lookup accountId.csv Account_ID output Account_ID as foundme | where Account_ID = foundme | table Account_ID Cell_Number Notes: 1) You have to keep all the fields you need in the stats command somehow, or they will not exist afterwards. 2) When you output the lookup results, you need to give it a new name or you won't know whether it was found or not.

DalJeanis · ‎06-23-2020

Here's one way: index=pan app=* | stats count by src app | where app!="ssl" Here's another: index=pan app!="ssl" | stats count by src

DalJeanis · ‎06-19-2020

I see this as a nontrivial version of Splunk soup. I'd proceed like this... index=A OR index=REL or index=B OR index=C | fields index parent child name status id | rename COMMENT as "double the REL records, levaing the others single" | eval myfan=mvrange(0,if(index="REL",2,1)) | mvexpand myfan | rename COMMENT as "set up match keys and data fields" | eval A_id=case(index="A",id, index="REL" AND myfan=0,parent) | eval B_id=case(index="B",id, index="REL" AND myfan=0,child, index="REL" AND myfan=1,parent) | eval C_id=case(index="C",id, index="REL" AND myfan=1,child) | eval A_name=case(index="A",name) | eval A_status=case(index="A",status) | eval B_name_status=case(index="B",name."!!!!".status) | eval C_name_status=case(index="C",name."!!!!".status) At this point records look like this index=A id name status A_id A_name A_status index=B id name status B_id B_name_status index=C id name status C_id C_name_status index=REL myfan=0 parent child A_id B_id index=REL myfan=1 parent child B_id C_id Then I'd proceed like this... | rename COMMENT as "reduce to required fields with one of these two" | fields - id name status parent child | fields index myfan A_id A_name A_status B_id B_name_status C_id C_name_status | rename COMMENT as "roll data from REL myfan 0 to A, then myfan=1 to A, tehn drop REL" | eventstats values(eval(case(myfan=0,B_id)) as B_id by A_id | eventstats values(eval(case(myfan=1,C_id)) as C_id by B_id | where index!="REL" | rename COMMENT as "now we have only A, B, C records, and the A records have all relevant keys." | rename COMMENT as "Roll B record to A then drop B" | eventstats values(B_name_status) as B_name_status by B_id | where index!="B" | rename COMMENT as "Roll C record to A then drop C" | eventstats values(C_name_status) as C_name_status by C_id | where index!="C" | rename COMMENT as "Above could be a stats" | rename COMMENT as "Add placeholders to handle potential NULLS" | eval B_name_status=coalesce(B_name_status,"N/A!!!!N/A") | eval C_name_status=coalesce(C_name_status,"N/A!!!!N/A") | rename COMMENT as "split up the records, then the fields" | mvexpand C_name_status | mvexpand B_name_status | eval B_name=mvindex(split(B_name_status,"!!!!"),0) | eval B_status=mvindex(split(B_name_status,"!!!!"),1) | eval C_name=mvindex(split(C_name_status,"!!!!"),0) | eval C_status=mvindex(split(C_name_status,"!!!!"),1) | rename COMMENT as "drop unneeded fields" | table A_name A_status B_name B_status C_name C_status That's all air code, so you'd have to shake it down with a small subset of the records before running the whole data set.

DalJeanis · ‎06-18-2020

That SPL should not limit the lookup to 10K, and we have lookups that are in the millions. Try this : | inputlookup ram_error.csv | stats count If that number is over 10K, then the problem is not with the part you've told us, but with the search you're using to bring the data back. Show us that, and we can help you fix it.

DalJeanis · ‎06-18-2020

Okay, you need to figure out which records are dropped. Here's how I would work this. 1) run each job for the exact same time range. 2) use |loadjob to load the output data from each job and cut it down to a few key fields. 3) use diff to compare the two files and find which records were added/deleted. 4) cut the file down to those records, look at just those records and see what's going on. It may be that they are duplicate records that are being dropped somehow, or it may be that something in the search that you havent shown us has a reason to act differently

DalJeanis · ‎06-18-2020

Switch the view to look at the details (_raw). That field has spaces on either side of the word " all ". How is it defined in the JSON? It's probably an error in the extraction routine, possibly caused by an error in the JSON itself.

DalJeanis · ‎06-16-2020

To be clear - 1) Searches work from the farthest inside braces, and work out, with the exception of the braces that follow an appendpipe (we'll come back to that.) 2) The subsearch in braces will return some set of information, and what set of information is defined by the verb that preceded it, if any. 3) If the braces are before the first pipe in the outside search, then the braces will return data as if the inside search had been run, then piped to the format command. thus, if the end of the search in braces looked like this... [ search index=foo ... more search terms ... | table host ] then it returns as if the search had been run like this index=foo ... more search terms ... | table host | format and the output field called search contains something like this ( host="value1" ) OR ( host="value2" ) OR .... ( host="value99" ) ) Before the first pipe, the value of field search is what will be dropped into the outer search. In your case, the inner search is just taking the data out of the lookup table to create the above stuff. The next step in your code, it runs that against the index as a subsearch. This is the place where @woodcock 's alarms went off. There are limits on how much data a subsearch will return, so you may get results that are wonky, depending on how long you run the search over. But let's assume it works. Your data comes back from that subsearch, and then it gets run through on the right side of a join. Now, joins in Splunk are always, at their most basic level, left joins. Every record on the left side gets matched to the first matching record on the right side. If you have NOT told it to match every record on the right, then only the first gets matched. Then, if you've told it you only want an inner join, it will throw away unmatched records from the left side. (These are some of the reasons we suggest people avoid join when they can.) Finally, what you are matching to on the far left is the same lookup table records you had at the far inside. On the other hand, if you use woodcock's version, then it works this way All the records from any host are read. For each record, the lookup table is accessed in memory for a match. If it is found, then the host value from the lookup table is copied to a new field. Next, if the new field is not found, the record is dropped as not being wanted. Now, if it was possible that there would be no records for any particular host, and you wanted to make sure they were there, then you could add an append at the end like this | inputlookup bsl_project_host.csv append=true And, one final thing. after appendpipe, instead of the braces being processed first, they are processed when the data reaches that point in the search. the entire set of records up until that point is put through the logic inside the braces, and then the output of that (subject to subsearch limitations) is added onto the end of the current set of records.

DalJeanis · ‎06-16-2020

| addtotals fieldname="Total Count" | addcoltotals labelfield=date_mday label="All Days" The addtotals command will add up the totals horizontally, the addcoltotals will add them vertically. I've updated the code above to include these.

DalJeanis · ‎06-16-2020

That's a very good start, @bowesmana. Two additions... 1) As a practice, we always include in the pseudocode a fields command to limit the junk and speed the search. If beginners learn that strategy early on, it will save them centuries of machine time. When doing values(*) as *, it's especially important. 2) streamstats is finicky with time_window, so if we're doing anything complicated, then we usually include a sort 0 to explicitly validate the event order right before the streamstats. base search | fields _time user ... the exact fields that you want to know about ... | sort 0 _time user | streamstats time_window=101ms values(*) as * by user I gave it 1 extra ms, since I can never remember whether streamstats is inclusive or exclusive, and with ms it might matter.

DalJeanis · ‎06-16-2020

@gcusello You probably also want to give him the code for the "30 second" part of his request.. Here's one cut at that: | index=wineventlog (host=PC1 EventCode=4648) OR (host=PC2 EventCode=4624) | fields _time EventCode host | sort 0 _time EventCode host | streamstats time_window=31s values(EventCode) AS eventcodes dc(EventCode) AS bothpresent list(host) as hosts | where bothpresent=2 Or, if you needed the _raw from the two events... | index=wineventlog (host=PC1 EventCode=4648) OR (host=PC2 EventCode=4624) | fields _time EventCode host | sort 0 _time EventCode host | streamstats time_window=31s values(EventCode) AS eventcodes dc(EventCode) AS bothpresent list(host) as hosts values(_raw) as Raw | where bothpresent=2 Also, @dsdeepak , you might want to correlate some other field, such as the user field on both machines, into the search. If your hypothetical attacker would have the same Windows user field on both machines, then you need a little more code. All this search will do is detect when there is ANY 4648 on one machine and 4624 on the other, not necessarily connected to each other.

DalJeanis · ‎06-16-2020

Try something like this: your search that gets individual ratings in this form | fields _time host "% _total Prcessor times" | bin _time span=1m | stats min("% _total Prcessor times") as min% by _time host | eval MyWarning = case(min% > 0.8,1) | streamstats time_window=301s sum(MyWarning) as Warning5m by host | where Warning5m > 4 Modify the MyWarning eval for whatever format your data actually returns in. 80 or 0.80 or whatever.

DalJeanis · ‎06-16-2020

There are a number of things I'd check. Python is finicky about indentation, so I'd probably write a python script with a cut-back version of the SPL to make sure what the values are immediately before the stats command, if none of the following fix it. Here are some ideas - 1) Run the stats together on one line. 2) Put quotes around GET 3) Use tonumber to force RequestTime to be a number, in case for some reason it is being evaluated as a string. 4) Make sure that all lines are at the same indentation. 5) Add commas between each clause in the stats line. search index=cdvr host=* AND source="/var/log/nginx/access.log" AND sourcetype="gemini-ecdn-nginx-access" | rex field=_raw ".*?\t.*?\t.*?\t.*?\t(?<Method>\w+)\s/(?<URI>.+?)\sHTTP.+?\t.*?\t(?<Status>.+?)\t.*?\t.*?\t.*?\t.*?\s.*?\t.*?\t(?<host_header>.+?)\t" | rex field=URI "(?<RecordingID>.*)\.(?<resource>.*)?\?.*" | dedup RecordingID | search Method="GET" resource="m3u8" | stats count(eval(tonumber(RequestTime)<2.00)) as PlaybackNumSuccessful count(eval(RecordingID)) as PlaybackNumTotal | eval PlaybackNumFailed=(PlaybackNumTotal-PlaybackNumSuccessful) | eval SuccessPer = (PlaybackNumSuccessful/PlaybackNumTotal)*100 | eval PlaybackLatencyLessThan2SecSuccessRate=round(SuccessPer, 3)."%" | fields PlaybackNumTotal PlaybackNumFailed PlaybackLatencyLessThan2SecSuccessRate There's one further thing to try ... | stats sum(eval(case(tonumber(RequestTime)<2.00,1, true(),0))) as PlaybackNumSuccessful, sum(eval(case(tonumber(RequestTime)>=2.00,1,true(),0))) as PlaybackNumLong, sum(eval(case(isnull(RequestTime),1, true(),0))) as PlaybackNumNull, count(eval(RecordingID)) as PlaybackNumTotal That will give you information about whether the RequestTime field is being interpreted incorrectly or not recognized at all.

Posts	3588
Solutions	588
Karma Given	847
Karma Received	1021
Member Since	‎12-30-2016

Online Status	Offline
Date Last Visited	‎08-24-2020 04:35 PM

Seal of Approval Badge/Award broken (bug)

Any way to get the name of the scheduled search yo...

Is there an equivalent for Dedup distributable on ...

What's up with my posting permissions on the answe...

What does cofilter actually do?

What are the best practices for indexing MASSIVE p...

What is the format for putting information into co...

Does splunk have a primitive to generate test reco...

Re: Forwarder- seekptr checksum error and logs not...

Re: Replace join on rex value

Re: Windows Updates Monitoring

Re: After 50000 records in lookup, new records are...

Re: Looking to rewrite a join across several index...

Re: Using fireall logs to find hosts that do not u...

Re: Summary Index - Eval Issue - Need both combine...

Re: How to show data from different regions as Tod...

Re: I'd like to get an idea for data grouping.

Re: How tokenize Multi-select value in a token <se...

Re: Use of predict command for alerting

Re: How to show data from different regions as Tod...

Re: Summary Index - Eval Issue - Need both combine...

Re: Lookup with hundreds values for one field

Re: Using fireall logs to find hosts that do not u...

Re: Looking to rewrite a join across several index...

Re: After 10000 records in lookup, new records are...

Re: Why does a scheduled saved search extract fewe...

Re: Event fields not showing automatically

Re: Data flows between index search and lookups

Re: how to count values from a filed and show coun...

Re: How to search for requests from the same sourc...

Re: How to correlate events from two devices in sp...

Re: find servers in 24 hour period that have susta...

Re: Different results when search is run in web UI...