About Sukisen1981

Sukisen1981 · ‎12-04-2019

have you checked this answer out? Issue with DB connect? https://answers.splunk.com/answers/579910/splunk-700-and-dbconnect-311-not-working-new-insta.html

Sukisen1981 · ‎11-15-2019

great work @jomulzer , i had first used this approach in the cluster command, to pass the probability value (t) dynamically, based on a user selection and it works...was sure it would work for your use case as well...

Sukisen1981 · ‎11-08-2019

assuming you ultimately want the panel in a dashboard, you can use tokens to pass the holdback value dynamically. Of course the token has to be passed values based on some condition. Even if you see the underlying XML code in red, still go ahead and try using tokens, it should work

Sukisen1981 · ‎11-07-2019

The point is if you apply a straight timechart without the stats command, you will get an output with time as first column and the names of the HCS field from column 2 onwards. Now, that is not what you needed. the key here are the bin and the stats commands. the bin is used to split time into 1 day intervals. the stats counts the HCS values against the time split of 1 day each. Remove all code after the stats and see the output, you will get 3 columns, _time, the field names for HCS and a third column 'count'. Now , I simply take the values of the count, values(count) and apply a timechart. The reason why my first code was just giving you 1 or 0 is because i was using timechart count, which was merely counting the count (occurences) of the stats command output and not the values of the count as you wanted. I am sure if you see the output till the stats command after removing rest of the code , you will understand. Please accept the answer if this resolves your issue 🙂

Sukisen1981 · ‎11-07-2019

index=Linux HCS "NOT OK" | search host!="psas*" host!="pccc*" host!="pisefibl*" host!="psapsap*" host=p* | bin span=1d _time | stats count by _time,HCS |timechart span=1d values(count) by HCS where count > 1 modified the timechart portion a bit to get the vlaues of the count. check first without timewrap and then with timewrap

Sukisen1981 · ‎11-07-2019

I am sorry, pehaps I am not able to grasp. You want a timechart of the values whose count is >0? If yes, then you can just modify the timechart part of the query above to something like this: previous stuff...|timechart span=1d values(count) by HCS|..rest of the stuff

Sukisen1981 · ‎11-07-2019

Sukisen1981 · ‎09-11-2019

hmmm tried this - https://splunkbase.splunk.com/app/3339/#/overview ? does say though that it works 6.x..but you can still give it a go

Sukisen1981 · ‎09-10-2019

We are monitoring docker container logs in splunk through forwarder. Now, it does look like we are ingesting a lot of unnecessary stuff and the log volumes are in serious danger of tipping our daily license limits. I am looking for some suggestions from forum members who have trimmed docker container logs. There are 2 options possible here - truncate/trim logs at the docker side or balcklist something at the splunk side. for example this if you look at the message fields , the message does not show any useful information. Has anyone worked on something similar and can suggest some string / pattern which we can blacklist or do some trimming at the docker container level?

Sukisen1981 · ‎09-04-2019

hi @Shashank_87 Let us know what you find out, although the data looks like oldgen GC happened once in 15 mins., that can not be true. At least I have not come across situations where old gne get cleaned up so frequently. This might just be the other normal CMS based GC clean up, remember CMS is an algo that marks threads for clean ups in a concurrent fashion . If this is the case then your use case changes and you might need to look at events only where, if you consider your gcduration as time in milliseconds it doesnt look like much and could be the CMS getting run in concurrence with the app.

Sukisen1981 · ‎09-04-2019

hi @Shashank_87 your code should work, please try it out and confirm

Sukisen1981 · ‎09-03-2019

Hi @Shashank_87 How you application logs gc depends on which toll you are using. Splunk is merely ingesting whatever you are forwarding it. Splunk has no role to play and neither is it transforming the data in any way. I suggest you have a quick chat with your JVM guy / GC tuning folks in your team 🙂 Why the gc duration is coming as null depends on what is printed out by your GC verbose application logs. Splunkis merely ingesting whats printed out by the GC logs. I can see that your new example snow contain 2 odgen clean up events from the CMS algo for servers 2 and 4. The 15 mins interval that you are taking about does not make sense if seen in isolation. Assuming you have a 3GB/4GB old gen heap, I have never seen a stable JVM which encounters a clean up on the oldgen every 15 minutes, that should happen once in a day or 2. But if you are now consuming GC events for ALL JVM processes(like for Prod Server 1, Prod Server 2 ....Prod Server n) , what you are seeing is the GC triggered by different applications. so you use something like this to see the number of GC triggers in the last 24 hours - index=xxx earliest=-24h||rex field=x "gcName:(?<GC>.*)"|rex field=x "jvmDescription:(?<Server>.*)"|stats count(GC) by Server This will give you a field count(GC) split by the prod servers for the past 24 hrs. Regarding the time, I don't know how your GC algo is logging the output. Wild guess , if i treat coltime as milliseconds AND colcount as number of collections, then it makes some sense and the GC time comes to ~ 7 secs for your second example above. This is still too high in my opinion and I am not sure on this. You want splunk to calculate something, build some predictions, thats ok but Splunk can not interpret what the source data values mean, so stuff like why gcduration is null, what does coltime mean is something you need t clarify with your JVM team who most probably enabled the GC logging in the first place

Sukisen1981 · ‎09-03-2019

hi @Shashank_87 This is going to be a long post, but consider it. Firstly, a bit on your GC logs. I don't think there are 2 different types of GC algos here. The first one, parnew is the eden generation clean up and the second one is CMS on the oldgen. So basically, these logs are from one single application/process on a server. In CMS, eden generation clean up IS a stop the world event. So for your questions, I suggest the following: 1- Frequency of GC : Divide into 2 , eden(young) gen clean up and oldgen cleanup so you use something like this |rex field=_raw "gcName:(?<eden>.*)" And now, if you run the search for, say the last 1 hour, |stats count(eden) gives you the frequency of eden clean up for the last 1 hour. What time period you choose for your frequency calcs are best left to you. In the same way you can extract the frequency for the second set, the oldgen 2- Time taken by GC : I can see 2 fields here, colTime & gCDuration, which one do you want to use to determine the time taken by for 1 cleanup cycle? Assuming you use coltime , your code will be something like this - |rex field=_raw "colTime:(?<time>.*)" BUT i have doubts on this values, GC times are generally in milliseconds, in which case the value of your oldgen clean up time is too high! Your application will burst. So instead you might want to use the gcduration. This is specific to your JVM algo print out and Splunk can not inform you on what field / how your GC logs capture the GC duration. 3- Other stuff: Asuuming you come this far, there are a lot of stuff on analytics that you can run , for example you can run a predict command based on the coltime. So if i use something like this - |rex field=_raw "gCEnd:(?<heap>.*)" I can append a timechart to this |rex field=_raw "gCEnd:(?<heap>.*)"|timechart span=15min avg(heap) as avg_heap|predict avg_heap AS predicted_heap algorithm=LLT upper90=high lower90=low future_timespan=10 holdback=10 This will predict your next 10 gc clean up times (for a 15 mins interval, that comes to the next 2.5 hrs). To read more on the usage of the predict command and to customize your algo refer here - https://docs.splunk.com/Documentation/Splunk/7.3.1/SearchReference/Predict I will still recommmend LLT , since this is a case which is not seasonal/cyclical but has a clear trend. You can then set a threshold of say 3-5GB as your total available heap and peer into 2.5 hrs in the future and see if your application memory is about to get exhausted. Remember, each time a GC clean up happens the predict command will take the lower value of the heap into consideration. Let us know how it goes 🙂

Sukisen1981 · ‎08-30-2019

hmmm this is just a guess - but what happens if you take the query and save it to a new dashboard, I think drilldown will be enabled , atm maybe you are in the MLTK create model page dashboard?

Sukisen1981 · ‎08-29-2019

hi @omaromar123 I am assuming you have gone through this - https://docs.splunk.com/Documentation/Splunk/7.3.1/Data/MonitorWindowsnetworkinformation if you are on windows, can you tell us what error you are facing?

Sukisen1981 · ‎08-28-2019

hi @stagare Please accept the answer if it significantly helped resolve your issue or let us know if there are any more issues

Sukisen1981 · ‎08-28-2019

hi @parveen77 Please accept the answer if it significantly helped resolve your issue or let us know if there are any more issues

Sukisen1981 · ‎08-28-2019

hi @harinivgr Please accept the answer if it significantly helped resolve your issue or let us know if there are any more issues

Sukisen1981 · ‎08-28-2019

try netstop and nestart splunk d from cmd?

Sukisen1981 · ‎08-27-2019

uhhh i was doubting it maybe from some previous installation that you were trying can you go to etc/apps and delete the installed folder for python for scientific app and then try re-install through the web?

Sukisen1981 · ‎08-27-2019

can you try a direct install from the web? click on apps settings gear button > browse more apps > search for Python for Scientific Computing and choose the relevant one for which the isntall button in green background will be enabled?

Sukisen1981 · ‎08-27-2019

hi @harinivgr try this | rex max_match=0 "^(?<lines>.+)\n+" | table lines | mvexpand lines

Sukisen1981 · ‎08-26-2019

hi @parveen77 The unit for the span specified with the timechart command must be seconds or higher. The predict command cannot accept subseconds as an input when it calculates the period. So if i do this index="_audit" | stats count(action) by _time | rename count(action) as count | predict count i get the error code -1 but if i just put a span of 1 min and use the same code with the bin command added index="_audit" | bin span=1min _time | stats count(action) by _time | rename count(action) as count | predict count I receive the results. Timechart is not mandatory and you can run both these code snaps as it is, since i ran this on the delivered _audit index for the last 24 hrs. If you use timechart you need to use span (min 1 sec), if you use something else like stats or a table you have to take care of the time buckets using the bin command

Sukisen1981 · ‎08-26-2019

| makeresults | eval x="2019-08-26 20:21:18 10.1.82.42 GET /aaaa/bbbb/ccc/ddddd/eeeee username=test&branch=KEL∾count=123456789 443 ABCD\HTTP/secure.abc.jss.pre 11.12.13.14 Java1.7.0_191 - 200 0 0 65018" | rex field=x ".*\s+(?<lastfld>.*)" replace 'x' by _raw | rex field=_raw ".*\s+(?<lastfld>.*)"

Sukisen1981 · ‎08-26-2019

thanks @diogofgm Accepted your answer, sorry if my question appeared too silly / stupid

Posts	980
Solutions	80
Karma Given	27
Karma Received	140
Member Since	‎08-02-2016

Online Status	Offline
Date Last Visited	‎08-19-2022 08:07 AM

Installing universal forwarder on windows with cus...

timechart rank by top 3 averages

Timechart visualization does not match statistics

reduce /limit docker container logs

splunk pricing / license cost scenario

rex expression does not work in curl

Why does the browser show "splunk https site not s...

dynamic token values in dashboard to re-run search...

dashboard clear text in text input

jellyfisher

Re: What is Code 127

Re: MLTK - Forecast Time Series: Dynamic Holdback ...

Re: MLTK - Forecast Time Series: Dynamic Holdback ...

Re: Search Show result where count > 10

Re: Search Show result where count > 10

Re: Search Show result where count > 10

Re: Search Show result where count > 10

Re: R with SPLUNK

reduce /limit docker container logs

Re: How to monitor Garbage Collector using Splunk?

Re: How to monitor Garbage Collector using Splunk?

Re: How to monitor Garbage Collector using Splunk?

Re: How to monitor Garbage Collector using Splunk?

Re: Splunk Machine Leanring Toolkit: Why is outlie...

Re: how to visualize access_combined in splunk ?

Re: Search Command: Quantify: How to filter events...

Re: Splunk Predict App: How to use SPL to forecast

Re: How to add event breaks after indexing a file ...

Re: Splunk Web not starting with restart command

Re: Python for Scientific Computing Add-on Install...

Re: Python for Scientific Computing Add-on Install...

Re: How to add event breaks after indexing a file ...

Re: Splunk Predict App: How to use SPL to forecast

Re: Search Command: Quantify: How to filter events...

Re: splunk pricing / license cost scenario