I need to calculate total memory used by a process. There are multiple processes with same root and suffixes. But data sampling is not consistent. Sometimes it comes in as 2 per minute, sometimes 4. Here is a sample:
09/07/2017 14:25:56.050 -0400 ,instance=server#1 ,Value=31827849216
09/07/2017 14:25:56.050 -0400 ,instance=server ,Value=30434951168
09/07/2017 14:25:11.065 -0400 ,instance=server#1 ,Value=31827849216
09/07/2017 14:25:11.065 -0400 ,instance=server ,Value=30434951168
09/07/2017 14:24:26.064 -0400 ,instance=server#1 ,Value=31827849216
09/07/2017 14:24:26.064 -0400 ,instance=server ,Value=30434922496
How do I sum it for server* by a minute? Can't do average as it would show half the memory used, can't sum as it would show double for times with 4 samples.
After some tries I figured the way to collect this for multiple processes and sync the time in the process.
host=Hostname source="Perfmon:Memory" counter="Available MBytes" | eval FreeGB=Value/1024 | bin span=1m _time | dedup _time| fields _time, host, FreeGB | join host [search ComputerName="Hostname " sourcetype=WinHostMon Type=OperatingSystem | eval TotalGB=TotalPhysicalMemoryKB/1048576 |fields host, TotalGB] | eval Name="Memory Used", MemUsed=TotalGB-FreeGB | table _time, host, Name, TotalGB, MemUsed | append
[search host=Hostname AND sourcetype="Perfmon:Process" AND counter="Working Set - Private" AND (process_name="background*" OR
process_name="vizqlserver*" OR
process_name="dataserver*" OR
process_name="tdeserver*" OR
process_name="redis-server*") |
eval Name=case(like(instance, "background%"),"Backgrounder",
like(instance,"vizqlserver%"), "VizQL" ,
like(instance,"dataserver%"), "Data server",
like(instance,"tdeserver%"),"TD engine",
like(instance,"redis-server%"), "Cache server"),Value = Value/1073741824| bin span=1m _time | dedup _time, instance | stats sum(Value) as MemUsed by _time, Name, host | table _time, host, Name, MemUsed]
After this creating a pretty chart was easy.
After some tries I figured the way to collect this for multiple processes and sync the time in the process.
host=Hostname source="Perfmon:Memory" counter="Available MBytes" | eval FreeGB=Value/1024 | bin span=1m _time | dedup _time| fields _time, host, FreeGB | join host [search ComputerName="Hostname " sourcetype=WinHostMon Type=OperatingSystem | eval TotalGB=TotalPhysicalMemoryKB/1048576 |fields host, TotalGB] | eval Name="Memory Used", MemUsed=TotalGB-FreeGB | table _time, host, Name, TotalGB, MemUsed | append
[search host=Hostname AND sourcetype="Perfmon:Process" AND counter="Working Set - Private" AND (process_name="background*" OR
process_name="vizqlserver*" OR
process_name="dataserver*" OR
process_name="tdeserver*" OR
process_name="redis-server*") |
eval Name=case(like(instance, "background%"),"Backgrounder",
like(instance,"vizqlserver%"), "VizQL" ,
like(instance,"dataserver%"), "Data server",
like(instance,"tdeserver%"),"TD engine",
like(instance,"redis-server%"), "Cache server"),Value = Value/1073741824| bin span=1m _time | dedup _time, instance | stats sum(Value) as MemUsed by _time, Name, host | table _time, host, Name, MemUsed]
After this creating a pretty chart was easy.
For this kind of thing, the per_minute
function of timechart
is perfect. Because timechart
always operates on some certain known timespan
, per_minute
in your scenario is calculated by taking sum of the Value
for that timespan and divide it by the timespan's amount of minutes.
Could you take the max for each minute, then sum that. That would give you one data point per minute.
For example:
YOUR BASE SEARCH ...
| timechart max(memory) as maxmem span=1m
| stats sum(maxmem) as totalmem
Using stats
or timechart
, you can do avg
by server and it will average each of them on each server, without worrying about how many are in each bucket. So if you have 20 events with the total memory used for server1 and 3 events for server2, it will give you the proper avg
for each of the two servers. Sum will do just that, which is to add them all up. The number disparity could be the problem then. And how would you know how much it was actually using if you don't know how many events there are for each server. Let Splunk do the work.
So assuming you want to get average mem used across several servers (with the memory used in a field called Value
and the hostname field called instance
), you could do something like:
... | timechart span=1m avg(Value) by instance
If you really want the sum
of the memory used, then I'm not understanding what you would want to do with that data.
You could also plot min
and max
memory used in the same visualization.
I guess I am confused because I am trying to read this query via ODBC driver. And while it looks ok in splunk, when I bring it into Tableau the only data I see is _time and _span. The actual numbers are not coming through.
Is there any way I can group all the server, server#1, server#2 into one? I don't need to see it by specific instance, but a total by all of them.
...| bin _time span=30s | timechart minspan=30s sum(Value)
I need to calculate total memory used by a process
How are you determining what process it is if the events don't include a process id or name?
host=blahblah sourcetype="Perfmon:Process" process_name="server*" counter="Working Set - Private"
depending on what you want to split by , this might be a helpful start:
index=x sourcetype=y | bin _time span=30s | timechart minspan=30s sum(Value) by instance