Splunk Search

Join two search queries - two indices.

splunk_arz
Explorer

Hi
we try to join the information of two indices.

INDEX_A contains the GC-Logfiles for a specific environment.
To geht the used Memory we use rex.

INDEX_B contains the ps Output for our entire environments. (collected every 3600 seconds)
We are using the ps-Output to get the max and the native-memory information.

We try to join these to informations by using this query:

index=INDEX_A | regex _raw="(?ms)<gc-end id=\"\d+\" type=\"global\"" | rex field=_raw "(?ms)<gc-end id=\"\d+\" type=\"global\".*?<mem type=\"tenure\" free=\"(?<MemFree>\d+)\" total=\"(?<MemTotal>\d+)\" percent=\"(?<MemPercent>\d+)\".*" | rex field=source "^/PATH/[^/]+/(?<ServiceId>[^/]+)/" | eval MemUsed = (MemTotal - MemFree)/1024/1024 | eval Instance = host."_".ServiceId | join type=inner "Instance",host [ search index=INDEX_B | rex field=_raw "^(\S+\s+){10}(?<NativeMem>\d+).*/PATH/(?<envid>[^/]+)/(?<ServiceId>[^/]+)/java/jre/bin/java[^\n]+-Xmx(?<MaxMemory>\d+)"  | eval NativeMem = NativeMem/1024 | eval Instance = host."_".ServiceId ] | table _time,Instance, MaxMemory, NativeMem

When we execute just the the inner search we get the correct numbers:

2018-01-05 02:48:38     INSTANCE1   4096    6538.875000 
2018-01-05 06:48:39     INSTANCE1   4096    5881.972656

but when we try to join both queries,
it seams that just the newest native-memory value gets used.

2018-01-05 07:16:54.372     INSTANCE1   4096    5881.972656
2018-01-05 07:04:40.463     INSTANCE1   4096    5881.972656
2018-01-05 06:43:46.421     INSTANCE1   4096    5881.972656
2018-01-05 06:43:45.889     INSTANCE1   4096    5881.972656 

Is there a problem with the event timestamps, these timestamps do not correlate.
Also the number of events is not the same.

Any suggestions?
Any other solutions?

We are using Splunk 6.5.1.

Thanks for your help.

MfG Johann

Tags (1)
0 Karma
1 Solution

splunk_arz
Explorer

Hi - this solution is a close approche - since we do not evaluate NativeMem Values for those instances we do not find a GC Cycle Event (=> no MemUsed Value for those instances).

index=Index_a 
| regex _raw="(?ms)<gc-end id=\"\d+\" type=\"global\"" 
| rex field=_raw "(?ms)<gc-end id=\"\d+\" type=\"global\".*?<mem type=\"tenure\" free=\"(?<MemFree>\d+)\" total=\"(?<MemTotal>\d+)\" percent=\"(?<MemPercent>\d+)\".*" 
| rex field=source "^/PATH/[^/]+/(?<ServiceId>[^/]+)/" 
| eval MemUsed = (MemTotal - MemFree)/1024/1024 
| eval Instance = host."_".ServiceId 
| bucket _time span=5m 
| join usetime=true type=inner Instance [ 
   search index=Index_b 
   | rex field=_raw "^(\S+\s+){10}(?<NativeMem>\d+).*/PATH/(?<envid>[^/]+)/(?<ServiceId>[^/]+)/java/jre/bin/java[^\n]+-Xmx(?<MaxMemory>\d+)"  
   | eval NativeMem = NativeMem/1024 
   | eval Instance = host."_".ServiceId
   | dedup Instance, NativeMem, MaxMemory
   | bucket _time span=5m 
   ] 
| stats max(*) as * by _time Instance
| timechart span=5m max(NativeMem) as NativeMem, max(MaxMemory) as MaxMemory,max(MemUsed) as MemUsed by Instance
| filldown

View solution in original post

0 Karma

splunk_arz
Explorer

Hi - this solution is a close approche - since we do not evaluate NativeMem Values for those instances we do not find a GC Cycle Event (=> no MemUsed Value for those instances).

index=Index_a 
| regex _raw="(?ms)<gc-end id=\"\d+\" type=\"global\"" 
| rex field=_raw "(?ms)<gc-end id=\"\d+\" type=\"global\".*?<mem type=\"tenure\" free=\"(?<MemFree>\d+)\" total=\"(?<MemTotal>\d+)\" percent=\"(?<MemPercent>\d+)\".*" 
| rex field=source "^/PATH/[^/]+/(?<ServiceId>[^/]+)/" 
| eval MemUsed = (MemTotal - MemFree)/1024/1024 
| eval Instance = host."_".ServiceId 
| bucket _time span=5m 
| join usetime=true type=inner Instance [ 
   search index=Index_b 
   | rex field=_raw "^(\S+\s+){10}(?<NativeMem>\d+).*/PATH/(?<envid>[^/]+)/(?<ServiceId>[^/]+)/java/jre/bin/java[^\n]+-Xmx(?<MaxMemory>\d+)"  
   | eval NativeMem = NativeMem/1024 
   | eval Instance = host."_".ServiceId
   | dedup Instance, NativeMem, MaxMemory
   | bucket _time span=5m 
   ] 
| stats max(*) as * by _time Instance
| timechart span=5m max(NativeMem) as NativeMem, max(MaxMemory) as MaxMemory,max(MemUsed) as MemUsed by Instance
| filldown
0 Karma

splunk_arz
Explorer

Hi!
if found a solution which seams to work for me:

index=INDEX_A sourcetype="was_gc" 
| regex _raw="(?ms)<gc-end id=\"\d+\" type=\"global\"" 
| rex field=_raw "(?ms)<gc-end id=\"\d+\" type=\"global\".*?<mem type=\"tenure\" free=\"(?<MemFree>\d+)\" total=\"(?<MemTotal>\d+)\" percent=\"(?<MemPercent>\d+)\".*" 
| rex field=source "^/PATH/(tom|ebkapps)/[^/]+/(?<ServiceId>[^/]+)/" 
| eval MemUsed = (MemTotal - MemFree)/1024/1024 
| eval Instance = host."_".ServiceId 
| bucket _time span=5m 
| append [ 
   search index=INDEX_B
   | rex field=_raw "^(\S+\s+){10}(?<NativeMem>\d+).*/PATH/(?<envid>[^/]+)/(?<ServiceId>[^/]+)/java/jre/bin/java[^\n]+-Xmx(?<MaxMemory>\d+)"  
   | eval NativeMem = NativeMem/1024 
   | eval Instance = host."_".ServiceId
   | dedup Instance, NativeMem, MaxMemory
   | bucket _time span=5m 
   ] 
| stats max(*) as * by _time Instance
| where isnotnull(MemUsed)
| timechart span=5m max(NativeMem) as NativeMem, max(MaxMemory) as MaxMemory,max(MemUsed) as MemUsed by Instance
| filldown

I use dedup 🙂
I use bucket with span=5.
After that i stats to max, so each line contains just one value.
Afterwards i remove lines with MemUses==NULL

Maybe can someone review this solution.

Thanks for your help!!

0 Karma

richgalloway
SplunkTrust
SplunkTrust

@splunk_arz, if your problem is resolved, please accept an answer to help future readers.

---
If this reply helps you, Karma would be appreciated.
0 Karma

splunk_arz
Explorer

Hi,
no the problem ist not solved. It seams that this solution is just an approche. We just get those events, where MemUsed, NativeMemory and MaxMemory have the same timestamp - since we use "where isnotnull(MemUsed)".
We do not evauluate the NativeMemory for those timeslotes with no GC Event.

When i use join i get a pretty good approche.

join type=inner Instance
0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

Hi @splunk_arz ,

Can you please try this?

index=INDEX_A 
| regex _raw="(?ms)<gc-end id=\"\d+\" type=\"global\"" 
| rex field=_raw "(?ms)<gc-end id=\"\d+\" type=\"global\".*?<mem type=\"tenure\" free=\"(?<MemFree>\d+)\" total=\"(?<MemTotal>\d+)\" percent=\"(?<MemPercent>\d+)\".*" 
| rex field=source "^/PATH/[^/]+/(?<ServiceId>[^/]+)/" 
| eval MemUsed = (MemTotal - MemFree)/1024/1024 
| eval Instance = host."_".ServiceId 
| append 
    [ search index=INDEX_B 
    | rex field=_raw "^(\S+\s+){10}(?<NativeMem>\d+).*/PATH/(?<envid>[^/]+)/(?<ServiceId>[^/]+)/java/jre/bin/java[^\n]+-Xmx(?<MaxMemory>\d+)" 
    | eval NativeMem = NativeMem/1024 
    | eval Instance = host."_".ServiceId 
    | table Instance MaxMemory, NativeMem ] 
| stats latest(_time) as _time,values(MaxMemory) as MaxMemory, values(NativeMem) as NativeMem by Instance
0 Karma

splunk_arz
Explorer

Hi!
Thanks for your advice, but we cant fix this problem.
When we use append, we also get Results for instances we are not looking for, since in the index we collect information for our entire environments.

As i mentioned, both events have different timestamps - could this fact cause the problem when we try to join the search queries?

_time         Instance                   MaxMemory      NativeMem       MemUsed 
2018-01-05 14:48:39.000     INSTANCE_1  512     3863.937500      
2018-01-05 14:11:58.425     INSTANCE_1                          25.041870 

When we use append, is there a way to compare the "inner" Instance name with the main Instance name?
Maybe this kind of "join" could work.

Thank you very much!

0 Karma

cmerriman
Super Champion

instead of join, could you try append? when you run your search, are you getting any messages in the job inspector saying that the subsearch was truncated or timed out?

index=INDEX_A 
| regex _raw="(?ms)<gc-end id=\"\d+\" type=\"global\"" 
| rex field=_raw "(?ms)<gc-end id=\"\d+\" type=\"global\".*?<mem type=\"tenure\" free=\"(?<MemFree>\d+)\" total=\"(?<MemTotal>\d+)\" percent=\"(?<MemPercent>\d+)\".*" 
| rex field=source "^/PATH/[^/]+/(?<ServiceId>[^/]+)/" 
| eval MemUsed = (MemTotal - MemFree)/1024/1024 
| eval Instance = host."_".ServiceId 
| append 
    [ search index=INDEX_B 
    | rex field=_raw "^(\S+\s+){10}(?<NativeMem>\d+).*/PATH/(?<envid>[^/]+)/(?<ServiceId>[^/]+)/java/jre/bin/java[^\n]+-Xmx(?<MaxMemory>\d+)" 
    | eval NativeMem = NativeMem/1024 
    | eval Instance = host."_".ServiceId ] 
| table _time,Instance, MaxMemory, NativeMem MemUsed
|stats values(*) as * by _time Instance host
|where isnotnull(MemUsed)
|fields - MemUsed
0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...