Splunk Search

Transforms on multiple indexes of XML sources with nested, duplicate keys

humbertocastro
New Member

I would like to get a single report by combining data from 3 different data sources. However, I am running into a problem where in one of my sources there is a duplicate field name (pool_name) and I cannot figure out how to split it up. The following is my search

index=Masking_View OR
index=Device OR
index=Association_Info
| rename Masking_View.View_Info.stor_grpname as sg_name
| rename Masking_View.View_Info.Device.dev_name as dev_name
| rename Masking_View.View_Info.Totals.total_dev_cap_mb as total_dev_cap_mb
| rename Association_Info.sg_name as sg_name
| rename Association_Info.policy_name as policy_name
| rename Device.dev_name as dev_name
| rename Device.pool.pool_name as pool_name
| rename Device.pool.pool_alloc_percent as pool_alloc_percent
| rename Device.pool.alloc_tracks_mb as alloc_tracks_mb
| fields sg_name, dev_name, total_dev_cap_mb, policy_name, pool_name, pool_alloc_percent, alloc_tracks_mb
| transaction sg_name, dev_name
| search mail*
| stats avg(pool_alloc_percent) sum(alloc_tracks_mb) first(total_dev_cap_mb) by sg_name, policy_name, pool_name

My sources are XML logs that have entries with the following structure:

Masking_View.log

<Masking_View>
  <View_Info>
    <stor_grpname>mail_SG</stor_grpname>
    <Device>
      <dev_name>0B01</dev_name>
    </Device>
    <Totals>
      <total_dev_cap_mb>1024003</total_dev_cap_mb>
    </Totals>
  </View_Info>
</Masking_View>

Device.log

<Device>
    <dev_name>0B01</dev_name>
    <total_tracks_mb>1024003</total_tracks_mb>
    <written_tracks_mb>414798</written_tracks_mb>
    <total_tracks_gb>1000.0</total_tracks_gb>
    <written_tracks_gb>405.1</written_tracks_gb>
    <pool_written_percent>41</pool_written_percent>
    <pool>
        <pool_name>VP_FC_01</pool_name>
        <alloc_tracks>0</alloc_tracks>
        <alloc_tracks_mb>0</alloc_tracks_mb>
        <alloc_tracks_gb>0.0</alloc_tracks_gb>
        <pool_alloc_percent>0</pool_alloc_percent>
    </pool>
    <pool>
        <pool_name>VP_SATA_01</pool_name>
        <alloc_tracks>6636960</alloc_tracks>
        <alloc_tracks_mb>414810</alloc_tracks_mb>
        <alloc_tracks_gb>405.1</alloc_tracks_gb>
        <pool_alloc_percent>41</pool_alloc_percent>
    </pool>
</Device>

Association_Info.log

<Association_Info>
    <sg_name>mail_SG</sg_name>
    <policy_name>Tier2</policy_name>
</Association_Info>

My props.conf looks like:

[Masking_View]
TRUNCATE = 0
MAX_EVENTS = 1000000
SHOULD_LINEMERGE = true
KV_MODE = xml
BREAK_ONLY_BEFORE = ^\s*<Masking_View>
MUST_BREAK_AFTER = /Masking_View>

[Device]
TRUNCATE = 0
SHOULD_LINEMERGE = true
KV_MODE = xml
BREAK_ONLY_BEFORE = ^\s*<Device>
MUST_BREAK_AFTER = /Device>

[Association_Info]
TRUNCATE = 0
SHOULD_LINEMERGE = true
KV_MODE = xml
BREAK_ONLY_BEFORE = ^\s*<Association_Info>
MUST_BREAK_AFTER = /Association_Info>

I would like to have a result showing something like:

sg_name   policy_name   pool_name   avg(pool_alloc_percent) sum(alloc_tracks_mb)    first(total_dev_cap_mb)
mail_SG   Tier2         VP_FC_01    0                       0                       0
mail_SG   Tier2         VP_SATA_01  20.500000               414810                  1024003

Instead I get this:

sg_name   policy_name   pool_name   avg(pool_alloc_percent) sum(alloc_tracks_mb)    first(total_dev_cap_mb)
mail_SG   Tier2         VP_FC_01    20.500000               414810                  1024003
mail_SG   Tier2         VP_SATA_01  20.500000               414810                  1024003

Do you have any idea how to get the various pool_name sums and averages to group properly?

0 Karma

dart
Splunk Employee
Splunk Employee

Try the eval function mvzip and then mvexpand to get one event per pool.

0 Karma

humbertocastro
New Member

I could not get the mvzip to work quite right but I did solve this a different way... probably not the best method, but it gets me the results I need. If anyone is interested I was able to create the 3 sets of fields by doing the following.

First create a field that is semicolon delimited then use a regex to get the fields I want. The comment system is not allowing me to post the code. So I guess that is left as an exercise for the reader.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...