All Apps and Add-ons

How to parse ShopperTrak XML payload obtained via the REST API Modular Input?

Yorokobi
SplunkTrust
SplunkTrust

Hey folks,

Any suggestions for how to parse the XML payload from ShopperTrak's REST endpoint obtained via the REST API Modular Input?

Link to full payload: https://gist.github.com/yorokobi/2b8f1004be5b6142eaf05a1670bb1b8c (I tried enabling soft wrapping, but it didn't take.)

Pretty (and smaller) sample (use the gist for your own testing):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<sites>
    <site storeID="1319">
        <traffic exits="13" enters="25" startTime="201608251030"/>
        <traffic exits="15" enters="13" startTime="201608251045"/>
        <traffic exits="16" enters="11" startTime="201608251100"/>
    </site>
    <site storeID="1455">
        <traffic exits="14" enters="26" startTime="201608251030"/>
        <traffic exits="30" enters="21" startTime="201608251045"/>
        <traffic exits="19" enters="17" startTime="201608251100"/>
    </site>
    <site storeID="2161">
        <traffic exits="32" enters="29" startTime="201608251330"/>
        <traffic exits="33" enters="35" startTime="201608251345"/>
        <traffic code="01" exits="" enters="" startTime="201608251400"/>
    </site>
    <site storeID="0486">
        <traffic exits="16" enters="17" startTime="201608251130"/>
        <traffic exits="13" enters="17" startTime="201608251145"/>
        <traffic exits="30" enters="21" startTime="201608251200"/>
    </site>
    <site storeID="2503">
        <traffic exits="3" enters="2" startTime="201608251230"/>
        <traffic exits="1" enters="4" startTime="201608251245"/>
        <traffic exits="4" enters="0" startTime="201608251300"/>
    </site>
    <site storeID="2162">
        <traffic exits="14" enters="18" startTime="201608251330"/>
        <traffic exits="19" enters="22" startTime="201608251345"/>
        <traffic exits="18" enters="12" startTime="201608251400"/>
    </site>
    <site storeID="2561">
        <traffic exits="5" enters="5" startTime="201608251330"/>
        <traffic exits="8" enters="8" startTime="201608251345"/>
        <traffic exits="3" enters="3" startTime="201608251400"/>
    </site>
    <site storeID="1647">
        <traffic exits="16" enters="12" startTime="201608251230"/>
        <traffic exits="8" enters="9" startTime="201608251245"/>
        <traffic code="02" exits="" enters="" startTime="201608251300"/>
    </site>
    <site storeID="0436">
        <traffic exits="20" enters="11" startTime="201608251230"/>
        <traffic exits="21" enters="20" startTime="201608251245"/>
        <traffic exits="22" enters="23" startTime="201608251300"/>
    </site>
    <site storeID="1751">
        <traffic exits="16" enters="9" startTime="201608251330"/>
        <traffic exits="15" enters="5" startTime="201608251345"/>
        <traffic exits="7" enters="12" startTime="201608251400"/>
    </site>
    <site storeID="1438">
        <traffic exits="18" enters="13" startTime="201608251330"/>
        <traffic exits="14" enters="11" startTime="201608251345"/>
        <traffic exits="12" enters="10" startTime="201608251400"/>
    </site>
    <site storeID="2029">
        <traffic exits="15" enters="19" startTime="201608251030"/>
        <traffic exits="23" enters="15" startTime="201608251045"/>
        <traffic exits="16" enters="18" startTime="201608251100"/>
    </site>
    <site storeID="0471">
        <traffic exits="19" enters="16" startTime="201608251230"/>
        <traffic exits="25" enters="15" startTime="201608251245"/>
        <traffic code="01" exits="" enters="" startTime="201608251300"/>
    </site>
</xml>

I have played with the following props.conf entries with pretty much zero success (not necessarily all at the same time--I haven't kept a running record).

SHOULD_LINEMERGE=false
BREAK_ONLY_BEFORE=<site\s
KV_MODE=xml
LINE_BREAKER=(<\/site>)

My attempts at using |xpath or |rex max_match=0 have likewise yielded less than stellar results.

What I need is a method of correctly tying storeID with exits, enters, startTime, and code when it shows up.

Thanks in advance. 😄

0 Karma

s2_splunk
Splunk Employee
Splunk Employee

Try this:

[shooperTrak]
KV_MODE = xml
MAX_TIMESTAMP_LOOKAHEAD = 128
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = true
TIME_FORMAT = %Y%m%d%H%M
TIME_PREFIX = \sstartTime=\"
disabled = false
BREAK_ONLY_BEFORE = <site storeID

This should give you one event for each "site" element. You can configure props/transforms to get rid of the unneeded lines (XML declaration and "sites" envelope) if you want to.

0 Karma

Yorokobi
SplunkTrust
SplunkTrust

That breaks events by storeID nicely but now I need a way to get enters/exits/code/startTime by storeID.

New event:

<site storeID="0310"><traffic exits="3" enters="7" startTime="201608250700"/><traffic exits="4" enters="6" startTime="201608250715"/><traffic exits="4" enters="4" startTime="201608250730"/><traffic exits="3" enters="6" startTime="201608250745"/>

I tried again with |xpath "//site/traffic/@enters/" outfield=enters but wasn't able to get an 'enters' field and | stats first(site.traffic.@enters) BY site.@storeID site.traffic.@startTime produces one result instead of four.

Ideally, the end result should look like:

startTime     |  storeID  |  enters  |  exits  |  code  |\
---------------------------------------------------------\
201608250700  |     0133  |       7  |      3  |        |\
201608250715  |     0133  |       6  |      4  |        |\
201608250730  |     0133  |       6  |      3  |        |\
0 Karma

s2_splunk
Splunk Employee
Splunk Employee

UPDATE: This should work for you:

yoursearch | stats list(site.traffic{@startTime}) as Time, list(site.traffic{@enters}) as Enters, list(site.traffic{@exits}) as Exits, list(site.traffic{@code}) as Code by site{@storeID} | rename "site{@storeID}" as storeID | mvexpand Time | mvexpand Enters | mvexpand Exits | table Time storeID Enters Exits Code

I didn't have much time to spend on this, but you will hopefully get the idea from this:

<yoursearch> | stats list(site.traffic{@enters}) as Enters, list(site.traffic{@exits}) as Exits by site{@storeID}

It needs to be refined to add your startTime and you can convert the multivalue field into individual table entries, if need be.

I hope this gets you started.

0 Karma

Yorokobi
SplunkTrust
SplunkTrust

I suspect what would work best is a custom response handler for the REST TA.

0 Karma

s2_splunk
Splunk Employee
Splunk Employee

That's an alternative, if what I provided doesn't work for you (it worked for me with your sample data).

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...