Hey folks,
Any suggestions for how to parse the XML payload from ShopperTrak's REST endpoint obtained via the REST API Modular Input?
Link to full payload: https://gist.github.com/yorokobi/2b8f1004be5b6142eaf05a1670bb1b8c (I tried enabling soft wrapping, but it didn't take.)
Pretty (and smaller) sample (use the gist for your own testing):
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<sites>
<site storeID="1319">
<traffic exits="13" enters="25" startTime="201608251030"/>
<traffic exits="15" enters="13" startTime="201608251045"/>
<traffic exits="16" enters="11" startTime="201608251100"/>
</site>
<site storeID="1455">
<traffic exits="14" enters="26" startTime="201608251030"/>
<traffic exits="30" enters="21" startTime="201608251045"/>
<traffic exits="19" enters="17" startTime="201608251100"/>
</site>
<site storeID="2161">
<traffic exits="32" enters="29" startTime="201608251330"/>
<traffic exits="33" enters="35" startTime="201608251345"/>
<traffic code="01" exits="" enters="" startTime="201608251400"/>
</site>
<site storeID="0486">
<traffic exits="16" enters="17" startTime="201608251130"/>
<traffic exits="13" enters="17" startTime="201608251145"/>
<traffic exits="30" enters="21" startTime="201608251200"/>
</site>
<site storeID="2503">
<traffic exits="3" enters="2" startTime="201608251230"/>
<traffic exits="1" enters="4" startTime="201608251245"/>
<traffic exits="4" enters="0" startTime="201608251300"/>
</site>
<site storeID="2162">
<traffic exits="14" enters="18" startTime="201608251330"/>
<traffic exits="19" enters="22" startTime="201608251345"/>
<traffic exits="18" enters="12" startTime="201608251400"/>
</site>
<site storeID="2561">
<traffic exits="5" enters="5" startTime="201608251330"/>
<traffic exits="8" enters="8" startTime="201608251345"/>
<traffic exits="3" enters="3" startTime="201608251400"/>
</site>
<site storeID="1647">
<traffic exits="16" enters="12" startTime="201608251230"/>
<traffic exits="8" enters="9" startTime="201608251245"/>
<traffic code="02" exits="" enters="" startTime="201608251300"/>
</site>
<site storeID="0436">
<traffic exits="20" enters="11" startTime="201608251230"/>
<traffic exits="21" enters="20" startTime="201608251245"/>
<traffic exits="22" enters="23" startTime="201608251300"/>
</site>
<site storeID="1751">
<traffic exits="16" enters="9" startTime="201608251330"/>
<traffic exits="15" enters="5" startTime="201608251345"/>
<traffic exits="7" enters="12" startTime="201608251400"/>
</site>
<site storeID="1438">
<traffic exits="18" enters="13" startTime="201608251330"/>
<traffic exits="14" enters="11" startTime="201608251345"/>
<traffic exits="12" enters="10" startTime="201608251400"/>
</site>
<site storeID="2029">
<traffic exits="15" enters="19" startTime="201608251030"/>
<traffic exits="23" enters="15" startTime="201608251045"/>
<traffic exits="16" enters="18" startTime="201608251100"/>
</site>
<site storeID="0471">
<traffic exits="19" enters="16" startTime="201608251230"/>
<traffic exits="25" enters="15" startTime="201608251245"/>
<traffic code="01" exits="" enters="" startTime="201608251300"/>
</site>
</xml>
I have played with the following props.conf
entries with pretty much zero success (not necessarily all at the same time--I haven't kept a running record).
SHOULD_LINEMERGE=false
BREAK_ONLY_BEFORE=<site\s
KV_MODE=xml
LINE_BREAKER=(<\/site>)
My attempts at using |xpath
or |rex max_match=0
have likewise yielded less than stellar results.
What I need is a method of correctly tying storeID with exits, enters, startTime, and code when it shows up.
Thanks in advance. 😄
Try this:
[shooperTrak]
KV_MODE = xml
MAX_TIMESTAMP_LOOKAHEAD = 128
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = true
TIME_FORMAT = %Y%m%d%H%M
TIME_PREFIX = \sstartTime=\"
disabled = false
BREAK_ONLY_BEFORE = <site storeID
This should give you one event for each "site" element. You can configure props/transforms to get rid of the unneeded lines (XML declaration and "sites" envelope) if you want to.
That breaks events by storeID nicely but now I need a way to get enters/exits/code/startTime by storeID.
New event:
<site storeID="0310"><traffic exits="3" enters="7" startTime="201608250700"/><traffic exits="4" enters="6" startTime="201608250715"/><traffic exits="4" enters="4" startTime="201608250730"/><traffic exits="3" enters="6" startTime="201608250745"/>
I tried again with |xpath "//site/traffic/@enters/" outfield=enters
but wasn't able to get an 'enters' field and | stats first(site.traffic.@enters) BY site.@storeID site.traffic.@startTime
produces one result instead of four.
Ideally, the end result should look like:
startTime | storeID | enters | exits | code |\
---------------------------------------------------------\
201608250700 | 0133 | 7 | 3 | |\
201608250715 | 0133 | 6 | 4 | |\
201608250730 | 0133 | 6 | 3 | |\
UPDATE: This should work for you:
yoursearch | stats list(site.traffic{@startTime}) as Time, list(site.traffic{@enters}) as Enters, list(site.traffic{@exits}) as Exits, list(site.traffic{@code}) as Code by site{@storeID} | rename "site{@storeID}" as storeID | mvexpand Time | mvexpand Enters | mvexpand Exits | table Time storeID Enters Exits Code
I didn't have much time to spend on this, but you will hopefully get the idea from this:
<yoursearch> | stats list(site.traffic{@enters}) as Enters, list(site.traffic{@exits}) as Exits by site{@storeID}
It needs to be refined to add your startTime and you can convert the multivalue field into individual table entries, if need be.
I hope this gets you started.
I suspect what would work best is a custom response handler for the REST TA.
That's an alternative, if what I provided doesn't work for you (it worked for me with your sample data).