Getting Data In

How to ingest Strava GPX (XML) data...

GentleBen187
New Member

I'm trying to ingest various kinds of data to learn as much as I can about Splunk data ingestion as possible. My latest attempt is with my Mountain Biking data, downloaded in GPX file format from Strava.

The format looks like the below...just with a bunch more events, roughly every 10 seconds, that capture Lat, Lon, and elevation.

There are a couple of challenges here for me:

  1. I assume that I need to associate the field, which only appears once per file, with every event in the file so Splunk will rightly understand that all of the Lat, Lon, Ele combination events apply to the proper ride. How can I do this?
  2. As a corollary to the above, is it possible to have the field become the SOURCE value (rather than the name of the source file)?
  3. OK, so maybe just one challenge with a couple of parts to it. 🙂
  4. PLEASE HELP!

     <?xml version="1.0" encoding="UTF-8"?>
            <gpx creator="strava.com Android" version="1.1" xmlns="http://www.topografix.com/GPX/1/1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/1/1/gpx.xsd">;
             <metadata>
              <time>2014-03-19T22:03:02Z</time>
             </metadata>
             <trk>
              <name>Albino squirrel ride</name>
              <trkseg>
               <trkpt lat="35.2376560" lon="-80.6323440">
                <ele>230.8</ele>
                <time>2014-03-19T22:03:02Z</time>
               </trkpt>
             <trkpt lat="35.2375570" lon="-80.6322680">
                <ele>230.9</ele>
                <time>2014-03-19T22:49:19Z</time>
               </trkpt>
               <trkpt lat="35.2375230" lon="-80.6322810">
                <ele>230.9</ele>
                <time>2014-03-19T22:49:22Z</time>
               </trkpt>
              </trkseg>
             </trk>
            </gpx>
    
0 Karma

to4kawa
Ultra Champion

UPDATED:

| makeresults 
| eval _raw="<?xml version=\"1.0\" encoding=\"UTF-8\"?>
  <gpx creator=\"strava.com Android\" version=\"1.1\" xmlns=\"http://www.topografix.com/GPX/1/1\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xsi:schemaLocation=\"http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/1/1/gpx.xsd\">;
  <metadata>
  <time>2014-03-19T22:03:02Z</time>
  </metadata>
  <trk>
  <name>Albino squirrel ride</name>
  <trkseg>
  <trkpt lat=\"35.2376560\" lon=\"-80.6323440\">
  <ele>230.8</ele>
  <time>2014-03-19T22:03:02Z</time>
  </trkpt>
  <trkpt lat=\"35.2375570\" lon=\"-80.6322680\">
  <ele>230.9</ele>
  <time>2014-03-19T22:49:19Z</time>
  </trkpt>
  <trkpt lat=\"35.2375230\" lon=\"-80.6322810\">
  <ele>230.9</ele>
  <time>2014-03-19T22:49:22Z</time>
  </trkpt>
  </trkseg>
  </trk>
  </gpx>" 
 | spath path="gpx.trk.trkseg.trkpt{@lat}" output=lat
 | spath path="gpx.trk.trkseg.trkpt{@lon}" output=lon
 | spath path="gpx.trk.trkseg.trkpt.ele" output=ele
 | spath path="gpx.trk.trkseg.trkpt.time" output=time
 | fields - _*
 | eval _counter=mvrange(0,mvcount(time))
 | stats list(*) as * by _counter
 | foreach * 
    [ eval <<FIELD>> = mvindex(<<FIELD>>,_counter)]
| eval _time=strptime(replace(time,"Z"," +0000"),"%FT%T %z") 
| fields _time lat lon ele time

if transaction does not work, this query works.

0 Karma

gavsdavs
Observer

You aren't tied to ingesting the file as a single event.

What if I have over 10,000 points in a gpx file ?

Re-think the content of the file, each point is an event, the whole gpx file is a collection of events.

It's entirely up to you, but if you have 10,000 points in a file its easier to handle 10,000 events not one event and ending up with a 10,000 member mutlivalue field set.

0 Karma

to4kawa
Ultra Champion

Similar Splunk answer

What if I have over 10,000 points in a gpx file ?

Whether the log is single line or multi line, no problem. because I don't use mvexpand
My answer updated.
and I think transaction is too slow.

0 Karma

to4kawa
Ultra Champion
| makeresults 
| eval _raw="<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<gpx creator=\"strava.com Android\" version=\"1.1\" xmlns=\"http://www.topografix.com/GPX/1/1\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xsi:schemaLocation=\"http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/1/1/gpx.xsd\">;
<metadata>
<time>2014-03-19T22:03:02Z</time>
</metadata>
<trk>
<name>Albino squirrel ride</name>
<trkseg>
<trkpt lat=\"35.2376560\" lon=\"-80.6323440\">
<ele>230.8</ele>
<time>2014-03-19T22:03:02Z</time>
</trkpt>
<trkpt lat=\"35.2375570\" lon=\"-80.6322680\">
<ele>230.9</ele>
<time>2014-03-19T22:49:19Z</time>
</trkpt>
<trkpt lat=\"35.2375230\" lon=\"-80.6322810\">
<ele>230.9</ele>
<time>2014-03-19T22:49:22Z</time>
</trkpt>
</trkseg>
</trk>
</gpx>" 
| spath

Hi, @gavsdavs
spath is useful.

gavsdavs
Observer

Yeah I see that, but I get a single event with a load of multi-value fields and i have to do an mvexpand dance to blow it all to pieces.

I personally prefer to work with the events separate and stats or transact them together rather than mvexpand them apart.

0 Karma

gavsdavs
Observer

Set up a parsing statement to ingest the data and break every line
(SHOULD_LINEMERGE=false)

Then use something like

| transaction startswith="\<trkpt" endswith="\</trkpt\>"
| xmlkv
| table time lat lon
0 Karma

gsopkoTC
Path Finder

You know, I was looking to do the same thing (different activity) and I found this Splunk blog post:
http://blogs.splunk.com/2015/03/22/downhill-splunking-part-1/

I would also look up the field extractor function of Splunk as you have a specific field to capture.

0 Karma

diogofgm
SplunkTrust
SplunkTrust

can you post a proper sample? use the code tag

------------
Hope I was able to help you. If so, some karma would be appreciated.
0 Karma

GentleBen187
New Member

Done! Apparently the code sample editor is a bit finicky. Thanks for taking the time to notify me that my code snippet didn't come through properly!

B

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...