Find the Distance Between Two or More Geolocation Coordinates
Comment by aworkman on aworkman's answer
Answer by MuS
Hi there,
fast forward into the future, we can do the *great circle formula* in Splunk now.
This example will provide the expected result:
| makeresults
| eval lat1=1, lon1=1, lat2=2, lon2=2
| eval rlat1 = pi()*lat1/180, rlat2=pi()*lat2/180, rlat = pi()*(lat2-lat1)/180, rlon= pi()*(lon2-lon1)/180
| eval a = sin(rlat/2) * sin(rlat/2) + cos(rlat1) * cos(rlat2) * sin(rlon/2) * sin(rlon/2)
| eval c = 2 * atan2(sqrt(a), sqrt(1-a))
| eval distance = 6371 * c
| table lat1 lon1 lat2 lon2 distance
`distance` will be the distance in `km`.
Hope this helps ...
Answer by Damien Dallimore
There is a [Haversine add-on on Splunkbase][1] that should do the trick for you.
Answer by rgonzale6
I'm working on a similar query and I much appreciate what you've both done here. I've worked up this:
| lookup geoip clientip |dedup userID, client_city| eval location=clientip."- ".client_city.", ".client_region.", ".client_country| stats last(client_lat) as Lat1, last(client_lon) as Lon1, first(client_lat) as Lat2, first(client_lon) as Lon2, values(location) dc(client_city) as distinctCount by userID| where distinctCount = 2 | eval distance=sqrt(pow(Lat1-Lat2,2)+pow(Lon1-Lon2,2))|sort distance desc
Comment by sideview on sideview's answer
No, I don't see why you'd need to do the distance calculation *within* the stats clause. That would be a little crazy. Do it before and use some form of `last(distance) as distance by username`, or `by username distance` in your stats, and then filter afterwards. Or use some form of `last(src_ip_latitude) as src_ip_latitude last(src_ip_longitude) as src_ip_longitude` in stats and then do the distance calculation after.
I think my question is a little more complex than I initially thought. My current base search only has the src_ip_latitude and src_ip_longitude fields. I want break it up (e.g. latitude1, latitude2, etc.) grouped by the username. I'm thinking I would need alter the end of my search to something like "where (count_country > 1) AND (distance > 100)". That means I likely need to do the distance calculation it within my stats clause. Because after my stats clause, I no longer have access to the latitude and longitude fields.
Assuming you have those other four fields in your events, just tack the `| eval ` onto the end of the search. Just by that eval will add an additional field to all rows called "distance". Again you have to have all four of those fields by those exact case sensitive names, on all events. More generally on all incoming rows, whether they're events or whether they've already been transformed or altered by other search language commands.
I completely forgot about the fact that that the Earth is round. :-) Too bad I can't use the great-circle formula.
Answer by sideview
The pythagorean theorem is a good approximation only for shorter distances. If you're actually dealing with pretty big distances you have to break out some trig functions and calculate great circle distance. http://en.wikipedia.org/wiki/Great-circle_distance
And since eval can't do trig functions ( see http://splunk-base.splunk.com/answers/26399/can-eval-evaluate-cosines ) that would lead you back to a custom search command again.
However, if your distances are all short enough, then what you propose just needs to be plugged into eval.
`| eval distance=sqrt(pow(src_ip_latidude1-src_ip_latidude2,2)+pow(src_ip_longitude1-src_ip_logitude2,2))`
Once that eval clause gives you that field called distance on your rows, you can do whatever you want with it.