Splunk Search

How to use geostats with a lookup containing Zip Codes, Population, and Latitude/Longitude to aggregate number of events against Population?

chowspecial
New Member

Hey guys,

So I have events that contain a lat / long. Here's an example of an event from the access log

/search?queryText=somesearch&lat=37.812889&lon=-122.432615

I have a lookup table with Zip Codes, population, and Lat Longs of the center of the zip code
Here's an example of a row in the csv that was uploaded

94109, 55984, 37.7955808980854, -122.422218643082

I'd like to do a geostats where I can aggregate number of events against Population. Aka "in this general blob we are seeing this many events per estimated population."

I'm not entirely sure where to start and if this is possible?

FYI, I'm going down the path of getting zip codes into those events to do a straight up lookup, but for now, I'd like to see what I can do.

0 Karma

jplumsdaine22
Influencer

Assuming your events contain zip codes your search might look something like this:

<event search> | stats <some aggregator> by zipcode | lookup ... | geostats ...

(I have elided the lookup and geostats commands, as they are totally dependent on what your tables look like)

By default geostats will group locations into buckets. You can change the resolution of with the binspanlong and binspanlat options to geostats. See http://docs.splunk.com/Documentation/Splunk/6.2.0/SearchReference/Geostats. There are also a few examples already in splunk answers.

I you're using Splunk 6.3 you can make choropleths which are even better. You'll need a KMZ (a zipped KML) file with the boundaries of the area you're interested in, rather than a straight lat/long table. See the geom command http://docs.splunk.com/Documentation/Splunk/6.3.1/SearchReference/Geom

0 Karma

chowspecial
New Member

Yes, as I mentioned, my events DON'T have zip codes. Working on the plausibility of getting that in. Unfortunately not using 6.3 yet but I may push for that.

0 Karma

jplumsdaine22
Influencer

Actually you did not mention that. All you said was that your events contain lat/long. You will need to be clearer about the event data. For example, do the lat/log fields in your events correspond directly to lat/long entries in your lookup table? Also, what do you mean by Population? Do the events contain population stats, or do you mean zip code when you say population.

It will be easier to assist if you post
A) Some sample events
B) A sample table of results

0 Karma

chowspecial
New Member

Ok. I have updated my question. I am going down the path of working with devs to see if getting zip code in the events is doable so that I can just do a straight lookup, but that will likely add unnecessary geo-lookups against our quota with our providers.

0 Karma

jplumsdaine22
Influencer

I see now. Yeah its going to be tricky as there is no way to define a zip code for each lat/long pair in your data. You could try bucketing but it wont be accurate, as each zip code is not a simple a radius. You're going to need a different lookup table that contains the entire lat long area for each zip code.

You should be able to find some data online (Try http://www.opengeocode.org/download.php)

0 Karma

chowspecial
New Member

Cool. I may loosen my constraint to an entire CBSA for now as the fuzziness of not being a radius isn't as impactful. Thanks!

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...