All Apps and Add-ons

Search for URL not in Alexa Top 1m

aer9480
Explorer

Hi everyone,
I have a log with a field that contains a URL. I would like to perform a Splunk search and find all logs where the resource name is not in the Alexa top 1 million sites list. I want to see what unpopular websites people are visiting by only displaying the sites that are not in the top million list. My URL field is just called "url," and I have the Alexa list in the Threat Intelligence Downloads section. Does anyone have a search query that can get me close to what I'm trying to do? Thanks!

Tags (2)

tsepanik1
New Member

I had a similar problem when I was trying to use the Alexa data. It seems like Alexa has started limiting downloads of their data unless you pay for it now. Not sure if Splunk has something special with them, but this worked for me: http://www.georgestarcher.com/splunk-app-for-es-and-alexa-top-sites/ The Cisco top 1m still populates the exact same lookups that the Alexa data used to, so it's a pretty seamless transition.,I ran into a problem when trying to use the Alexa data too. I found this article that talks about how Alexa doesn't allow their data to be downloaded without paying anymore. Switching over to the Cisco top 1m worked for me, and it still uses all the same lookup names and populates all the same places as the Alexa data used to. http://www.georgestarcher.com/splunk-app-for-es-and-alexa-top-sites/ Hope that helps!

0 Karma

DalJeanis
Legend
0 Karma

aer9480
Explorer

Thanks for the quick response! I have read through the article you posted, but even the most simple command: |inputlookup alexa_by_str.csv returns no results. I would like to verify that this is the right name for the csv. Do you know what tab I would be able to find the filename under? Under the Threat Intelligence Downloads section it says that the Alexa csv is named top-1m.csv.zip. I found this by looking in the URL field and this shows where the file resides: https://s3.amazonaws.com/alexa-static/top-1m.csv.zip, but I think this is only where the file is being downloaded from and I'm assuming the name changes once it gets downloaded. That's my guess as to why it's not returning results, do you think it could be a wrong filename or is something else going in here? Thanks again!

0 Karma

DalJeanis
Legend

@aer9480 - You could actually be putting it anywhere. You need to use the lookup name in the app.

Go to the app. Then under setting, go to lookups, then lookup definitions. The Name column is what you want to use.

0 Karma

aer9480
Explorer

Under lookup definitions, the Name is listed as alexa_lookup_by_str, and I believe I should be using the inputlookup command to search the contents of the lookup table, but when I try |inputlookup alexa_lookup_by_str no results are returned. Any ideas on what else I cant try to troubleshoot? Thanks!

0 Karma

aer9480
Explorer

@DalJeanis Any ideas on what could be causing me to not see the results? Thanks!

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...