If I can see a pattern forming that will help me track users in my environment, how can I set up a search to serve these functions?
Case 1: A client consistently hits a set of URI stems, how can I track that scraper by looking for IP addresses or user agents that have hit all of those specific URI stems?
Case 2: If a client methodically increments through a specific field such as IPs or memberIDs (these are specific to my environment) how can I write a search to track their IP block or UserAgent?
I have an idea for Case 1, and similar issues. Create a CSV file containing all the URI stems that you want to track and upload it using the lookups interface. The file should look like this
uri_stem
splunk.com/apage
www.whatsup.com/anotherpage
"uri_stem" is the header line, which is required. Then search like this:
othersearchcriteria [ inputlookup uri.csv | rename uri_stem as search ]
| stats dc(uri) by ip
| where count >= [ inputlookup uri.csv | stats count | return $count ]
Note that this search simply inserts the URI stems as search terms in the base search - it doesn't use the field names at all. However, you will need the field name for the stats command - I assumed that it was "uri". Run the search and then look at the search job inspector to see what it actually did. Second, the list of uri_stems really shouldn't exceed about a hundred.
Finally, you could make a more sophisticated CSV, including more information or selection criteria, and use it as a true lookup. But I think this is a good starting point, as it shows a good way to build a dynamic list of criteria. To change the criteria, simply update the CSV file and reload it. You can even generate the CSV from another Splunk search - there are many possibilities.