The context: I'm looking for sensitive information patterns showing up in the IIS sourcetype that we have.
What I can already do: I can run this search:
sourcetype="iis"
| rex field=_raw "[^(^|[0-9])](?<ccmaybe>(5[1-5][0-9]{14})|(4[0-9]{12}([0-9]{3})?)|(3[47][0-9]{13})|(6011[0-9]{12})|((30[0-5]|36[0-9]|38[0-9])[0-9]{11}))"
| search ccmaybe!=""
| table ccmaybe
What I need is the field this shows up in, largely so I can exclude known fields that will never have that data. But I do not at all want to specify each and every field that are in IIS logs: partly because that query would be tremendous, and partly because what if we add items to the logs?
What should I do?
[edit 9/11] Updating with an example:
2018-09-11 18:25:33 172.0.0.1 GET /App/Admin/Login.aspx - 443 - 192.168.0.1 Mozilla/5.0+(Macintosh;+Intel+Mac+OS+X+10_13_6)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/69.0.3497.81+Safari/537.36 - 302 0 0 0 127.0.0.1 1234567890abcdef-ABC - - TLSv1.2
The problem I'm having is, I want to search each field for anything that might have CC data, but I want to do this searching against the extracted fields, not against the raw data. I tried using the Luhn Splunk add-on, but it parses the entire raw log without spaces, which lumps everything together regardless of field.
can you please post a sample of the data?
It's any IIS logs, but here's a simple example:
2018-09-11 18:25:33 172.x.x.x GET /App/Admin/Login.aspx - 443 - 192.168.0.1 Mozilla/5.0+(Macintosh;+Intel+Mac+OS+X+10_13_6)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/69.0.3497.81+Safari/537.36 - 302 0 0 0 127.0.0.1 1234567890abcdef-ABC - - TLSv1.2
This expands out to various event fields, like true_ip and cs_method.
I think the punct
field would be an excellent choice here. It would be the fattest way to exclude logging formats which are irrelevent