All Apps and Add-ons

How do I hide parameters in an SQL query?

geneoshaughness
Explorer

I have a slow query log from Mysql that among other things has the query statement in an extracted field.
For example the query might be:
SELECT * FROM SAMPLE WHERE USER_ID = 'John Doe' and COMPANY = 'Widget Inc.' and UID = 9999

I'd like to be able to group the sql statements to see how often they occur.
I'd like to get a field that excludes all digits and text between the quotes.
For the same example, I'd like the field to look like this:
SELECT * FROM SAMPLE WHERE USER_ID = '' and COMPANY = '' and UID =

Thanks

Tags (1)
1 Solution

geneoshaughness
Explorer

I should have updated this a while ago.
I was able to do this simply with a field extraction and by adding a field with an eval statement in the query.

A different thread suggested this stanza in /opt/splunk/etc/system/local/props.conf

[slow_query]
LINE_BREAKER = (?:;|Argument)(\n)(?:\# Time: |\# User@Host: )
SHOULD_LINEMERGE = false
TRUNCATE = 0

I added one field extraction for my slow_query source type to get all of the relevant fields:

[^\[\n]*\[(?P<DB_User>[^\]]+)[^\[\n]*\[(?P<DB_Host>[^\]]+).*\n# Query_time: (?P<Query_time>\d*\.\d*)  Lock_time: (?P<Lock_time>\d*.\d*) Rows_sent: (?P<Rows_sent>\d*)  Rows_examined: (?P<Rows_examined>\d*).*SET timestamp\=\d+\;(?P<SQL_STATEMENT>.*)\;

In the search bar I used this:

sourcetype="slow_query"| eval Norm_sql=replace(SQL_STATEMENT, "(\d+|\'.*?\')", "XXXX")

This chart adds up all of the time the database spent on each query in the slow log and ranks them. If your database server can handle the load, you could set the slow query time down to 0 to see everything, but it will definitely hurt performance.

sourcetype="slow_query"| eval Norm_sql=replace(SQL_STATEMENT, "(\d+|\'.*?\')", "XXXX")  | stats sum(Query_time) by Norm_sql | sort sum(Query_time) desc

View solution in original post

0 Karma

geneoshaughness
Explorer

I should have updated this a while ago.
I was able to do this simply with a field extraction and by adding a field with an eval statement in the query.

A different thread suggested this stanza in /opt/splunk/etc/system/local/props.conf

[slow_query]
LINE_BREAKER = (?:;|Argument)(\n)(?:\# Time: |\# User@Host: )
SHOULD_LINEMERGE = false
TRUNCATE = 0

I added one field extraction for my slow_query source type to get all of the relevant fields:

[^\[\n]*\[(?P<DB_User>[^\]]+)[^\[\n]*\[(?P<DB_Host>[^\]]+).*\n# Query_time: (?P<Query_time>\d*\.\d*)  Lock_time: (?P<Lock_time>\d*.\d*) Rows_sent: (?P<Rows_sent>\d*)  Rows_examined: (?P<Rows_examined>\d*).*SET timestamp\=\d+\;(?P<SQL_STATEMENT>.*)\;

In the search bar I used this:

sourcetype="slow_query"| eval Norm_sql=replace(SQL_STATEMENT, "(\d+|\'.*?\')", "XXXX")

This chart adds up all of the time the database spent on each query in the slow log and ranks them. If your database server can handle the load, you could set the slow query time down to 0 to see everything, but it will definitely hurt performance.

sourcetype="slow_query"| eval Norm_sql=replace(SQL_STATEMENT, "(\d+|\'.*?\')", "XXXX")  | stats sum(Query_time) by Norm_sql | sort sum(Query_time) desc
0 Karma

dwaddle
SplunkTrust
SplunkTrust

In the past, I have used sed expressions to normalize SQL statements. It gets kinda iffy depending on the complexity of your SQLs, but it is workable. Something like this:

| rex mode=sed "s/=\s+'[^']+'/= ?/g" | rex mode=sed "s/=\s+[0-9]+/= ?/g"

It becomes a game of figuring out what patterns you want to replace with "?" from the original SQL.

Kate_Lawrence-G
Contributor

dwaddle is right on this one. I find Kodos - http://kodos.sourceforge.net/ very helpful for figuring out complex regular expressions.

0 Karma

dwaddle
SplunkTrust
SplunkTrust

Well, it's possible I got the regex wrong. Or, it's equally possible that it's not in the right place in your search string. You need to know a little about how to apply regular expressions and sed transformation rules.

0 Karma

geneoshaughness
Explorer

That looks like it would do exactly what I want, but I don't know what to do with that. I tried that at the end of my search string, but it didn't do anything

0 Karma

Kate_Lawrence-G
Contributor

Hi,

Well I think the way to do this is to look at your 3 objects (USER_ID,COMPANY,UID) and then break them out into fields something like this (assuming they are not already fields pulled out and Splunk doesn't already recognize them)

source=<mysqlquerylow> "SELECT * FROM SAMPLE WHERE" | regex (?<usrid>" USER_ID = '(\w+\s\w+)') | regex (?<cmpy>" and COMPANY = '(\w+\s\w+\D)') | regex  (?<uid>" and UID = (\d+)" | eval yourfield="SELECT
* FROM SAMPLE WHERE "+ursid+cmpy+uid | stats count(newfield)

The benefit of this is you can report on the fields you extracted and count them individually (i.e. how often is UID 9999 coming up)

0 Karma

geneoshaughness
Explorer

It could be any sql statment that takes over 3 seconds to run. it includes updates, inserts and selects. It otens involves several forms of Join statements and subqueries.

0 Karma

Kate_Lawrence-G
Contributor

OK I think I see where you are going but you will still need to break the data out into fields to be able to report on it effectively.

Are these queries in any way similar? Do they start with or end with something definite? Also how are you determining if they are poorly performing is that info in the log as well?

0 Karma

geneoshaughness
Explorer

I should have been more clear. I don't know what those fields are.
There are hundreds of different queries with different where clauses in each one.

Many of the queries are generated when the user makes different selections in the software.

My goal is to find poorly performing queries that are frequently used in order to optimize the database to perform better.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...