All Apps and Add-ons

How to tell splunk that 2 entries are related when feeding it data from a SQL database?

zsimic
Path Finder

Hi,

I'm trying to feed some data coming from a SQL database to Splunk. I have multiple tables that I'm trying to "flatten out" for Splunk, and I'd like to know if there is any way to keep the connections that existed in the SQL DB. I have something like this:

  • Table A:
    • Columns: timestamp, eventid, class, user, host and multiple other key-value pairs
  • Table B:
    • Columns: timestamp, eventid, severity, message
    • Many rows in Table B will correspond to 1 row in Table A
  • Table C:
    • Columns: eventid, type, value
    • Many rows in Table C will also correspond to 1 row in Table A

What I would like to know is basically what's the best approach to teach Splunk about the connection that exists here via 'eventid'? (which is just a number). I'm planning on simply outputting the contents of Table A and B in this form:

For Table A, one line per row:

[time] eventid=<number>, class=<string>, user=<string>, host=<string>, etc...

For Table B, also one line per row:

[time], eventid=<number>, severity=<string>: <message>

For Table C, also one line per row:

[time], eventid=<number>, type=<string>, value=<decimal number>

I'm new to Splunk, so maybe what I'm worrying about here is irrelevant for Splunk...
Is there a way to correlate entries from B and C without repeating 'class=...' in each entry output for B and C?
Can Splunk basically find all 'eventid' numbers that were generated for a given 'class' (in a given time frame), then fetch all B and C items for all those 'eventid'-s? And could it sum/average the 'value' found in C for such a 'class'?

I don't want to repeat 'class=' in each line for B and C because the amount of data repeated would be huge:

  • there are many fields like 'class' (like 'user' and 'host' shown above), some of them with relatively long values
  • there are many entries in B and C for 1 row in A...

To explain this with a more concrete example: suppose I search for all Table A entries with class=FOO, I get a set of entries found with their 'eventid', I would like to now show a chart per day with the number (count) of entries in Table B having one of the 'eventid'-s in that set.

Also, with Table C's 'value', I would like to see a chart with average 'value'-s per day, all the 'value'-s considered being averaged in that set of 'eventid'-s, and by 'type' (found in the Table C output).

What to do basically (how to best organize the data I output) to make sure Splunk has everything it needs to make the connections?

Tags (2)
1 Solution

mslvrstn
Communicator

I believe you'll feel pretty comfortable with the command you want : join

search sourcetype=A class=FOO | join eventid [search sourcetype=B] | timechart span=24h count by eventid

After the join, you'll have a set of results, one line for every event in B that had an eventid in the first search.
You can do something similar for C, or perhaps it will make more sense to make the join search [sourcetype=B OR sourcetype=C]

View solution in original post

mslvrstn
Communicator

I believe you'll feel pretty comfortable with the command you want : join

search sourcetype=A class=FOO | join eventid [search sourcetype=B] | timechart span=24h count by eventid

After the join, you'll have a set of results, one line for every event in B that had an eventid in the first search.
You can do something similar for C, or perhaps it will make more sense to make the join search [sourcetype=B OR sourcetype=C]

zsimic
Path Finder

Allright! Made it work, all I needed to do was add index=imported as well in the join 🙂 as in:

index=imported sourcetype=A class=FOO | join eventid [search index=imported sourcetype=B] | timechart span=24h count by eventid

zsimic
Path Finder

The join doesn't seem to work, I tried but I get no entries returned. The first part of the search had to be like this because of how I organized the data (had to omit the first 'search' word otherwise I get nothing):

index=imported sourcetype=A class=FOO

That returns all results as I expected, but adding a join yields 0 events, no matter what I try

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...