Knowledge Management

Problem with parsing multi-line Key Value events from DBconnect Dump

hmdoan
Explorer

I'm fairly new to using DB Connect. But I needed to index event data from a netcool reporting database (Oracle 11g). As this is an event correlation database, I was forced to perform a dump to Splunk on a daily basis (rising column tailing won't work here). I was using key-value format initially and it seemed to work just fine - until I realized that some Oracle columns were really poorly formatted with unmatched quotes, new lines, CRs, etc. in them.

This broke how Splunk indexed those events resulting in me missing about 10% of my total events.

The way around this was to use Multi-line Key Value format for my output.

Now everything came in just fine. All events were accounted for. However, my Splunk fields were all truncated:

Example:
The event shows the following:
summary=The power supply in Bay # 1 has changed to state: general failure
name=Bob Edwards
ticketnumber=No fault found

Splunk field shows up as:
summary=The
name=Bob
ticketnumber=No

How do I make Splunk pull in everything?

This was not a problem when I used the single key-value output - due to the quotes be automatically put in for strings. Not sure what to do form MKV.

0 Karma
1 Solution

hmdoan
Explorer

I really wish Splunk would put the same intelligence for field extraction in the MKV output as for the KV output. The KV output is smart enough to escape funny characters and add quotes for strings with spaces in them.

Oh well. My solution was even simpler. Clean up the events at the source. I ended up cleaning up my poorly formatted strings by running regexp_replace in my SQL query where ever I knew I might have had an ugly string. This cleaned up any special control characters as well as un-matched quotes. With this string clean-up in place, going back and using the KV output worked perfectly:

select
   to_char(lastmodified,'YYYY/MM/DD HH24:MI:SS') timestamp,
   to_char(lastmodified,'YYYY/MM/DD HH24:MI:SS') lastmodified,
   to_char(firstoccurrence,'YYYY/MM/DD HH24:MI:SS') firstoccurrence,
   to_char(lastoccurrence,'YYYY/MM/DD HH24:MI:SS') lastoccurrence,
   regexp_replace(summary,'[[:cntrl:]]|\"', '')summary,
   regexp_replace(text1,'[[:cntrl:]]|\"', '')text1,
   regexp_replace(text2,'[[:cntrl:]]|\"', ' ')text2,
   regexp_replace(text3,'[[:cntrl:]]|\"', ' ')text3,
   serial

from 
   Blah...

where
   Blah-Blah

View solution in original post

hmdoan
Explorer

I really wish Splunk would put the same intelligence for field extraction in the MKV output as for the KV output. The KV output is smart enough to escape funny characters and add quotes for strings with spaces in them.

Oh well. My solution was even simpler. Clean up the events at the source. I ended up cleaning up my poorly formatted strings by running regexp_replace in my SQL query where ever I knew I might have had an ugly string. This cleaned up any special control characters as well as un-matched quotes. With this string clean-up in place, going back and using the KV output worked perfectly:

select
   to_char(lastmodified,'YYYY/MM/DD HH24:MI:SS') timestamp,
   to_char(lastmodified,'YYYY/MM/DD HH24:MI:SS') lastmodified,
   to_char(firstoccurrence,'YYYY/MM/DD HH24:MI:SS') firstoccurrence,
   to_char(lastoccurrence,'YYYY/MM/DD HH24:MI:SS') lastoccurrence,
   regexp_replace(summary,'[[:cntrl:]]|\"', '')summary,
   regexp_replace(text1,'[[:cntrl:]]|\"', '')text1,
   regexp_replace(text2,'[[:cntrl:]]|\"', ' ')text2,
   regexp_replace(text3,'[[:cntrl:]]|\"', ' ')text3,
   serial

from 
   Blah...

where
   Blah-Blah

jcoates_splunk
Splunk Employee
Splunk Employee

the fields with spaces in them need to be quoted, or you need to change your delimiter (though if you do it that way, you'll then need to do your own key-value extraction).

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...