Dashboards & Visualizations

Unable to Index XML file

akram
Explorer

Hi,
I would like to index a whole folder which contain XML files for SSO system.The XML log file format end with .svclog. The XML file contain such info:

However, when i try to remove the first line of the logs, , Splunk suddenly able to index the logs.

Question:
1. Does Splunk has limitation to index the XML file if the XML file contain some header that might restrict the file to be index by Splunk?
2. How to resolve this instead of modifying the raw file?

Thanks

Tags (3)
0 Karma

mikaelbje
Motivator

In case you're still looking for a solution to this and need a working line breaking, correct timestamp extraction and automatic extractions of the XML tags, you can use the following:

[your_sourcetype_name]
LINE_BREAKER = [\>\s]((?=\#E2ETraceEvent))
SHOULD_LINEMERGE = False
TIME_PREFIX = SystemTime
MAX_TIMESTAMP_LOOKAHEAD = 50
KV_MODE = XML

Replace # with <

I also used initCrcLength=1024 in inputs.conf for the monitor statement.

0 Karma

jonuwz
Influencer

Splunk uses xpath.

Xpath does NOT ( as part of its specification ) search non-null namespaces unless you tell it to.

<root xlmns="http://example.com">
  <key>value</key>
</root>

is not the same as:

<root>
  <key>value</key>
</root>

In the 1st example, key is actually "http://example.com":key

In the 2nd example, key is just "key" and the namespace is null - so xpath will find it.

In programming languages you usually have to register the default namespace ( dn = "http://example.com" ). Then use it in the xpath expression

nodes = xpath("//dn:key)

This is a huge balle-ache in all languages. Non-default namespaces are always explicitly declaed in the xml-element names. Its only default namespaces that trip people up.

If Martins answer does not solve the problem, and you've stumbled across this 'gotcha' the easiest solution is going to be to sed-cmd out the xlmns="*" section in the logs before its indexed. You can do this is splunk.

martin_mueller
SplunkTrust
SplunkTrust

If the XML files all share the same beginning, Splunk's default mechanism of detecting duplicates/logrotation may throw out all but the first file... throwing out the first line may move unique parts of the file into the CRC window. See http://docs.splunk.com/Documentation/Splunk/latest/Data/Howlogfilerotationishandled for more info.

martin_mueller
SplunkTrust
SplunkTrust

Great - I've converted my comment to an answer to let you mark it as solved.

0 Karma

akram
Explorer

Guys, crcSalt = solved my issue. 😉 Thanks, appreciate it.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

If the CRC is the cause of the issue, then yes... however, if you are only a bit away from the window length you might just increase the window a bit (initCrcLength). What's best for you depends on your files.

akram
Explorer

does if i put crcSalt = will solve this?

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...