Hi,
I would like to index a whole folder which contain XML files for SSO system.The XML log file format end with .svclog. The XML file contain such info:
However, when i try to remove the first line of the logs,
Question:
1. Does Splunk has limitation to index the XML file if the XML file contain some header that might restrict the file to be index by Splunk?
2. How to resolve this instead of modifying the raw file?
Thanks
In case you're still looking for a solution to this and need a working line breaking, correct timestamp extraction and automatic extractions of the XML tags, you can use the following:
[your_sourcetype_name] LINE_BREAKER = [\>\s]((?=\#E2ETraceEvent)) SHOULD_LINEMERGE = False TIME_PREFIX = SystemTime MAX_TIMESTAMP_LOOKAHEAD = 50 KV_MODE = XML
Replace # with <
I also used initCrcLength=1024 in inputs.conf for the monitor statement.
Splunk uses xpath.
Xpath does NOT ( as part of its specification ) search non-null namespaces unless you tell it to.
<root xlmns="http://example.com">
<key>value</key>
</root>
is not the same as:
<root>
<key>value</key>
</root>
In the 1st example, key is actually "http://example.com":key
In the 2nd example, key is just "key" and the namespace is null - so xpath will find it.
In programming languages you usually have to register the default namespace ( dn = "http://example.com"
). Then use it in the xpath expression
nodes = xpath("//dn:key)
This is a huge balle-ache in all languages. Non-default namespaces are always explicitly declaed in the xml-element names. Its only default namespaces that trip people up.
If Martins answer does not solve the problem, and you've stumbled across this 'gotcha' the easiest solution is going to be to sed-cmd out the xlmns="*" section in the logs before its indexed. You can do this is splunk.
If the XML files all share the same beginning, Splunk's default mechanism of detecting duplicates/logrotation may throw out all but the first file... throwing out the first line may move unique parts of the file into the CRC window. See http://docs.splunk.com/Documentation/Splunk/latest/Data/Howlogfilerotationishandled for more info.
Great - I've converted my comment to an answer to let you mark it as solved.
Guys, crcSalt =
If the CRC is the cause of the issue, then yes... however, if you are only a bit away from the window length you might just increase the window a bit (initCrcLength
). What's best for you depends on your files.
does if i put crcSalt =