Sample data:
<id>WGBSTH8180T</id>
<sytems>
<sys_Id>14502</sys_Id>
<name>GYS</name>
<version>9901</version>
<ip_address>172.11.11.212</ip_address>
<connector>
<connector_Id>TH818AST001A</connector_Id></connector></sytems>
I can able to get the value WGBSTH8180T
with the regex like | rex field=Info "(?ms)(?.*?)"
Can anyone help me out with how to extract the values sys_Id (systems), name (systems), version (systems), ip_address (systems), and connector_Id (connector) from the data above?
I'm using regex as mentioned below, but its not working. Please help me out to write regex:
| rex field=Info "(?ms)<sytems><sys_Id>(?<sytems>.*?)</sys_Id></sytems>"
| rex field=Info "(?ms)<sytems><name>(?<Name>.*?)</name></sytems>"
| rex field=Info "(?ms)<sytems><version>(?<Version>.*?)</version></sytems>"
| rex field=Info "(?ms)<sytems><ip_address>(?<Ip_Address>.*?)</ip_address></sytems>"
| rex field=Info "(?ms)<sytems><connector><connector_Id>(?<Connector_Ids>.*?)</sys_Id></connector_Id></sytems>"
Thanks in advance
Lots of threads are out there about the topic such as - https://answers.splunk.com/answers/49521/splunk-why-must-xml-sources-be-so-complicated.html
The following regex matches in 66 or less steps.
rex field=info "\<id\>(?<id>[^\<]+)\<[^\<]+\<[^\s]+\>\s+\<sys_Id\>(?<sys_Id>[^\<]+)\<[^\<]+\<(?<name>[^\<]+)\<[^\<]+\<(?<version>[^\<]+)\<[^\<]+\<ip_address\>(?<ip_address>[^\<]+)\<[^\<]+\<[^\s]+\>\s+\<(?<connector_id>[^\<]+)"
All the answers given so far will work. Have poor or non-optimized regex statements for large dataset cause poor search performance. What work well for 10k events doesn't work well 1Billion events.
There are three reasons the rex commands are not working:
1) The characters between "<sytems>" and the following tag are not accounted for.
2) Slashes must be escaped.
3) The last rex has closing tags in the wrong order.
These rex commands should work, although @jplumsdaine22's answer is better.
| rex field=info "<sys_Id>(?<sytems>.*?)<\/sys_Id>"
| rex field=Info "<name>(?<Name>.*?)<\/name>"
| rex field=Info "<version>(?<Version>.*?)<\/version>"
| rex field=Info "<ip_address>(?<Ip_Address>.*?)<\/ip_address>"
| rex field=Info "<connector_Id>(?<Connector_Ids>.*?)<\/connector_Id>"
If the log files you are indexing are valid xml, just use the spath
command. See the search reference
http://docs.splunk.com/Documentation/Splunk/6.3.3/SearchReference/Spath