Hi there,
I need some help to form regex command. My requirement is to first search for code=SEND then stats count the CountryCode, secondly search for code=RECEIVE then stats count the CountryCode.
This is my XML log:
<Cust>
<Code>SEND</Code>
<CountryCode>CN</CountryCode>
<Lty>
<CtyNm>BEIJING</CtyNm>
<Zip>100176</Zip>
</Lty>
</Cust>
<Cust>
<Code>RECEIVE</Code>
<CountryCode>JP</CountryCode>
<Lty>
<CtyNm>TOKYO</CtyNm>
<Zip>1000001</Zip>
</Lty>
</Cust>
I'm having this query formed but not meeting the above requirement, it only matches the first code=SEND and perform the count.
index=*
| rex "(?P[\w+\s+]+)"
| stats count by country
Appreciate for your help.
If you do the ingestion in such a way that each is a separate event in Splunk with valid xml syntax, the field extraction can be done by adding KV_MODE = xml
in props.conf on Search Head(s). So, main focus should be getting the data ingested correctly. if your data doesn't have timestamp then here current time is considered as the _time value for the event.
Try this for props.conf on your Indexer/Heavy Forwarder.
[SourceType_name]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)(?=\s*\<Cust\>)
DATETIME_CONFIG = CURRENT
Search head props.conf
[SourceType_name]
KV_MODE =xml
If you separate events on the basis of tag then it will be easier to perform evaluation
If you do the ingestion in such a way that each is a separate event in Splunk with valid xml syntax, the field extraction can be done by adding KV_MODE = xml
in props.conf on Search Head(s). So, main focus should be getting the data ingested correctly. if your data doesn't have timestamp then here current time is considered as the _time value for the event.
Try this for props.conf on your Indexer/Heavy Forwarder.
[SourceType_name]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)(?=\s*\<Cust\>)
DATETIME_CONFIG = CURRENT
Search head props.conf
[SourceType_name]
KV_MODE =xml
If you separate events on the basis of tag then it will be easier to perform evaluation
I'm not familiar with props.conf, the ... is nested inside a long output of XML, and there are many events with these tags. Could you further explain by editing props.conf, how can I achieve what I needed to query?
have you indexed your xml log ?
if not then you can separate events on the basis of start/end of tag
like here in your case I supposed your starting tag is <Cust>
so on the the basis of this I have separated events .
refer LINE_BREAKER = part in docs
http://docs.splunk.com/Documentation/Splunk/latest/Admin/Propsconf?utm_source=answers&utm_medium=in-...
Thanks for explaining and I got your point now. The log already indexed and I can't un-do it since I'm not admin. Furthermore, the log is not purely just XML but with other text and SOAP messages as well.
try this:
... |rex field=_raw max_match=0 "(?m)<code>(?<code>[^<]+).*(?[^<]+)"|eval merged=mvzip(Code,CountryCode,";;")|mvexpand merged| eval merged=split(merged,";;") | eval Code=mvindex(merged,0)| eval CountryCode=mvindex(merged,1)
try this run anywhere search:
|makeresults|eval raw="<Cust> <Code>SEND</Code> <CountryCode>CN</CountryCode> <Lty> <CtyNm>BEIJING</CtyNm> <Zip>100176</Zip> </Lty> </Cust>
<Cust> <Code>RECEIVE</Code> <CountryCode>JP</CountryCode> <Lty> <CtyNm>TOKYO</CtyNm> <Zip>1000001</Zip> </Lty> </Cust>"
|rex field=raw max_match=0 "(?m)<Code>(?<Code>[^<]+).*<CountryCode>(?<CountryCode>[^<]+)"|eval merged=mvzip(Code,CountryCode,";;")|mvexpand merged| eval merged=split(merged,";;") | eval Code=mvindex(merged,0)| eval CountryCode=mvindex(merged,1)
Thank you. I get some ideas with your answer.
is this only one event? why you have not separated event on basis of tag