Getting Data In

Best way to extract mix of positional and kv from csv?

tdotrob
Engager

My CSV log file has three fields that are positional followed by a variable mix of K=V pairs like so:

2017/12/11 20:28:57.424953, MQXF_PUT, A, PID=20942, TID=1, CC=0, RC=0, UserId=mqm, HConn=20971526, HObj=2, MD_PutDate=2017/12/12, MD_PutTime=01:28:57.42, MD_MsgId=414D5120415348202020202020202020D9302F5AF62D9221, MD_Expiry=-1, MD_Persistence=MQPER_PERSISTENT, MD_PutApplName=q, MD_PutApplType=MQAT_UNIX, MD_ReplyToQ=CHANNEL.REPLY, BufferLength=424, MsgData=
2017/12/11 20:28:57.443437, MQXF_GET, A, PID=20973, TID=1, CC=0, RC=0, UserId=mqm, HConn=20971526, HObj=2, MD_PutDate=2017/12/12, MD_PutTime=01:28:57.42, MD_MsgId=414D5120415348202020202020202020D9302F5AF72D9221, MD_CorrelId=414D5120415348202020202020202020D9302F5AF62D9221, MD_Expiry=-1, MD_Persistence=MQPER_PERSISTENT, MD_PutApplName=ASH, MD_PutApplType=MQAT_QMGR, MD_ReplyToQ=CHANNEL.REPLY, MD_ReplyToQMgr=ASH, BufferLength=32710, DataLength=852, MD_PutDate=2017/12/12, MD_PutTime=01:28:57.42, MD_MsgId=414D5120415348202020202020202020D9302F5AF62D9221, MD_Expiry=-1, MD_Persistence=MQPER_PERSISTENT, MD_PutApplName=q, MD_PutApplType=MQAT_UNIX, MD_ReplyToQ=CHANNEL.REPLY, MD_ReplyToQMgr=ASH, MsgData=
2017/12/11 20:28:57.493279, MQXF_PUT, A, PID=20987, TID=4, CC=0, RC=0, UserId=mqm, HConn=20971526, HObj=4, MD_PutDate=2017/12/12, MD_PutTime=01:28:57.45, MD_MsgId=414D5120424952434820202020202020F1FE2E5A8C946923, MD_CorrelId=414D5120415348202020202020202020D9302F5AF62D9221, MD_Expiry=-1, MD_Persistence=MQPER_PERSISTENT, MD_PutApplName=q, MD_PutApplType=MQAT_UNIX, MD_ReplyToQ=CHANNEL.REPLY, MD_ReplyToQMgr=ASH, BufferLength=424, MsgData=
2017/12/11 20:28:57.496147, MQXF_GET, A, PID=20944, TID=1, CC=0, RC=0, UserId=mqm, HConn=20971526, HObj=2, MD_PutDate=2017/12/12, MD_PutTime=01:28:57.45, MD_MsgId=414D5120424952434820202020202020F1FE2E5A8C946923, MD_CorrelId=414D5120415348202020202020202020D9302F5AF62D9221, MD_Expiry=-1, MD_Persistence=MQPER_PERSISTENT, MD_PutApplName=q, MD_PutApplType=MQAT_UNIX, MD_ReplyToQ=CHANNEL.REPLY, MD_ReplyToQMgr=ASH, BufferLength=2048, DataLength=424, MsgData=

The log contains a record of varying API calls, each with a different quantity and mix of fields. For example, the first record has an MD_MsgID but no MD_CORRELID. All the other records have both. The second record has multiple of these.

Eventually I want to correlate all of these such that if we take the unique MD_MsgID in the first record and find it in the MsgID or CorrelID of any other record, that group of 4 entries constitutes a transaction. To do that seems to require parsing the positional fields in the front, and the KV pairs that follow.

I tried creating a Source Type with 3 custom names but all the other fields show up as EXTRA_FIELD_1, EXTRA_FIELD_2, EXTRA_FIELD_3, etc. If I let it parse the KV pairs, the positional fields show up like MQXF_PUT="MQXF_PUT". I think I need to somehow do both at once.

Any suggestions greatly appreciated.

0 Karma

tdotrob
Engager

Still tweaking this but I ended up making a new source type that seems to be working. No idea if this is the best approach, or even a good approach (my Splunk experience is all of 4 days at this point) but it does the job.

In transforms.conf:

[cware-mqa]
# Extracts for Capitalware MQ Auditor 
# Extracts: cwmqa_method, cwmqa_ordinal
# 2017/12/11 22:38:57.021568, MQXF_GET, A, PID=30789, TID=1, CC=0, RC=0, variable KV pairs follow
REGEX = ^[^,]+, [[nspaces:cwmqa_method]],\s++[[nspaces:cwmqa_ordinal]],\s++[[all:other]]
MV_ADD = true

In props.conf:

[cware_mq_auditor]
pulldown_type = true
REPORT-access = cware-mqa
SHOULD_LINEMERGE = False
TIME_PREFIX = ^
category = Web
description = Parse positional fields from Capitalware MQ Auditor logs
0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...