Getting Data In

Best way to extract mix of positional and kv from csv?

tdotrob
Engager

My CSV log file has three fields that are positional followed by a variable mix of K=V pairs like so:

2017/12/11 20:28:57.424953, MQXF_PUT, A, PID=20942, TID=1, CC=0, RC=0, UserId=mqm, HConn=20971526, HObj=2, MD_PutDate=2017/12/12, MD_PutTime=01:28:57.42, MD_MsgId=414D5120415348202020202020202020D9302F5AF62D9221, MD_Expiry=-1, MD_Persistence=MQPER_PERSISTENT, MD_PutApplName=q, MD_PutApplType=MQAT_UNIX, MD_ReplyToQ=CHANNEL.REPLY, BufferLength=424, MsgData=
2017/12/11 20:28:57.443437, MQXF_GET, A, PID=20973, TID=1, CC=0, RC=0, UserId=mqm, HConn=20971526, HObj=2, MD_PutDate=2017/12/12, MD_PutTime=01:28:57.42, MD_MsgId=414D5120415348202020202020202020D9302F5AF72D9221, MD_CorrelId=414D5120415348202020202020202020D9302F5AF62D9221, MD_Expiry=-1, MD_Persistence=MQPER_PERSISTENT, MD_PutApplName=ASH, MD_PutApplType=MQAT_QMGR, MD_ReplyToQ=CHANNEL.REPLY, MD_ReplyToQMgr=ASH, BufferLength=32710, DataLength=852, MD_PutDate=2017/12/12, MD_PutTime=01:28:57.42, MD_MsgId=414D5120415348202020202020202020D9302F5AF62D9221, MD_Expiry=-1, MD_Persistence=MQPER_PERSISTENT, MD_PutApplName=q, MD_PutApplType=MQAT_UNIX, MD_ReplyToQ=CHANNEL.REPLY, MD_ReplyToQMgr=ASH, MsgData=
2017/12/11 20:28:57.493279, MQXF_PUT, A, PID=20987, TID=4, CC=0, RC=0, UserId=mqm, HConn=20971526, HObj=4, MD_PutDate=2017/12/12, MD_PutTime=01:28:57.45, MD_MsgId=414D5120424952434820202020202020F1FE2E5A8C946923, MD_CorrelId=414D5120415348202020202020202020D9302F5AF62D9221, MD_Expiry=-1, MD_Persistence=MQPER_PERSISTENT, MD_PutApplName=q, MD_PutApplType=MQAT_UNIX, MD_ReplyToQ=CHANNEL.REPLY, MD_ReplyToQMgr=ASH, BufferLength=424, MsgData=
2017/12/11 20:28:57.496147, MQXF_GET, A, PID=20944, TID=1, CC=0, RC=0, UserId=mqm, HConn=20971526, HObj=2, MD_PutDate=2017/12/12, MD_PutTime=01:28:57.45, MD_MsgId=414D5120424952434820202020202020F1FE2E5A8C946923, MD_CorrelId=414D5120415348202020202020202020D9302F5AF62D9221, MD_Expiry=-1, MD_Persistence=MQPER_PERSISTENT, MD_PutApplName=q, MD_PutApplType=MQAT_UNIX, MD_ReplyToQ=CHANNEL.REPLY, MD_ReplyToQMgr=ASH, BufferLength=2048, DataLength=424, MsgData=

The log contains a record of varying API calls, each with a different quantity and mix of fields. For example, the first record has an MD_MsgID but no MD_CORRELID. All the other records have both. The second record has multiple of these.

Eventually I want to correlate all of these such that if we take the unique MD_MsgID in the first record and find it in the MsgID or CorrelID of any other record, that group of 4 entries constitutes a transaction. To do that seems to require parsing the positional fields in the front, and the KV pairs that follow.

I tried creating a Source Type with 3 custom names but all the other fields show up as EXTRA_FIELD_1, EXTRA_FIELD_2, EXTRA_FIELD_3, etc. If I let it parse the KV pairs, the positional fields show up like MQXF_PUT="MQXF_PUT". I think I need to somehow do both at once.

Any suggestions greatly appreciated.

0 Karma

tdotrob
Engager

Still tweaking this but I ended up making a new source type that seems to be working. No idea if this is the best approach, or even a good approach (my Splunk experience is all of 4 days at this point) but it does the job.

In transforms.conf:

[cware-mqa]
# Extracts for Capitalware MQ Auditor 
# Extracts: cwmqa_method, cwmqa_ordinal
# 2017/12/11 22:38:57.021568, MQXF_GET, A, PID=30789, TID=1, CC=0, RC=0, variable KV pairs follow
REGEX = ^[^,]+, [[nspaces:cwmqa_method]],\s++[[nspaces:cwmqa_ordinal]],\s++[[all:other]]
MV_ADD = true

In props.conf:

[cware_mq_auditor]
pulldown_type = true
REPORT-access = cware-mqa
SHOULD_LINEMERGE = False
TIME_PREFIX = ^
category = Web
description = Parse positional fields from Capitalware MQ Auditor logs
0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...