I am trying to bring in MS lync conversations into Splunk. We can get To: and From: data but the conversation data is encoded. I tried using the CHARSET = utf-8 in the props.conf but that doesn't work. Any thoughts? Is this possible?
Using a powershell script we can get the following:
Date: Tue, 16 Jul 2013 15:01:02 GMT
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="MIME_Boundary"
From: user1@thisplace.com
To: user2@thisplace.com
Session-Id: 253f92dae4f9460bb66fb37a7cec8c13;9ca2fa0cac;3c2d56428e
Subject: Conversation between mcross@llbean.com and jandrews@llbean.com
--MIME_Boundary
Content-Transfer-Encoding: base64
Content-Type: application/msword;
charset="utf-8"
e1xydGYxXGZiaWRpc1xhbnNpXGFuc2ljcGcxMjUyXGRlZmYwXGRlZmxhbmcxMDMzXGRlZnRhYjM2
MHtcZm9udHRibHtcZjBcZm5pbFxmY2hhcnNldDAgTWljcm9zb2Z0IFNhbnMgU2VyaWY7fXtcZjFc
ZnN3aXNzXGZjaGFyc2V0MCBTZWdvZSBVSTt9e1xmMlxmc3dpc3MgQXJpYWw7fX0NCntcY29sb3J0
YmwgO1xyZWQwXGdyZWVuMFxibHVlMDt9DQpcdmlld2tpbmQ0XHVjMVxwYXJkXGx0cnBhclxmMFxm
czE3IFxiIFRyYW5zY3JpcHQgZm9yIGluc3RhbnQgbWVzc2FnaW5nIChJTSkgc2Vzc2lvbjogXGIw
XGIgQ29udmVyc2F0aW9uIGJldHdlZW4gbWNyb3NzQGxsYmVhbi5jb20gYW5kIGphbmRyZXdzQGxs
YmVhbi5jb21cYjBccGFyDQpccGFyDQpcYiBtY3Jvc3NAbGxiZWFuLmNvbSBbMjAxMy0wNy0xNiAx
NTowMTowMlpdOlxiMFxwYXINCiAgXGNmMVxmMVxmczIwIEhpIEpvc2ggYW5kIHdlbGNvbWUgYmFj
ayAtXH5wbGVhc2UgYXR0ZW5kIG10ZyBpbiBQaW5lY29uZSBhdCAxMWFtIC0gVk13YXJlIEhBIC0g
cGxlYXNlIGJyaW5nIExldmkgd2l0aCB5b3UgLSB5b3UgYm90aCBnb3QgaW52aXRlIGJ1dCBkaWQg
bm90IHJlc3BvbmRcY2YwXGYyXGZzMjRccGFyDQpcZjBcZnMxNyBcYiBqYW5kcmV3c0BsbGJlYW4u
Y29tIFsyMDEzLTA3LTE2IDE1OjAxOjEyWl06XGIwXHBhcg0KICBcY2YxXGYxXGZzMjAgSSB3aWxs
IGJlIGxhdGVcY2YwXGYyXGZzMjRccGFyDQpcZjBcZnMxNyBcYiBtY3Jvc3NAbGxiZWFuLmNvbSBb
MjAxMy0wNy0xNiAxNTowMToyNVpdOlxiMFxwYXINCiAgXGNmMVxmMVxmczIwIGhvdyBsYXRlIC0g
SSB3aWxsIGxldCBEaWNrIGtub3dcY2YwXGYyXGZzMjRccGFyDQpcZjBcZnMxNyBcYiBqYW5kcmV3
c0BsbGJlYW4uY29tIFsyMDEzLTA3LTE2IDE1OjAxOjM2Wl06XGIwXHBhcg0KICBcY2YxXGYxXGZz
MjAgNS0xMCBtaW5cY2YwXGYyXGZzMjRccGFyDQpcZjBcZnMxNyBcYiBqYW5kcmV3c0BsbGJlYW4u
Y29tIFsyMDEzLTA3LTE2IDE1OjAxOjQwWl06XGIwXHBhcg0KICBcY2YxXGYxXGZzMjAgd29ya2lu
ZyBvbiBhbiBpc3N1ZVxjZjBcZjJcZnMyNFxwYXINClxmMFxmczE3IFxiIG1jcm9zc0BsbGJlYW4u
Y29tIFsyMDEzLTA3LTE2IDE1OjAxOjQyWl06XGIwXHBhcg0KICBcY2YxXGYxXGZzMjAgb2sgLSB0
aGFua3NcY2YwXGYyXGZzMjRccGFyDQp9DQo=
--MIME_Boundary--
This is what’s in the Encoded text. I was able to view using MS Mord:
Transcript for instant messaging (IM) session: Conversation between user1@thisplace.com and user2@thisplace.com
user1@thisplace.com [2013-07-16 15:03:21Z]:
hi - moved to Snapdragon
user2@thisplace.com [2013-07-16 15:03:32Z]:
ok i will be there in a minute
It's not the UTF8 character set that's "encoding" the message really - the lines you're seeing at the end of the log sample are base64 encoded. There is an app written by @dwaddle here on splunkbase that does base64 decoding. Check it out. http://splunk-base.splunk.com/apps/35644/base64-custom-command