Solved: Removing all white spaces from event at Index time

Tim_1 · ‎09-21-2017

Hi all,

I want to remove the whitespaces from only the account value, and not the whole event at index time. Is this possible?

Given the events look like this:

{"account": "Account", "justification": "TEST 1", "value": "50"}

{"account": "dev 1", "justification": "TEST 2", "value": "50"}

{"account": "uat test acc", "justification": "TEST 3", "value": "50"}

{"account": "a .. x .. y .. z .. etc", "justification": "TEST 4", "value": "50"}

I want it to look like this:

{"account": "Account", "justification": "TEST 1", "value": "50"}

{"account": "dev1", "justification": "TEST 2", "value": "50"}

{"account": "uattestacc", "justification": "TEST 3", "value": "50"}

{"account": "axyzetc", "justification": "TEST 4", "value": "50"}

cpetterborg · ‎09-21-2017

The following is assuming that you really have data that looks like 1 .. n in your data stream, rather than something like 1 2 3 4 5 6 7 8 9 0. If you have only things like the latter, then it will be a simpler regex, but this one will work either way.

You could probably do something like this in props.conf:

SEDCMD-pass1 = s/Account ([^"\s]+)(\s([^"\s]+))?(\s([^"\s]+))?(\s([^"\s]+))?(\s([^"\s]+))?/\1\2\4\6\8/

This will remove up to 4 spaces. If you need to do more, then add a second pass, or third pass:

SEDCMD-pass2 = s/Account ([^"\s]+)(\s([^"\s]+))?(\s([^"\s]+))?(\s([^"\s]+))?(\s([^"\s]+))?/\1\2\4\6\8/

I haven't completely tested this, but I believe it to be fairly correct. If your event data differs much from this example, then it could make things more difficult.

View solution in original post

lfedak_splunk · ‎09-21-2017

Hey @Tim_1 if they solved your problem, please don't forget to accept an answer! You can upvote posts as well. (Karma points will be awarded for either action.) Happy Splunking!

Tim_1 · ‎09-22-2017

Hi @Ifedak, will do so once I found a solution. Thanks 🙂

DalJeanis · ‎09-21-2017

I assume that you mean you want to eliminate all spaces, or all white space, from the account field at index time?

Try something like this in transforms.conf

[stanzaname]
SOURCE_KEY = account
REGEX = ^([^\s]+)(\s+)*([^\s]*)(\s+)*([^\s]*)(\s+)*([^\s]*)(\s+)*([^\s]*)(\s+)*(.*)$
DEST_KEY = account
FORMAT = $1$3$5$7$9$11

You can repeat this phrase ([^\s]*)(\s+)* once for each number of spaces you want to eliminate, and add one more odd number to the FORMAT. Not sure how many is the highest possible number.

Tim_1 · ‎09-22-2017

Hi @DalJeanis,

Thanks for the answer. Is there a way to do it without having to change it depending on the number of spaces? I would prefer not to have to create multiple stanza for each different number of n spaces.

Also, my question wasn't 100% clear on the data I want to reformat. I've updated the question to be more inline of what the data should be.

cpetterborg · ‎09-21-2017

The following is assuming that you really have data that looks like 1 .. n in your data stream, rather than something like 1 2 3 4 5 6 7 8 9 0. If you have only things like the latter, then it will be a simpler regex, but this one will work either way.

You could probably do something like this in props.conf:

SEDCMD-pass1 = s/Account ([^"\s]+)(\s([^"\s]+))?(\s([^"\s]+))?(\s([^"\s]+))?(\s([^"\s]+))?/\1\2\4\6\8/

This will remove up to 4 spaces. If you need to do more, then add a second pass, or third pass:

SEDCMD-pass2 = s/Account ([^"\s]+)(\s([^"\s]+))?(\s([^"\s]+))?(\s([^"\s]+))?(\s([^"\s]+))?/\1\2\4\6\8/

I haven't completely tested this, but I believe it to be fairly correct. If your event data differs much from this example, then it could make things more difficult.

Tim_1 · ‎09-22-2017

Hi @cpetterborg,

Thanks for the answer. My question wasn't 100% clear with the examples, so I've updated the question to be more inline of what the data should be.

The data won't be integers, but strings.

cpetterborg · ‎09-22-2017

This should still work with strings of multiple characters.

Tim_1 · ‎09-22-2017

Yes, got it half working so far.
Thanks for the help. 🙂
Will accept when fully complete.

sbbadri · ‎09-21-2017

try this,

| makeresults | eval test="{\"account\": \"Account 1 2\", \"justification\": \"TEST 1\", \"value\": \"50\"}" | rex field=test "(?P<t1>{\"account\":\s+)(?P<t2>\"Account\s+\S+.*\")(?P<t3>\,\s+\"justification\":\s+\"TEST\s+\d+\"\,\s+\"value\":\s+\"\d+\"})" | rex field=t2 mode=sed "s/ //g" | eval t4=t1+t2 | eval t5=t4+t3 | rename t5 as test

Tim_1 · ‎09-22-2017

Hi @sbbadri,

Thanks for the answer, but I am looking at doing this at index time and not at search time.

Removing all white spaces from event at Index time

Introducing the 2024 SplunkTrust!

Introducing the 2024 Splunk MVPs!

Splunk Custom Visualizations App End of Life