Splunk Search

How to build a lookup table without delimiters?

ddrillic
Ultra Champion

Due to the nature of the data, we can't use any delimiters.

The data layout is as follows by character position.

Name = 1-8
Department = 9-12
Location= 13-24
New Department = 25-28
Status = 29-30

Is there a way to specify the lookup definition based on these character position?

Tags (2)
0 Karma
1 Solution

DalJeanis
SplunkTrust
SplunkTrust

The field names are not an issue. Knowing that the data is abstracted and/or encrypted is enough.

Splunk CAN bring in and process binary files...

Assuming the data is all in one 30 byte field, then this would extract the binary-valued fields...

 | rex "^(?<Name>.{8})(?<Department>.{4})(?<Location>.{12})(?<New Department>.{4})(?<Status>.{2})$"

...but I'm just not sure what other gotchas there might be involved with just slapping that data into a lookup and trying to use it as is.

I am TEMPTED to think in terms of having each of those fields except Status being converted into and represented by one to three 4-byte numbers. I know that would perform the function without issue, but I don't know if I'm introducing unneeded complexity that the vanilla system would handle straight out of the box.

I SUSPECT, based on other questions and answers about binary data, that splunk just isn't architected to handle it very well.

The best option that I can suggest is to convert the binary data into display-hex, thus taking up twice as much space, but consisting only of [0-9A-F]. Then it can be treated as character data.

View solution in original post

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

The field names are not an issue. Knowing that the data is abstracted and/or encrypted is enough.

Splunk CAN bring in and process binary files...

Assuming the data is all in one 30 byte field, then this would extract the binary-valued fields...

 | rex "^(?<Name>.{8})(?<Department>.{4})(?<Location>.{12})(?<New Department>.{4})(?<Status>.{2})$"

...but I'm just not sure what other gotchas there might be involved with just slapping that data into a lookup and trying to use it as is.

I am TEMPTED to think in terms of having each of those fields except Status being converted into and represented by one to three 4-byte numbers. I know that would perform the function without issue, but I don't know if I'm introducing unneeded complexity that the vanilla system would handle straight out of the box.

I SUSPECT, based on other questions and answers about binary data, that splunk just isn't architected to handle it very well.

The best option that I can suggest is to convert the binary data into display-hex, thus taking up twice as much space, but consisting only of [0-9A-F]. Then it can be treated as character data.

0 Karma

ddrillic
Ultra Champion

Much appreciated @DalJeanis

DalJeanis
SplunkTrust
SplunkTrust

Yes, but no.

First, there is no reason your delimiter can't be something not possible to be present in the data, such as "!!!!".

Second, unless the data is encrypted, those fields don't present as data types that would necessarily include ALL OF the special characters... semicolons, exclamation points, commas, @ # $ ^ & and so on.

So, what's up here?

0 Karma

ddrillic
Ultra Champion

Great - thank you @DalJeanis

Instead of the field names mentioned before please consider the following -

Field1 = positions 01-08
Field2 = positions 09-12
Field3 = positions 13-24
Field4 = positions 25-28
Field5 = positions 29-30

These fields may contain any combinations of characters (displayable and non-displayable) including special characters. So there are no combinations of characters that could reliably be used as a delimiter field.
So the question is - Can a lookup table be built from a structured file where the records are fixed length as defined before, and how?

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Are you indexing this data? Do you want to use data as-is as lookup table file?

0 Karma

ddrillic
Ultra Champion

We would like to use the data as-is...

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

Will the system have to deal with any binary zeroes x'00' in the data?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...