Splunk Search

How can I extract fields from a space delimited event with potential spaces in the field values?

jamesvz84
Communicator

How would I go along extracting fields for the below? The challenge I am seeing is that it seems to be delimited by space, but the values themselves can contain a space. For example, the header datatime has space, and the user agent has spaces (though the latter has quotes around it).

What would be the best approach for extracting fields from this data?

Aug 27 17:48:19 10.252.22.22 Aug 27 10:46:48 10.251.106.44 2015-08-27 17:35:43 19 10.234.37.191 - - - OBSERVED "News/Media" http://bits.blogs.nytimes.com/2015/08/26/facebook-tests-a-digital-assistant-for-its-messaging-app/?_...  200 TCP_HIT GET image/jpeg http graphics8.nytimes.com 80 /images/2015/08/28/business/28eugoogle-web/28eugoogle-web-mediumThreeByTwo210.jpg - jpg "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36" 10.251.106.44 8762 4053 - "none" "none"
0 Karma

lguinn2
Legend

A field definition is ultimately a regular expression. You can certainly write a regular expression that would include spaces - or anything else! Of course, for a complicated event, the regular expressions may be complex as well.

You might be able to avoid writing your own regular expression if your data is one of the pretrained sourcetypes, or if there is an app for the data.

The timestamp is a special case. Splunk's default timestamp extraction is not confused by spaces, although it might have some problem with the fact that there are 3 timestamps in the event! Which one is the event time? Again, you can use regular expressions to help Splunk identify the proper time stamp; here is some info in the documentation.

I frankly think that "grouping fields" on the fly is an inconvenient way to do things. Remember that field extractions are dynamic - you can change them at any time. So even if you have already indexed the data, you can change the field definitions. [Exception: unless you used "index time" field extractions - which you should avoid as much as possible.]

If you need help writing the regular expressions, tell us exactly how you want the fields broken out in this event...

bschaithnyakuma
New Member

11/06/2018 01:31:21.784 (# 178) (58w8239-11212-2001-0078-00999393003903) Director (Director, 63) 1

I need to get (5***) as a field in the above log

0 Karma

changux
Builder

What do you think about using the space as field separator and after discover all, group some fields in eventtypes for example? Also you can use eval functions.

Use eventtypes

Group using eval

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...