Splunk Search

Single field not always extracted, but appears when piping into "extract"

spock_yh
Path Finder

I have set up a search-time field extraction. The extraction extracts a bunch of fields from a URL in a log file.

My problem is that for one of these fields, some events contain it and others do not, with no apparent reason. Here are two such examples. The first manages to extract the field, the second doesn't:

1.1.9.1 - [20/Mar/2011:17:39:37 -0700] 15625 "some.web.site" GET "/myaccount/videos/B004CZXC54.flv" "" 307 - "medusa" "-" "Python-urllib/2.6" "2.2.2.2"

1.1.9.1 - [20/Mar/2011:18:10:45 -0700] 0 "some.web.site" GET "/myaccount/videos/B003QMJAXM.flv" "" 307 - "medusa" "-" "Python-urllib/2.6" "2.2.2.2"

The field I'm trying to extract is the one corresponding to the "myaccount" part. As you can see, the two events are extremely similar - but the first doesn't show the field, the second does.

The odd thing about this is that: * If I pipe my search into | extract reload=T, I can see the missing field for all results. * There are a number of fields after this missing field (for the "videos" part, "B003QM.." part, "flv" part, etc) that are extracted fine.

The original regular expression was quite complex but I stripped it down to something simple that still shows the problem:

 /(?<medusa_account_alias>[^/]+)/(?<medusa_restype>videos|images)

The problem field is the medusa_account_alias field. The fields following it seem to be extracted ok.

Any ideas will be greatly appreciated, is this some kind of bug in splunk or am I missing something?

Tags (1)
0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

Can you please provide the props.conf/transforms.conf stanzas that are responsible for performing the extractions and field aliasing?

spock_yh
Path Finder

The problem is caused by a field alias I have defined.

What I want is to have medusa_account_alias filled either from the above regex, or from another field ("accountId") extracted for another format of the log row. I used an alias from accountId to medusa_account_alias, and this caused the problem.

How do I achieve this otherwise? Having a field that can get filled by two disjoint cases?

Also, this doesn't explain why splunk's behavior was so arbitrary - why would it generate medusa_account_alias for one event and not for the other?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...