I have following message format.
2013-06-17 15:33:01+0200 appid="myapplication" responsetimems="155" message="Calling method="calculate" class="math" data="size="98123" rows="9811" firstcolumn="customername"""
Splunk parsed that into following fields
appid = myapplication
responsetimesms = 155
message = Calling method=
class = math
data = size=
rows = 9811
firstcolumn = customername
But I want to
appid = myapplication
responsetimesms 155
message = Calling method="calculate" class="math" data="size="98123" rows="9811" firstcolumn="customername"
method = calculate
class = math
data = size="98123" rows="9811" firstcolumn="customername"
size = 98123
rows = 9811
firstcolumn = customername
How can I do that?
What Kristian writes is correct. You do need to manipulate the extraction mode using a customization. Based on your data sample, there may be two steps to a solution.
Assume that your data is catalogued with sourcetype "answers-1371500719", then create an entry in props.conf and transforms.conf with the following:
#props.conf
[answers-1371500719]
REPORT-get_kv_fields = get_kv_fields
#transforms.conf
[get_kv_fields]
REGEX = ([a-zA-Z0-9]+)\=\"([a-zA-Z0-9]+)\"
FORMAT = $1::$2
MV_ADD = true
This ensures that you obtain all of those fields and corresponsing values that follow this convetion
field="value123"
This will provide you the appropriate value pairs. Please note that the message field is still incorrect.
The message field can then overriden using an inline regular expression at search time,
sourcetype="answers-1371500719" | rex field=_raw "message\=\"(?<message>.+\"?)\"\""
or automatically by updating your props.conf entry
#props.conf
[answers-1371500719]
REPORT-get_kv_fields = get_kv_fields
EXTRACT-message_field = message\=\"(?<message>.+\"?)\"\"
In the end you end up with this:
gc
BTW: Thanks for posting a data sample. It is always easy if we see the data.
Thanks for the solution. I also had the same problem and this works for me as well.
Update: I can change message format. What I looking for is how to convince Splunk to work with inner/nested fields.
Yes. You can do it. Either through rex
extractions in each search, or through doing some configuration in props.conf. That would involve EXTRACTs where you specify exactly what you want extracted (pretty much the same regex syntax as for rex
). You might want to set KV_MODE to none
as well.
http://docs.splunk.com/Documentation/Splunk/5.0.3/admin/Propsconf
EXTRACT-<class> = [<regex>|<regex> in <src_field>]
* Used to create extracted fields (search-time field extractions) that do not reference
transforms.conf stanzas.
* Performs a regex-based field extraction from the value of the source field.
* <class> is a unique literal string that identifies the namespace of the field you're extracting.
**Note:** <class> values do not have to follow field name syntax restrictions. You can use
characters other than a-z, A-Z, and 0-9, and spaces are allowed. <class> values are not subject
to key cleaning.
* The <regex> is required to have named capturing groups. When the <regex> matches, the named
capturing groups and their values are added to the event.
* Use '<regex> in <src_field>' to match the regex against the values of a specific field.
Otherwise it just matches against _raw (all raw event data).
* NOTE: <src_field> can only contain alphanumeric characters (a-z, A-Z, and 0-9).
* If your regex needs to end with 'in <string>' where <string> is *not* a field name, change
the regex to end with '[i]n <string>' to ensure that Splunk doesn't try to match <string>
to a field name.
KV_MODE = [none|auto|multi|json|xml]
* Used for search-time field extractions only.
* Specifies the field/value extraction mode for the data.
* Set KV_MODE to one of the following:
* none: if you want no field/value extraction to take place.
* auto: extracts field/value pairs separated by equal signs.
* auto_escaped: extracts fields/value pairs separated by equal signs and honors \" and \\
as escaped sequences within quoted values, e.g field="value with \"nested\" quotes"
* multi: invokes the multikv search command to expand a tabular event into multiple events.
* xml : automatically extracts fields from XML data.
* json: automatically extracts fields from JSON data.
* Setting to 'none' can ensure that one or more user-created regexes are not overridden by
automatic field/value extraction for a particular host, source, or source type, and also
increases search performance.
* Defaults to auto.
* The 'xml' and 'json' modes will not extract any fields when used on data that isn't of the
correct format (JSON or XML).
Hope this helps,
K
Sorry, but it's not what I looking for. See update.