I have a log file that sometimes has very long field.
A row of my log is:
018-07-31 10:22:38.8701 inoutLogger level="ERROR" timestamp="31/07/2018 10:22:38" Elapsed_ms="1218.7727" richiesta='"<?xml version="1.0" encoding="utf-16"?><my very long xml>"'
my props.file is:
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE=false
TRUNCATE=0
pulldown_type = 1
Sometimes if I make a search of the field "richiesta" I have the field truncated.
search xxxx | table richiesta
I obtain only a part of the xml (es. "<?xml version="1.0").
Any suggestion?
Thanks
Gianluca
Take a look at this post.
Hi kmorris,
in my props.conf I have already TRUNCATE=0. According to the documentation splunk should never truncate.
Kind Regards
Gianluca
Did you define TRUNCATE=0 under same stanza as your sourcetype or source? The reason I am asking this is to see if there are any precedence issues.
hi nittala_surya,
my props.conf is
**[invest-be-inout-crg]
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE=false
TRUNCATE=0
pulldown_type = 1
TRANSFORMS-filter_logs = extract_fields-invest-be-inout,extract_fields_Source_Wel**
in my search query I set sourcetype = invest-be-inout-crg
so I think the truncate is in the correct place. Is there a way to check in the UI?
btw I will try TRUNCATE=20000 and I will see if this will solve the problem
Thanks
regards
Gianluca
Hi,
Are you able to see full value of richiesta field in raw log?
Are you getting truncated value only while displaying it in table?
Hi thambisetty,
if I make a query "search xxxx | fields richiesta" I see the field truncated even if the raw data is complete
If I make a similar query "search xxxx |eval len=len(_raw) | eval len_rich=len(richiesta) | table richiesta len len_rich"
It looks to me the field is truncated when the raw has a length > 10000. It is always truncated when length is > 10000, it is never truncated when the length is less then 10000.
Bye
That means field value is not extracted as you expected if you can post samle raw event i can help you with regex to extract richiest field value.
hi thambisetti,
I didn't define a transform because, according to splunk the log is already written in key='value'. It should extract the value automatically.
It is an option that I can consider to write a transform if I don't find a solution.
Thanks
Kind regards
Gianluca
If you want to extract values at search-time, you use spath
command like this: search xxxx | spath input=richiesta
. This strips all xml fields automatically. More info here.
If this is not what you're looking, then please provide some sample events and I can help you with regular expressions to extract fields using props.conf.
hy nittala_surya,
thank you for your reply. What I am not able to understand is why the field richiesta contains only a part of the xml.
The spath command is very interesting
Kind regards
What do you mean by "a part of xml". Like, only richiesta field has xml data and the raw event is just text?
Mi field richiesta in the log file is:
.... richiesta='"<?xml version="1.0" encoding="utf-16"?><MemoConsulenzaRequest xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><ZSRVEXT><Username xmlns="http://www.cadit.it/MW/MWGSSRE">ut27537</Username></ZSRVEXT></MemoConsulenzaReques...' .....
if I query richiesta at search time I obtain only the first part of richiesta "<?xml version="1.0"
I think the default values of [kv]
(Key-value) are the reason for truncation. According the limits.conf, below are the default values. Check if any of these apply to "richiesta".
avg_extractor_time = <integer>
* Maximum amount of CPU time, in milliseconds, that the average (over search
results) execution time of a key-value pair extractor will be allowed to take
before warning. Once the average becomes larger than this amount of time a
warning will be issued
* Default: 500 (.5 seconds)
limit = <integer>
* The maximum number of fields that an automatic key-value field extraction
(auto kv) can generate at search time.
* If search-time field extractions are disabled (KV_MODE=none in props.conf)
then this setting determines the number of index-time fields that will be
returned.
* The summary fields 'host', 'index', 'source', 'sourcetype', 'eventtype',
'linecount', 'splunk_server', and 'splunk_server_group' do not count against
this limit and will always be returned.
* Increase this setting if, for example, you have indexed data with a large
number of columns and want to ensure that searches display all fields from
the data.
* Default: 100
maxchars = <integer>
* Truncate _raw to this size and then do auto KV.
* Default: 10240 characters
maxcols = <integer>
* When non-zero, the point at which kv should stop creating new fields.
* Default: 512
max_extractor_time = <integer>
* Maximum amount of CPU time, in milliseconds, that a key-value pair extractor
will be allowed to take before warning. If the extractor exceeds this
execution time on any event a warning will b
Just a note...The longer and more complicated your events, the more you would get out of hand coding the field extractions. The auto extractor, while "correct", does not necessarily produce the most efficient regular expressions for the data.