Hi,
I am reeving the logs from email gateway and all the field values are between ' character and those are captured as part of field value. Below is the sample log.
<22>May 21 14:16:30 meg234 : app='smtp', name='Email Status', policy_name='', dvc_host='', virtual_host='meg.test.com', event_id=50006, reason_id=77, direction=1, src_ip='1.1.1.1', src_host='meg.test.com', dest_ip='2.2.2.2', dest_host='', rhdr_ip='', is_primary_action=, scanner='', action='', status='Email Delivered', sender=, recipient='', msgid='5b69_036d_201e8739_1cef_495b_a267_8ce04d4b9c36', orig_msgid='2ecf5795f6ea4e1ca81c732102316082@test.local', nrcpts=1, relay='', subject='sadeer Final', encryption_type='0', orig_subject='', orig_sender='', size=238141, attachments='Companytest.docx, test.xlsx', number_attachments=2, virus_name='', file_name='', spamscore=, spamthreshold=, spamrules='', URL='', contentrule='[]', content_terms='[]', tz='GMT', tz_offset='+0000', dlpfile='', dlprules='', dlpclassification='', dlpfileuploaded='', dlpfiledigest='', dlpfilesize='', iascore=, iathreshold=, ts_reputation_score=, ts_geo_location='', ts_ip_rep_status=, ts_hash_length=, ts_lookup_hash='', local-time='2017-05-21_14:16:23_GMT' scan-host-name='meg', scan-host-ip='1.1.1.1', host-name='meg234', host-domain-name='test.com', mac-address='00:00:00:34:79:45', product='FG (9.9) PM5600', user-name='test'
All the captured field value including the special character ' as begging and end of the value. I wan't to remove the special character ' from all the beginning and end of the value. of all the fields.
help me on this.
The Search-Time Order of Operations is this:
Sourcetype RENAME
EXTRACT-xxx
REPORT-xxx
KV_MODE
FIELDALIAS-xxx
EVAL-xxx
LOOKUP-xxx
MILLISECONDS
FILTER
EVENTTYPING
TAGGING
So use EVAL
instead of EXTRACT
and try this:
[YourSourcetypeHere]
EVAL-app=replace(app, "^'|'$", "")
And so on for all of the field names.
The Search-Time Order of Operations is this:
Sourcetype RENAME
EXTRACT-xxx
REPORT-xxx
KV_MODE
FIELDALIAS-xxx
EVAL-xxx
LOOKUP-xxx
MILLISECONDS
FILTER
EVENTTYPING
TAGGING
So use EVAL
instead of EXTRACT
and try this:
[YourSourcetypeHere]
EVAL-app=replace(app, "^'|'$", "")
And so on for all of the field names.
Hello,
To handle it at search time, you can add the following to props.conf (on searchheads):
[Sourcetype]
EXTRACT-app = app=\'(?<app>\w+)\'
and so on for other fields.
Regards
No this will not help because EXTRACT
happens before KV_MODE
; that's why I asked how the fields were being created.
@woodcock as per my test (Splunk version 6.5.3) EXTRACT
is working fine with KV_MODE=auto
This is very strange but hey; there you go!
I believe that allowing EXTRACT to work after KV_MODE is intended to make some tweaks on the automatically extracted fields.
During search, you can do it like this:
... | foreach * [rex field=<<FIELD>> mode=sed "s/^'// s/'$//"]
During indexing, we will need to know how you are indexing your fields.
During the search, below command is working but I need to fix in props.conf .
index =test | | rex mode=sed "s/'//g"
Your SEDCMD approach is wrong because it does not consider the fact that the '
character frequently occur inside of the field data with an escape character and this will strip the quote but leave the escape and be very confusing. How are you creating your fields now? Are you using KV_MODE=auto
?
yes I am using auto mode.
Just to add..
When I am using the search with below, It's shows the special character ' removed.
index =test | | rex mode=sed "s/'//g"
but when I add the below in props.conf, special character are not removing.
SEDCMD-RemoveSingleQuotes = s//'//g
SEDCMD is used at index time.
as per docs (http://docs.splunk.com/Documentation/Splunk/6.2.1/admin/Propsconf):
SEDCMD-<class> = <sed script>
* Only used at index time.