Splunk Search

remove the special character ' from beginning and end of the field value

mustafag
Path Finder

Hi,
I am reeving the logs from email gateway and all the field values are between ' character and those are captured as part of field value. Below is the sample log.

<22>May 21 14:16:30 meg234 : app='smtp', name='Email Status', policy_name='', dvc_host='', virtual_host='meg.test.com', event_id=50006, reason_id=77, direction=1, src_ip='1.1.1.1', src_host='meg.test.com', dest_ip='2.2.2.2', dest_host='', rhdr_ip='', is_primary_action=, scanner='', action='', status='Email Delivered', sender=, recipient='', msgid='5b69_036d_201e8739_1cef_495b_a267_8ce04d4b9c36', orig_msgid='2ecf5795f6ea4e1ca81c732102316082@test.local', nrcpts=1, relay='', subject='sadeer Final', encryption_type='0', orig_subject='', orig_sender='', size=238141, attachments='Companytest.docx, test.xlsx', number_attachments=2, virus_name='', file_name='', spamscore=, spamthreshold=, spamrules='', URL='', contentrule='[]', content_terms='[]', tz='GMT', tz_offset='+0000', dlpfile='', dlprules='', dlpclassification='', dlpfileuploaded='', dlpfiledigest='', dlpfilesize='', iascore=, iathreshold=, ts_reputation_score=, ts_geo_location='', ts_ip_rep_status=, ts_hash_length=, ts_lookup_hash='', local-time='2017-05-21_14:16:23_GMT' scan-host-name='meg', scan-host-ip='1.1.1.1', host-name='meg234', host-domain-name='test.com', mac-address='00:00:00:34:79:45', product='FG (9.9) PM5600', user-name='test'

All the captured field value including the special character ' as begging and end of the value. I wan't to remove the special character ' from all the beginning and end of the value. of all the fields.
help me on this.

Tags (1)
0 Karma
1 Solution

woodcock
Esteemed Legend

The Search-Time Order of Operations is this:

Sourcetype RENAME
EXTRACT-xxx
REPORT-xxx
KV_MODE
FIELDALIAS-xxx
EVAL-xxx
LOOKUP-xxx
MILLISECONDS
FILTER
EVENTTYPING
TAGGING

So use EVAL instead of EXTRACT and try this:

[YourSourcetypeHere]
EVAL-app=replace(app, "^'|'$", "")

And so on for all of the field names.

View solution in original post

0 Karma

woodcock
Esteemed Legend

The Search-Time Order of Operations is this:

Sourcetype RENAME
EXTRACT-xxx
REPORT-xxx
KV_MODE
FIELDALIAS-xxx
EVAL-xxx
LOOKUP-xxx
MILLISECONDS
FILTER
EVENTTYPING
TAGGING

So use EVAL instead of EXTRACT and try this:

[YourSourcetypeHere]
EVAL-app=replace(app, "^'|'$", "")

And so on for all of the field names.

0 Karma

aakwah
Builder

Hello,

To handle it at search time, you can add the following to props.conf (on searchheads):

[Sourcetype]
EXTRACT-app = app=\'(?<app>\w+)\'

and so on for other fields.

Regards

0 Karma

woodcock
Esteemed Legend

No this will not help because EXTRACT happens before KV_MODE; that's why I asked how the fields were being created.

0 Karma

aakwah
Builder

@woodcock as per my test (Splunk version 6.5.3) EXTRACT is working fine with KV_MODE=auto

0 Karma

woodcock
Esteemed Legend

This is very strange but hey; there you go!

0 Karma

aakwah
Builder

I believe that allowing EXTRACT to work after KV_MODE is intended to make some tweaks on the automatically extracted fields.

0 Karma

woodcock
Esteemed Legend

During search, you can do it like this:

... | foreach * [rex field=<<FIELD>> mode=sed "s/^'// s/'$//"]

During indexing, we will need to know how you are indexing your fields.

0 Karma

mustafag
Path Finder

During the search, below command is working but I need to fix in props.conf .
index =test | | rex mode=sed "s/'//g"

0 Karma

woodcock
Esteemed Legend

Your SEDCMD approach is wrong because it does not consider the fact that the ' character frequently occur inside of the field data with an escape character and this will strip the quote but leave the escape and be very confusing. How are you creating your fields now? Are you using KV_MODE=auto?

0 Karma

mustafag
Path Finder

yes I am using auto mode.

0 Karma

mustafag
Path Finder

Just to add..
When I am using the search with below, It's shows the special character ' removed.
index =test | | rex mode=sed "s/'//g"

but when I add the below in props.conf, special character are not removing.

SEDCMD-RemoveSingleQuotes = s//'//g

0 Karma

aakwah
Builder

SEDCMD is used at index time.

as per docs (http://docs.splunk.com/Documentation/Splunk/6.2.1/admin/Propsconf):

SEDCMD-<class> = <sed script>
* Only used at index time.
0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...