All Apps and Add-ons

Need to remove numeric values from field to find top values

tachu
Explorer

I have millions of values indexed that look like this

,A}MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_no_event_id_total_value_season_percent_stars_33097521 A}MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_no_event_id_total_value_season_percent_20709664 A}MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_586_user_by_outfit_32587030 A}MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_592_impression_33141624 ,A}MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_no_event_id_total_value_season_percent_stars_33952008 A}MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_no_event_id_total_value_new_33208512 A}MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_no_event_id_total_value_new_stars_32270501 ,A}MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_no_event_id_total_value_season_percent_stars_32635194 ,A}MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_no_event_id_total_value_season_percent_stars_32635194 A}USER_V2_ID_TO_CAREER_TIER_1417875 A}US_MC_USER_EVENTS_BY_UID_U_802735 A}USER_OUTFIT_LOOK_RATING_STARS_SAVED_KEY3_17481979 A}USEROUTFIT_32305379

There is many more variations. I need to create a field that grabs all the alpha characters and excludes all numbers to be able to see top patters/values IE

MCTEST1_SI_EVENTS_TEST1_SI_EVENTS_no_event_id_total_value_season_percent_stars_

MCTEST1_SI_EVENTS_TEST1_SI_EVENTS__impression_

MCTEST1_SI_EVENTS_TEST1_SI_EVENTS__user_by_outfit_

USER_V2_ID_TO_CAREER_TIER_

US_MC_USER_EVENTS_BY_UID_U_
0 Karma

kristian_kolb
Ultra Champion

You can easily do it in inline in a search with rex;

...| rex field=your_fieldname mode=sed "s/\d//g"

If you don't specify a field name, the sed script will run aginst the whole event (the _raw field).

Then you can do your stats/top/chart etc on the field.

http://docs.splunk.com/Documentation/Splunk/6.0.1/SearchReference/Rex

/K

kristian_kolb
Ultra Champion

alternatively, if you want to keep some numbers, like in TEST1 or V2, but not those sub-parts that are ONLY numbers, like _2342_ you can just alter the sed script slightly;

...| rex field=your_fieldname mode=sed "s/_\d+/_/g"

0 Karma

sideview
SplunkTrust
SplunkTrust

If these are in a field called myField then you would just tack this on the end of your search:

| eval myField=replace(myField,"_\d+$","")

and that will effectively clip the _33097521 off the end of all the values.

Therefore if you have this:

| eval myField=replace(myField,"_\d+$","") | top 100 myField

you'll get the top 100 values, considering only the part up to the big integers.

You can also use rex instead of eval. The following will do the same thing as the eval syntax above:

| rex field="myField" "(?<myField>.+)_\d+"`
0 Karma

kristian_kolb
Ultra Champion

Oops, answering a little late. Hmm.. Neither the rex nor the replace will handle numbers in the middle of the strings.

0 Karma

lukejadamec
Super Champion

Is that an example of one event, or is each line an event?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...