Solved: How to edit my search to prevent duplicate values ...

splunker1981 · ‎07-30-2019

Hello fellow Splunkers

Not sure the best way to approach the following problem. I use replace to update values within a fields. I'm running into an issue when I replace the 2 different strings values with the same string which ends up resulting in duplicate values for each of the renamed strings

| searchHere
| stats count(product_id) as total_product_id by assest_tag, product_version
| replace "storeX" with "local" in asset_tag 
| replace "storeY_NZ" with "local" in asset_tag 
| where asset_tag == local
| eventstats sum(total_product_id) as total_in_inventory
| eval perc = round( total_product_id * 100 / total_in_inventory, 1 ) 
| table product_version, total_in_inventory, perc

Part of the issue that I'm having is that I'm getting duplicate production versions. My guess is because of the replace. How can I resolve this?

jacobpevans · ‎07-30-2019

Howdy Splunker,

Could you provide some sample data and explain where you see the duplicates?

Based on the information given, my guess is one or more of the following:
- Move the two replace lines to above the first stats
- Add by asset_tag, product_version to the end of your eventstats
- Change count(product_id) to just count

Here's a run-anywhere search that I was using to try to understand your data. The output of it looks accurate to me.

| makeresults count=50
| eval asset_tag       = random()%10 * 1000 + 1000
| eval product_version = random()%10 * 10   + 10
| replace "2000" with "1000" in asset_tag
| replace "3000" with "1000" in asset_tag
| stats count as total_product_id by asset_tag, product_version
| eventstats sum(total_product_id) as total_in_inventory 
| eval perc = round( 100 * (total_product_id / total_in_inventory), 1 )
| sort asset_tag product_version
| addcoltotals labelfield=asset_tag total_product_id perc

Cheers,
Jacob

If you feel this response answered your question, please do not forget to mark it as such. If it did not, but you do have the answer, feel free to answer your own post and accept that as the answer.

View solution in original post

jacobpevans · ‎07-30-2019

Howdy Splunker,

Could you provide some sample data and explain where you see the duplicates?

Based on the information given, my guess is one or more of the following:
- Move the two replace lines to above the first stats
- Add by asset_tag, product_version to the end of your eventstats
- Change count(product_id) to just count

Here's a run-anywhere search that I was using to try to understand your data. The output of it looks accurate to me.

| makeresults count=50
| eval asset_tag       = random()%10 * 1000 + 1000
| eval product_version = random()%10 * 10   + 10
| replace "2000" with "1000" in asset_tag
| replace "3000" with "1000" in asset_tag
| stats count as total_product_id by asset_tag, product_version
| eventstats sum(total_product_id) as total_in_inventory 
| eval perc = round( 100 * (total_product_id / total_in_inventory), 1 )
| sort asset_tag product_version
| addcoltotals labelfield=asset_tag total_product_id perc

Cheers,
Jacob

If you feel this response answered your question, please do not forget to mark it as such. If it did not, but you do have the answer, feel free to answer your own post and accept that as the answer.

woodcock · ‎07-30-2019

@jacobevans nailed it.

How to edit my search to prevent duplicate values when using replace command

Join Us for Splunk University and Get Your Bootcamp Game On!

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

Announcing Scheduled Export GA for Dashboard Studio