Solved: Re: Duplicate values for each fields coming up whe...

srv007 · ‎06-19-2023

Below is the splunk query that i'm using.

index=standard_tanium source="Prod-CSV" script=read.ps1"
|rename field01 as user
|rename field02 as login-count
|rename CSV-Timestamp as Script_run
|rex mode=sed field=user "/[&?].*//g"
|table Script_run IDBC-Hostname user login-count
|lookup read-login.csv user-id as user OUTPUTNEW user-id
|lookup owner.csv hostname as IDBC-Hostname OUTPUTNEW tier sownername servicename
|search user-id=*
|sort - login-count
| fields Script_run IDBC-Hostname user login-count tier sownername servicename

the above query is working fine, just duplicate values coming for tier sownername servicename

Output :

Script_run IDBC-Hostname user login-count tier sownername servicename

2023/06/19 12:25 abc12345 11133121-990 4 2 Bob ADF-Co

2 Bob ADF-Co

2023/06/15 17:25 xzyz1112 33421111-990 2 1 Sam AXF-Co

1 Sam AXF-Co

ITWhisperer · ‎06-20-2023

It looks like you have multiple entries in your owner.csv for hostname.

Ideally, you should only have one entry

If you don't want to resolve the duplicates in your owner.csv file, try something like this

|stats values(*) as * by Script_run IDBC-Hostname user login-count

View solution in original post

ITWhisperer · ‎06-20-2023

Try mvdedup(user)

index=standard_tanium source="Prod-CSV" script=read.ps1"
|rename field01 as user
|rename field02 as login-count
|rename CSV-Timestamp as Script_run
|rex mode=sed field=user "/[&?].*//g"
|table Script_run IDBC-Hostname user login-count
| eval user=mvdedup(user)
|lookup read-login.csv user-id as user OUTPUTNEW user-id
|lookup owner.csv hostname as IDBC-Hostname OUTPUTNEW tier sownername servicename
|search user-id=*
|sort - login-count
| fields Script_run IDBC-Hostname user login-count tier sownername servicename

srv007 · ‎06-20-2023

still same result, no changes

ITWhisperer · ‎06-20-2023

Try the same with IDBC-Hostname

index=standard_tanium source="Prod-CSV" script=read.ps1"
|rename field01 as user
|rename field02 as login-count
|rename CSV-Timestamp as Script_run
|rex mode=sed field=user "/[&?].*//g"
|table Script_run IDBC-Hostname user login-count
| eval user=mvdedup(user)
|lookup read-login.csv user-id as user OUTPUTNEW user-id
| eval IDBC-Hostname=mvdedup('IDBC-Hostname')
|lookup owner.csv hostname as IDBC-Hostname OUTPUTNEW tier sownername servicename
|search user-id=*
|sort - login-count
| fields Script_run IDBC-Hostname user login-count tier sownername servicename

srv007 · ‎06-20-2023

still same, no help.

ITWhisperer · ‎06-20-2023

Perhaps it is time for you to share some of your events. For example, what do you get after this part of the search?

index=standard_tanium source="Prod-CSV" script=read.ps1"
|rename field01 as user
|rename field02 as login-count
|rename CSV-Timestamp as Script_run
|rex mode=sed field=user "/[&?].*//g"
|table Script_run IDBC-Hostname user login-count

srv007 · ‎06-20-2023

Below is the result :

Script_run IDBC-Hostname user login-count

2023/06/19 12:25 abc12345 11133121-990 4
2023/06/15 17:25 xzyz1112 33421111-990 2

2023/06/14 19:15 arsd1123 7877333-750 1

2023/06/12 12:25 abc12345 9911128-630 3

Here it is expected to have repeated hostnames as the user id is varying and we need sum total of login-count so we shouldnt dedup hostnames.

Now the problem of duplicate value happens for fields which we using for lookup like tier

ITWhisperer · ‎06-20-2023

Try adding one more line at a time to find out which line is producing the duplicates

srv007 · ‎06-20-2023

index=standard_tanium source="Prod-CSV" script=read.ps1"
|rename field01 as user
|rename field02 as login-count
|rename CSV-Timestamp as Script_run
|rex mode=sed field=user "/[&?].*//g"
|table Script_run IDBC-Hostname user login-count

|lookup read-login.csv user-id as user OUTPUTNEW user-id
|search user-id=*
|sort -login-count
|fields Script_run IDBC-Hostname user login-count tier sownername servicename

This gives output as below--

Script_run IDBC-Hostname user login-count tier sownername servicename

2023/06/19 12:25 abc12345 11133121-990 4
2023/06/15 17:25 xzyz1112 33421111-990 2

2023/06/14 19:15 arsd1123 7877333-750 1

2023/06/12 12:25 abc12345 9911128-630 3

srv007 · ‎06-20-2023

|lookup read-login.csv user-id as user OUTPUTNEW user-id

|lookup owner.csv hostname as IDBC-Hostname OUTPUTNEW tier sownername servicename

|search user-id=*
|sort -login-count
|fields Script_run IDBC-Hostname user login-count tier sownername servicename

When i add lookup for owner.csv then its creating duplicates for the lookup fields.

owner.csv has tier,sownername servicename field values which i'm comparing and adding to my main search
read-login.csv has user account info.

read.ps1 has info about about all hostnames and login count. Using read-login.csv im fetching details for accounts mentioned in that csv only, and doing lookup with owner.csv i'm adding the extra fields to the search results.

ITWhisperer · ‎06-20-2023

It looks like you have multiple entries in your owner.csv for hostname.

Ideally, you should only have one entry

If you don't want to resolve the duplicates in your owner.csv file, try something like this

|stats values(*) as * by Script_run IDBC-Hostname user login-count

srv007 · ‎06-20-2023

Thanks Mate, this did help.
all repeated fields are gone, at the same time not affecting the hostname.
so output is below :

The only concern is can we remove user-id from getting displayed as same info already being shown under user.
And if we can re-allign the order of the fields.
example --

Script_run IDBC-Hostname user login-count tier sownername servicename

srv007 · ‎06-20-2023

Update --

i added | table and the fields to show and this as fixed the above mentioned too.

Kudos!!

ITWhisperer · ‎06-20-2023

Where did you add the stats command?

srv007 · ‎06-20-2023

index=standard_tanium source="Prod-CSV" script=read.ps1"
|rename field01 as user
|rename field02 as login-count
|rename CSV-Timestamp as Script_run
|rex mode=sed field=user "/[&?].*//g"
|table Script_run IDBC-Hostname user login-count
|lookup read-login.csv user-id as user OUTPUTNEW user-id
|lookup owner.csv hostname as IDBC-Hostname OUTPUTNEW tier sownername servicename
|search user-id=*
|sort -login-count
|stats values(*) as * by Script_run IDBC-Hostname user login-count
|table Script_run IDBC-Hostname user login-count tier sownername servicename

ITWhisperer · ‎06-20-2023

So each event has one value for IDBC-Hostname, and your owner.csv has only one entry for each hostname?

srv007 · ‎06-20-2023

owner.csv has only one entry for each hostname? >> correct
So each event has one value for IDBC-Hostname >> can have more, hostname can re-appear but against different user and its run daily, so can have more than one entry for same hostname.

srv007 · ‎06-20-2023

want to understand why the lookup fields are appearing repeated and how to fix it.
I did try
|mvexpand tier | dedup IDBC-Hostname

but then that removes duplicate entries which shouldn't happen as it can be case same server repeated and we need data for it. This dedup affects login-count since dedup removes duplicate hostnames.

Duplicate values for each fields coming up where lookup is used?

dashboard

Introducing the Splunk Community Dashboard Challenge!

Get the T-shirt to Prove You Survived Splunk University Bootcamp

Wondering How to Build Resiliency in the Cloud?