Below is the splunk query that i'm using.
index=standard_tanium source="Prod-CSV" script=read.ps1"
|rename field01 as user
|rename field02 as login-count
|rename CSV-Timestamp as Script_run
|rex mode=sed field=user "/[&?].*//g"
|table Script_run IDBC-Hostname user login-count
|lookup read-login.csv user-id as user OUTPUTNEW user-id
|lookup owner.csv hostname as IDBC-Hostname OUTPUTNEW tier sownername servicename
|search user-id=*
|sort - login-count
| fields Script_run IDBC-Hostname user login-count tier sownername servicename
the above query is working fine, just duplicate values coming for tier sownername servicename
Output :
Script_run IDBC-Hostname user login-count tier sownername servicename
2023/06/19 12:25 abc12345 11133121-990 4 2 Bob ADF-Co
2 Bob ADF-Co
2 Bob ADF-Co
2 Bob ADF-Co
2023/06/15 17:25 xzyz1112 33421111-990 2 1 Sam AXF-Co
1 Sam AXF-Co
1 Sam AXF-Co
1 Sam AXF-Co
It looks like you have multiple entries in your owner.csv for hostname.
Ideally, you should only have one entry
If you don't want to resolve the duplicates in your owner.csv file, try something like this
|stats values(*) as * by Script_run IDBC-Hostname user login-count
Try mvdedup(user)
index=standard_tanium source="Prod-CSV" script=read.ps1"
|rename field01 as user
|rename field02 as login-count
|rename CSV-Timestamp as Script_run
|rex mode=sed field=user "/[&?].*//g"
|table Script_run IDBC-Hostname user login-count
| eval user=mvdedup(user)
|lookup read-login.csv user-id as user OUTPUTNEW user-id
|lookup owner.csv hostname as IDBC-Hostname OUTPUTNEW tier sownername servicename
|search user-id=*
|sort - login-count
| fields Script_run IDBC-Hostname user login-count tier sownername servicename
still same result, no changes
Try the same with IDBC-Hostname
index=standard_tanium source="Prod-CSV" script=read.ps1"
|rename field01 as user
|rename field02 as login-count
|rename CSV-Timestamp as Script_run
|rex mode=sed field=user "/[&?].*//g"
|table Script_run IDBC-Hostname user login-count
| eval user=mvdedup(user)
|lookup read-login.csv user-id as user OUTPUTNEW user-id
| eval IDBC-Hostname=mvdedup('IDBC-Hostname')
|lookup owner.csv hostname as IDBC-Hostname OUTPUTNEW tier sownername servicename
|search user-id=*
|sort - login-count
| fields Script_run IDBC-Hostname user login-count tier sownername servicename
still same, no help.
Perhaps it is time for you to share some of your events. For example, what do you get after this part of the search?
index=standard_tanium source="Prod-CSV" script=read.ps1"
|rename field01 as user
|rename field02 as login-count
|rename CSV-Timestamp as Script_run
|rex mode=sed field=user "/[&?].*//g"
|table Script_run IDBC-Hostname user login-count
Below is the result :
Script_run IDBC-Hostname user login-count
2023/06/19 12:25 abc12345 11133121-990 4
2023/06/15 17:25 xzyz1112 33421111-990 2
2023/06/14 19:15 arsd1123 7877333-750 1
2023/06/12 12:25 abc12345 9911128-630 3
Here it is expected to have repeated hostnames as the user id is varying and we need sum total of login-count so we shouldnt dedup hostnames.
Now the problem of duplicate value happens for fields which we using for lookup like tier
Try adding one more line at a time to find out which line is producing the duplicates
index=standard_tanium source="Prod-CSV" script=read.ps1"
|rename field01 as user
|rename field02 as login-count
|rename CSV-Timestamp as Script_run
|rex mode=sed field=user "/[&?].*//g"
|table Script_run IDBC-Hostname user login-count
|lookup read-login.csv user-id as user OUTPUTNEW user-id
|search user-id=*
|sort -login-count
|fields Script_run IDBC-Hostname user login-count tier sownername servicename
This gives output as below--
Script_run IDBC-Hostname user login-count tier sownername servicename
2023/06/19 12:25 abc12345 11133121-990 4
2023/06/15 17:25 xzyz1112 33421111-990 2
2023/06/14 19:15 arsd1123 7877333-750 1
2023/06/12 12:25 abc12345 9911128-630 3
index=standard_tanium source="Prod-CSV" script=read.ps1"
|rename field01 as user
|rename field02 as login-count
|rename CSV-Timestamp as Script_run
|rex mode=sed field=user "/[&?].*//g"
|table Script_run IDBC-Hostname user login-count
|lookup read-login.csv user-id as user OUTPUTNEW user-id
|lookup owner.csv hostname as IDBC-Hostname OUTPUTNEW tier sownername servicename
|search user-id=*
|sort -login-count
|fields Script_run IDBC-Hostname user login-count tier sownername servicename
When i add lookup for owner.csv then its creating duplicates for the lookup fields.
owner.csv has tier,sownername servicename field values which i'm comparing and adding to my main search
read-login.csv has user account info.
read.ps1 has info about about all hostnames and login count. Using read-login.csv im fetching details for accounts mentioned in that csv only, and doing lookup with owner.csv i'm adding the extra fields to the search results.
It looks like you have multiple entries in your owner.csv for hostname.
Ideally, you should only have one entry
If you don't want to resolve the duplicates in your owner.csv file, try something like this
|stats values(*) as * by Script_run IDBC-Hostname user login-count
Thanks Mate, this did help.
all repeated fields are gone, at the same time not affecting the hostname.
so output is below :
The only concern is can we remove user-id from getting displayed as same info already being shown under user.
And if we can re-allign the order of the fields.
example --
Script_run IDBC-Hostname user login-count tier sownername servicename
Update --
i added | table and the fields to show and this as fixed the above mentioned too.
Kudos!!
Where did you add the stats command?
index=standard_tanium source="Prod-CSV" script=read.ps1"
|rename field01 as user
|rename field02 as login-count
|rename CSV-Timestamp as Script_run
|rex mode=sed field=user "/[&?].*//g"
|table Script_run IDBC-Hostname user login-count
|lookup read-login.csv user-id as user OUTPUTNEW user-id
|lookup owner.csv hostname as IDBC-Hostname OUTPUTNEW tier sownername servicename
|search user-id=*
|sort -login-count
|stats values(*) as * by Script_run IDBC-Hostname user login-count
|table Script_run IDBC-Hostname user login-count tier sownername servicename
So each event has one value for IDBC-Hostname, and your owner.csv has only one entry for each hostname?
owner.csv has only one entry for each hostname? >> correct
So each event has one value for IDBC-Hostname >> can have more, hostname can re-appear but against different user and its run daily, so can have more than one entry for same hostname.
want to understand why the lookup fields are appearing repeated and how to fix it.
I did try
|mvexpand tier | dedup IDBC-Hostname
but then that removes duplicate entries which shouldn't happen as it can be case same server repeated and we need data for it. This dedup affects login-count since dedup removes duplicate hostnames.