CSV content:

swarjs · ‎01-11-2018

We are trying to parse an entry in the csv which is of the below format,
2018-01-11 00:00:00,default.MS_2016,shekhar.com,Forward-Mapping,No,,10.102.31.216,1,0,2,0,0,0,1,0,0,0,0,1,0,0,1,1,0,1,0,8,0,0,hinf=10|cert=15|ipsec=80|

The value in the last column needs to be parsed and it should be displayed as separate columns and corresponding value.
Example: For the entry "hinf=10|cert=15|ipsec=80|"
The Table needs to display,
hinf cert ipsec
10 15 80

Could anyone please let me know if this is possible.

mayurr98 · ‎01-11-2018

hey @swarjs

Try this run anywhere search

| makeresults | eval _raw="2018-01-11 00:00:00,default.MS_2016,shekhar.com,Forward-Mapping,No,,10.102.31.216,1,0,2,0,0,0,1,0,0,0,0,1,0,0,1,1,0,1,0,8,0,0,hinf=10|cert=15|ipsec=80|" | rex field=_raw "hinf=(?P<hinf>[^\|]+)\|cert=(?P<cert>[^\|]+)\|ipsec=(?P<ipsec>[^\|]+)" | table hinf cert ipsec

Let me know if this helps!

swarjs · ‎01-11-2018

Thanks for the quick response!!
Sorry I probably missed out on adding more details in my question.

As a standalone search, your query is working fine. But in our scenario the last value in the CSV is a dynamic list of key-value pairs. So there is no fixed keyword and no fixed number of entries.

Following is the existing query for the report,
sourcetype=ib:dns:zone index=ib_ipam (view="*") * * * | msservers ms_primary | dedup view,zone_name | noop | rename view as View, zone_name as Zone, zone_format as Function, signed as Signed, hosts as Hosts, rr_total as "Total Records", rr_a as "A Records", rr_aaaa as "AAAA Records", rr_cname as "CNAME Records", rr_dhcid as "DHCID Records", rr_dname as "DNAME Records", rr_dnskey as "DNSKEY Records", rr_ds as "DS Records", rr_mx as "MX Records", rr_naptr as "NAPTR Records", rr_nsec as "NSEC Records", rr_nsec3param as "NSEC3PARAM Records", rr_nsec3 as "NSEC3 Records", rr_ns as "NS Records", rr_ptr as "PTR Records", rr_rrsig as "RRSIG Records", rr_soa as "SOA Records", rr_srv as "SRV Records", rr_tlsa as "TLSA Records", rr_txt as "TXT Records", rr_other as "Other Records", rr_lbdn as "LBDN" | eval Timestamp = strftime(_time, "%Y-%m-%d %H:%M:%S %Z") | table Timestamp, Zone, Function, Signed, Hosts, "LBDN", "Total Records", "A Records", "AAAA Records", "CNAME Records", "DHCID Records", "DNAME Records", "DNSKEY Records", "DS Records", "MX Records", "NAPTR Records", "NSEC Records", "NSEC3PARAM Records", "NSEC3 Records", "NS Records", "PTR Records", "RRSIG Records", "SOA Records", "SRV Records", "TLSA Records", "TXT Records", "Other Records"

In the query above, rr_aaaa, rr_a, rr_soa etc are the headers of the csv.

This needs enhancement to support the new column added in the CSV(rr_unknown = "hinfo=10|cert=15|ipsec=80|").

mayurr98 · ‎01-11-2018

hey so you can write

| rex field=rr_unknown "hinf=(?P<hinf>[^\|]+)" | rex field=rr_unknown  "cert=(?P<cert>[^\|]+)" | rex field=rr_unknown "ipsec=(?P<ipsec>[^\|]+)" | table hinf cert ipsec

OR

| extract pairdelim="|", kvdelim="="

swarjs · ‎01-11-2018

But the values that are populated in the last column are dynamic. So there is no fixed format for the key value pair and also there is no fixed count of the key-value pairs that would occur in this column. So using the above "rex" probably would not work since it expects particular field keys.

Could you please let me know how to write a generic version of the regex, where in the last column will be parsed based on key-value pairs. Is there a way that we could specify the "extract" call to work on a particular field, say "rr_unknown".

I am new to splunk queries, so please do not mind if my question sounds trivial.

mayurr98 · ‎01-11-2018

can you give me sample input? with the exact field name and values?

swarjs · ‎01-11-2018

CSV content:

timestamp,view,zone_name,zone_format,signed,grid_primary,ms_primary,rr_a,......rr_other,rr_total,hosts,rr_lbdn,rr_dhcid,rr_unknown
2018-01-12 00:00:00,default.MS_2016,shekhar.com,Forward-Mapping,No,,10.102.31.216,1,0,2,0,0,0,1,0,0,0,0,1,0,0,1,1,0,1,0,8,0,0,0,|hinfo=10|cert=20|loc=30|abc=67|
2018-01-12 00:00:00,default.MS_2016,zone.com,Forward-Mapping,No,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,|caa=89|cname=90|aaaa=100|

In the above CSV entries, the last column("rr_unknown"), could contain the key value pairs.
this column needs to be further parsed and each entry of that parsing entry needs to be displayed as a separate column in the table.

Example: The table display should look as follows,
Timestamp Zone 'A Record' 'AAAA Record' .. .. .. .. .. 'DHCID Record' 'CAA Record' 'CNAME Record' 'AAAA Record'
2018-01-12 00:00:00 zone.com 0 0 .. .. .. .. .. 0 89 90 100

The query that I am currently using is as follows,
sourcetype=ib:dns:zone index=ib_ipam (view="") | msservers ms_primary | dedup view,zone_name | noop | rename view as View, zone_name as Zone, zone_format as Function, signed as Signed, hosts as Hosts, rr_total as "Total Records", rr_a as "A Records", rr_aaaa as "AAAA Records", rr_cname as "CNAME Records", rr_dhcid as "DHCID Records", rr_dname as "DNAME Records", rr_dnskey as "DNSKEY Records", rr_ds as "DS Records", rr_mx as "MX Records", rr_naptr as "NAPTR Records", rr_nsec as "NSEC Records", rr_nsec3param as "NSEC3PARAM Records", rr_nsec3 as "NSEC3 Records", rr_ns as "NS Records", rr_ptr as "PTR Records", rr_rrsig as "RRSIG Records", rr_soa as "SOA Records", rr_srv as "SRV Records", rr_tlsa as "TLSA Records", rr_txt as "TXT Records", rr_other as "Other Records", rr_lbdn as "LBDN" | eval Timestamp = strftime(_time, "%Y-%m-%d %H:%M:%S %Z") | table Timestamp, Zone, Function, Signed, Hosts, "LBDN", "Total Records", "A Records", "AAAA Records", "CNAME Records", "DHCID Records", "DNAME Records", "DNSKEY Records", "DS Records", "MX Records", "NAPTR Records", "NSEC Records", "NSEC3PARAM Records", "NSEC3 Records", "NS Records", "PTR Records", "RRSIG Records", "SOA Records", "SRV Records", "TLSA Records", "TXT Records", "Other Records"

This query needs to be enhanced to accommodate the new column "rr_unknown" and display the values under this column as individual columns.

p_gurav · ‎01-11-2018

Hi swarjs,

After indexing is "hinf=10|cert=15|ipsec=80|" is coming in one field?

swarjs · ‎01-11-2018

Yes correct. After extraction, "hinf=10|cert=15|ipsec=80|" will be in one field. Need to further process this and display it as separate columns in a table.

p_gurav · ‎01-11-2018

Then you can use field extractor to extract values , Or use rex command "| rex field=(replace field name where it is being captured) "hinf=(?P[^|]+)|cert=(?P[^|]+)|ipsec=(?P[^|]+)""

swarjs · ‎01-11-2018

Hi p_gurav,
Thanks for the inputs. Sorry I have updated more details about my query below. One of the concern is that the last column of my csv is dynamic. ie. there is no fixed pattern for the key value pairs. Only the delimiters are fixed.

Parsing a field in csv having key value pairs and displaying it as header and value in table

CSV content:

Adoption of RUM and APM at Splunk

Routing logs with Splunk OTel Collector for Kubernetes

Welcome to the Splunk Community!