All Apps and Add-ons

Splunk App for AWS Billing: Why is a single entry of raw data showing 2 results (count=2 and 200%)?

mjm295
Path Finder

When I do a particular search on a unique record ID, I get 1 piece of raw data back, but some of the fields are reporting 2 entries. I believe this is skewing my results further down the line.

For a particular search:

index=aws-bill  RecordId=39613589688296092585051622

I get exactly 1 Event, but hovering over the field for Blended cost, I see 2 lots of data. Value is "0.00000170" but the count is 2. Why is this?
alt text

Also, when I do this search and show as a chart:

index=aws-bill  RecordId=39613589688296092585051622 | timechart sum(BlendedCost) as $ by showback

I get a barchart with the value as "0.00000340" which is double the Blended cost.

Where is this coming from? What are my options for getting better results?

0 Karma
1 Solution

woodcock
Esteemed Legend

OK, that explains it; you are telling Splunk to extract json fields twice: once at index time ( INDEXED_EXTRACTIONS=json ) and once at search time ( KV_MODE=json). Get rid of the KV_MODE setting.

See this Q&A for a more complete discussion:

http://answers.splunk.com/answers/174939/why-are-my-json-fields-extracted-twice.html

View solution in original post

monkee
Path Finder

The latest version 2.0.9 uses just KV_MODE=json, so it does not cause any duplicates. Thanks to woodcock for the heads up.

0 Karma

woodcock
Esteemed Legend

OK, that explains it; you are telling Splunk to extract json fields twice: once at index time ( INDEXED_EXTRACTIONS=json ) and once at search time ( KV_MODE=json). Get rid of the KV_MODE setting.

See this Q&A for a more complete discussion:

http://answers.splunk.com/answers/174939/why-are-my-json-fields-extracted-twice.html

woodcock
Esteemed Legend

Your picture is unambiguously clear: it is be because your 1 matching event has a multivalued field called BlendedCost with 2 values, both of which are the same: 0.00000170. How is the BlendedCost field created? What is in the raw data (_raw field)?

mjm295
Path Finder

inputs.conf:

 [script:///opt/splunk/etc/apps/SplunkAppforAWSBilling/bin/ProcessDetailedReport.py]
 disabled = 0
 index = aws-bill
 interval = 10800
 passAuth = splunk-system-user
 source = SplunkAppforAWSBilling_Import
 sourcetype = SplunkAppforAWSBilling_Processor

props.conf:

 [source::SplunkAppforAWSBilling_Import]
 INDEXED_EXTRACTIONS=json
 KV_MODE=json
 TIME_PREFIX=\"UsageStartDate\"\:
 TIME_FORMAT=%Y-%m-%d %H:%M:%S

transforms.conf

 #######################
 #  Lookups
 #######################
 [payer_account_id]
 filename = payer_account_id.csv

 [linked_account_id]
 filename = linked_account_id.csv
0 Karma

mjm295
Path Finder

Hi, No it seems to be a single entry in the raw data: "BlendedCost": "0.00000170"

It is sucked in from a spreadsheet which comes from AWS billing. BlendedCost is one of the columns in the spreadsheet and that also only has the single entry.

Raw data is:

{"user:hostname": "awswarsp01", "PricingPlanId": "505699", "user:showback": "IT:Aris", "ProductName": "Amazon Elastic Compute Cloud", "ResourceId": "i-9fdb5ea1", "PayerAccountId": "311971337317", "UsageStartDate": "2015-08-01 00:00:00", "BlendedCost": "0.00000170", "InvoiceID": "Estimated", "ReservedInstance": "N", "RecordType": "LineItem", "RecordId": "39613589688296092585051622", "Operation": "InterZone-Out", "user:Name": "inst-aris-app-01", "SubscriptionId": "28816468", "user:project": "aris design", "ItemDescription": "$0.010 per GB - regional data transfer - in/out/between EC2 AZs or using IPs or ELB", "UnBlendedCost": "0.00000170", "UnBlendedRate": "0.0100000000", "UsageType": "APS2-DataTransfer-Regional-Bytes", "LinkedAccountId": "311971337317", "BlendedRate": "0.0100000000", "user:environment": "production", "UsageQuantity": "0.00016988", "UsageEndDate": "2015-08-01 01:00:00", "RateId": "3510837"}

Mark

0 Karma

woodcock
Esteemed Legend

I didn't say it was in the raw data twice (although that is one way to have a multivalued field created with the same value twice). So now we have half of the pieces of the puzzle; what are your Splunk configurations (particularly inputs.conf, props.conf and transforms.conf)?

0 Karma

mjm295
Path Finder

inputs.conf:

[script:///opt/splunk/etc/apps/SplunkAppforAWSBilling/bin/ProcessDetailedReport.py]
disabled = 0
index = aws-bill
interval = 10800
passAuth = splunk-system-user
source = SplunkAppforAWSBilling_Import
sourcetype = SplunkAppforAWSBilling_Processor

props.conf:

[source::SplunkAppforAWSBilling_Import]
INDEXED_EXTRACTIONS=json
KV_MODE=json
TIME_PREFIX=\"UsageStartDate\":
TIME_FORMAT=%Y-%m-%d %H:%M:%S

transforms.conf

#######################
# Lookups
#######################

[payer_account_id]
filename = payer_account_id.csv

[linked_account_id]
filename = linked_account_id.csv

0 Karma

mjm295
Path Finder

trying to highlight that 2nd search but get this error:
You are only allowed to submit 2 posts per day until you reach 40 points of reputation level.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...