All Apps and Add-ons

How to LINEMERGE sourcetype="mscs:storage:blob" for Azure blob storage files ?

Log_wrangler
Builder

I have install the Splunk_TA_microsoft-cloudservices APP on a heavy forwarder which routes to multiple indexers in a "distributed search peer" configuration.

The app is working and I can find the events in my search head

for example, index=azure sourcetype = mscs:storage:blob

The files that are sent to the blob storage are in json format.

The problem is that Splunk is not parsing it correctly.

I believe I need to add SHOULD_LINEMERGE somewhere .

I was thinking that I need to add to "props.conf" on the fwdr in /opt/splunk/etc/apps/Splunk_TA_microsoft-cloudservices/local

When I look in .../default I can see a props.conf but there is no stanza for sourcetype = mscs:storage:blob.

Do I create a props.conf in .../local and enter a stanza for sourcetype = mscs:storage:blob and add SHOULD_LINEMERGE = true?

Ideally I want the breaks to occur on the timestamps.

Please advise.

Thank you

0 Karma

jkat54
SplunkTrust
SplunkTrust

The app ships with a default props.conf. I suggest you download the app again and figure out what is missing in your deployment.

0 Karma

Log_wrangler
Builder

Thank you for the reply.

I must not have explained my question clearly.

I have props.conf in /opt/splunk/etc/apps/Splunk_TA_microsoft-cloudservices/default,

however it does not reference the sourcetype = mscs:storage:blob

I have:

[mscs:storage:table]
KV_MODE = json
TIME_PREFIX = "Timestamp":\s*"
SHOULD_LINEMERGE = false

But I don't have:

[mscs:storage:blob] < where I would setup "SHOULD_LINEMERGER"

The only "blob" references are: (via grep -i blob props.conf)

[source::...splunk_ta_microsoft-cloudservices_storage_blob*.log*]
sourcetype = mscs:storage:blob:log

Before I download again and loses existing work, is there a way I can verify that the following should be in default?

[mscs:storage:blob]
KV_MODE = json
TIME_PREFIX = "Timestamp":\s*"
SHOULD_LINEMERGE = false

OR do I create it in

/opt/splunk/etc/apps/Splunk_TA_microsoft-cloudservices/local

because I usually don't edit default as I learned that is a no-no.

Thank you

0 Karma

Kendo213
Communicator

I'm having the same issue. There is no sourcetype configured in props.conf for this add-on for storage blobs. I'm having issues coming up with a props.conf that will identify timestamps, etc.

This is what I've been testing:

SHOULD_LINEMERGE=false
LINE_BREAKER: (,[\r\n]+\s+){
SECCMD-REMOVEFOOTER = \s}\s+]\s+}
SECCMD-REMOVEHEADER = {\s+"records":\s+[\s+
disabled = false
TIME_PREFIX = "time":\s"
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%7Q
MAX_TIMESTAMP_LOOKAHEAD = 50

0 Karma

Log_wrangler
Builder

Thanks for your comment, I guess we are in the same boat. Are you testing this in

/opt/splunk/etc/apps/Splunk_TA_microsoft-cloudservices/local

with a newly created stanza for

[mscs:storage:blob] ??

0 Karma

Log_wrangler
Builder

Rephrase: Did you have to make any additional changes to inputs.conf?
https://docs.splunk.com/Documentation/AddOns/released/MSCloudServices/Configureinputs5

0 Karma

Kendo213
Communicator

I didn't make any changes, I just did everything via the UI, except this props.conf.

The props.conf I provided sort of works, but it creates that weird delay, and when I preview this props.conf in the UI, it isn't starting the timestamp, it starts after 2018-07-*.

Let me know what you come up with.

0 Karma

Log_wrangler
Builder

so I am using this (same line breaker as you are)...

[mscs:storage:blob]
SHOULD_LINEMERGE = true
LINE_BREAKER: (,[\r\n]+\s+){
TRUNCATE = 0
KV_MODE = json

and it works for my data

but I would like to clean up the trailing

       }
    ]
}

AND remove these

}
,
}

I might ask another json format line break question to see if anyone else has solved this.

But thanks for the help, please convert your comment to an answer and I will accept!

0 Karma

Kendo213
Communicator

Are you seeing any delays in data coming in? With any sort of props.conf for this sourcetype, we start seeing a delay in how often the data gets in pulled in, and it's fairly random.

0 Karma

Log_wrangler
Builder

well in my dir /opt/splunk/etc/apps/Splunk_TA_microsoft-cloudservices/local
my inputs.conf

looks like this

[mscs_storage_blob://interesting-events]
account = my_azure_logs
blob_list = *
blob_mode = append
collection_interval = 36
container_name = insights-operational-logs

so the interval is reaching out every 36 seconds...

I am not seeing any big deal, if I do "last 15 minutes" I get results about every 2 minutes but I am using a hf to fwder to my indexers which are not cluster but in a distributed peer setup. I hope that helps

0 Karma

jkat54
SplunkTrust
SplunkTrust

Did you try downloading older versions of the app and checking props.conf?

0 Karma

jkat54
SplunkTrust
SplunkTrust

I just checked older copies i have and found that there's not any props.conf settings for the mscs:storage:blob sourcetype. You'll have to create your own based on the data you're storing in the blob.

http://docs.splunk.com/Documentation/AddOns/released/MSCloudServices/Sourcetypes

We made it easier on ourselves by appending a string to the beginning of every event we store in blobs. This way our Line_Breaker can be "THIS_IS_OUR_LINE_BREAKER" or whatever we like.

0 Karma

Kendo213
Communicator

Yeah, and I confirmed with Splunk support (there is a single line in the documentation about this) that the props.conf isn't built for blob because there are too many types of data.

The issue we're encountering now, is that with a props.conf configured, data seems about 5-10 minutes delayed. It will ingest about 10 minutes of data, and then approximately 10 minutes later, backfill the next 10 minutes. I was hoping others were seeing the same issue 😕 I've tried two different heavy forwarders, and neither are taxed, so it isn't a resource issue.

0 Karma

Kendo213
Communicator

I am testing it there -- however we saw some really weird behavior.

We have an index cluster, and then some standalone search heads. We're using one of the search heads for this application, it's only installed there, and not on the indexers. Once we had a props.conf in the directory you mentioned, we get really random / slow indexing of cloud services inputs. For example, it may work for a bit, and then slow down to the point of perhaps only once every 15 minutes. I'm interested to see how far you get with it.

0 Karma

Log_wrangler
Builder

Ok thanks, yes- I decrease the interval on inputs to 30 seconds for testing, then I will dial it backup to 3600 (1hr).

I wish I could just use AWS, for the last few years every time I have to deal with Azure there is a problem...

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...