Wondering if there are any best practices (or reference architectures) for running Splunk against an Azure (or another cloud) solution where there are, for example, multiple web servers, and in this case a very large number of worker nodes. There could also be n number of these deployments. So essentially LOTS of cloud VM instances. All the logs are automatically transferred to Azure Table Storage.
We don't want to have to transfer all this data on-premise as it could get a little unwieldy.
Would the best approach be to run Splunk up on a VM in the cloud and have it download the logs to local storage? This could be problematic if the VM was recycled as the local storage could (will eventually) get wiped...
Appreciate any guidance.
Thanks,
Dave
Quite some time has passed since this was originally posted, but here are some pointers:
To monitor Azure VMs, in many cases, you do not need to install & configure UF on each node. Instead, you can rely on Azure Diagnostics (performance metrics, logs, etc.) that are collected out of the box and stored in Azure Storage account. You can then ingest this data into Splunk (be it on-prem or on Azure) in various ways, including Splunk Add-on for Microsoft Azure.
I just wrote an Azure Diagnostics App for Splunk and submitted it to splunkbase yesterday for approval. I tested it in both windows and Linux. What it does is pull the azure diagnostics from the azure WAD tables and populate the splunk indexes with it. Currently it doesn't do any grooming of the azure tables but that is something I plan on adding later. It can run on or off-premises, some due diligence is needed to determine what makes the most sense in different scenarios (pay for instances vs data transfers). If you do decide to give it a try, do let me know, I'd love to hear some feedback.
Thanks,
Michel
There has definitely been some work done on this before, and I believe Splunk's SEs have used Amazon-based instances for demos from time to time.
The following would be a good starting point:
Setting up UniversalForwarders on each node should work just fine. However, since Azure diagnostics logs might always have more information, I want to have this data indexed. Did you figure out the best way to forward azure diagnostics logs to a Splunk indexer (OnPrem or on Azure)?
Thanks. I did watch a couple of those. After doing further research I believe using Universal Forwarders (I wasn't aware of these when posting) on each node and hosting one or more indexers in the cloud is the way to go. Would certainly appreciate any comments on this approach though.