First, there are a number of Splunk Apps that will help you bring in and visualized information from the devices you mentioned. I'd start there, using the appropriate F5, Juniper, and Cisco apps. All are on http://splunkbase.splunk.com
Those apps won't help with the end-to-end monitoring you're asking about (which I'll get to as well) but what they will do is provide deeper insight into the operation of those different products.
Most (or all) of those devices also support sending information via syslog. Typically this will include hardware level events, logs of configuration changes, and so on. This is generally low volume data but worth looking at. I'd certainly want Splunk to open a ServiceNow ticket whenever a device reports a fan failure or loss of a redundant power supply!
With all of that being said, it still doesn't address your need: I'm looking to identify top talkers and want to be alerted when a network link is approaching full capacity. When a switch port is experiencing degraded service, I want to know which servers and applications are affected.
What you'll want is the network meta-information that's available from each of your devices. Commonly called "flow" data, this is the data set that will help you answer your question. This comes in many flavors, including:
- NetFlow v5 or v9 data from older Cisco gear
- J-Flow data from your Juniper equipment
- sFlow (for sampled NetFlow) from your switches
- IPFIX data from your VMware virtual switches and pretty much every other intelligent networking device released since 2015
Note that all of these network flow types are in binary format; Splunk cannot ingest them directly.
Wikipedia has a great write-up on each of these for those who are interested. The TLDR version: NetFlow was invented by Cisco, other vendors had their own versions. IPFIX replaces them as a common, universal standard.
In Splunkbase, you'll find a few different TAs from Splunk, one for IPFIX data and one for NetFlow v5/v9 data. They'd help you bring in some of the data, but would not address your Juniper devices or the sampled flow data from your switches.
I believe that to accomplish what you're after, you'll want to use NetFlow Logic's "NetFlow Integrator". (See their app at https://splunkbase.splunk.com/app/489/)
How it works:
First, NetFlow Integrator acts a sort of middleware. It takes in all of the different flow data types, converts them from binary format. When Integrator sees data coming in, it reaches back to the sending network device to do some SNMP-based data collection. This allows Integrator to determine data such as port speed and duplex, and other device information.
What Integrator does next is up to you. It can send each flow record to Splunk (converted to syslog format) or send aggregated information periodically, or both. Sending the aggregated data is the best fit for most Splunk environments, Flow data can be VERY high volume and this allows you to keep your Splunk license usage low.
Once the data gets to Splunk, you'll finally have your answer. NetFlow Logic has apps on Splunkbase that use the Splunk platform to tie all of the data together. This includes reports top talkers, network utilization/health/saturation, and traffic flows affected by networking issues. It does this even with your VMware switches, top of rack devices, and (as I saw at VMworld this week) VMware NSX.
I hope that helps point you in the right direction!
While we're talking about the network - I'll mention "Splunk App for Stream" as well. It wouldn't help with the use case you asked about, but it's worth knowing about. Stream allows you to look at the application protocol level to analyze communication between servers via TCP or UDP. When the log files don't give the information you want, Stream allows you to bring in data for both IT Ops and Security use cases.
... View more