Our indexers are configured to use s3 compatibility remotepath, ans were seeing lots of 400 status code returned when Splunk makes calls to S3
The isolated all URI contains a parameter "&delimiter=guidSplunk". URI with parameters that does not have the parameter "delimiter=guidSplunk" all have the paramter "versions" parameter no value and those succeeds. For example:
Failing URI:
/secsplunk-idx-sysmon?max-keys=1000&prefix=sysmon%2Fdma%2F55%2Fe5%2F1595~BED2107F-430E-49FC-8449-949FA7F70D51&delimiter=guidSplunk
Succedding URI:
/secsplunk-idx-sysmon?versions&max-keys=1000&prefix=sysmon%2Fdb%2F55%2Fe5%2F1595~BED2107F-430E-49FC-8449-949FA7F70D51%2FguidSplunk-BED2107F-430E-49FC-8449-949FA7F70D51%2Frawdata%2Fslicesv2.dat
The resolution was > string delimiters are working but appear that there is a bug opened for this issue.
http://tracker.ceph.com/issues/24821
The issue turned out to be s3 compatibility storage does not support a delimiter "string" but just a delimiter character, so Splunk need to apply configuration change to disable delimiter.
Due to this change in delimiter within Splunk Code will have to parse through a lot more object names to find buckets, and it will be inefficient.
So, Splunk will end up disabling the delimiter, which will mean the object listings will be inefficient. This is one of the limitations of not having full aws s3 compatibility.
In server.conf
remote.s3.use_delimiter = true | false
* Optional.
* Specifies whether a delimiter (currently "guidSplunk") should be used to list the objects that are present on the remote storage.
* A delimiter groups objects that have the same delimiter value so that the listing process can be more efficient as it
does not need to report similar objects.
* Defaults to: true