All Apps and Add-ons

How can I read a public dataset from public s3 bucket?

jdunlea1
Explorer

Is there a way (using the Splunk TA for AWS or otherwise) that Splunk can connect to a publicly available S3 bucket (such as those made available here https://registry.opendata.aws/) and read in the data?

From the Splunk TA, the only buckets that I can read from are those which were created in my account.

0 Karma
1 Solution

jdunlea1
Explorer

After some further digging and testing it appears that it can be done but you need to create the input using the conf file as per link text

The key here for ingesting "old" data from a public dataset in S3 is that you need to set initial_scan_datetime to be a date that is BEFORE the modified file time for the files in the S3 bucket.

Once I did this, I was able to pull the public dataset into Splunk from the public S3 bucket.

View solution in original post

0 Karma

jdunlea1
Explorer

After some further digging and testing it appears that it can be done but you need to create the input using the conf file as per link text

The key here for ingesting "old" data from a public dataset in S3 is that you need to set initial_scan_datetime to be a date that is BEFORE the modified file time for the files in the S3 bucket.

Once I did this, I was able to pull the public dataset into Splunk from the public S3 bucket.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...