All Apps and Add-ons

REST API Modular Input: Duplicates when fetching data from Twitter

bkarambelkar
New Member

I'll pulling Tweets from a list using the /lists/statuses API
https://dev.twitter.com/rest/reference/get/lists/statuses.

But the events are being duplicated, and I think this is because there is no way for me to specify the since_id param in the request URL.
What I would ideally like to do is, when tweets are fetched, store the max(id_str) value somewhere and pass it as a request param to the next invocation.
How can I accomplish this ?

Also does the module support ARRAYs and decomposes individual events from the Array ? Currently I'm passing count=1 argument, but ideally I would like to pass in count=100 (the max allowed) so as to be able to pull in more than 1 tweet per call.

0 Karma

Damien_Dallimor
Ultra Champion

I think that the twitter response json format may have changed since I wrote that response handler.

Try this instead :

class TwitterEventHandler:

    def __init__(self,**args):
        pass

    def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):       

        if response_type == "json":        
            output = json.loads(raw_response_output)
            last_tweet_indexed_id = 0
            for twitter_event in output:
                print_xml_stream(json.dumps(twitter_event))
                if "id_str" in twitter_event:
                    tweet_id = twitter_event["id_str"]
                    if tweet_id > last_tweet_indexed_id:
                        last_tweet_indexed_id = tweet_id

            if not "params" in req_args:
                req_args["params"] = {}

            req_args["params"]["since_id"] = last_tweet_indexed_id

        else:
            print_xml_stream(raw_response_output)
0 Karma

Damien_Dallimor
Ultra Champion

The App does come with an example custom response handler for Twitter.

Look at TwitterEventHandler in rest_ta/bin/responsehandlers.py

0 Karma

bkarambelkar
New Member

Thanks for the pointer Damien, But Setting the ResponseHandler to TwitterEventHandler produces no events in the index. At least with the DefaultEventHandler I was getting the index to populate.

Here are my settings
Endpoint URL : https://api.twitter.com/1.1/lists/statuses.json
URL Arguments : slug=XXXXXX,owner_screen_name=XXXXXX,count=100
Response Type : json
Response Handler : TwitterEventHandler
Stream Request : Checked
Source Type : From list / _json

Am I missing something ? I even checked Index Error Response, but nothing in the index.

Thanks for helping out.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...