Good evening all,
I have been fighting with the reddit API in splunk today and was hoping someone could help me out. I am trying to pull topics, comments, replies of all a users subscribed subreddits. I have been able to pull the users /me get request so the tokens are authenticating.
Thanks all,
Denym
They only way i can get a users subreddits is by going to this endpoint:
https://api.reddit.com/user/{username}
then parsing out that respons's json with a for loop to get permalinks to their subreddits and then getting those permalinks... looks like this in python:
import requests
req = requests.get('https://api.reddit.com/user/knightfang')
json = req.json()
for child in json['data']['children']:
r = requests.get('https://api.reddit.com/' + str(child['data']['permalink']))
j = r.json()
The issue is that you have to look at the user to get their subreddits and then look at their individual subreddits to get the comments. So i dont see the Modular REST API helping much here...
They only way i can get a users subreddits is by going to this endpoint:
https://api.reddit.com/user/{username}
then parsing out that respons's json with a for loop to get permalinks to their subreddits and then getting those permalinks... looks like this in python:
import requests
req = requests.get('https://api.reddit.com/user/knightfang')
json = req.json()
for child in json['data']['children']:
r = requests.get('https://api.reddit.com/' + str(child['data']['permalink']))
j = r.json()
The issue is that you have to look at the user to get their subreddits and then look at their individual subreddits to get the comments. So i dont see the Modular REST API helping much here...
Thank you, I was starting to think a targeted crawler would be an easier way to build the users data set.
I converted to an answer because I don't think you will solve this otherwise.
Are they your own subreddits or any user on reddit?
A configuration list for the REST plug in would be helpful, I am not sure if my current settings are refreshing or just running the original token time out. also the event data I am pulling in is coming in unstructured using the request url https://oauth.reddit.com/r/news/new (in an attempt to just pull new /r/news topics as a starting point.
I am wanting to pull subreddits that the user has subscribed to
Using the following settings I am getting title and link data(example of output below)
Endpoint URL
https://oauth.reddit.com/r/news/new
URL to send the HTTP GET request to
HTTP Method
HTTP method to use.Defaults to GET. POST and PUT are not really RESTful for requesting data from the API, but useful to have the option for target APIs that are "REST like"
Authentication Type
Authentication method to use
OAUTH 2 Token Type
OAUTH 2 token type, defaults to "Bearer"
OAUTH 2 Access Token
TOKEN
OAUTH 2 access token
OAUTH 2 Refresh Token
REFRESH_TOKEN
OAUTH 2 refresh token
OAUTH 2 Token Refresh URL
https://www.reddit.com/api/v1/access_token
OAUTH 2 token refresh URL
OAUTH 2 Token Refresh Properties
grant_type=refresh_token&refresh_token=72173017-lOP4Hi9dsgYImqOlv-yKGgIRgIg
OAUTH 2 token refresh properties : key=value,key2=value2
OAUTH 2 Client ID
CLIENT_ID
OAUTH 2 client ID
OAUTH 2 Client Secret
Client_Secret
OAUTH 2 client secret
HTTP Header Properties
Custom HTTP header properties : key=value,key2=value2
URL Arguments
Custom URL arguments : key=value,key2=value2
Response Type
Rest Data Response Type, defaults to text
Response Handler
Python classname of custom response handler, defaults to DefaultResponseHandler
Response Handler Arguments
Response Handler arguments string , key=value,key2=value2
Response Filter Pattern
Python Regex pattern, if present , the response will be scanned for this match pattern, and indexed if a match is present
Streaming Request ?
Whether or not this is a HTTP streaming request, defaults to false
Index Error Responses
Whether or not to index error response codes, defaults to false
HTTP Proxy Address
HTTP proxy address, ie: http://10.10.1.10:3128 or http://user:pass@10.10.1.10:3128
HTTPs Proxy Address
HTTPs proxy address,ie: https://10.10.1.10:3128 or https://user:pass@10.10.1.10:3128
Request Timeout
Request Timeout in seconds , defaults to 30
Backoff Time
Time in seconds to wait for retry after error or timeout , defaults to 10
Polling Interval
Polling interval in either seconds or a CRON time format , defaults to 60 seconds.
Run multiple requests sequentially ?
Whether multiple requests spawned by tokenization are run in parallel or sequentially, defaults to false (run in parallel)
Sequential Stagger Time
An optional stagger time period between sequential requests.Defaults to 0
Delimiter
Delimiter to use for any multi "key=value" field inputs, defaults to ','
Source type
Set sourcetype field for all events from this source.
Set sourcetype
Source type
_json
If this field is left blank, the default value of script will be used for the source type.
Blockquote
{"kind": "Listing", "data": {"modhash": null, "children": [{"kind": "t3", "data": {"contest_mode": false, "banned_by": null, "media_embed": {}, "subreddit": "news", "selftext_html": null, "selftext": "", "likes": null, "suggested_sort": null, "user_reports": [], "secure_media": null, "link_flair_text": null, "id": "64qk8w", "gilded": 0, "secure_media_embed": {}, "clicked": false, "score": 1, "report_reasons": null, "author": "That_lonely", "saved": false, "mod_reports": [], "name": "t3_64qk8w", "subreddit_name_prefixed": "r/news", "approved_by": null, "over_18": false, "domain": "theguardian.com", "hidden": false, "thumbnail": "", "subreddit_id": "t5_2qh3l", "edited": false, "link_flair_css_class": null, "author_flair_css_class": null, "downs": 0, "brand_safe": true, "archived": false, "removal_reason": null, "is_self": false, "hide_score": true, "spoiler": false, "permalink": "/r/news/comments/64qk8w/united_where_the_customer_is_always_wrong/", "num_reports": null, "locked": false, "stickied": false, "created": 1491943231.0, "url": "https://www.theguardian.com/world/2017/apr/11/united-airlines-boss-oliver-munoz-says-passenger-belli...", "author_flair_text": null, "quarantine": false, "title": "United: where the customer is always wrong", "created_utc": 1491914431.0, "distinguished": null, "media": null, "num_comments": 0, "visited": false, "subreddit_type": "public", "ups": 1}}, {"kind": "t3", "data": {"contest_mode": false, "banned_by": null, "media_embed": {}, "subreddit": "news", "selftext_html": null, "selftext": "", "likes": null, "suggested_sort": null, "user_reports": [], "secure_media": null, "link_flair_text": null, "id": "64qir9", "gilded": 0, "secure_media_embed": {}, "clicked": false, "score": 1, "report_reasons": null, "author": "JoeyFlynn", "saved": false, "mod_reports": [], "name": "t3_64qir9", "subreddit_name_prefixed": "r/news", "approved_by": null, "over_18": false, "domain": "news.sky.com", "hidden": false, "thumbnail": "", "subreddit_id": "t5_2qh3l", "edited": false, "link_flair_css_class": null, "author_flair_css_class": null, "downs": 0, "brand_safe": true, "archived": false, "removal_reason": null, "is_self": false, "hide_score": true, "spoiler": false, "permalink": "/r/news/comments/64qir9/russia_us_ties_are_the_worst_since_cold_war/", "num_reports": null, "locked": false, "stickied": false, "created": 1491942689.0, "url": "http://news.sky.com/story/russia-says-us-ties-are-worst-since-cold-war-ahead-of-tillerson-visit-1083...", "author_flair_text": null, "quarantine": false, "title": "Russia: US ties are the 'worst since Cold War'", "created_utc": 1491913889.0, "distinguished": null, "media": null, "num_comments": 0, "visited": false, "subreddit_type": "public", "ups": 1}}, {"kind": "t3", "data": {"contest_mode": false, "banned_by": null, "media_embed": {}, "subreddit": "news", "selftext_html": null, "selftext": "", "likes": null, "suggested_sort": null, "user_reports": [], "secure_media": null, "link_flair_text": null, "id": "64qidh", "gilded": 0, "secure_media_embed": {}, "clicked": false, "score": 1, "report_reasons": null, "author": "Sacmo77", "saved": false, "mod_reports": [], "name": "t3_64qidh", "subreddit_name_prefixed": "r/news", "approved_by": null, "over_18": false, "domain": "wavy.com", "hidden": false, "thumbnail": "", "subreddit_id": "t5_2qh3l", "edited": false, "link_flair_css_class": null, "author_flair_css_class": null, "downs": 0, "brand_safe": true, "archived": false, "removal_reason": null, "is_self": false, "hide_score": true, "spoiler": false, "permalink": "/r/news/comments/64qidh/one_dead_two_injured_in_house_fire_in_virginia/", "num_reports": null, "locked": false, "stickied": false, "created": 1491942547.0, "url": "http://wavy.com/2017/04/09/one-dead-two-injured-in-house-fire-in-virginia-beach/", "author_flair_text": null, "quarantine": false, "title": "One dead, two injured in house fire in Virginia Beach", "created_utc": 1491913747.0, "distinguished": null, "media": null, "num_comments": 1, "visited": false, "subreddit_type": "public", "ups": 1}}, {"kind": "t3", "data": {"contest_mode": false, "banned_by": null, "media_embed": {}, "subreddit": "news", "selftext_html": null, "selftext": "", "likes": null, "suggested_sort": null, "user_reports": [], "secure_media": null, "link_flair_text": null, "id": "64qick", "gilded": 0, "secure_media_embed": {}, "clicked": false, "score": 1, "report_reasons": null, "author": "dtha_", "saved": false, "mod_reports": [], "name": "t3_64qick", "subreddit_name_prefixed": "r/news", "approved_by": null, "over_18": false, "domain": "bloomberg.com", "hidden": false, "thumbnail": "", "subreddit_id": "t5_2qh3l", "edited": false, "link_flair_css_class": null, "author_flair_css_class": null, "downs": 0, "brand_safe": true, "archived": false, "removal_reason": null, "is_self": false, "hide_score": true, "spoiler": false, "permalink": "/r/news/comments/64qick/dogbite_claims_surge_18_as_children_bear_brunt_of/", "num_reports": null, "locked": false, "stickied": false, "created": 1491942539.0, "url": "https://www.bloomberg.com/news/articles/2017-04-10/dog-bite-claims-surge-18-as-children-bear-brunt-o...", "author_flair_text": null, "quarantine": false, "title": "Dog-Bite Claims Surge 18% as Children Bear Brunt of Attacks", "created_utc": 1491913739.0, "distinguished": null, "media": null, "num_comments": 0, "visited": false, "subreddit_type": "public", "ups": 1}}, {"kind": "t3", "data": {"contest_mode": false, "banned_by": null, "media_embed": {}, "subreddit": "news", "selftext_html": null, "selftext": "", "likes": null, "suggested_sort": null, "user_reports": [], "secure_media": null, "link_flair_text": null, "id": "64qhy1", "gilded": 0, "secure_media_embed": {}, "clicked": false, "score": 8, "report_reasons": null, "author": "halloweenie2", "saved": false, "mod_reports": [], "name": "t3_64qhy1", "subreddit_name_prefixed": "r/news", "approved_by": null, "over_18": false, "domain": "mlive.com", "hidden": false, "thumbnail": "", "subreddit_id": "t5_2qh3l", "edited": false, "link_flair_css_class": null, "author_flair_css_class": null, "downs": 0, "brand_safe": true, "archived": false, "removal_reason": null, "is_self": false, "hide_score": true, "spoiler": false, "permalink": "/r/news/comments/64qhy1/professional_fundraisers_pocket_61_of_michigan/", "num_reports": null, "locked": false, "stickied": false, "created": 1491942390.0, "url": "http://www.mlive.com/news/index.ssf/2017/04/professional_fundraisers_pocke_1.html", "author_flair_text": null, "quarantine": false, "title": "Professional fundraisers pocket 61% of Michigan charitable donations", "created_utc": 1491913590.0, "distinguished": null, "media": null, "num_comments": 0, "visited": false, "subreddit_type": "public", "ups": 8}}, {"kind": "t3", "data": {"contest_mode": false, "banned_by": null, "media_embed": {}, "subreddit": "news", "selftext_html": null, "selftext": "", "likes": null, "suggested_sort": null, "user_reports": [], "secure_media": null, "link_flair_text": null, "id": "64qhvk", "gilded": 0, "secure_media_embed": {}, "clicked": false, "score": 1, "report_reasons": null, "author": "dwergje48", "saved": false, "mod_reports": [], "name": "t3_64qhvk", "subreddit_name_prefixed": "r/news", "approved_by": null, "over_18": false, "domain": "nltimes.nl", "hidden": false, "thumbnail": "", "subreddit_id": "t5_2qh3l", "edited": false, "link_flair_css_class": null, "author_flair_css_class": null, "downs": 0, "brand_safe": true, "archived": false, "removal_reason": null, "is_self": false, "hide_score": true, "spoiler": false, "permalink": "/r/news/comments/64qhvk/dutch_ships_linked_to_human_trafficking_ring/", "num_reports": null, "locked": false, "stickied": false, "created": 1491942370.0, "url": "http://nltimes.nl/2017/04/11/dutch-ships-linked-human-trafficking-ring", "author_flair_text": null, "quarantine": false, "title": "Dutch ships linked to human trafficking ring", "created_utc": 1491913570.0, "distinguished": null, "media": null, "num_comments": 0, "visited": false, "subreddit_type": "public", "ups": 1}}, {"kind": "t3", "data": {"contest_mode": false, "banned_by": null, "media_embed": {}, "subreddit": "news", "selftext_html": null, "selftext": "", "likes": null, "suggested_sort": null, "user_reports": [], "secure_media": null, "link_flair_text": null, "id": "64qh7h", "gilded": 0, "secure_media_embed": {}, "clicked": false, "score": 1, "report_reasons": null, "author": "Consiliarius", "saved": false, "mod_reports": [], "name": "t3_64qh7h", "subreddit_name_prefixed": "r/news", "approved_by": null, "over_18": false, "domain": "reuters.com", "hidden": false, "thumbnail": "", "subreddit_id": "t5_2qh3l", "edited": false, "link_flair_css_class": null, "author_flair_css_class": null, "downs": 0, "brand_safe": true, "archived": false, "removal_reason": null, "is_self": false, "hide_score": true, "spoiler": false, "permalink": "/r/news/comments/64qh7h/reuters_north_korea_state_media_warns_of_nuclear/", "num_reports": null, "locked": false, "stickied": false, "created": 1491942147.0, "url": "http://www.reuters.com/article/us-northkorea-nuclear-idUSKBN17D0A4", "author_flair_text": null, "quarantine": false, "title": "Reuters: North Korea state media warns of nuclear strike if provoked as U.S. warships approach", "created_utc": 1491913347.0, "distinguished": null, "media": null, "num_comments": 0, "visited": false, "subreddit_type": "public", "ups": 1}}, {"kind": "t3", "data": {"contest_mode": false, "banned_by": null, "media_embed": {}, "subreddit": "news", "selftext_html":
Blockquote
Ideally I would like to be able to pull in formatted data that can display along the lines of:
created_at: TIME
author:/u/USERNAME
title:TITLE OF POST ("comment" if not topic)
text:CONTENT(for comments/replies)
links: EXTERNAL LINKS (as seen above news article links to external sites)