Every since we upgraded the O365 Reporting app to 1.1.0 we have not been getting Message Trace Logs. I upgraded this morning to 1.1.3, and still the same thing. The error below shows in the logs when it tries to search.
For the below example i set it to Continuously Monitor, with a start date of 8/3/2018. However when the query runs it sets the start date to 7/5. I have tried multiple combinations of dates and the start and end date are always 7/5 when set to continuously monitor.
2018-08-04 06:06:23,695 DEBUG pid=13857 tid=MainThread file=connectionpool.py:_new_conn:809 | Starting new HTTPS connection (1): reports.office365.com
2018-08-04 06:06:24,454 DEBUG pid=13857 tid=MainThread file=connectionpool.py:_make_request:400 | https://reports.office365.com:443 "GET /ecp/reportingwebservice/reporting.svc/MessageTrace?$filter=StartDate%20eq%20datetime'2018-07-05T00:31:38Z'%20and%20EndDate%20eq%20datetime'2018-07-05T01:31:38Z' HTTP/1.1" 200 216
What i did notice is that if i select "Index once" in the config and specify a time period, it pulls logs in fine. However when switched back to "Continuously Monitor" it breaks again. Any ideas?
It seems that there is a problem with number of messages returned (it was over 10 000 messages - I missed this because we recently migrated more people). I changed query windows to 30 minutes and it started working again.
Worked and again stopped working. No it got messages and then suddenly some error occurs (messages are not indexed):
2018-09-10 22:57:16,969 DEBUG pid=2213 tid=MainThread file=base_modinput.py:log_debug:286 | Number of messages returned: 10137
2018-09-10 22:57:16,969 DEBUG pid=2213 tid=MainThread file=base_modinput.py:log_debug:286 | Max date before getting message: 2018-09-10 07:11:01.148852
2018-09-10 22:57:19,645 DEBUG pid=2213 tid=MainThread file=base_modinput.py:log_debug:286 | Max date after getting messages: 2018-09-10 08:10:57.876969
2018-09-10 22:57:19,645 DEBUG pid=2213 tid=MainThread file=binding.py:post:736 | POST request to https://127.0.0.1:8089/servicesNS/nobody/TA-MS_O365_Reporting/storage/collections/data/TA_MS_O365_Re... (body: {'body': '[{"state": "{\\"max_date\\": \\"2018-09-10 08:10:57.876969\\"}", "_key": "o365_msgtrace_obj_checkpoint"}]'})
2018-09-10 22:57:19,647 ERROR pid=2213 tid=MainThread file=splunk_rest_client.py:request:144 | Failed to issue http request=POST to url=https://127.0.0.1:8089/servicesNS/nobody/TA-MS_O365_Reporting/storage/collections/data/TA_MS_O365_Reporting_checkpointer/batch_save, error=Traceback (most recent call last):
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/splunk_rest_client.py", line 140, in request
verify=verify, proxies=proxies, cert=cert, **kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/requests/sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/requests/sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/requests/adapters.py", line 426, in send
raise ConnectionError(err, request=request)
ConnectionError: ('Connection aborted.', BadStatusLine("''",))
2018-09-10 22:57:19,648 WARNING pid=2213 tid=MainThread file=utils.py:wrapper:157 | Run function: update failed: Traceback (most recent call last):
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/utils.py", line 154, in wrapper
return func(*args, **kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/modular_input/checkpointer.py", line 208, in update
self._collection_data.batch_save(record)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/splunklib/client.py", line 3719, in batch_save
return json.loads(self._post('batch_save', headers=KVStoreCollectionData.JSON_HEADER, body=data).body.read())
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/splunklib/client.py", line 3610, in _post
return self.service.post(self.path + url, owner=self.owner, app=self.app, sharing=self.sharing, **kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/splunklib/binding.py", line 287, in wrapper
return request_fun(self, *args, **kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/splunklib/binding.py", line 69, in new_f
val = f(*args, **kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/splunklib/binding.py", line 738, in post
response = self.http.post(path, all_headers, **query)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/splunklib/binding.py", line 1201, in post
return self.request(url, message)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/splunklib/binding.py", line 1218, in request
response = self.handler(url, message, **kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/splunk_rest_client.py", line 140, in request
verify=verify, proxies=proxies, cert=cert, **kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/requests/sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/requests/sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/requests/adapters.py", line 426, in send
raise ConnectionError(err, request=request)
ConnectionError: ('Connection aborted.', BadStatusLine("''",))
.
2018-09-10 22:57:19,649 ERROR pid=2213 tid=MainThread file=base_modinput.py:log_error:307 | Get error when collecting events.
Traceback (most recent call last):
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/modinput_wrapper/base_modinput.py", line 127, in stream_events
self.collect_events(ew)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ms_o365_message_trace.py", line 88, in collect_events
input_module.collect_events(self, ew)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/input_module_ms_o365_message_trace.py", line 259, in collect_events
get_events_continuous(helper, ew, global_microsoft_office_365_username, global_microsoft_office_365_password)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/input_module_ms_o365_message_trace.py", line 207, in get_events_continuous
helper.save_check_point(check_point_key, checkpoint_data)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/modinput_wrapper/base_modinput.py", line 529, in save_check_point
self.ckpt.update(key, state)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/utils.py", line 154, in wrapper
return func(*args, **kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/modular_input/checkpointer.py", line 208, in update
self._collection_data.batch_save(record)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/splunklib/client.py", line 3719, in batch_save
return json.loads(self._post('batch_save', headers=KVStoreCollectionData.JSON_HEADER, body=data).body.read())
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/splunklib/client.py", line 3610, in _post
return self.service.post(self.path + url, owner=self.owner, app=self.app, sharing=self.sharing, **kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/splunklib/binding.py", line 287, in wrapper
return request_fun(self, *args, **kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/splunklib/binding.py", line 69, in new_f
val = f(*args, **kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/splunklib/binding.py", line 738, in post
response = self.http.post(path, all_headers, **query)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/splunklib/binding.py", line 1201, in post
return self.request(url, message)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/splunklib/binding.py", line 1218, in request
response = self.handler(url, message, **kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/splunk_rest_client.py", line 140, in request
verify=verify, proxies=proxies, cert=cert, **kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/requests/sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/requests/sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "/opt/splunk/etc/apps/TA-MS_O365_Reporting/bin/ta_ms_o365_reporting/solnlib/packages/requests/adapters.py", line 426, in send
raise ConnectionError(err, request=request)
ConnectionError: ('Connection aborted.', BadStatusLine("''",))
Hi!
I had a similar problem. Looks like there is some bug in the date features.
The API only supports requests up to 7 days back, and it looks like the app overrides the date settings from the continous monitor start time setting, replacing the start date with the date from the latest indexed events.
What i did:
1. Disabled the continous input temporarily.
2. Created a new input for a "index once" job. (Don't use the same input as you did for the continous monitor)
3. Set interval to "-1" and Window size and Delay throttle so you don't exceed the limitations in the API. (2000 messages or so per query)
4. Index the last 7 days worth of data using the newly created "index once" input-
5. Wait for the job to complete and catch up with today's date.
6. Disable the "index once" input.
7. Re enable the continous input leaving the "Start date/time" blank.
8. This did the trick for me
Hope it helps
Thanks for the reply. I tried that and i get the same results. After the initial bulk import, then setting it back to "continuously monitor" the logs returned "no messages returned", and the search date is set to 7-6-2018 once again. However in looking at the index the last indexed item was on 8/4 (when i did the bulk import).
2018-08-06 05:29:12,403 DEBUG pid=95026 tid=MainThread file=binding.py:new_f:71 | Operation took 0:00:00.002489
2018-08-06 05:29:12,403 DEBUG pid=95026 tid=MainThread file=base_modinput.py:log_debug:286 | Start date: 2018-07-06 23:51:38, End date: 2018-07-07 00:51:38
2018-08-06 05:29:12,404 DEBUG pid=95026 tid=MainThread file=base_modinput.py:log_debug:286 | Endpoint URL: https://reports.office365.com/ecp/reportingwebservice/reporting.svc/MessageTrace?$filter=StartDate eq datetime'2018-07-06T23:51:38Z' and EndDate eq datetime'2018-07-07T00:51:38Z'
2018-08-06 05:29:12,404 INFO pid=95026 tid=MainThread file=setup_util.py:log_info:114 | Proxy is not enabled!
2018-08-06 05:29:12,409 DEBUG pid=95026 tid=MainThread file=connectionpool.py:_new_conn:809 | Starting new HTTPS connection (1): reports.office365.com
2018-08-06 05:29:13,270 DEBUG pid=95026 tid=MainThread file=connectionpool.py:_make_request:400 | https://reports.office365.com:443 "GET /ecp/reportingwebservice/reporting.svc/MessageTrace?$filter=StartDate%20eq%20datetime'2018-07-06T23:51:38Z'%20and%20EndDate%20eq%20datetime'2018-07-07T00:51:38Z' HTTP/1.1" 200 216
2018-08-06 05:29:13,273 DEBUG pid=95026 tid=MainThread file=base_modinput.py:log_debug:286 | No messages returned. Setting max date to 2018-07-06 23:56:38
To bad it didn't help.
May suggest to remove the continous input totally and create a new one because it looks like the date setting is stuck there somewhere.
That actually worked! Deleting the input and re-creating it (with a new name) worked and it is now running searches in 1 hour increments from about 5 days ago. Thanks for the help!
It is actually even enough to clone old input. That's strange, because it worked, then suddenly stopped working (everytime no messages returned). After cloning and removing old one input it started pulling messages...
Unfortunately this does not resolves situation that it just stops pulling trace.