About richardhull_bjs

richardhull_bjs · ‎03-13-2014

See https://github.com/splunk/splunk-sdk-python/pull/77

richardhull_bjs · ‎03-11-2014

Not buffering is definitely the problem here. I created the following class: class ResponseReaderWrapper(io.RawIOBase): def __init__(self, responseReader): self.responseReader = responseReader def readable(self): return True def close(self): self.responseReader.close() def read(self, n): return self.responseReader.read(n) def readinto(self, b): sz = len(b) data = self.responseReader.read(sz) for idx, ch in enumerate(data): b[idx] = ch return len(data) And then this allows me to utilize the io.BufferedReader as follows: rs = job.results(count=maxRecords, offset=self._offset) results.ResultsReader(io.BufferedReader(ResponseReaderWrapper(rs))) This means my query and pulling the results now runs in ~3 seconds rather than 90+ seconds as before. It would be nice if ResponseReader implemented the readable and readinto methods so it were more streamlike, then this ResponseReaderWrapper class wouldn't be necessary - happy to provide a pull-request for this if you agree.

richardhull_bjs · ‎03-10-2014

I should say, we're using the 1.1.0 python lib here

richardhull_bjs · ‎03-10-2014

I am experiencing the same thing. I ran my app with the -m cProfile flag, and after some munging in excel: ncalls tottime percall cumtime percall filename:lineno(function) ----------------------------------------------------------------- 410 0.01 0 94.422 0.23 results.py:204(next) 410 0.757 0.002 94.412 0.23 results.py:207(_parse_results) 29481 0.185 0 93.039 0.003 <string>:80(next) 33 0.001 0 92.819 2.813 results.py:93(read) 32 9.158 0.286 92.818 2.901 results.py:124(read) 518047 13.097 0 83.542 0 binding.py:1142(read) 518053 11.294 0 68.321 0 httplib.py:532(read) 518199 24.065 0 54.89 0 socket.py:336(read) 518764 9.899 0 19.695 0 ssl.py:235(recv) 518764 5.646 0 9.796 0 ssl.py:154(read) 518764 4.15 0 4.15 0 {built-in method read} 518520 2.431 0 2.431 0 {max} 518846 2.415 0 2.415 0 {method 'seek' of 'cStringIO.StringO' objects} 518466 2.356 0 2.356 0 {cStringIO.StringIO} I'm reading this as results.py is making 1/2million calls out to binding.py's read method, ONE character at a time. I'm guessing that it is not using any form of buffered I/O though ? def read(self, n=None): """Read at most *n* characters from this stream. If *n* is ``None``, return all available characters. """ response = "" while n is None or n > 0: c = self.stream.read(1) if c == "": break elif c == "<": c += self.stream.read(1) if c == "<?": while True: q = self.stream.read(1) if q == ">": break else: response += c if n is not None: n -= len(c) else: response += c if n is not None: n -= 1 return response

Posts	4
Solutions	1
Karma Given	0
Karma Received	9
Member Since	‎03-10-2014

Online Status	Offline
Date Last Visited	‎06-05-2020 02:04 AM

Re: Python SDK - results.ResultsReader extremely s...

Re: Python SDK - results.ResultsReader extremely s...

Re: Python SDK - results.ResultsReader extremely s...

Re: Python SDK - results.ResultsReader extremely s...