Reporting

Big duration difference between loadjob query and view results function

JuGuSm
Path Finder

Hi,

I created a scheduled report generating millions of lines.

When I list my jobs and I click on the name of scheduled report I get my result INSTANTLY.

But when I try to load the job using this query:

| loadjob savedsearch="MY_USER:MY_APP:MY_REPORT"

the search is completed in 433 seconds!

So why is there such a difference?

Thank you.

Tags (1)
0 Karma

acharlieh
Influencer

I have to admit not knowing much about your environment and your search, nor the exact particulars of these inner workings, so I need to hedge and note that this is a bit of an educated guess:

When I go to a url such as https://mysplunk/en-US/app/myapp/search?sid=searchId
Splunk simply has to go to disk, pull the set of cached results and display them to me. I suspect this is probably even optimized to pull only the displayed page size set of results back to me as the end user. A relatively simple operation with all the results on disk already.

Comparatively, if I execute a search such as | loadjob searchId Splunk would have to setup a search pipeline, copy the millions of results that you said you had into the pipeline, write the millions of results as output of the pipeline, tear down the pipeline, and then display the results back to me, which would be a function that seems like it would be very dependent on the read/write disk speed, and the memory of your search head. As you have more results, and more fields per result in the original search, the memory and disk requirements grow, but ideally you are doing some pre-statistics in your original search so that you are saving time (but of course that's a function of what your original and add-on searches are. (Specifying by saved search, means there's just a lookup to pull the results from the latest executed set of that search, which shouldn't be much additional time).

Ideally the cost of running | loadjob is cheaper than going back to the indexers and pulling search results, but that's dependent on how you're splitting your original and add-on searches, and size and dimensions of the data set you're pulling using loadjob compared to the original. Does that seem to make sense? (and does anyone have more specifics than this handwavyness that I'm doing right here?)

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...