Getting Data In

help on CSV default limit issue

jip31
Motivator

hi

If I launch the files separately, I have results
But since a few days, I am unable to cross the data between the 2 CSV files below

| inputlookup host.csv 
| lookup toto.csv "Computer" as host 
| stats count as "Number of machines" by flag

When I have a look to toto.csv, I have 98000 events
Is anybody can confirm me that the issue comes from the CSV default limit which is 10000 events?
If yes, what I have to do for having a workaround to this issue?
thanks

Tags (1)
0 Karma
1 Solution

nickhills
Ultra Champion

You will only have as many results as there are entries in host.csv.
The number of entries in toto.csv will not affect the number of results this query returns.

What are you trying to do?

If my comment helps, please give it a thumbs up!

View solution in original post

0 Karma

nickhills
Ultra Champion

You will only have as many results as there are entries in host.csv.
The number of entries in toto.csv will not affect the number of results this query returns.

What are you trying to do?

If my comment helps, please give it a thumbs up!
0 Karma

jip31
Motivator

I have an host list in host.csv and an host list in toto.csv
From the host.csv list, I need to retrieve the existing datas for the same host in toto.csv
It was working perfectly but since the toto.csv has more than 10000 events it doesnt works
I can see any other explanations than this

0 Karma

nickhills
Ultra Champion

lookup (files) with very large row counts are not the recommended approach.
However - this is for performance reasons, and not because they will not work.
The recommended approach is to use a KV store if your lookup data is more than 10,000 entries.

With that said, based on what you have said so far, that does not appear to be your problem.

Its far more likely that you have some bad data in your lookup file, a misquoted or extra comma, or extra quotation marks are top candidates - something which becomes troublesome to spot with nearly 100,000 entries.

If I was trying to find the problem, I would split the lookup into 10 smaller files (cat/head/tail whaetver your comfortable with) and test each small file 1 at a time. Manually run a query to test for the last now in each lookup - when you find a file which no longer works you can start narrowing down the problem.

If performamce is something you care about - i would also look at moving the lookup to KV - during which, the bad data in the existing file will likely become self evident.

If my comment helps, please give it a thumbs up!
0 Karma

jip31
Motivator

It seems to be a right issue on the file when the push in prod has been done...

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...