Splunk Search

inputlookup not returning all rows

jizzmaster
Path Finder

I have a csv file as a lookup, named "resources.csv." Looking at the actual file, it has about 30,000 lines. In the Splunk search, I am only getting about 15,000 results, though. I'm using the following command to view the lookup table:

|inputlookup resources.csv

The csv file is updated through a script I have running each morning. I have restarted the searchhead that this lookup file is being read by. Nothing has seemed to work. Still only about 15,000 results from the inputlookup command.

Any suggestions?

Tags (1)
0 Karma
1 Solution

jizzmaster
Path Finder

http://answers.splunk.com/answers/139821/inputlookup-not-returning-all-the-rows-in-csv-file.html

Seems that double quotes affect this. I had six entries with double quotes. I removed them and now I'm getting all my results. Seems that inbetween the lines with quotes was about 15,000 lines.

So, why would double quotes affect a csv lookup file? Rows are determined via linebreaks. And columns are determined by commas.

View solution in original post

jizzmaster
Path Finder

http://answers.splunk.com/answers/139821/inputlookup-not-returning-all-the-rows-in-csv-file.html

Seems that double quotes affect this. I had six entries with double quotes. I removed them and now I'm getting all my results. Seems that inbetween the lines with quotes was about 15,000 lines.

So, why would double quotes affect a csv lookup file? Rows are determined via linebreaks. And columns are determined by commas.

steveyz
Splunk Employee
Splunk Employee

double quotes are used to quote values in CSV. See https://en.wikipedia.org/wiki/Comma-separated_values#Basic_rules_and_examples

woodcock
Esteemed Legend

It is either hitting a limit or you are not looking at the file that think you are. How is the file generated? How is it put onto the Search Head (or is it)? The only other option is that you have tripped across a bug. If you have investigated it well, I would open a support case.

0 Karma

jizzmaster
Path Finder

I'm creating it via a bash shell script that is using the sqlcmd command provided by the Microsoft MS-SQL driver (this is on a RHEL6 box) to query the database and create the file. The output overwrites the previous csv in a lookup folder every morning.

0 Karma

steveyz
Splunk Employee
Splunk Employee

Does your lookup have more than 1 line per event? Splunk generated lookup files with multivalued fields often have this property. If your lookup file has a primary key, you can try to find the set difference between the lookup file and what inputlookup returns and see if there is any pattern as to which rows are missing.

0 Karma

jizzmaster
Path Finder

Only 1 line per event in my csv lookup file. I'm trying to find some rhyme and reason to which rows are missing.

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

Edit: I thought that this might apply but it doesn't. Thanks @steveyz


First of all, I can't believe your name is @jizzmaster.

Secondly, check limits.conf - I'm wondering if this is what you're hitting?

[lookup]

max_memtable_bytes = <integer> 
* Maximum size of static lookup file to use an in-memory index for.
* Defaults to 10000000 in bytes (10MB
0 Karma

jizzmaster
Path Finder

I figured the username should fit with the product ...

max_memtable_bytes is 10MB but my csv is 4.5MBs.

aljohnson_splun
Splunk Employee
Splunk Employee

Then it would be more like cavemaster I'd think but do I (personally) appreciate your humor

steveyz
Splunk Employee
Splunk Employee

that limits.conf setting does not affect inputlookup. It only affects the performance optimization for performing lookups. inputlookup is basically inputcsv, but from the lookup directories rather than the dispatch directory.

jizzmaster
Path Finder

Also, when performing additional searches through Splunk on this lookup file, there are missing fields. When I search for it in the actual file, the rows I'm looking for are there.

This is the command I use to search through the lookup table:

|inputlookup resources.csv |search host_name=2AH39911B

And yes, the "host_name" field exists. As does the specific host_name I'm searching for. No results in the Splunk search though.

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...