Getting Data In

Can anyone help to get this data into Splunk properly?

adlireza
Path Finder

I have tried to index this file without much success. It's driving me nuts how the fields are never separated correctly no matter what setting I change. I'd be grateful if anyone can try to index this file and let me know what are the props.conf settings you used.

date            time                field101    field102    field103    field104    field105    field106    field107  field108
2008-10-1       00:15:00.0                35.947          46.170          26.839          26.161           0.079           0.099           0.160          36.833
2008-10-1       00:30:00.0                35.608          46.024          26.791          26.210           0.078           0.098           0.159          36.736
2008-10-1       00:45:00.0                35.608          45.879          26.549          25.871           0.080           0.100           0.162          36.929
2008-10-1       01:00:00.0                35.463          45.976          26.355          25.677           0.081           0.101           0.162          37.057
2008-10-1       01:15:00.0                35.608          45.879          26.452          26.210           0.080           0.100           0.162          36.873
2008-10-1       01:30:00.0                35.754          46.218          26.549          25.967           0.080           0.101           0.163          36.945
2008-10-1       01:45:00.0                35.705          46.170          26.452          26.258           0.081           0.101           0.164          36.949
2008-10-1       02:00:00.0                35.172          45.491          26.549          25.919           0.080           0.101           0.164          37.074
2008-10-1       02:15:00.0                35.415          45.831          26.452          25.919           0.080           0.101           0.163          36.993
2008-10-1       02:30:00.0                35.511          45.637          26.549          25.628           0.082           0.103           0.165          37.170
0 Karma

mattymo
Splunk Employee
Splunk Employee

If these headers will remain statically defined, I would advise against using indexed extractions as it can/will dramatically increase your index size.

I would recommend ingesting each line and then using search time field extractions. That way this awfully formatted file wont have to confirm to CSV and you don't have to worry about using props to massage your data.

Also I don't believe your TIME_FORMAT is correct, if you truly turned it to CSV, the date and time would be comma separated.

I used the following props to ingest with timestamps, then would use a auto field extraction regex to pull the fields.

[ ]
CHARSET=UTF-8
SHOULD_LINEMERGE=false
disabled=false
TIME_FORMAT=%Y-%m-%d %H:%M:%S
MAX_DAYS_AGO=10000

- MattyMo
0 Karma

adlireza
Path Finder

I've been at this for hours and totally given up on trying to index it as-is (I think it might be something with the way the number of whitespaces delimiting each value is inconsistent). My latest attempts have been trying to turn it into comma-separated values using SEDCMD, but still no success yet. Below are my settings for this:

[ somedata ]
HEADER_FIELD_LINE_NUMBER=1
CHARSET=UTF-8
MAX_DAYS_AGO=10000
MAX_TIMESTAMP_LOOKAHEAD=30
NO_BINARY_CHECK=true
SEDCMD-changeintocsv=s/\s+/,/g
SHOULD_LINEMERGE=false
TIME_FORMAT=%Y-%m-%d %H:%M:%S
category=Custom
disabled=false
pulldown_type=true
INDEXED_EXTRACTIONS=csv

The strange thing is I can run regular CLI sed utility on the file to turn it into a CSV file, and Splunk will parse just fine using the csv sourcetype.

0 Karma

adlireza
Path Finder

I've been at this for hours and totally given up on trying to index it as-is (I think it might be something with the way the number of whitespaces delimiting each value is inconsistent). My latest attempts have been trying to turn it into comma-separated values using SEDCMD, but still no success yet. Below are my settings for this:

[ somedata ]
HEADER_FIELD_LINE_NUMBER=1
CHARSET=UTF-8
MAX_DAYS_AGO=10000
MAX_TIMESTAMP_LOOKAHEAD=30
NO_BINARY_CHECK=true
SEDCMD-changeintocsv=s/\s+/,/g
SHOULD_LINEMERGE=false
TIME_FORMAT=%Y-%m-%d %H:%M:%S
category=Custom
disabled=false
pulldown_type=true
INDEXED_EXTRACTIONS=csv

The strange thing is I can run regular CLI sed utility on the file to turn it into a CSV file, and Splunk will parse just fine using the csv sourcetype.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...