Splunk Search

Field Extraction for REST API Logs

lpolo
Motivator

The content of the log is basically API REST calls. I am facing the issue of not being able to extract the fields of the API calls correctly because the order of some fields changes.
This is an example:

2011-11-21 15:03:59,926 <query id=d2f98492-94c3-48dc-bb06-525de5ca60c9>/abc/find?house.types=condo,apt,townhome,house&avail.zip.area=area-whitelist&results.limit=7&avail.db1.location=20008|X&results.start=1&q=homes&user.account=1&avail.db.device=ipad&avail.resources=mris,lf&fields=usa,ca$id,wfi,lf$id,best.worst,time.endYear,releaseYear</query>    

2011-11-21 15:03:29,995 <query id=d2f98492-94c3-48dc-bb06-525de5ca60A9>/abc/find?house.types=condo,apt,townhome,house&results.limit=7&avail.db1.location=20008|X&results.start=1&q=homes&user.account=1&avail.db.device=ipad&avail.resources=mris,lf&fields=usa,ca$id,wfi,lf$id,best.worst,time.endYear,releaseYear&avail.zip.area=area-whitelist</query>

In this case field "avail.zip.area" moved its position. Any field can move to any valid position.

How can I address this issue?

Thanks,
Lp

Tags (1)
0 Karma
1 Solution

_d_
Splunk Employee
Splunk Employee

You can address this issue by adding EXTRACT statements in your props under the appropriate stanza which describes the data from this source.
Let's assume this data's sourcetype is mytype. Unless you have props.conf elsewhere, create one in $SPLUNK_HOME/etc/system/local and add the following:

[mytype]
..
EXTRACT-zip = avail.zip.area=(?<avail_zip_area>.*?)[\&|\<]

This will create a search-time extraction that will create a field called avail_zip_area and pick-up the right value as long as it is followed by a "&" or "<".

Hope this helps.

> please upvote and accept answer if you find it useful - thanks!

View solution in original post

_d_
Splunk Employee
Splunk Employee

Your props seem to be wrong. Well, first the stanza header needs to reflect the sourcetype of the data - if your sourcetype is rex-srch-qen-solr-rest_request then you are OK. Next, you have to rework your extractions:

This won't work:
[rex-srch-qen-solr-rest_request]
...
EXTRACT-results.start=(?<results_start>.*?)[\&|\<|\?]

This will:
[rex-srch-qen-solr-rest_request]
...
EXTRACT-results.start = results\.start\=(?<results_start>.*?)[\&|\<|\?]

You need to include results\.start\= in your extraction so that you tell splunk where exactly to start looking for that field.

Likewise for this other extraction:

[rex-srch-qen-solr-rest_request]
...
EXTRACT-avail.zip.area = avail\.zip\.area\=(?<avail_zip_area>.*?)[\&|\<|\?]

Hope it helps.

EDIT: Escaped the dots and equal signs just to be safe

0 Karma

lpolo
Motivator

Yes. props.conf does not report any other string after the ",". This is an example:

index=main sourcetype="rex-srch-qen-solr-rest_request" | top limit=3 available_zip_area

available_zip_area  count   percent

1 20016 81691 90.898065
2 20008 8077 8.987326
3 20045 91 0.101256

Now with rex:
index=main sourcetype="rex-srch-qen-solr-rest_request" | rex "available.zip.area=(?.*?)[&|<|?]" | top limit=3 available_zip_area

available_zip_area  count   percent

1 20016,20015 81691 90.898065
2 20008,20815,20816 8077 8.987326
3 20045,20016 91 0.101256

Thanks,

0 Karma

_d_
Splunk Employee
Splunk Employee

Are you saying that you used avail\.zip\.area\=(?<avail_zip_area>.*?)[\&|\<|\?] with rex and props.conf and got different results?

0 Karma

lpolo
Motivator

D,

It partially worked. I escaped the dots and equal signs but Splunk refuses to report consistently. For example:
In the logs:
available_zip_area=20016,20008.

Splunk just reports the first zip:
Just reports: available_zip_area=20016.

If I use the regular expression with "rex" command I can get all the zips:
available_zip_area=20016,20008.

Any idea?

Thanks,

0 Karma

lpolo
Motivator

let me try...

0 Karma

_d_
Splunk Employee
Splunk Employee

You can address this issue by adding EXTRACT statements in your props under the appropriate stanza which describes the data from this source.
Let's assume this data's sourcetype is mytype. Unless you have props.conf elsewhere, create one in $SPLUNK_HOME/etc/system/local and add the following:

[mytype]
..
EXTRACT-zip = avail.zip.area=(?<avail_zip_area>.*?)[\&|\<]

This will create a search-time extraction that will create a field called avail_zip_area and pick-up the right value as long as it is followed by a "&" or "<".

Hope this helps.

> please upvote and accept answer if you find it useful - thanks!

lpolo
Motivator

I did try |extract reload=t. It did not work. The props.conf is basically what you recommended. I think that the problem is that Splunk is dealing with the "." character and it conflicts with my props.conf.
If I remove props.conf Splunk tries to extract the fields but with some inconsistencies. When I use "rex" command to test my regular expression works fine.

0 Karma

_d_
Splunk Employee
Splunk Employee

Did you try a | extract reload=t in your search? What does your props.conf stanza for this extraction look like?

0 Karma

lpolo
Motivator

It works if you run it via "rex" search command. If you configure it in $SPLUNK_HOME/etc/system/local/props.conf it does not work. I have never been able to configure REST API logs in Splunk.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...