Getting Data In

Correctly ingest Puma / Rails logs

aa123s
Explorer

Hello,

I have been fighting with this issue for the past few says and still cannot manage to solve it. After reading documentation, searching this forum and an abundant amount of googling, here I am.
My Puma log looks like this:

I, [2018-01-23T15:04:18.058127 #28] INFO -- : [cad63e92-848e-4f93-ac49-041569978447] Started GET "/api/oe/recent_searches.json" for 100.96.20.4 at 2018-01-23 15:04:18 +0000
I, [2018-01-23T15:04:18.059429 #28] INFO -- : [cad63e92-848e-4f93-ac49-041569978447] Processing by Api::Oe::RecentSearchesController#index as JSON
I, [2018-01-23T15:04:18.059492 #28] INFO -- : [cad63e92-848e-4f93-ac49-041569978447] Parameters: {"recent_search"=>{}}
D, [2018-01-23T15:04:18.063726 #28] DEBUG -- : [cad63e92-848e-4f93-ac49-041569978447] User Load (1.3ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 ORDER BY "users"."id" ASC LIMIT $2 [["id", 4], ["LIMIT", 1]]
D, [2018-01-23T15:04:18.067015 #28] DEBUG -- : [cad63e92-848e-4f93-ac49-041569978447] Search Load (1.1ms) SELECT "searches".* FROM "searches" WHERE "searches"."id" = 151
I, [2018-01-23T15:04:18.070320 #28] INFO -- : [cad63e92-848e-4f93-ac49-041569978447] [active_model_serializers] Rendered ActiveModel::Serializer::CollectionSerializer with ActiveModelSerializers::Adapter::Attributes (1.9ms)
I, [2018-01-23T15:04:18.070584 #28] INFO -- : [cad63e92-848e-4f93-ac49-041569978447] Completed 200 OK in 11ms (Views: 2.4ms | ActiveRecord: 2.4ms)

My first problem is that I cannot manage Splunk to split events correctly. Whatever I try I get one event per line. Could anybody provide a working example of a stanza that puts the above log in a single event?

The second problem is that sometimes log entries overlap with one another. In other words as one entry begins (Started GET...) another might start as well before the first one completes (Competed 200 OK...). When that happens the only way to discern one from the other is using the transaction id. Would it be possible to correctly split also these entries at index time?

Thank you

Tags (1)
1 Solution

qi3ber
Explorer

For your first problem. There is a difference between what splunk considers an event, and what you are considering an event. Or more correctly, the splunk concept of an event doesn't line up with how rails generates it's logs. An event in splunk parlance is an atomic data point in a logfile. As far as splunk is concerned, the single rails event of 'page load' is comprised of several splunk 'single line' events. There is no way, without writing a pre-parser, to get all the constituent lines comprising a single rails event to be joined into a single Splunk event.

All is not doom and gloom however. Once splunk ingests the many lines of log data from your rails logs. You can join those lines together at search time into a single result. You can either use the splunk 'transaction' command, or the considerably faster 'stats' command to join them together. For starters I'd recommend using the 'transaction' command, as it's easier to figure out how to join the lines together.

Fortunately, rails includes their event id field (I forget it's true name) in the lines of their logs, so it's easy to join them together, the hardest part is going to be writing the regex to extract that field the first time. I haven't tested, but something like this should work:

| rex field=_raw ":\s\[(?P<event_id>[a-z0-9-]+)\]"

You could then take that a step further and move that setting into props.conf for the sourcetype in question, so that the event_id is extracted automatically.

EXTRACT-eventid = :\s\[(?P<event_id>[a-z0-9-]+)\]

Sorry I couldn't give you the answer you were hoping for, but hopefully that helps a bit.

View solution in original post

0 Karma

qi3ber
Explorer

For your first problem. There is a difference between what splunk considers an event, and what you are considering an event. Or more correctly, the splunk concept of an event doesn't line up with how rails generates it's logs. An event in splunk parlance is an atomic data point in a logfile. As far as splunk is concerned, the single rails event of 'page load' is comprised of several splunk 'single line' events. There is no way, without writing a pre-parser, to get all the constituent lines comprising a single rails event to be joined into a single Splunk event.

All is not doom and gloom however. Once splunk ingests the many lines of log data from your rails logs. You can join those lines together at search time into a single result. You can either use the splunk 'transaction' command, or the considerably faster 'stats' command to join them together. For starters I'd recommend using the 'transaction' command, as it's easier to figure out how to join the lines together.

Fortunately, rails includes their event id field (I forget it's true name) in the lines of their logs, so it's easy to join them together, the hardest part is going to be writing the regex to extract that field the first time. I haven't tested, but something like this should work:

| rex field=_raw ":\s\[(?P<event_id>[a-z0-9-]+)\]"

You could then take that a step further and move that setting into props.conf for the sourcetype in question, so that the event_id is extracted automatically.

EXTRACT-eventid = :\s\[(?P<event_id>[a-z0-9-]+)\]

Sorry I couldn't give you the answer you were hoping for, but hopefully that helps a bit.

0 Karma

aa123s
Explorer

Hello qi3ber,

Thank you for your answer!
It would have been nice to group those lines at index time but if that's impossible we will settle for search time.

I tried your suggestions. The 'transaction' / 'rex' command work and do exactly what I would expect them to do.

On the other hand the 'EXTRACT-...' command is ignored and has no effect when introduced in props.conf (I can confirm that changes in props.conf are being evaluated as I'm successfully updating the sourcetype each time I try). Do you have any clue on why EXTRACT doesn't work? Thanks again!

0 Karma

qi3ber
Explorer

I haven't tested, but I believe the EXTRACT line above should work as is, especially if the rex is working. The only thing I can think of that would prevent it from working then, is something you're probably already doing.

The EXTRACT line (specifically), and the rest of the sourcetype stuffs (if wanted), need to be on the search head in order for them to work. Of course that only matters if you don't have an all-in-one server.

0 Karma

aa123s
Explorer

I am afraid I don't have access to the search head... so never mind.
Thank you!

Alain

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...