Getting Data In

Size limit for an event?

erydberg
Splunk Employee
Splunk Employee

Hi!

Is there a size limit for how big an event can be before it's split into two? I'm trying to index p4 data, and these events can get really big, especially for big integrates or branches, and I'd like to know if I will run into any errors and how I can avoid them.

Thanks, Elin

Tags (3)
2 Solutions

gkanapathy
Splunk Employee
Splunk Employee

You can increase the 10,000 character limit in a single line by setting TRUNCATE in props.conf. This can be applied per-host, per-source, or per-sourcetype as usual with props.conf. Lines are by default (but not necessarily) traditional "lines", separated the configurable LINE_BREAKER sequence. You are more likely to need to raise this higher if you use non-default LINE_BREAKERs. I have raised this over 250,000 with no problems.

Note that if you use line merging (SHOULD_LINEMERGE = true), you can combine, by default, up to 257 lines into a single event. This limit can also be increased (MAX_EVENTS, not a very accurate name for what it actually does), and I have raised it to over 10,000 without problems. By raising one or both of these parameters as appropriate, you can have extremely large events.

I do not know of a "hard" limit on either of these settings.

View solution in original post

Lowell
Super Champion

There are some other limit you should keep in mind too:

  • Splunk limits how many lines you can see of an event within the web UI. I think you start running into issues start at around 500 lines.
  • If you use any custom search commands, then the python interface bumps the CSV "cell" limit to 10Mb for (This would be a single "field" in splunk terms, in your case the concern is the size of the "_raw" field), so if you have any single events larger than 10M you could run into problems there.
  • I think I've heard about issues when trying to export large events. (I don't remember the exact context, for sure. If I remember correctly it has something to do with the number of lines in the event.)

These may or may not be things that you would run into, but it may be worth the effort to do some testing to be sure. You will certainly want to load a variety of test events (of various sizes) into your system to make sure you have your event breaking vs event truncation policies working the way you want.

Best of luck!

View solution in original post

sjnorman
Explorer

Splunk limits how many lines you can
see of an event within the web UI. I
think you start running into issues
start at around 500 lines.

Is there a way to configure/limit the number of lines displayed in the Web UI for the search results? We have large, multi-lined events being fed in & would prefer if you had to expand the particular search result that you're interested in.

0 Karma

lmyrefelt
Builder

I am interested in the same thing, i have not yet started on it, but i think a good start could be hidden here; http://docs.splunk.com/Documentation/Splunk/6.0.3/AdvancedDev/EventRendering

But that i guess is another discussion / question 🙂

0 Karma

erydberg
Splunk Employee
Splunk Employee

Okay, I successfully indexed events around 30mb, so it doesn't seem like there's a size limit. The only problem I run into is that I get warnings in my splunkd.log about the pipelines being filled up, but the events gets indexed correctly.

Lowell
Super Champion

There are some other limit you should keep in mind too:

  • Splunk limits how many lines you can see of an event within the web UI. I think you start running into issues start at around 500 lines.
  • If you use any custom search commands, then the python interface bumps the CSV "cell" limit to 10Mb for (This would be a single "field" in splunk terms, in your case the concern is the size of the "_raw" field), so if you have any single events larger than 10M you could run into problems there.
  • I think I've heard about issues when trying to export large events. (I don't remember the exact context, for sure. If I remember correctly it has something to do with the number of lines in the event.)

These may or may not be things that you would run into, but it may be worth the effort to do some testing to be sure. You will certainly want to load a variety of test events (of various sizes) into your system to make sure you have your event breaking vs event truncation policies working the way you want.

Best of luck!

gkanapathy
Splunk Employee
Splunk Employee

You can increase the 10,000 character limit in a single line by setting TRUNCATE in props.conf. This can be applied per-host, per-source, or per-sourcetype as usual with props.conf. Lines are by default (but not necessarily) traditional "lines", separated the configurable LINE_BREAKER sequence. You are more likely to need to raise this higher if you use non-default LINE_BREAKERs. I have raised this over 250,000 with no problems.

Note that if you use line merging (SHOULD_LINEMERGE = true), you can combine, by default, up to 257 lines into a single event. This limit can also be increased (MAX_EVENTS, not a very accurate name for what it actually does), and I have raised it to over 10,000 without problems. By raising one or both of these parameters as appropriate, you can have extremely large events.

I do not know of a "hard" limit on either of these settings.

Lowell
Super Champion

Copied from the docs: Multi-line event linebreaking and segmentation limitations


Splunk does apply limitations to extremely large events when it comes to linebreaking and segmentation:

  • Lines over 10,000 bytes: Splunk breaks lines over 10,000 bytes into multiple lines of 10,000 bytes each when it indexes them. It appends the field meta::truncated to the end of each truncated section. However, Splunk still groups these lines into a single event.
  • Segmentation for events over 100,000 bytes: Splunk only displays the first 100,000 bytes of an event in the search results. Segments after those first 100,000 bytes of a very long line are still searchable, however.
  • Segmentation for events over 1,000 segments: Splunk displays the first 1,000 individual segments of an event as segments separated by whitespace and highlighted on mouseover. It displays the rest of the event as raw text without interactive formatting.


Based on this, I think searching for meta::truncated would be the way of finding events that were not including in their entirety.

Hmm. The more I read the docs on this point, the less confident I am in my understanding of what it actually means. I don't actually see a real size limit mentioned here. For example, we know that lines with 10,0000 byte are split up (and yet somehow still kept together as a single event), but there is not mention of a maximum number of "lines" per se?

And the segmentation stuff all has to do with indexed terms which is important if you need to search on any term in your really big events. Based on this, I would say you should be able to search on any of the first 1000 unique terms in your event, but again this doesn't say much in terms of max size.

And the "mouseover" stuff I think is a reminiscent of Splunk 3.x, which was a nice feature that highlited the same word across events if you hovered over a term with your mouse, but seems like it was removed in Splunk 4.0. So that's not helpful at all.

Do you have events over 97k? (100,0000 bytes) If not, you should be fine. If you do, then hopefully someone else here can give you a better answer.

So the bottom line is nevermind, I don't have an answer for you at all. Sorry.

gkanapathy
Splunk Employee
Splunk Employee

You can increase the 10,000 character limit in a single line by setting TRUNCATE in props.conf. This can be applied per-host, per-source, or per-sourcetype as usual with props.conf. Lines are by default (but not necessarily) traditional "lines", the configurable LINE_BREAKER sequence.

Note that if you use line merging, you can combine, by default, up to 257 lines into a single event, but this limit can also be increased.

By raising one or both of these parameters as appropriate, you can have extremely large events. I do not know of a "hard" limit on either of these settings.

0 Karma

Lowell
Super Champion

For whatever it's worth, seems like meta::truncated is not longer placed on events as of Splunk 6.x.

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...