Getting Data In

Why are the events being cut off at 257 lines in xml data?

mwcooley
Explorer

Hi,

I have xml data that can have up to 500+ lines but Splunk is truncating at 257 lines. I've been trying combinations of LINE_BREAK and BREAK_ONLY_BEFORE, but no luck. I'm not sure if it's my regex or my config files or what.

thanks,
mike

I defined the stanza in inputs.conf:
[monitor:///app/freeswitch/cdrs/*.xml]
sourcetype = conf_cdr_xml

Here's my props.conf:
[conf_cdr_xml]
KV_MODE = xml
SHOULD_LINEMERGE = false
BREAK_ONLY_BEFORE = \<\/cdr\>
MAX_EVENTS = 100000
TRUNCATE=100000
NO_BINARY_CHECK = true
pulldown_type = true

And here is an example event:

    <cdr>
      <conference>
        <name>5551231234-1234567</name>
        <hostname>test@test.net</hostname>
        <rate>8000</rate>
        <interval>20</interval>
        <start_time type="UNIX-epoch">1521040386</start_time>
        <end_time endconference_forced="false" type="UNIX-epoch">1521040388</end_time>
        <members>
          <member type="caller">
            <join_time type="UNIX-epoch">1521040386</join_time>
            <leave_time type="UNIX-epoch">1521040388</leave_time>
            <flags>
              <is_moderator>true</is_moderator>
              <end_conference>true</end_conference>
              <was_kicked>false</was_kicked>
              <is_ghost>false</is_ghost>
            </flags>
            <caller_profile>
              <username>5553214321</username>
              <dialplan>XML</dialplan>
              <caller_id_name>DEMO SITE</caller_id_name>
              <caller_id_number>5553214321</caller_id_number>
              <callee_id_name></callee_id_name>
              <callee_id_number></callee_id_number>
              <ani>5553214321</ani>
              <aniii></aniii>
              <network_addr>10.1.1.165</network_addr>
              <rdnis></rdnis>
              <destination_number>5551231234;conf=555;mod;tone=NO_SOUNDS</destination_number>
              <uuid>2dccfdde-279a-11e8-99a6-5903ab961f76</uuid>
              <source>mod_sofia</source>
              <context>public</context>
              <chan_name>sofia/internal/5553214321@10.1.1.125</chan_name>
            </caller_profile>
          </member>
          <member type="caller">
            <join_time type="UNIX-epoch">1521040386</join_time>
            <leave_time type="UNIX-epoch">1521040388</leave_time>
            <flags>
              <is_moderator>true</is_moderator>
              <end_conference>true</end_conference>
              <was_kicked>false</was_kicked>
              <is_ghost>false</is_ghost>
            </flags>
            <caller_profile>
              <username>5553214321</username>
              <dialplan>XML</dialplan>
              <caller_id_name>DEMO SITE</caller_id_name>
              <caller_id_number>5553214321</caller_id_number>
              <callee_id_name></callee_id_name>
              <callee_id_number></callee_id_number>
              <ani>5553214321</ani>
              <aniii></aniii>
              <network_addr>10.1.1.165</network_addr>
              <rdnis></rdnis>
              <destination_number>5551231234;conf=555;mod;tone=NO_SOUNDS</destination_number>
              <uuid>2dccfdde-279a-11e8-99a6-5903ab961f76</uuid>
              <source>mod_sofia</source>
              <context>public</context>
              <chan_name>sofia/internal/5553214321@10.1.1.125</chan_name>
            </caller_profile>
          </member>
        </members>
        <rejected></rejected>
      </conference>
    </cdr>
0 Karma
1 Solution

tiagofbmm
Influencer

The Universal Forwarder doesn't have those parsing capabilities at all, so what is leaving your UF is just blocks of data (uncooked data) and not events themselves.

You must put these props configurations in a full Splunk Instance, either your Heavy Forwarder or Indexer.

Remember that data will only go through the parsing pipeline once, so I think the HF is the solution here

View solution in original post

keishamtcs
Explorer

Hi Mike,

I am also facing the same issue. Were you able to fix your issue ?

Regards

0 Karma

mwcooley
Explorer

Hi @keishamtcs ,

I added the props.conf file to the indexer as suggested in the answer by @tiagofbmm. Well, he actually suggested the heavy forwarder, but that's all controlled at the corporate level, so they chose to add it to the indexer instead.

0 Karma

niketn
Legend

@mwcooley, with SHOULD_LINEMERGE turned on i.e. true, the event break at 257 lines is not an indication of MAX_EVENTS. Rather, it is an indication that Splunk is unable to identify the timestamp in the lines that it has parsed. Since the timestamp in your log is Unix Epoch timestamp, you should add timestamp extraction related configs in props.conf i.e.

TIME_FORMAT=%s
TIME_PREFIX=\<start_time type=\"UNIX-epoch\"\>
MAX_TIMESTAMP_LOOKAHEAD=10

Further, try with the following LINE_BREAKER

LINE_BREAKER=[\>\s]((?=\<cdr\>))

Following are other required config which you already have:

SHOULD_LINEMERGE=true
KV_MODE = xml

Please try out and confirm.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

tiagofbmm
Influencer

The Universal Forwarder doesn't have those parsing capabilities at all, so what is leaving your UF is just blocks of data (uncooked data) and not events themselves.

You must put these props configurations in a full Splunk Instance, either your Heavy Forwarder or Indexer.

Remember that data will only go through the parsing pipeline once, so I think the HF is the solution here

mwcooley
Explorer

Thanks, @tiagofbmm. Looks like you are correct. I've opened a ticket with the splunk group. I'll update once I've confirmed.

0 Karma

mwcooley
Explorer

Yep, just as you stated. Thanks again.

0 Karma

tiagofbmm
Influencer

you're welcome

0 Karma

cmerriman
Super Champion
[ conf_cdr_xml ]
SHOULD_LINEMERGE=true
NO_BINARY_CHECK=true
BREAK_ONLY_BEFORE=\<\/cdr\>
MAX_EVENTS=100000
disabled=false

i think something like this should work.

0 Karma

mwcooley
Explorer

Hi,

Didn't work. I've been reading more and now I'm wondering if my events are being truncated later down the pipe. I'm doing all this work on a universal forwarder. The next hop is a heavy forwarder, then the indexer. Could it be that one of those is truncating my events?

0 Karma

mwcooley
Explorer

and, in case you're wondering why i don't just check there... it's a corporate instance of splunk and i have no access to the UF nor the indexer. i did reach out to that group as well.

0 Karma

mwcooley
Explorer

After a bit more reading, I think I should be using LINE_BREAKER instead of BREAK_ONLY_BEFORE. I tried it by just substitute line_breaker for break_only_before in the props.conf example. Still no luck.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...