Knowledge Management

multikv missing column values causing parsing issues

stanwin
Contributor

Hi

Is there any workaround in multikv.conf, column with missing values are being assigned values from next header with values..

 Subsystem/Job  User        Number   User        Type Pool Pty     CPU   Int    Rsp  AuxIO   CPU%  Function       Status   Threads
   JDENET_K     ONEWORLD    01267   ONEWORLD    BCI    8  20   15884.2                  1    1.9  jvmStart  DEQW         33
   QSRVERR      QUSER       00129   ONEWORLD    PJ     2  20   18277.8               3832     .9                  CNDW          1

Space is the delimiter that works fine as long as a field value is not missing.

Int & Rsp are blank & get values of AuxIO & CPU% respectively.. like wise if any other column is missing, next value gets shifted left by a position.

Tags (1)
0 Karma

woodcock
Esteemed Legend

OK, add this to $SPLUNK_HOME/etc/system/local/multikv.conf:

[myScriptedInputFieldsByPositionToHandleGaps]
#pre.start     = *
pre.linecount = 10
# List our preferred column names
header.tokens = _token_list_, Subsystem_Job, User, Number, Current_User, Type, Pool, Pty, CPU, Elapsed_Int, Elapsed_Rsp, Elapsed_AuxIO, Elapsed_CPU_PCT, Function, Status, Threads
body.tokens = _chop_, (0,15),(0,12),(0,9),(0,12),(0,5),(0,5),(0,8),(0,6),(0,7),(0,5),(0,8),(0,6),(0,15),(0,9),(0,5)

Then you use it like this:

... | multikv conf=myScriptedInputFieldsByPositionToHandleGaps

Keep in mind that I cannot find a single example of how to correctly use the _chop_ syntax so my implementation of the body.tokens line is a HUGE guess and probably will not work correctly first try. For example is the offset mentioned here relative (always a 0) or 0-based from the first character? I assumed the former but...:

<chopper> = _chop_, <int-list>               
* Transform each string into a list of tokens specified by <int-list>.
* <int-list> is a list of (offset, length) tuples.

You will have to play around but I am sure this is pretty close to what is meant here:

http://docs.splunk.com/Documentation/Splunk/6.2.5/Admin/Multikvconf

0 Karma

hire_vladimir
Explorer

The .spec file for chop and other processors is tad misleading, the correct syntax for this would be below. Note the comma separation for the tuples.

multikv.conf
[lsof_fixed]
body.ignore = regex, "^(:?COMMAND|---\sOpen files:)"

header.tokens = token_list, COMMAND, PID, TID, USER, FD, TYPE, DEVICE, SIZE_OFF, NODE, NAME

^(?.{9})\s(?.{4})\s(?.{4})\s(?.{7})\s(?.{5})\s(?.{8})\s(?.{18})\s(?.{9})\s(?.{10})\s(?.+)

COMMAND, PID, TID, USER, FD, TYPE, DEVICE, SIZE_OFF, NODE, NAME

body.tokens = chop, 0, 9, 10, 4, 15, 4, 20, 7, 28, 5, 34, 8, 43, 18, 62, 9, 72, 10, 83, 100

sample data:
--- Open files: 'lsof' started on Fri Jul 12 20:57:38 UTC 2019 ---
COMMAND PID TID USER FD TYPE DEVICE SIZE/OFF NODE NAME
systemd 1 root cwd DIR 202,1 224 34 /
systemd 1 root txt REG 202,1 1644360 4447583 /usr/lib/systemd/systemd
systemd 1 root 22u unix 0xffff9e22222cdc00 0t0 32250 /run/systemd/journal/stdout
auditd 1221 1532 root 7u unix 0xffff9e2222458800 0t0 13340 socket

0 Karma

woodcock
Esteemed Legend

Your stuff is position-oriented so don't use multikv, do this instead:

In props.conf:

[mySourceType]
REPORT-mySourceType = position_based_fields

In transforms.conf:

[position_based_fields]
REGEX = ^(.{15})(.{12})(.{9})(.{12})(.{5})(.{5})(.{8})(.{6})(.{7})(.{5})(.{8})(.{6})(.{15})(.{9})(.*)$
FORMAT = Subsystem_Job::$1 User1::$2 Number::$3 User2::$4 Type::$5 Pool::$6 Pty::$7 CPU::$8 Int::$9 Rsp::$10 AuxIO::$11 CPU_PCT::$12 Function::$13 Status::$14 Threads::$15
0 Karma

stanwin
Contributor

Thanks a lot woodcock!

I will assess & let know!

0 Karma

stanwin
Contributor

This does not work eactly for my case.

I have output like :

 5770SS1 V7R1M0 100423                  Work with Active Jobs                                    8/21/15  1:16:47        Page    1
 Reset . . . . . . . . . . . . . . . . . :   *NO
 Subsystems  . . . . . . . . . . . . . . :   *ALL
 CPU Percent Limit . . . . . . . . . . . :   *NONE
 Response Time Limit . . . . . . . . . . :   *NONE
 Sequence  . . . . . . . . . . . . . . . :   *CPU
 Job name  . . . . . . . . . . . . . . . :   *ALL
 CPU %  . . . :    23.1          Elapsed time . . . . . . . :   00:22:51           Active jobs . . . . . . :   2531
                                     Current                             --------Elapsed---------
 Subsystem/Job  User        Number   User        Type Pool Pty     CPU   Int    Rsp  AuxIO   CPU%  Function       Status   Threads
   JDENET_K     ONEWORLD    006767   ONEWORLD    BCI    8  20   38584.2                  1    1.9  PGM-jvmStartPa  DEQW         33
   QSQSRVR      QUSER       005759   ONEWORLD    PJ     2  20   18277.8               3832     .9                  CNDW          1
   JDENET_K     ONEWORLD    006829   ONEWORLD    BCI    8  20   14312.4                  0    1.4  PGM-JDENET_K    DEQW          1
   QDBFSTCCOL   QSYS        004843   QSYS        SYS    2  50    4121.9                  1     .0                  EVTW          7
   JDENET_K     ONEWORLD    006773   ONEWORLD    BCI    8  20    2781.9                  1     .3  PGM-JDENET_K    DEQW          1
   JDENET_K     ONEWORLD    006771   ONEWORLD    BCI    8  20    1864.7                 18     .1  PGM-jvmStartPa  DEQW         46
   DCM          BMCAGENT    007807   BMCUSER     BCI    2  25    1623.2                 69     .1  PGM-DCM         SELW          1
   PATROLAGEN   BMCAGENT    007792   BMCAGENT    BCH    2  40    1042.6               3049     .0  PGM-PATROLAGEN  SELW          1
   QZDASOINIT   QUSER       060190   AJKINGS     PJ     2  20    1004.1                  0     .0                  TIMW          1
   JDENET_K     ONEWORLD    061116   ONEWORLD    BCI    8  20     923.6                  0     .2  PGM-jvmStartPa  DEQW         55
   QSQSRVR      QUSER       059453   ONEWORLD    PJ     2  20     738.3                 18     .0                  CNDW          1
   JDENET_N     ONEWORLD    006826   ONEWORLD    BCI    8  20     632.6                 99     .0  PGM-JDENET_N    SELW          1
   JDENET_N     ONEWORLD    006822   ONEWORLD    BCI    8  20     596.3                 50     .0  PGM-JDENET_N    SELW          1
   QSQSRVR      QUSER       007352   ONEWORLD    PJ     2  20     523.1                  0     .0                  CNDW          1
   LOG6110      OWUSER      005817   OWUSER      BCH    2  10     508.0                  9     .0  PGM-LOG6110     EVTW          1
   AS400COLLE   BMCAGENT    007808   BMCUSER     BCI    2  25     477.2                 63     .0  PGM-COLLECT     SELW          1
   QSQSRVR      QUSER       013311   OWUSER      PJ     2  20     453.5                  0     .0                  CNDW          1
   JDENET_N     ONEWORLD    006783   ONEWORLD    BCI    8  20     390.1                 14     .0  PGM-JDENET_N    SELW          1

So i need extraction of each row as multi valued key pairs.

In the solution, it only extracts for single line, with last column having all of remainder log data.

Also with multikv i can set custom conf to parse it by ignoring the headers & other information..

0 Karma

woodcock
Esteemed Legend

So this is a scripted input? You did not mention that these were multiline events; that makes all the difference in the world.

0 Karma

stanwin
Contributor

Hi Woodcock

yes its a command output that produces the multiline data. Sorry if it wasn't clear earlier.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...