Solved: Extract data from a txt file

leandromatperei · ‎04-01-2020

Hello everyone,

I have the attached file that is generated every night via my client's internal system and I need to index the information to collect metrics.

Job 'NICE Kaizen Job' : Step 1, 'Run script' : Began Executing 2020-03-31 22:00:00

AgentName                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Canal       Ramal                                              ID do Logger Dia                            Chamadas   
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ ----------- -------------------------------------------------- ------------ ------------------------------ -----------
agent1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       333         22222222                                           00000000     31/03/2020                     17
agent1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       334         22222222                                           00000000     31/03/2020                     2
Sala de Reuniao - Sala 123                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   161         333333333                                          11111111     31/03/2020                     1
Sala de Reuniao - Sala 545                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   157         44444444                                           11111111     31/03/2020                     1
agent2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       106         66666666                                           11111111     31/03/2020                     11
agent2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       106         88888888                                           11111111     31/03/2020                     11
TI Count FL 545454 (2)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       134         999999999                                          11111111     31/03/2020                     6
TI Count FL 545454 (2)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       134         999999999                                          11111111     31/03/2020                     6
TI Count FL 545454 (2)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       134         999999999                                          11111111     31/03/2020                     6

(9 rows(s) affected)

However, it generates a completely unconfigured file for me and due to the company's business rules, I cannot use Python or another language inside the server where the system is hosted.

I need to extract the information from (Agent_Name, Canal, Ramal, Id do Logger, Dia), an example is below

Remembering that this file is variable, there are days that generate many lines and other days not.

AgentName Canal Ramal ID do Logger Dia Chamadas

agent1 33 22222222 00000000 31/03/2020 17
agent1 334 22222222 00000000 31/03/2020 2

Is there a way to do this with Splunk?

to4kawa · ‎04-02-2020

props.conf

[mergecsv]
LINE_BREAKER = affected\)([\r\n]+)
REPORT-mergecsv = merge_csv
SEDCMD-footer = s/\(.*\)//g
SEDCMD-header = s/(?ms).*---\s+//g
SHOULD_LINEMERGE = 0
TIME_FORMAT = %F %T
TIME_PREFEX = Executing\s

transforms.conf

[merge_csv]
REGEX = (?-ms)(?<Agent_Name>\w.*)\s+(?<Canal>\d+)\s+?(?<Ramal>\d+)\s+?(?<Id_do_Logger>\d+)\s+?(?<Dia>[\d\/]+)\s+?(?<Chamadas>\d+)
MV_ADD = true

The way to separate the combined events was pretty tough for me, but I done it.
I learned a lot.

thank you @leandromatpereira

View solution in original post

to4kawa · ‎04-02-2020

props.conf

[mergecsv]
LINE_BREAKER = affected\)([\r\n]+)
REPORT-mergecsv = merge_csv
SEDCMD-footer = s/\(.*\)//g
SEDCMD-header = s/(?ms).*---\s+//g
SHOULD_LINEMERGE = 0
TIME_FORMAT = %F %T
TIME_PREFEX = Executing\s

transforms.conf

[merge_csv]
REGEX = (?-ms)(?<Agent_Name>\w.*)\s+(?<Canal>\d+)\s+?(?<Ramal>\d+)\s+?(?<Id_do_Logger>\d+)\s+?(?<Dia>[\d\/]+)\s+?(?<Chamadas>\d+)
MV_ADD = true

The way to separate the combined events was pretty tough for me, but I done it.
I learned a lot.

thank you @leandromatpereira

leandromatperei · ‎04-02-2020

@to4kawa thanks a lot.

That way it was indexed is perfect for me, it is possible to extract the fields properly.

The only problem is with more fields in the file, where Splunk discards many fields, the larger prints and files are below:

https://www.dropbox.com/s/6dulocddzhm3dld/splunk.rar?dl=0

to4kawa · ‎04-03-2020

your csv has only AgentName field.
Is there always 6 fields?

leandromatperei · ‎04-03-2020

The 06 fields are always listed, what changes is the size of the affected lines that is linked to the number of agents.

Now the fields are always 06 (Agent Name, Channel, Extension, Logger Id, Call Day)

leandromatperei · ‎04-03-2020

If you want I can add more examples here.

to4kawa · ‎04-03-2020

add on props.conf
TRUNCATE = 0

to4kawa · ‎04-01-2020

| makeresults 
| eval _raw=" Job 'NICE Kaizen Job' : Step 1, 'Run script' : Began Executing 2020-03-31 22:00:00

  AgentName                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Canal       Ramal                                              ID do Logger Dia                            Chamadas   
  ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ ----------- -------------------------------------------------- ------------ ------------------------------ -----------
  agent1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         333         22222222                                           00000000     31/03/2020                     17
  agent1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         334         22222222                                           00000000     31/03/2020                     2
  Sala de Reuniao - Sala 123                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   161         333333333                                          11111111     31/03/2020                     1
  Sala de Reuniao - Sala 545                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   157         44444444                                           11111111     31/03/2020                     1
  agent2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          106         66666666                                           11111111     31/03/2020                     11
  agent2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          106         88888888                                           11111111     31/03/2020                     11
  TI Count FL 545454 (2)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       134         999999999                                          11111111     31/03/2020                     6
  TI Count FL 545454 (2)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       134         999999999                                          11111111     31/03/2020                     6
  TI Count FL 545454 (2)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       134         999999999                                          11111111     31/03/2020                     6

  (9 rows(s) affected)" 
| rename COMMENT as "This is your sample"
| rex mode=sed "s/^s+//g"
| makemv delim="
" _raw 
| rename _raw as raw 
| mvexpand raw 
| rex field=raw "(?ms)(?<Agent_Name>.*?)\s{3,}(?<Canal>\d+)\s+(?<Ramal>\d+)\s+(?<Id_do_Logger>\d+)\s+(?<Dia>\S+)\s+(?<Chamadas>\d+)" 
| fields - _* raw 
| table Agent_Name, Canal, Ramal, Id_do_Logger, Dia, Chamadas 
| foreach * 
    [ eval <<FIELD>> = trim('<<FIELD>>')] 
| search Agent_Name=*

Please ask admin to edit props.conf and extract appropriate fields.

props.conf

[simplecsv]
SHOULD_LINEMERGE = 0
TRUNCATE = 0
TIME_FORMAT = %d/%m/%Y
TIME_PREFIX = (?ms)(?<Agent_Name>.*?)\s{3,}(?<Canal>\d+)\s+(?<Ramal>\d+)\s+(?<Id_do_Logger>\d+)\s+
TRANSFORMS-nulls = null1, null2, null3, null4
REPORT-csv = simple_csv
category = Splunk App Add-on Builder
pulldown_type = 1

transforms.conf

[null1]
REGEX = ---
DEST_KEY = queue
FORMAT = nullQueue

[null2]
REGEX = AgentName
DEST_KEY = queue
FORMAT = nullQueue

 [null3]
REGEX = affected
DEST_KEY = queue
FORMAT = nullQueue

[null4]
REGEX = Began
DEST_KEY = queue
FORMAT = nullQueue

[simple_csv]
REGEX = (?<Agent_Name>\w.*)\s+(?<Canal>\d+)\s+(?<Ramal>\d+)\s+(?<Id_do_Logger>\d+)\s+(?<Dia>\S+)\s+(?<Chamadas>\d+)

Why do you make shell script?
You don't have to do this much trouble

leandromatperei · ‎04-01-2020

Thanks for the answer,

But I had some doubts:

In my props.conf file, what configuration should I do to index the logs according to what I need?
I tried to use examples of slightly larger files and it didn't work correctly, I couldn't index the example of my log file in txt format here, I think it would make it easier for the analysis, is it possible to send it by e-mail?

When I put a part of the text in the log file it "breaks" due to the number of characters.

to4kawa · ‎04-01-2020

index ?
I don't know your index . line number?

breaks ?
Isn't this TRUNCATE?

https://docs.splunk.com/Documentation/Splunk/latest/Admin/Propsconf

props.conf is simple line break and only time extract.
so, transforms.conf is null queue and field extract.
Good luck.

leandromatperei · ‎04-01-2020

Follow my dropbox link with the file

https://www.dropbox.com/s/zrqyiobhvb3jn30/teste.txt?dl=0

This file is complete with the complete format, just open it through notepad or notepad ++.

to4kawa · ‎04-01-2020

thank you @leandromatpereira
I check your log and test
On my splunk, both conf are OK.
please check my answer(updated) and try it.

leandromatperei · ‎04-01-2020

thank you @to4kawa

Let me see and I will return here.

leandromatperei · ‎04-02-2020

@to4kawa thanks a lot.

I would just like to check if it is possible that the timestamp is the value of the field "Started running 2020-03-31 22:00:00"

I analyzed that the timestamp is in the current time and must be at the time the routine was executed.

Is it also possible that all fields are within the same line?

leandromatperei · ‎04-01-2020

I don't think I quite understand, I'm sending a link to the original file.

See if you can include this file in Splunk within a new type of source code with the regex settings.

The file should only be opened in notepad or notepad ++

https://www.splunk.com/ https://www.splunk.com/

https://www.splunk.com/

to4kawa · ‎04-01-2020

I see. my answer is updated.
On my splunk, field extractions are good.
check and try it.

Extract data from a txt file

AgentName Canal Ramal ID do Logger Dia Chamadas

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!