Hello everyone,
I have the attached file that is generated every night via my client's internal system and I need to index the information to collect metrics.
Job 'NICE Kaizen Job' : Step 1, 'Run script' : Began Executing 2020-03-31 22:00:00
AgentName Canal Ramal ID do Logger Dia Chamadas
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ ----------- -------------------------------------------------- ------------ ------------------------------ -----------
agent1 333 22222222 00000000 31/03/2020 17
agent1 334 22222222 00000000 31/03/2020 2
Sala de Reuniao - Sala 123 161 333333333 11111111 31/03/2020 1
Sala de Reuniao - Sala 545 157 44444444 11111111 31/03/2020 1
agent2 106 66666666 11111111 31/03/2020 11
agent2 106 88888888 11111111 31/03/2020 11
TI Count FL 545454 (2) 134 999999999 11111111 31/03/2020 6
TI Count FL 545454 (2) 134 999999999 11111111 31/03/2020 6
TI Count FL 545454 (2) 134 999999999 11111111 31/03/2020 6
(9 rows(s) affected)
However, it generates a completely unconfigured file for me and due to the company's business rules, I cannot use Python or another language inside the server where the system is hosted.
I need to extract the information from (Agent_Name, Canal, Ramal, Id do Logger, Dia), an example is below
Remembering that this file is variable, there are days that generate many lines and other days not.
agent1 33 22222222 00000000 31/03/2020 17
agent1 334 22222222 00000000 31/03/2020 2
Is there a way to do this with Splunk?
props.conf
[mergecsv]
LINE_BREAKER = affected\)([\r\n]+)
REPORT-mergecsv = merge_csv
SEDCMD-footer = s/\(.*\)//g
SEDCMD-header = s/(?ms).*---\s+//g
SHOULD_LINEMERGE = 0
TIME_FORMAT = %F %T
TIME_PREFEX = Executing\s
transforms.conf
[merge_csv]
REGEX = (?-ms)(?<Agent_Name>\w.*)\s+(?<Canal>\d+)\s+?(?<Ramal>\d+)\s+?(?<Id_do_Logger>\d+)\s+?(?<Dia>[\d\/]+)\s+?(?<Chamadas>\d+)
MV_ADD = true
The way to separate the combined events was pretty tough for me, but I done it.
I learned a lot.
thank you @leandromatpereira
props.conf
[mergecsv]
LINE_BREAKER = affected\)([\r\n]+)
REPORT-mergecsv = merge_csv
SEDCMD-footer = s/\(.*\)//g
SEDCMD-header = s/(?ms).*---\s+//g
SHOULD_LINEMERGE = 0
TIME_FORMAT = %F %T
TIME_PREFEX = Executing\s
transforms.conf
[merge_csv]
REGEX = (?-ms)(?<Agent_Name>\w.*)\s+(?<Canal>\d+)\s+?(?<Ramal>\d+)\s+?(?<Id_do_Logger>\d+)\s+?(?<Dia>[\d\/]+)\s+?(?<Chamadas>\d+)
MV_ADD = true
The way to separate the combined events was pretty tough for me, but I done it.
I learned a lot.
thank you @leandromatpereira
@to4kawa thanks a lot.
That way it was indexed is perfect for me, it is possible to extract the fields properly.
The only problem is with more fields in the file, where Splunk discards many fields, the larger prints and files are below:
your csv has only AgentName field.
Is there always 6 fields?
The 06 fields are always listed, what changes is the size of the affected lines that is linked to the number of agents.
Now the fields are always 06 (Agent Name, Channel, Extension, Logger Id, Call Day)
If you want I can add more examples here.
add on props.conf
TRUNCATE = 0
| makeresults
| eval _raw=" Job 'NICE Kaizen Job' : Step 1, 'Run script' : Began Executing 2020-03-31 22:00:00
AgentName Canal Ramal ID do Logger Dia Chamadas
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ ----------- -------------------------------------------------- ------------ ------------------------------ -----------
agent1 333 22222222 00000000 31/03/2020 17
agent1 334 22222222 00000000 31/03/2020 2
Sala de Reuniao - Sala 123 161 333333333 11111111 31/03/2020 1
Sala de Reuniao - Sala 545 157 44444444 11111111 31/03/2020 1
agent2 106 66666666 11111111 31/03/2020 11
agent2 106 88888888 11111111 31/03/2020 11
TI Count FL 545454 (2) 134 999999999 11111111 31/03/2020 6
TI Count FL 545454 (2) 134 999999999 11111111 31/03/2020 6
TI Count FL 545454 (2) 134 999999999 11111111 31/03/2020 6
(9 rows(s) affected)"
| rename COMMENT as "This is your sample"
| rex mode=sed "s/^s+//g"
| makemv delim="
" _raw
| rename _raw as raw
| mvexpand raw
| rex field=raw "(?ms)(?<Agent_Name>.*?)\s{3,}(?<Canal>\d+)\s+(?<Ramal>\d+)\s+(?<Id_do_Logger>\d+)\s+(?<Dia>\S+)\s+(?<Chamadas>\d+)"
| fields - _* raw
| table Agent_Name, Canal, Ramal, Id_do_Logger, Dia, Chamadas
| foreach *
[ eval <<FIELD>> = trim('<<FIELD>>')]
| search Agent_Name=*
Please ask admin to edit props.conf and extract appropriate fields.
props.conf
[simplecsv]
SHOULD_LINEMERGE = 0
TRUNCATE = 0
TIME_FORMAT = %d/%m/%Y
TIME_PREFIX = (?ms)(?<Agent_Name>.*?)\s{3,}(?<Canal>\d+)\s+(?<Ramal>\d+)\s+(?<Id_do_Logger>\d+)\s+
TRANSFORMS-nulls = null1, null2, null3, null4
REPORT-csv = simple_csv
category = Splunk App Add-on Builder
pulldown_type = 1
transforms.conf
[null1]
REGEX = ---
DEST_KEY = queue
FORMAT = nullQueue
[null2]
REGEX = AgentName
DEST_KEY = queue
FORMAT = nullQueue
[null3]
REGEX = affected
DEST_KEY = queue
FORMAT = nullQueue
[null4]
REGEX = Began
DEST_KEY = queue
FORMAT = nullQueue
[simple_csv]
REGEX = (?<Agent_Name>\w.*)\s+(?<Canal>\d+)\s+(?<Ramal>\d+)\s+(?<Id_do_Logger>\d+)\s+(?<Dia>\S+)\s+(?<Chamadas>\d+)
Why do you make shell script?
You don't have to do this much trouble
Thanks for the answer,
But I had some doubts:
In my props.conf file, what configuration should I do to index the logs according to what I need?
I tried to use examples of slightly larger files and it didn't work correctly, I couldn't index the example of my log file in txt format here, I think it would make it easier for the analysis, is it possible to send it by e-mail?
When I put a part of the text in the log file it "breaks" due to the number of characters.
index
?
I don't know your index
. line number?
breaks
?
Isn't this TRUNCATE
?
https://docs.splunk.com/Documentation/Splunk/latest/Admin/Propsconf
props.conf is simple line break and only time extract.
so, transforms.conf is null queue and field extract.
Good luck.
Follow my dropbox link with the file
This file is complete with the complete format, just open it through notepad or notepad ++.
thank you @leandromatpereira
I check your log and test
On my splunk, both conf are OK.
please check my answer(updated) and try it.
thank you @to4kawa
Let me see and I will return here.
@to4kawa thanks a lot.
I would just like to check if it is possible that the timestamp is the value of the field "Started running 2020-03-31 22:00:00"
I analyzed that the timestamp is in the current time and must be at the time the routine was executed.
Is it also possible that all fields are within the same line?
I don't think I quite understand, I'm sending a link to the original file.
See if you can include this file in Splunk within a new type of source code with the regex settings.
The file should only be opened in notepad or notepad ++
https://www.splunk.com/ https://www.splunk.com/
I see. my answer is updated.
On my splunk, field extractions are good.
check and try it.