Hello all,
I have been banging my head on a problem for the past 24 hours and I am in great need of your help.
I am processing data coming from surveys; I have a tabular data set that contains three multi value fields:
- one multi value field that contains the questions that were shown to the user (example: q1, q2, q3)
- one multi value field that contains the questions that were answered by the user (example: q2, q3)
- one multi value field that contains the duration it took for the respondent to answer those questions in the format: :
All this data is dynamic (including question references), and I cannot make any assumption on the fields content and names.
What I am interested in is reconciliating how long it took for a question that was answered to be answered. Ultimately, I want to check that the duration for every answered question is above a certain static threshold (example: 2000ms).
I am at the stage where, using rex , and mvfilter , I am able to generate the data below from raw events, and know if a question has been answered, and if it has a duration. Unfortunately, I have not been able to extract the duration and compare it with the threshold because rex and match in mvfilter do not support dynamic values. My idea was indeed to mvexpand the shown questions, and then extract for each shown question if it has been answered, and in how long. So far my search looks like this;
| rex field=_raw max_match=0 "question_answer_(?[a-zA-Z_]*)\""
| rex field=_raw max_match=0 "question_duration_(?[a-zA-Z_]*\":\d*),"
| replace "*\"*" with "**" in question_durations
| table shown_questions, answered_questions, question_durations, *
| mvexpand shown_questions
| eval is_answered=if(match(answered_questions, shown_questions), "true", "false")
| eval has_duration=if(match(question_durations, shown_questions), "true", "false")
Here I am a bit lost as it's fairly unclear to me how I can work with dynamic values to extract the question duration either from the multivalue field question_durations or from the individual fields question_duration<question-code> . I thought originally to use mvfilter(match(questions_durations, shown_questions)) to extract the line of the question but it does not work due to the dynamic shown_questions parameter. Putting a static value in there works, but unfortunately that's not an option for me.
Ultimately, my objective is to be able to check if the duration of all questions that have been answered is above a certain threshold.
Would you have an idea on how I could achieve this?
Example of data (before the mvexpand):
... View more