I have a response from one of the client application like this:
{ "employees": { "2023-03-16": { "1": { "id": 1, "name": "Michael Scott", "email": "demo@desktime.com", "groupId": 1, "group": "Accounting", "profileUrl": "url.com", "isOnline": false, "arrived": false, "left": false, "late": false, "onlineTime": 0, "offlineTime": 0, "desktimeTime": 0, "atWorkTime": 0, "afterWorkTime": 0, "beforeWorkTime": 0, "productiveTime": 0, "productivity": 0, "efficiency": 0, "work_starts": "23:59:59", "work_ends": "00:00:00", "notes": { "Skype": "Find.me", "Slack": "MichielS" }, "activeProject": [] }, "2": { "id": 2, "name": "Andy Bernard", "email": "demo3@desktime.com", "groupId": 106345, "group": "Marketing", "profileUrl": "url.com", "isOnline": true, "arrived": "2023-03-16 09:17:00", "left": "2023-03-16 10:58:00", "late": true, "onlineTime": 6027, "offlineTime": 0, "desktimeTime": 6027, "atWorkTime": 6060, "afterWorkTime": 0, "beforeWorkTime": 0, "productiveTime": 4213, "productivity": 69.9, "efficiency": 14.75, "work_starts": "09:00:00", "work_ends": "18:00:00", "notes": { "Background": "Law and accounting" }, "activeProject": { "project_id": 67973, "project_title": "Blue Book", "task_id": 42282, "task_title": "Blue Book task", "duration": 6027 } }..... } "__request_time": "1678957028" }
I am facing problem with the date field "2023-03-16" as this field changes everyday. I wanted to create statistics based on all Employee IDs, Late employees, Email etc for last 7 days. I have used Spath but cannot use wildcard search on all Late employees on all days. Thanks
Maybe you can explain what is the significance of that top level key "2023-03-16"? And is the phrase There ought to be some semantic significance that it changes every day. Also, is that the only top level key? If so, I'd say that the developers are making a poor design. Same can be said about the second level key ("1", "2", ...), which seems to be semantically redundant as third-level key "id". If you have any influence over developers, maybe suggest that they get rid of second-level key, and just make an array with the 3rd level objects.
Anyway, I am not going to assume any semantic significance of the top level key(s?) for now. I also assume that your desire to use wildcard search is about search in those 3rd level keys such as "name" and "email". To my understanding, you want a simple search such as name = "michael *" late = "true" without having to confront field names such as 2023-03-16.1.name.
This is something you can try:
| spath path=employees
| eval date = json_array_to_mv(json_keys(employees))
| mvexpand date
| eval day_employees = json_extract(employees, date)
| eval employee_id = json_array_to_mv(json_keys(day_employees))
| mvexpand employee_id
| eval day_employees = json_extract(day_employees, employee_id)
| spath input=day_employees
Your sample data (single date, two id) would give these field values: (There are too many fields so the following is transposed.)
fieldname | 1 | 2 |
_mkv_child | 0 | 1 |
activeProject.duration | 6027 | |
activeProject.project_id | 67973 | |
activeProject.project_title | Blue Book | |
activeProject.task_id | 42282 | |
activeProject.task_title | Blue Book task | |
afterWorkTime | 0 | 0 |
arrived | false | 2023-03-16 09:17:00 |
atWorkTime | 0 | 6060 |
beforeWorkTime | 0 | 0 |
date | 2023-03-16 | 2023-03-16 |
desktimeTime | 0 | 6027 |
efficiency | 0 | 14.75 |
demo@desktime.com | demo3@desktime.com | |
employee_id | 1 | 2 |
group | Accounting | Marketing |
groupId | 1 | 106345 |
isOnline | false | true |
late | false | true |
left | false | 2023-03-16 10:58:00 |
name | Michael Scott | Andy Bernard |
notes.Background | Law and accounting | |
notes.Skype | Find.me | |
notes.Slack | MichielS | |
offlineTime | 0 | 0 |
onlineTime | 0 | 6027 |
productiveTime | 0 | 4213 |
productivity | 0 | 69.9 |
profileUrl | url.com | url.com |
work_ends | 00:00:00 | 18:00:00 |
work_starts | 23:59:59 | 09:00:00 |
Here is an emulation that you can play with and compare with real data
| makeresults
| eval _raw = "{ \"employees\": { \"2023-03-16\": { \"1\": { \"id\": 1, \"name\": \"Michael Scott\", \"email\": \"demo@desktime.com\", \"groupId\": 1, \"group\": \"Accounting\", \"profileUrl\": \"url.com\", \"isOnline\": false, \"arrived\": false, \"left\": false, \"late\": false, \"onlineTime\": 0, \"offlineTime\": 0, \"desktimeTime\": 0, \"atWorkTime\": 0, \"afterWorkTime\": 0, \"beforeWorkTime\": 0, \"productiveTime\": 0, \"productivity\": 0, \"efficiency\": 0, \"work_starts\": \"23:59:59\", \"work_ends\": \"00:00:00\", \"notes\": { \"Skype\": \"Find.me\", \"Slack\": \"MichielS\" }, \"activeProject\": [] }, \"2\": { \"id\": 2, \"name\": \"Andy Bernard\", \"email\": \"demo3@desktime.com\", \"groupId\": 106345, \"group\": \"Marketing\", \"profileUrl\": \"url.com\", \"isOnline\": true, \"arrived\": \"2023-03-16 09:17:00\", \"left\": \"2023-03-16 10:58:00\", \"late\": true, \"onlineTime\": 6027, \"offlineTime\": 0, \"desktimeTime\": 6027, \"atWorkTime\": 6060, \"afterWorkTime\": 0, \"beforeWorkTime\": 0, \"productiveTime\": 4213, \"productivity\": 69.9, \"efficiency\": 14.75, \"work_starts\": \"09:00:00\", \"work_ends\": \"18:00:00\", \"notes\": { \"Background\": \"Law and accounting\" }, \"activeProject\": { \"project_id\": 67973, \"project_title\": \"Blue Book\", \"task_id\": 42282, \"task_title\": \"Blue Book task\", \"duration\": 6027 } } } }, \"__request_time\": \"1678957028\" }"
``` data emulation above ```
Thanks for your response. Yes it is indeed a bad design.
There is only one parent node "2023-03-16". This changes everyday "2023-03-16, 17, 18....." but the associated fields under this date that is employee id(1, 2,.....) remains the same. The problem arises when we create analytics/alerts for let say last 7 days. The top level key changes.
Your response worked but I am getting all the events of "all time" even if I have selected a timestamp of 24h.
It sounds like that each event is a summary report for a day.
Your response worked but I am getting all the events of "all time" even if I have selected a timestamp of 24h.
Do you mean that when your time selector is for last 24 hours, Splunk returns multiple daily summaries? If _time and the date key do not agree, and if your intention is to search for those summaries that fall within your search window, you can filter by that key, e.g.,
| spath path=employees
| eval date = json_array_to_mv(json_keys(employees))
| mvexpand date ``` skip this if each employees record has only one top level key ```
| addinfo
| eval date_start = strptime(date, "%F")
| where info_min_time <= date_start AND relative_time(date_start, "+1d") < info_max_time
| eval day_employees = json_extract(employees, date)
| eval employee_id = json_array_to_mv(json_keys(day_employees))
| mvexpand employee_id
| eval day_employees = json_extract(day_employees, employee_id)
| spath input=day_employees
Maybe you can explain what is the significance of that top level key "2023-03-16"? And is the phrase There ought to be some semantic significance that it changes every day. Also, is that the only top level key? If so, I'd say that the developers are making a poor design. Same can be said about the second level key ("1", "2", ...), which seems to be semantically redundant as third-level key "id". If you have any influence over developers, maybe suggest that they get rid of second-level key, and just make an array with the 3rd level objects.
Anyway, I am not going to assume any semantic significance of the top level key(s?) for now. I also assume that your desire to use wildcard search is about search in those 3rd level keys such as "name" and "email". To my understanding, you want a simple search such as name = "michael *" late = "true" without having to confront field names such as 2023-03-16.1.name.
This is something you can try:
| spath path=employees
| eval date = json_array_to_mv(json_keys(employees))
| mvexpand date
| eval day_employees = json_extract(employees, date)
| eval employee_id = json_array_to_mv(json_keys(day_employees))
| mvexpand employee_id
| eval day_employees = json_extract(day_employees, employee_id)
| spath input=day_employees
Your sample data (single date, two id) would give these field values: (There are too many fields so the following is transposed.)
fieldname | 1 | 2 |
_mkv_child | 0 | 1 |
activeProject.duration | 6027 | |
activeProject.project_id | 67973 | |
activeProject.project_title | Blue Book | |
activeProject.task_id | 42282 | |
activeProject.task_title | Blue Book task | |
afterWorkTime | 0 | 0 |
arrived | false | 2023-03-16 09:17:00 |
atWorkTime | 0 | 6060 |
beforeWorkTime | 0 | 0 |
date | 2023-03-16 | 2023-03-16 |
desktimeTime | 0 | 6027 |
efficiency | 0 | 14.75 |
demo@desktime.com | demo3@desktime.com | |
employee_id | 1 | 2 |
group | Accounting | Marketing |
groupId | 1 | 106345 |
isOnline | false | true |
late | false | true |
left | false | 2023-03-16 10:58:00 |
name | Michael Scott | Andy Bernard |
notes.Background | Law and accounting | |
notes.Skype | Find.me | |
notes.Slack | MichielS | |
offlineTime | 0 | 0 |
onlineTime | 0 | 6027 |
productiveTime | 0 | 4213 |
productivity | 0 | 69.9 |
profileUrl | url.com | url.com |
work_ends | 00:00:00 | 18:00:00 |
work_starts | 23:59:59 | 09:00:00 |
Here is an emulation that you can play with and compare with real data
| makeresults
| eval _raw = "{ \"employees\": { \"2023-03-16\": { \"1\": { \"id\": 1, \"name\": \"Michael Scott\", \"email\": \"demo@desktime.com\", \"groupId\": 1, \"group\": \"Accounting\", \"profileUrl\": \"url.com\", \"isOnline\": false, \"arrived\": false, \"left\": false, \"late\": false, \"onlineTime\": 0, \"offlineTime\": 0, \"desktimeTime\": 0, \"atWorkTime\": 0, \"afterWorkTime\": 0, \"beforeWorkTime\": 0, \"productiveTime\": 0, \"productivity\": 0, \"efficiency\": 0, \"work_starts\": \"23:59:59\", \"work_ends\": \"00:00:00\", \"notes\": { \"Skype\": \"Find.me\", \"Slack\": \"MichielS\" }, \"activeProject\": [] }, \"2\": { \"id\": 2, \"name\": \"Andy Bernard\", \"email\": \"demo3@desktime.com\", \"groupId\": 106345, \"group\": \"Marketing\", \"profileUrl\": \"url.com\", \"isOnline\": true, \"arrived\": \"2023-03-16 09:17:00\", \"left\": \"2023-03-16 10:58:00\", \"late\": true, \"onlineTime\": 6027, \"offlineTime\": 0, \"desktimeTime\": 6027, \"atWorkTime\": 6060, \"afterWorkTime\": 0, \"beforeWorkTime\": 0, \"productiveTime\": 4213, \"productivity\": 69.9, \"efficiency\": 14.75, \"work_starts\": \"09:00:00\", \"work_ends\": \"18:00:00\", \"notes\": { \"Background\": \"Law and accounting\" }, \"activeProject\": { \"project_id\": 67973, \"project_title\": \"Blue Book\", \"task_id\": 42282, \"task_title\": \"Blue Book task\", \"duration\": 6027 } } } }, \"__request_time\": \"1678957028\" }"
``` data emulation above ```