Hello,
I need your help with a field extraction.
I have this type of data, and I'd like to extract the following fields with a rex command:
The syntax is as follows :
("data": ["from" :
"2024-04-25T11: 30Z",
"to": "2024-04-2512:00Z",
"intensity": ("forecast": 152,
"actual": null, "index": "moderate"}), ("from": "2024-04-25T12:002",
"intensity": {"forecast": 152, "actual": null, "index": "moderate"}), ("from": "2024-04-25T12:30Z",
"to":
{"from": "2024-04-25T13:00Z", "to":
"2024-04-25T12: 30Z",
("forecast": 164,
"actual": null,
"2024-04-2513: 30Z", "intensity": ("forecast": 154,
"to": "2024-04-25T13: 002",
"intensity": ("forecast": 154,
"actual": null,
"index":
"actual": null, "index": "moderate"}), ("from": "2024-04-25T13:30Z*, "to": "2024-04-25T14:002",
"moderate"}},
"intensity":
04-25T14: 30Z",
"to" :
"index": "moderate"3}, ("from": "2024-04-25T14:002*, "to": "2024-04-25T14:30Z", "intensity": ("forecast": 166, "actual": null, "index": "moderate"}), ("from":
" 2024-04-25T15:00Z"
"actual": nu11,
"index"
"intensity": {"forecast": 170, "actual": null, "index": "moderate"}), {"from": "2024-04-2515: 00Z",
2024-
"to" :
"moderate"}), {"from": "2024-04-25T15:30Z", "to": "2024-04-25T16:00Z", "intensity": ("forecast": 175,
"to": "2024-04-25T15: 30Z",
"intensity": {"forecast": 172,
"2024-04-25T16: 30Z",
"index": "moderate"}}, ("from": "2024-04-25T17:00Z", "to":
"intensity": ("forecast": 177, "actual": null, "index": "moderate"?), ("from": "2024-04-2516: 302",
"actual" : nu11,
"index"
"moderate"}}, ("from": "2024-04-2516: 00Z",
"to": "2024-04-25T17:002",
"intensity": ("forecast": 179,
"actual": null,
2024-04-2517: 30Z", "intensity": ("forecast": 181, "actual": null,
25T18:00Z", "intensity": ("forecast": 184,
"index": "moderate"}}, {"from": "2024-04-25T17: 30Z",
"actual": null, "index": "moderate"}), ("from": "2024-04-25T18:002", "to": "2024-04-25T18: 30Z",
"to": "2024-04-
"moderate"}}, ("from": "2024-04-2518: 30Z", "to": "2024-04-25T19:002",
"intensity": ("forecast": 187, "actual": null,
"intensity": ("forecast": 190,
"actua1": nul1,
"index":
"index":
"high"}}, ("from":
"intensity": {"forecast": 193,
"actual": null, "index":
"2024-04-25T19: 00Z", "to":
"2024-04-25T19: 30Z"
"high"}}, ("from": "2024-04-25T19:30Z", "to": "2024-04-25T20:00Z", "intensity":
{"forecast": 194,
"2024-04-2520: 00Z", "to": "2024-04-25T20:30Z", "intensity": {"forecast": 195, "actual": null, "index": "high"3}, ("from": "2024-04-25T20:30Z",
"actual": null, "index": "high")}, ("from":
"2024-04-25T21:00Z", "intensity": ("forecast":
198, "actual": null, "index": "high"'), ("from": "2024-04-25T21: 002",
"2024-04-25T22: 00Z", "intensity": {"forecast": 187, "actual": null,
"to": "2024-04-25T21: 30Z", "intensity": {"forecast": 196,
'actual": null,
"index": "high"}}, {"from": "2024-04-25T21:302"
"to"
"index": "moderate"}}, ("from": "2024-04-25T22:00Z", "to":
"2024-04-25T22: 30Z",
"intensity": ("forecast": 181, "actual": null,
"index": "moderate"}}, {"from": "2024-04-25T22:30Z", "to": "2024-04-25T23:002", "intensity": ("forecast": 180, "actual": null,
"index'
moderate"}},{"from":
25T23:30Z", "intensity": {"forecast": 172, "actual": null, "index": "moderate"}}, {"from": "2024-04-25T23: 30Z",
"2024-04-25T23:002",
"to":
" 2024-04-
"moderate"}}, {"from": "2024-04-26T00:00Z", "to": "2024-04-2600: 30Z", "intensity": ("forecast": 150,
"to": "2024-04-2600: 00Z",
"intensity": ("forecast": 150,
"actual": null,
"index":
"actual": null, "index": "moderate")}, ("from": "2024-04-26T00: 302",
"to": "2024-04-26T01:00Z"
"intensity": {"forecast": 149,
"actual": null,
"index": "moderate"}}, ("from": "2024-04-26T01:002",
"to": "2024-04-26T01:30Z", "intensity": {"forecast": 149,
"actual": null,
"index":
"moderate"}}, ...
Thank you very much
Hello,I have this type of data, and I'd like to extract the following fields with a rex command:
Two words: Don't. The data you show is clearly a fragment from a JSON object. Do not treat structured data such as JSON as text because the developer can change format at any time without changing syntax and render your rex useless. Splunk has robust, QA-tested commands like spath. Follow @ITWhisperer's advice to share valid, raw JSON data. (Anonymize as needed.) If your raw data is a mix of free text and JSON, show examples of how they are mixed so we can extract the valid JSON, then handle JSON in spath or fromjson (9.0+)
Specific questions:
Lastly, in a common logging practice is to append JSON data at the end, following some other informational strings that do not contain opening curly bracket. If this is the case, you can easily extract that JSON part with the following and handle it robustly with spath:
| rex "^[^{]*(?<json_data>.+)"
| spath input=json_data path=data{}
| mvexpand data{}
| spath input=data{}
After this, your highlighted values would be in fields from, to, and intensity.forecast, respectively.
Please can you repost your sample data in the correct format as what you posted does not match the structure show in your screen grab and is not valid JSON. Also, please paste into a code block </> to preserve format information.
Hi @anissabnk,
this seems to be a json format, you could use INDEXED_EXTRACTIONS=json or the spath command https://docs.splunk.com/Documentation/Splunk/9.2.1/SearchReference/Spath
If anyway you want to use a regex, you should use more regexes like the following:
| rex "from\"\s*:\s*\"(?<from>[^\"]+)\""
that you can test at https://regex101.com/r/6NQsEb/1
| rex "to\"\s*:\s*\"(?<to>[^\"]+)\""
that you can test at https://regex101.com/r/6NQsEb/2
| rex "intensity\"\s*:\s*\(\"\w+\"\s*:\s*(?<intensity>\d+)"
that you can test at https://regex101.com/r/6NQsEb/3
Ciao.
Giuseppe