Splunk Search

ARIMA convert integer machine learning

nsantiago17
Explorer

I'm trying to run this query below:

(index=A sourcetype=jobs_info JOB_NAME IN (ACQUA)) OR (index=B sourcetype=FIRE) OR (index=C sourcetype=EARTH)

| eval _time = strftime(_time, "%Y-%m-%d")
| eval START_TIME = strptime(START_TIME,"%Y%m%d%H%M%S")
| eval END_TIME = strptime(END_TIME,"%Y%m%d%H%M%S")
| eval EXECUTION_TIME = END_TIME-START_TIME

| eventstats avg(EXECUTION_TIME) as avg stdev(EXECUTION_TIME) as stdev

| eval lowerBound=(avg-stdev*exact(1.5)), upperBound=(avg+stdev*exact(1.5))
| eval isOutlier=if(EXECUTION_TIME < lowerBound OR EXECUTION_TIME > upperBound, 1, 0)

| stats values(EXECUTION_TIME) as EXECUTION_TIME sum(TNeg) as neg by _time
| where isnotnull(EXECUTION_TIME)
| table _time neg EXECUTION_TIME
| sort - _time

| fit RandomForestRegressor EXECUTION_TIME from "_time" "neg" n_estimators=15 into "teste"
| apply "teste"
| eval predicted(EXECUTION_TIME) = round('predicted(EXECUTION_TIME)', 2)

| stats values(neg) as neg, values(EXECUTION_TIME) as REALEXEC, values(predicted(EXECUTION_TIME)) as EXEC by _time
| eval erro = round(((EXEC/REALEXEC)-1)*100, 2)
| eval _time = tonumber(_time)
| table _time neg REALEXEC EXEC
| sort _time
| fit ARIMA _time EXEC holdback=3 conf_interval=95 order=12-0-1 forecast_k=5 as prediction | forecastviz(5, 3, "EXEC", 95)

And I'm having this error: Error in 'fit' command: Error while fitting "ARIMA" model: cannot convert float NaN to integer.
How can I can fix it and is there some easier way to run my code?

0 Karma

quincybatten
New Member

The ValueError: cannot convert float NaN to integer raised because of Pandas doesn't have the ability to store NaN values for integers. From Pandas v0.24, introduces Nullable Integer Data Types which allows integers to coexist with NaNs. This does allow integer NaNs . This is the pandas integer, instead of the numpy integer.

df['column_name'].astype(np.float).astype("Int32")

 

0 Karma

hkeswani_splunk
Splunk Employee
Splunk Employee

Either your _time or EXEC could be in float format which needs to be changed to the integer type.
Could you show the table for _time and EXEC just before the fit ARIMA command?

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...