.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/plot_forecasting.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code or to run this example in your browser via JupyterLite. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_plot_forecasting.py: ====================================== Forecasting with (Nonlinear) AR models ====================================== .. currentmodule:: fastcan.narx In this examples, we will demonstrate how to use :func:`make_narx` to build (nonlinear) AutoRegressive (AR) models for time-series forecasting. The time series used is the monthly average atmospheric CO2 concentrations from 1958 and 2001. The objective is to forecast the CO2 concentration till nowadays with initial 18 months data. .. rubric:: Credit * The majority of code is adapted from the scikit-learn tutorial `Forecasting of CO2 level on Mona Loa dataset using Gaussian process regression (GPR) `_. .. GENERATED FROM PYTHON SOURCE LINES 21-25 .. code-block:: Python # Authors: The fastcan developers # SPDX-License-Identifier: MIT .. GENERATED FROM PYTHON SOURCE LINES 26-31 Prepare data ------------ We use the data consists of the monthly average atmospheric CO2 concentrations (in parts per million by volume (ppm)) collected at the Mauna Loa Observatory in Hawaii, between 1958 and 2001. .. GENERATED FROM PYTHON SOURCE LINES 31-38 .. code-block:: Python from sklearn.datasets import fetch_openml co2 = fetch_openml(data_id=41187, as_frame=True) co2 = co2.frame co2.head() .. raw:: html
year month day weight flag station co2
0 1958 3 29 4 0 MLO 316.1
1 1958 4 5 6 0 MLO 317.3
2 1958 4 12 4 0 MLO 317.6
3 1958 4 19 6 0 MLO 317.5
4 1958 4 26 2 0 MLO 316.4


.. GENERATED FROM PYTHON SOURCE LINES 39-42 First, we process the original dataframe to create a date column and select it along with the CO2 column. Here, date columns is used only for plotting, as there is no inputs (including time or date) required in AR models. .. GENERATED FROM PYTHON SOURCE LINES 42-51 .. code-block:: Python import pandas as pd co2_data = co2[["year", "month", "day", "co2"]].assign( date=lambda x: pd.to_datetime(x[["year", "month", "day"]]) )[["date", "co2"]] co2_data.head() .. raw:: html
date co2
0 1958-03-29 316.1
1 1958-04-05 317.3
2 1958-04-12 317.6
3 1958-04-19 317.5
4 1958-04-26 316.4


.. GENERATED FROM PYTHON SOURCE LINES 52-54 The CO2 concentration are from March, 1958 to December, 2001, which is shown in the plot below. .. GENERATED FROM PYTHON SOURCE LINES 54-62 .. code-block:: Python import matplotlib.pyplot as plt plt.plot(co2_data["date"], co2_data["co2"]) plt.xlabel("date") plt.ylabel("CO$_2$ concentration (ppm)") _ = plt.title("Raw air samples measurements from the Mauna Loa Observatory") .. image-sg:: /auto_examples/images/sphx_glr_plot_forecasting_001.png :alt: Raw air samples measurements from the Mauna Loa Observatory :srcset: /auto_examples/images/sphx_glr_plot_forecasting_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 63-69 We will preprocess the dataset by taking a monthly average to smooth the data. The months which have no measurements were collected should not be dropped. Because AR models require the time intervals between the two neighboring measurements are consistent. As the results, the NaN values should be kept as the placeholders to maintain the time intervals, and :class:`NARX` can handle the missing values properly. .. GENERATED FROM PYTHON SOURCE LINES 69-78 .. code-block:: Python co2_data = co2_data.set_index("date").resample("ME")["co2"].mean().reset_index() plt.plot(co2_data["date"], co2_data["co2"]) plt.xlabel("date") plt.ylabel("Monthly average of CO$_2$ concentration (ppm)") _ = plt.title( "Monthly average of air samples measurements\nfrom the Mauna Loa Observatory" ) .. image-sg:: /auto_examples/images/sphx_glr_plot_forecasting_002.png :alt: Monthly average of air samples measurements from the Mauna Loa Observatory :srcset: /auto_examples/images/sphx_glr_plot_forecasting_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 79-82 For plotting, the time axis for training is from March, 1958 to December, 2001, which is converted it into a numeric, e.g., March, 1958 will be converted to 1958.25. The time axis for test is from March, 1958 to nowadays. .. GENERATED FROM PYTHON SOURCE LINES 82-93 .. code-block:: Python import datetime import numpy as np today = datetime.datetime.now(tz=datetime.UTC) current_month = today.year + today.month / 12 x_train = (co2_data["date"].dt.year + co2_data["date"].dt.month / 12).to_numpy() x_test = np.arange(start=1958.25, stop=current_month, step=1 / 12) .. GENERATED FROM PYTHON SOURCE LINES 94-100 Nonlinear AR model ------------------ We can use :func:`make_narx` to easily build a nonlinear AR model, which does not has an input. Therefore, the input ``X`` is set as ``None``. :func:`make_narx` will search 10 polynomial terms, whose maximum degree is 2 and maximum delay is 9. .. GENERATED FROM PYTHON SOURCE LINES 100-115 .. code-block:: Python from fastcan.narx import make_narx, print_narx max_delay = 9 model = make_narx( None, co2_data["co2"], n_terms_to_select=10, max_delay=max_delay, poly_degree=2, verbose=0, ) model.fit(None, co2_data["co2"], coef_init="one_step_ahead") print_narx(model, term_space=27) .. rst-class:: sphx-glr-script-out .. code-block:: none | yid | Term | Coef | |-----|---------------------------|----------| | 0 | Intercept | -14.881 | | 0 | y_hat[k-1,0] | 1.291 | | 0 | y_hat[k-2,0] | -0.203 | | 0 | y_hat[k-3,0] | -0.438 | | 0 | y_hat[k-5,0] | 0.327 | | 0 | y_hat[k-8,0] | -0.578 | | 0 | y_hat[k-9,0] | 0.688 | | 0 | y_hat[k-2,0]*y_hat[k-6,0] | -0.000 | | 0 | y_hat[k-6,0]*y_hat[k-6,0] | -0.046 | | 0 | y_hat[k-6,0]*y_hat[k-7,0] | 0.091 | | 0 | y_hat[k-7,0]*y_hat[k-7,0] | -0.046 | .. GENERATED FROM PYTHON SOURCE LINES 116-129 Forecasting performance ----------------------- As AR model does not require input data, the input ``X`` in :func:`predict` is used to indicate total steps to forecast. The initial conditions ``y_init`` is the first 18 months data from the ground truth, which contains missing values. If there is no missing value given to ``y_init``, we can only use ``max_delay`` number of samples as the initial conditions. However, if missing values are given to ``y_init``, :class:`NARX` will break the data into multiple time series according to the missing values. For each time series, at least ``max_delay`` number of samples, which does not have missing values, are required to do the proper forecasting. The results show our fitted model is capable to forecast to future years with only first 18 months data. .. GENERATED FROM PYTHON SOURCE LINES 129-143 .. code-block:: Python y_pred = model.predict( len(x_test), y_init=co2_data["co2"][:18], ) plt.plot(x_test, y_pred, label="Predicted", c="tab:orange") plt.plot(x_train, co2_data["co2"], label="Actual", linestyle="dashed", c="tab:blue") plt.legend() plt.xlabel("Year") plt.ylabel("Monthly average of CO$_2$ concentration (ppm)") _ = plt.title( "Monthly average of air samples measurements\nfrom the Mauna Loa Observatory" ) plt.show() .. image-sg:: /auto_examples/images/sphx_glr_plot_forecasting_003.png :alt: Monthly average of air samples measurements from the Mauna Loa Observatory :srcset: /auto_examples/images/sphx_glr_plot_forecasting_003.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 5.718 seconds) .. _sphx_glr_download_auto_examples_plot_forecasting.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: lite-badge .. image:: images/jupyterlite_badge_logo.svg :target: ../lite/lab/index.html?path=auto_examples/plot_forecasting.ipynb :alt: Launch JupyterLite :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_forecasting.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_forecasting.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_forecasting.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_