standard_average_of_similar_days
Purpose
The purpose of the rule is to estimate data points by using historical data from same days of the week.
Flow chart

Detailed description
- The rule is triggered only for interval channels i.e. where channel_classifier_id contains INT15 or similar e.g. ACTIVE_DELIVERY_INT 15T_VEE, ACTIVE_REDELIVERY_INT 60T_VEE.
- The estimation is performed only for data points having annotations defined in the parameter
validation_rules_to_estimate. - The estimation is performed only for data points where consecutive no. of annotated data points is between parameters
min_gap_sizeandmax_gap_size. These are optional parameters and if not defined then all annotated data points will be selected for estimation. - Historic data is loaded using the parameter
max_number_of_days_to_load- it defines the no. of days to load. The days counting starts from the start time of the ingested data. Historic data is loaded 7 days at a time until there is no more data to load or the max_number_of_days to_load is reached. - Historic data with annotations defined in the parameter
non_allowed_annotationsis removed from calculations. - Holiday calendar is used to determine any holidays within historical data or data to be estimated. Holiday_Calendar should be defined in the datasource tag. The tag name and property for holiday calendar is defined in the
standard_constants. If holiday calendar is not defined then holidays are not treated differently during estimation. - Any holidays found are treated as Sundays in historial data and in the data to be estimated.
- If there is not enough historical data, then estimation of the datapoint is skipped. The minimum no. of histarical values to be used for averaging is defined in the parameter
min_no_values. - Arithmetic average is calculated for all histric datapoints with the same day and the same timestamp. Day light saving time is not taken into account.
Common use cases
- The rule is used in cases actual data is not missing more than 2 months and the data has weekly characteristic. Otherwise the estimated values will be influenced from potential yearly seasonality. It means the calculated average includes data points which include yearly seasonal effect and this leads to lower accuracy of the estimated value.
- In case datasource does not have yearly seasonal characteristic the estimation rule could be used for estimating data missing for a longer period - almost a year.