Preprocessing package¶

Pre-processing module is a module responsible of analysing the raw energy data as provided from the NILMtk API. It contains a pre-processing sub-module that defines the different data transformations to be applied to the input data (e.g., data normalisation). It focuses on the input data, while the output data is included in the data loader as some models requires states generation.

Pre_processing module¶

deep_nilmtk.preprocessing.pre_processing.data_preprocessing(aggregate, targets=None, feature_type='mains', alpha=0.1, normalize=None, main_mu=329, main_std=450, q_filter={'q': 50, 'w': 10}, main_min=0, main_max=1500)[source]¶

Default pre-processing function. It performs normalization of the input. However, it leaves the target output normlization to the dataloader as some loaders require to also generate the states from the the original data.

Parameters

aggregate (list of DataFrames) -- The aggregate power
targets (list of DataFrames, optional) -- The target power, defaults to None
feature_type (str, optional) -- the type of input features to derive from the aggregate power, defaults to main
alpha (float, optional) -- reflection rate, defaults to 0.1
normalize ([type], optional) -- normalization type, defaults to None
main_mu (int, optional) -- the mean of the aggregate power data, defaults to 329
main_std (int, optional) -- the std of the aggregate power data, defaults to 450
q_filter (dict, optional) -- quantile filters, defaults to {"q":50, "w":10}
main_min (int, optional) -- the min of the aggregate power data, defaults to 0
main_max (int, optional) -- the max of the aggregate power data, defaults to 1500

Returns

aggregate power, submetered data all in one dataframe , submetered data as seperate datFrames

Return type

tuple

deep_nilmtk.preprocessing.pre_processing.get_differential_power(data)[source]¶

The differences between consecutive elements of an array.

Parameters: data (np.array) -- the input data
Returns: The differences.
Return type: np.array

deep_nilmtk.preprocessing.pre_processing.get_percentile(data, p=50)[source]¶

Calculates the percentile p of the data

Parameters

data (np.array) -- The power data
p (int, optional) -- The quantile , defaults to 50

Returns

The quantile values of the power data

Return type

np.array

deep_nilmtk.preprocessing.pre_processing.get_temporal_info(data)[source]¶

Generates the temporal information related power consumption

Parameters: data (list(DatetimeIndex)) -- a list of temporal information
Returns: Temporal contextual information of the energy data
Return type: np.array

deep_nilmtk.preprocessing.pre_processing.get_variant_power(data, alpha=0.1)[source]¶

Generate variant power which reduce noise that may impose negative influence on pattern identification

Parameters

data (np.array) -- power signal
alpha (float, optional) -- reflection rate, defaults to 0.1

Returns

The variant power generated

Return type

np.array

deep_nilmtk.preprocessing.pre_processing.over_lapping_sliding_window(data, seq_len=4, step_size=1)[source]¶

Generates overlappping sequences using the sliding sequence approach.

Parameters

data (np.array) -- Power data
seq_len (int, optional) -- The length of the sequences. Defaults to 4.
step_size (int, optional) -- The step size. Defaults to 1.

Returns

An array of the generated sequences.

Return type

np.array

deep_nilmtk.preprocessing.pre_processing.quantile_filter(data: numpy.array, sequence_length: int = 10, p: int = 50)[source]¶

Applies quantile filter on the input data.

Parameters

data (np.array) -- The input data power data.
sequence_length (int, optional) -- The length of sequence, defaults to 10
p (int, optional) -- The percentile. Defaults to 50.

Returns

array of values for correponding percentile

Return type

np.array

States module¶

deep_nilmtk.preprocessing.states.compute_status(appliances, thresholds=None, min_off=None, min_on=None, threshold_std=True, return_means=False, appliances_labels=[], threshold_method='at')[source]¶

Calculates the operational status of appliances using the specified thresholding method

Parameters

appliances (np.array) -- Power consumption of target applainces
thresholds (np.array, optional) -- Threhsold of each applaince, defaults to None
min_off (np.array, optional) -- Minimum off duration, defaults to None
min_on (np.array, optional) -- Minimum on duration, defaults to None
threshold_std (bool, optional) -- Decides about the use of STD to calcualte the thresholds, defaults to True
return_means (bool, optional) -- Specifiyies if the mean consumption of each status is required, defaults to False
appliances_labels (list, optional) -- Labels of the considered appliances, defaults to []
threshold_method (str, optional) -- The thresholding method to be used for status derivation, defaults to 'at'

Returns

Operational states with the thresholds used and the power consumption of each states

Return type

tuple

deep_nilmtk.preprocessing.states.get_status(ser, thresholds)[source]¶

[summary]

Parameters

ser (np.array) -- Target power consumption with shape = (num_series, series_len, num_meters)
thresholds (np.array) -- Thresholds of target power with shape = (num_meters,)

Returns

An array (num_series, series_len, num_meters) with binary values indicating ON (1) and OFF (0) states.

Return type

np.array

deep_nilmtk.preprocessing.states.get_status_by_duration(ser, thresholds, min_off, min_on)[source]¶

Calculates operational status of multiple meters using thresholds

Parameters

ser (np.array) -- Power consumption shape = (num_series, series_len, num_meters) - num_series : Amount of time series- series_len : Length of each time series - num_meters : Meters contained in the array.
thresholds (np.array) -- Thresholds of power consumption shape = (num_meters,)
min_off (np.array) -- Mimimum off duration with shape = (num_meters,)
min_on (np.array) -- Mimimum on duration with shape = (num_meters,)

Returns

Operational status with binary values indicating ON (1) and OFF (0) states with shape (num_series, series_len, num_meters).

Return type

np.array

deep_nilmtk.preprocessing.states.get_status_means(ser, status)[source]¶

Get means of ON/OFF status.

Parameters

ser (np.array) -- Power data
status (np.array) -- The operational status of the target power

Returns

Mean power consumption of each state

Return type

np.array

Threshold module¶

deep_nilmtk.preprocessing.threshold.get_threshold_params(appliances, threshold_method='at')[source]¶

Given the method name and list of appliances, this function results the necessary Args to use the method in ukdale_data.load_ukdale_meter

Parameters

appliances (list) -- List of aappliances
threshold_method (str, optional) -- Thresholding method, defaults to 'at'

Raises

ValueError -- Wrong thresholding method
ValueError -- Missing parameters of an applaince

Returns

thresholds, min_off, min_on, threshold_std

Return type

tuple

deep_nilmtk.preprocessing.threshold.get_thresholds(ser, use_std=True, return_mean=False)[source]¶

Returns the estimated thresholds that splits ON and OFF appliances states.

Parameters

ser (np.array) -- An array with shape = (num_series, series_len, num_meters) - num_series : Amount of time series. - series_len : Length of each time series. - num_meters : Meters contained in the array.
use_std (bool, optional) -- Consider the standard deviation of each cluster when computing the threshold. If not, the threshold is set in the middle point between cluster centroids., defaults to True
return_mean (bool, optional) -- If True, return the means as second parameter., defaults to False

Returns

thresholds and mean consumption for each appliance

Return type

tuple

Note

The eman values are only returned when return_mean is True (default False)