Preprocessing package¶
Pre-processing module is a module responsible of analysing the raw energy data as provided from the NILMtk API. It contains a pre-processing sub-module that defines the different data transformations to be applied to the input data (e.g., data normalisation). It focuses on the input data, while the output data is included in the data loader as some models requires states generation.
Pre_processing module¶
- deep_nilmtk.preprocessing.pre_processing.data_preprocessing(aggregate, targets=None, feature_type='mains', alpha=0.1, normalize=None, main_mu=329, main_std=450, q_filter={'q': 50, 'w': 10}, main_min=0, main_max=1500)[source]¶
Default pre-processing function. It performs normalization of the input. However, it leaves the target output normlization to the dataloader as some loaders require to also generate the states from the the original data.
- Parameters
aggregate (list of DataFrames) -- The aggregate power
targets (list of DataFrames, optional) -- The target power, defaults to None
feature_type (str, optional) -- the type of input features to derive from the aggregate power, defaults to main
alpha (float, optional) -- reflection rate, defaults to 0.1
normalize ([type], optional) -- normalization type, defaults to None
main_mu (int, optional) -- the mean of the aggregate power data, defaults to 329
main_std (int, optional) -- the std of the aggregate power data, defaults to 450
q_filter (dict, optional) -- quantile filters, defaults to {"q":50, "w":10}
main_min (int, optional) -- the min of the aggregate power data, defaults to 0
main_max (int, optional) -- the max of the aggregate power data, defaults to 1500
- Returns
aggregate power, submetered data all in one dataframe , submetered data as seperate datFrames
- Return type
tuple
- deep_nilmtk.preprocessing.pre_processing.get_differential_power(data)[source]¶
The differences between consecutive elements of an array.
- Parameters
data (np.array) -- the input data
- Returns
The differences.
- Return type
np.array
- deep_nilmtk.preprocessing.pre_processing.get_percentile(data, p=50)[source]¶
Calculates the percentile p of the data
- Parameters
data (np.array) -- The power data
p (int, optional) -- The quantile , defaults to 50
- Returns
The quantile values of the power data
- Return type
np.array
- deep_nilmtk.preprocessing.pre_processing.get_temporal_info(data)[source]¶
Generates the temporal information related power consumption
- Parameters
data (list(DatetimeIndex)) -- a list of temporal information
- Returns
Temporal contextual information of the energy data
- Return type
np.array
- deep_nilmtk.preprocessing.pre_processing.get_variant_power(data, alpha=0.1)[source]¶
Generate variant power which reduce noise that may impose negative influence on pattern identification
- Parameters
data (np.array) -- power signal
alpha (float, optional) -- reflection rate, defaults to 0.1
- Returns
The variant power generated
- Return type
np.array
- deep_nilmtk.preprocessing.pre_processing.over_lapping_sliding_window(data, seq_len=4, step_size=1)[source]¶
Generates overlappping sequences using the sliding sequence approach.
- Parameters
data (np.array) -- Power data
seq_len (int, optional) -- The length of the sequences. Defaults to 4.
step_size (int, optional) -- The step size. Defaults to 1.
- Returns
An array of the generated sequences.
- Return type
np.array
- deep_nilmtk.preprocessing.pre_processing.quantile_filter(data: numpy.array, sequence_length: int = 10, p: int = 50)[source]¶
Applies quantile filter on the input data.
- Parameters
data (np.array) -- The input data power data.
sequence_length (int, optional) -- The length of sequence, defaults to 10
p (int, optional) -- The percentile. Defaults to 50.
- Returns
array of values for correponding percentile
- Return type
np.array
States module¶
- deep_nilmtk.preprocessing.states.compute_status(appliances, thresholds=None, min_off=None, min_on=None, threshold_std=True, return_means=False, appliances_labels=[], threshold_method='at')[source]¶
Calculates the operational status of appliances using the specified thresholding method
- Parameters
appliances (np.array) -- Power consumption of target applainces
thresholds (np.array, optional) -- Threhsold of each applaince, defaults to None
min_off (np.array, optional) -- Minimum off duration, defaults to None
min_on (np.array, optional) -- Minimum on duration, defaults to None
threshold_std (bool, optional) -- Decides about the use of STD to calcualte the thresholds, defaults to True
return_means (bool, optional) -- Specifiyies if the mean consumption of each status is required, defaults to False
appliances_labels (list, optional) -- Labels of the considered appliances, defaults to []
threshold_method (str, optional) -- The thresholding method to be used for status derivation, defaults to 'at'
- Returns
Operational states with the thresholds used and the power consumption of each states
- Return type
tuple
- deep_nilmtk.preprocessing.states.get_status(ser, thresholds)[source]¶
[summary]
- Parameters
ser (np.array) -- Target power consumption with shape = (num_series, series_len, num_meters)
thresholds (np.array) -- Thresholds of target power with shape = (num_meters,)
- Returns
An array (num_series, series_len, num_meters) with binary values indicating ON (1) and OFF (0) states.
- Return type
np.array
- deep_nilmtk.preprocessing.states.get_status_by_duration(ser, thresholds, min_off, min_on)[source]¶
Calculates operational status of multiple meters using thresholds
- Parameters
ser (np.array) -- Power consumption shape = (num_series, series_len, num_meters) - num_series : Amount of time series- series_len : Length of each time series - num_meters : Meters contained in the array.
thresholds (np.array) -- Thresholds of power consumption shape = (num_meters,)
min_off (np.array) -- Mimimum off duration with shape = (num_meters,)
min_on (np.array) -- Mimimum on duration with shape = (num_meters,)
- Returns
Operational status with binary values indicating ON (1) and OFF (0) states with shape (num_series, series_len, num_meters).
- Return type
np.array
Threshold module¶
- deep_nilmtk.preprocessing.threshold.get_threshold_params(appliances, threshold_method='at')[source]¶
Given the method name and list of appliances, this function results the necessary Args to use the method in ukdale_data.load_ukdale_meter
- Parameters
appliances (list) -- List of aappliances
threshold_method (str, optional) -- Thresholding method, defaults to 'at'
- Raises
ValueError -- Wrong thresholding method
ValueError -- Missing parameters of an applaince
- Returns
thresholds, min_off, min_on, threshold_std
- Return type
tuple
- deep_nilmtk.preprocessing.threshold.get_thresholds(ser, use_std=True, return_mean=False)[source]¶
Returns the estimated thresholds that splits ON and OFF appliances states.
- Parameters
ser (np.array) -- An array with shape = (num_series, series_len, num_meters) - num_series : Amount of time series. - series_len : Length of each time series. - num_meters : Meters contained in the array.
use_std (bool, optional) -- Consider the standard deviation of each cluster when computing the threshold. If not, the threshold is set in the middle point between cluster centroids., defaults to True
return_mean (bool, optional) -- If True, return the means as second parameter., defaults to False
- Returns
thresholds and mean consumption for each appliance
- Return type
tuple
Note
The eman values are only returned when return_mean is True (default False)