Meetings
Link to page

Algorithm: split_data_with_data_with_balanced_test_set

Input:
EEG data (samples,features)
Labels for each sample
Timestamps for each sample
Number of samples per class for the test set
Minimum time separation for test samples
Minimum time separation between training and test samples
File object to log excluded timestamps
Output:
Xtrain​: Time-separated training data
Xtest​: Balanced test data
ytrain​: Training labels
ytest​: Test labels
Training timestamps
Test timestamps

Initialize:
test_data←[], test_labels←[], test_timestamps←[]
For each unique class label L in Labels:
Filter class-specific data:class_data←eeg_data[labels==l],class_labels←labels[labels==l],class_timestamps←timestamps[labels==l]\text{class\_data} \leftarrow \text{eeg\_data}[\text{labels} == l], \quad \text{class\_labels} \leftarrow \text{labels}[\text{labels} == l], \quad \text{class\_timestamps} \leftarrow \text{timestamps}[\text{labels} == l]class_data←eeg_data[labels==l],class_labels←labels[labels==l],class_timestamps←timestamps[labels==l]
b. Reverse timestamps:
reversed_timestamps←class_timestamps[::-1]\text{reversed\_timestamps} \leftarrow \text{class\_timestamps[::-1]}reversed_timestamps←class_timestamps[::-1]
c. Select test timestamps:
selected_timestamps←[]\text{selected\_timestamps} \leftarrow [ ]selected_timestamps←[]
For each timestamp ttt in \text{reversed_timestamps}:
If ∀s∈selected_timestamps,∣t−s∣>time_window_test_sample\forall s \in \text{selected\_timestamps}, |t - s| > \text{time\_window\_test\_sample}∀s∈selected_timestamps,∣t−s∣>time_window_test_sample: selected_timestamps.append(t)\text{selected\_timestamps}.append(t)selected_timestamps.append(t)
Stop if ∣selected_timestamps∣=n_per_class|\text{selected\_timestamps}| = \text{n\_per\_class}∣selected_timestamps∣=n_per_class.
d. Sort selected_timestamps\text{selected\_timestamps}selected_timestamps.
e. Find indices of selected test samples:
test_indices←[i for i,t in enumerate(class_timestamps) if t∈selected_timestamps]\text{test\_indices} \leftarrow [i \text{ for } i, t \text{ in enumerate(class\_timestamps) if } t \in \text{selected\_timestamps}]test_indices←[i for i,t in enumerate(class_timestamps) if t∈selected_timestamps]
f. Append test data:
test_data.append(class_data[test_indices]),test_labels.append(class_labels[test_indices]),test_timestamps.append(class_timestamps[test_indices])\text{test\_data}.append(\text{class\_data[test\_indices]}), \quad \text{test\_labels}.append(\text{class\_labels[test\_indices]}), \quad \text{test\_timestamps}.append(\text{class\_timestamps[test\_indices]})test_data.append(class_data[test_indices]),test_labels.append(class_labels[test_indices]),test_timestamps.append(class_timestamps[test_indices])
Concatenate and Sort Test Data:
Xtest←concat(test_data),ytest←concat(test_labels),test_timestamps←concat(test_timestamps)X_{\text{test}} \leftarrow \text{concat}(\text{test\_data}), \quad y_{\text{test}} \leftarrow \text{concat}(\text{test\_labels}), \quad \text{test\_timestamps} \leftarrow \text{concat}(\text{test\_timestamps})Xtest​←concat(test_data),ytest​←concat(test_labels),test_timestamps←concat(test_timestamps)
Sort Xtest,ytest,test_timestampsX_{\text{test}}, y_{\text{test}}, \text{test\_timestamps}Xtest​,ytest​,test_timestamps by timestamps.
Create Training Mask:
mask←[∀t∈timestamps,∀ttest∈test_timestamps,∣t−ttest∣>time_window_train_sample]\text{mask} \leftarrow [ \forall t \in \text{timestamps}, \forall t_{\text{test}} \in \text{test\_timestamps}, |t - t_{\text{test}}| > \text{time\_window\_train\_sample} ]mask←[∀t∈timestamps,∀ttest​∈test_timestamps,∣t−ttest​∣>time_window_train_sample]
Exclude test samples:
mask←mask∩(timestamps∉test_timestamps)\text{mask} \leftarrow \text{mask} \cap (\text{timestamps} \not\in \text{test\_timestamps})mask←mask∩(timestamps∈test_timestamps)
Prepare Training Data:
Xtrain←eeg_data[mask],ytrain←labels[mask],train_timestamps←timestamps[mask]X_{\text{train}} \leftarrow \text{eeg\_data}[\text{mask}], \quad y_{\text{train}} \leftarrow \text{labels}[\text{mask}], \quad \text{train\_timestamps} \leftarrow \text{timestamps}[\text{mask}]Xtrain​←eeg_data[mask],ytrain​←labels[mask],train_timestamps←timestamps[mask]
Excluded timestamps:
excluded_samples←timestamps[¬mask]\text{excluded\_samples} \leftarrow \text{timestamps}[\neg \text{mask}]excluded_samples←timestamps[¬mask]
Log Excluded Samples: Convert excluded_samples\text{excluded\_samples}excluded_samples to string and write to log_file\text{log\_file}log_file.
Return:
Xtrain,Xtest,ytrain,ytest,train_timestamps,test_timestampsX_{\text{train}}, X_{\text{test}}, y_{\text{train}}, y_{\text{test}}, \text{train\_timestamps}, \text{test\_timestamps}Xtrain​,Xtest​,ytrain​,ytest​,train_timestamps,test_timestamps
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.