Algorithm: split_data_with_data_with_balanced_test_set
Input:
EEG data (samples,features) Timestamps for each sample Number of samples per class for the test set Minimum time separation for test samples Minimum time separation between training and test samples File object to log excluded timestamps Output:
Xtrain: Time-separated training data Xtest: Balanced test data
Initialize:
test_data←[], test_labels←[], test_timestamps←[]
For each unique class label L in Labels:
Filter class-specific data:class_data←eeg_data[labels==l],class_labels←labels[labels==l],class_timestamps←timestamps[labels==l]\text{class\_data} \leftarrow \text{eeg\_data}[\text{labels} == l], \quad \text{class\_labels} \leftarrow \text{labels}[\text{labels} == l], \quad \text{class\_timestamps} \leftarrow \text{timestamps}[\text{labels} == l]class_data←eeg_data[labels==l],class_labels←labels[labels==l],class_timestamps←timestamps[labels==l]
reversed_timestamps←class_timestamps[::-1]\text{reversed\_timestamps} \leftarrow \text{class\_timestamps[::-1]}reversed_timestamps←class_timestamps[::-1] c. Select test timestamps: selected_timestamps←[]\text{selected\_timestamps} \leftarrow [ ]selected_timestamps←[] For each timestamp ttt in \text{reversed_timestamps}: If ∀s∈selected_timestamps,∣t−s∣>time_window_test_sample\forall s \in \text{selected\_timestamps}, |t - s| > \text{time\_window\_test\_sample}∀s∈selected_timestamps,∣t−s∣>time_window_test_sample: selected_timestamps.append(t)\text{selected\_timestamps}.append(t)selected_timestamps.append(t) Stop if ∣selected_timestamps∣=n_per_class|\text{selected\_timestamps}| = \text{n\_per\_class}∣selected_timestamps∣=n_per_class. d. Sort selected_timestamps\text{selected\_timestamps}selected_timestamps. e. Find indices of selected test samples: test_indices←[i for i,t in enumerate(class_timestamps) if t∈selected_timestamps]\text{test\_indices} \leftarrow [i \text{ for } i, t \text{ in enumerate(class\_timestamps) if } t \in \text{selected\_timestamps}]test_indices←[i for i,t in enumerate(class_timestamps) if t∈selected_timestamps] test_data.append(class_data[test_indices]),test_labels.append(class_labels[test_indices]),test_timestamps.append(class_timestamps[test_indices])\text{test\_data}.append(\text{class\_data[test\_indices]}), \quad \text{test\_labels}.append(\text{class\_labels[test\_indices]}), \quad \text{test\_timestamps}.append(\text{class\_timestamps[test\_indices]})test_data.append(class_data[test_indices]),test_labels.append(class_labels[test_indices]),test_timestamps.append(class_timestamps[test_indices]) Concatenate and Sort Test Data: Xtest←concat(test_data),ytest←concat(test_labels),test_timestamps←concat(test_timestamps)X_{\text{test}} \leftarrow \text{concat}(\text{test\_data}), \quad y_{\text{test}} \leftarrow \text{concat}(\text{test\_labels}), \quad \text{test\_timestamps} \leftarrow \text{concat}(\text{test\_timestamps})Xtest←concat(test_data),ytest←concat(test_labels),test_timestamps←concat(test_timestamps) Sort Xtest,ytest,test_timestampsX_{\text{test}}, y_{\text{test}}, \text{test\_timestamps}Xtest,ytest,test_timestamps by timestamps. mask←[∀t∈timestamps,∀ttest∈test_timestamps,∣t−ttest∣>time_window_train_sample]\text{mask} \leftarrow [ \forall t \in \text{timestamps}, \forall t_{\text{test}} \in \text{test\_timestamps}, |t - t_{\text{test}}| > \text{time\_window\_train\_sample} ]mask←[∀t∈timestamps,∀ttest∈test_timestamps,∣t−ttest∣>time_window_train_sample] mask←mask∩(timestamps∉test_timestamps)\text{mask} \leftarrow \text{mask} \cap (\text{timestamps} \not\in \text{test\_timestamps})mask←mask∩(timestamps∈test_timestamps) Xtrain←eeg_data[mask],ytrain←labels[mask],train_timestamps←timestamps[mask]X_{\text{train}} \leftarrow \text{eeg\_data}[\text{mask}], \quad y_{\text{train}} \leftarrow \text{labels}[\text{mask}], \quad \text{train\_timestamps} \leftarrow \text{timestamps}[\text{mask}]Xtrain←eeg_data[mask],ytrain←labels[mask],train_timestamps←timestamps[mask] excluded_samples←timestamps[¬mask]\text{excluded\_samples} \leftarrow \text{timestamps}[\neg \text{mask}]excluded_samples←timestamps[¬mask] Log Excluded Samples:
Convert excluded_samples\text{excluded\_samples}excluded_samples to string and write to log_file\text{log\_file}log_file. Xtrain,Xtest,ytrain,ytest,train_timestamps,test_timestampsX_{\text{train}}, X_{\text{test}}, y_{\text{train}}, y_{\text{test}}, \text{train\_timestamps}, \text{test\_timestamps}Xtrain,Xtest,ytrain,ytest,train_timestamps,test_timestamps