Skip to content

keyvan-amiri/QTab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

QTab: Inter-Case-Aware Remaining Time Prediction via Queuing Networks and Tabular Machine Learning

This is the repository for our paper "QTab: Inter-Case-Aware Remaining Time Prediction via Queuing Networks and Tabular Machine Learning" submitted to CAiSE2026 conference.

Installation

First, clone this GitHub repository to your local machine:

git clone https://github.com/keyvan-amiri/QTab

To install and set up the required environment on a Linux system, run the following commands:

conda create -n QTab python=3.11
conda activate QTab
pip install -r requirements.txt
conda clean --all

QTab Approach

During training, QTab processes the event log in three main steps: (1) it augments the log by identifying enabler activities and discovering resource roles; (2) it constructs a queuing network from the augmented log to model inter-case dependencies; and (3) it extracts intra- and inter-case features from both the augmented log and the queuing network to create a tabular dataset. A tabular machine learning model is then trained on this dataset. During inference, the queuing network is updated with new events, and the same feature extraction procedure is applied to ongoing cases to produce the test dataset, on which the trained model predicts remaining time.

Event Log Augmentation

QTab takes an event log and augments each event with information required to construct a queuing network. The augmentation procedure involves three steps: (1) identifying enabler activities, (2) discovering roles, and (3) optionally estimating start timestamps.

To execute these steps, if the event log includes both start and complete timestamps, specify the dataset name and run the Prepare_log.py script as shown below:

python Prepare_log.py --dataset BPIC_2017_W

If the event log includes only complete timestamps, specify the dataset name and run the Prepare_log.py and augment.py scripts sequentially as shown below:

python augment.py --dataset BPIC15_1
python Prepare_log.py --dataset BPIC15_1

For each event log, a separate configuration file is created which controls the event log augmentation behavior (e.g., BPIC15_1.yaml for BPIC 15-1 event log).

Training Tabular Models

For training the tabular models after feature extraction, proceed as follows:

  1. Put the extracted datasets and metadata in a directory called 'data/raw'. The file structure should be as follows:
data/
├── processed/
└── raw/
    ├── CG columns/
    ├── CG data/
    ├── Queue columns/
    └── Queue data/
  1. Run clean_data.py to store preprocessed, versions of the datasets in 'data/processed' that are ready to be used for modeling.
  2. Install tabarena by following the installation instructions: https://github.com/autogluon/tabarena
  3. Run each model on each dataset using: 'python run_tabarena.py --dat_name DAT_NAME --model_name MODEL_NAME'
  4. Run 'get_res_tables.py' to obtain result tables with predictions and time measurements.

Training Baseline Models

To train and evaluate baseline approaches LS-ICE, PGTNet, Congestion Graphs follow the instruction here.

About

Predictive Process Monitoring using Social Network Analysis and Tabular Machine Learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors