orthoPE

A Python library for computing Orthographic Prediction Error (oPE) estimates based on predictive coding models of visual word recognition.

This library implements multiple oPE models, including several beyond those described in the original publication, to analyze how visual and orthographic features predict behavioral and neural responses during word recognition.

Features

Multiple oPE Estimators: 7 different estimation methods including L1/L2 norms, Mahalanobis distance, and Kalman-weighted prediction error
Flexible Text Rendering: Pixel-based rendering with TrueType fonts or abstract letter-based encoding
Noise Simulation: Configurable Gaussian noise to model perceptual uncertainty
Multi-Language Support: German (fully tested), with experimental support for English, French, and Dutch
Analysis Tools: Built-in correlation analysis and visualization functions for behavioral (RT) and neural (EEG) data

Installation

pip install numpy pandas scipy matplotlib seaborn pillow tqdm

Clone the repository:

git clone https://github.com/yourusername/orthoPE.git
cd orthoPE

Fonts

The library supports rendering with TrueType fonts. Sample fonts used in testing include Courier, Cambria, and Verdana. Due to licensing restrictions, fonts are not included in this repository.

To use font-based rendering:

Download TrueType fonts (.ttf files) from legitimate sources
Place them in a ./fonts/ directory in the project root
Additional fonts can be added but have not been systematically tested

Note: The package can run without proprietary fonts using the letter-based encoding mode (font='word'), which uses abstract orthographic representations instead of pixel rendering.

Data

Corpus and experimental data are not included in this repository. The data used in this repository is posted in https://osf.io/d8yjc/. These data is part of the research paper https://doi.org/10.1016/j.neuroimage.2020.116727, which should be cited if using the data.

Required data structure:

Corpus files: CSV with word frequencies (e.g., ger5_fin.csv)
Behavioral data: CSV with reaction time data from lexical decision tasks
EEG data (optional): CSV with neural responses at different time windows

Place data files in ./data_repository/ directory.

Quick Start

Compute oPE estimates

import orthope

# Pixel-based rendering with specific font
orthope.run_all_oPEs('german', 'courier')

# Letter-based encoding (no font required)
orthope.run_all_oPEs('german', 'word')

Run full analysis pipeline

import pipelines

# Compute models and generate visualizations
pipelines.compute_all_models('german', fonts=['courier'])
pipelines.compute_all_models('german', 'word')

# Integrate multiple models into single CSV
modeldict = {
    'pred_err_l2': 'PE2',
    'mahalanobis': 'Mahalanobis',
    'kalmanw_pred_err': 'Kalman'
}
pipelines.integrate_models_in_csv(modeldict, 'all_models')

Load and analyze results

import orthope

# Initialize estimator
gg = orthope.OrthopeEstimator('german', 'courier', noise=0.1)

# Load computed oPE estimates
opes_df = gg.load_opes()

# Load behavioral data
rt_df = gg.load_data_rt()

# Load EEG data (German only)
eeg_df = gg.load_data_eeg()

# Compute correlations
correlations = orthope.compute_oPE_RT_correlations('german', 'courier')

# Generate visualizations
orthope.plot_oPE_RT_scatterplots('german', 'courier')
orthope.plot_oPE_RT_rhos('german', 'courier')

oPE Estimator Types

The library implements 7 oPE estimation methods:

n_pixels_l1 - L1 norm of pixel differences
n_pixels_l2 - L2 norm of pixel differences
pred_err_l1 - L1 prediction error from corpus statistics
pred_err_l2 - L2 prediction error from corpus statistics
pw_pred_err - Pixel-wise prediction error
mahalanobis - Mahalanobis distance using corpus covariance
kalmanw_pred_err - Kalman-weighted prediction error

Testing

Run basic regression tests to ensure nothing is broken:

python test_basic.py

This runs 19 fast tests (~0.5 seconds) covering core functionality.

Language Support

German: Fully tested and validated
English, French, Dutch: Experimental support (not systematically tested)

Use non-German languages with caution and validate results independently.

Output

The library generates:

Models: CSV files with oPE estimates (./models/)
Results: Correlation plots and scatterplots (./results/)
Integrated data: Combined model outputs (all_models.csv)

Advanced Usage

Custom noise levels

# Initialize with specific noise level
gg = orthope.OrthopeEstimator('german', 'courier', noise=1.5)

# Available noise levels: [0.1, 0.2, 0.5, 0.8, 1.0, 1.5, 2.0]

Letter-based encoding

# Use abstract orthographic representation instead of pixels
wg = orthope.LetterOrthopeEstimator('german', noise=0.1)
result = wg.__render_text__('hello')

Compute single oPE estimate

gg = orthope.OrthopeEstimator('german', 'courier', noise=0.1)
gg.corpus_stats = gg.__estimate_corpus_stats__()
ope_value = gg.__estimate_ope__('example', estimator='mahalanobis')

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
models		models
README.md		README.md
orthope.py		orthope.py
pipelines.py		pipelines.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

orthoPE

Features

Installation

Fonts

Data

Quick Start

Compute oPE estimates

Run full analysis pipeline

Load and analyze results

oPE Estimator Types

Testing

Language Support

Output

Advanced Usage

Custom noise levels

Letter-based encoding

Compute single oPE estimate

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

orthoPE

Features

Installation

Fonts

Data

Quick Start

Compute oPE estimates

Run full analysis pipeline

Load and analyze results

oPE Estimator Types

Testing

Language Support

Output

Advanced Usage

Custom noise levels

Letter-based encoding

Compute single oPE estimate

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages