Futrell2018 SPRT benchmark using GAMs + control predictors#107

Open

hans wants to merge 7 commits intomainfrom

hans/reading-times-gam

Contributor

hans commented Nov 4, 2022 •

edited

Loading

I'm starting a benchmark implementation for reading time evaluation that uses control predictors (word length and frequency; spillover effects from previous word(s)) as well as a more advanced statistical model (GAMs).

FWIW this PR is also a fun test case of a benchmark with Conda dependencies (needs R and an R package, which obviously can't be installed via pip).

Still to-do (& happy to accept help if anyone is interested):

Predict both RT mean and variance. Recent studies have argued that between-subject RT variance is meaningfully related to surprisal. Use the Gaussian location-scale implementation included in mgcv.
Held-out evaluation. Currently the benchmark evaluates on the training data, yikes
Test code

hans and others added 5 commits

November 4, 2022 17:09


          draft GAM benchmark -- doesn't crash! returns scores! but no held-out…

eae79e8

… evaluation yet


          rename folder to be a valid python module

219173d


          use environment.yml to match #113

6fba657


          Merge branch 'main' into hans/reading-times-gam

9e8cc36


          update environment to resolve dependencies more smoothly

caadb00

mschrimpf reviewed

View reviewed changes

brainscore_language/benchmarks/futrell2018_gam/benchmark.py

+                      data_mask = ~data.isna().any(axis=1)
+                      data = data[data_mask]
+                      # TODO check that columns match formula variable names

Member

mschrimpf Dec 31, 2022

todo

brainscore_language/benchmarks/futrell2018_gam/benchmark.py

+                      data["prev_surp"] = data["surprisal"].shift(1)
+                      data["len"] = self.data[data_mask].word_core.str.len()
+                      data["prev_len"] = data["len"].shift(1)
+                      data["freq"] = surprisals  # HACK need to look this up.

Member

mschrimpf Dec 31, 2022

todo?

brainscore_language/benchmarks/futrell2018_gam/benchmark.py

+                      r_mgcv = importr("mgcv")
+                      model = r_mgcv.gam(formula, data=data)
+                      # TODO held out data

Member

mschrimpf Dec 31, 2022

todo

brainscore_language/benchmarks/futrell2018_gam/benchmark.py

Comment on lines +81 to +89

+                      surprisals = candidate.digest_text(stimuli)['behavior']
+                      attach_presentation_meta(surprisals, self.data['presentation'])
+                      # exclude first words
+                      surprisals = surprisals[surprisals['word_within_sentence_id'] != 1]
+                      data_mask = self.data['word_within_sentence_id'] != 1
+                      # Fit and evaluate GAM model
+                      model, predictions, targets = self.fit(surprisals, data_mask)

Member

mschrimpf Dec 31, 2022

Suggested change

      
                    surprisals = candidate.digest_text(stimuli)['behavior']
          
                    attach_presentation_meta(surprisals, self.data['presentation'])
          
                    # exclude first words
          
                    surprisals = surprisals[surprisals['word_within_sentence_id'] != 1] 
          
                    data_mask = self.data['word_within_sentence_id'] != 1
          
                    # Fit and evaluate GAM model
          
                    model, predictions, targets = self.fit(surprisals, data_mask)
          
                    model_reading_times = candidate.digest_text(stimuli)['behavior']
          
                    attach_presentation_meta(surprisals, self.data['presentation'])
          
                    # exclude first words
          
                    model_reading_times = model_reading_times[model_reading_times['word_within_sentence_id'] != 1] 
          
                    data_mask = self.data['word_within_sentence_id'] != 1
          
                    # Fit and evaluate GAM model
          
                    model, predictions, targets = self.fit(model_reading_times, data_mask)

brainscore_language/benchmarks/futrell2018_gam/benchmark.py

		return score


		class SplitHalvesConsistency:

Member

mschrimpf Dec 31, 2022

could from ../futrell2018.benchmark import SplitHalvesConsistency (

language/brainscore_language/benchmarks/futrell2018/benchmark.py

Line 55 in 01b229c

class SplitHalvesConsistency:

) since identical. Or we put both benchmarks inside the benchmarks/futrell2018 plugin? I'm fine with either, slightly leaning towards adding this to the futrell2018 plugin

Member

mschrimpf commented May 1, 2023

Hi @hans just checking in on this PR

mschrimpf added 2 commits

April 30, 2023 21:04


          Merge branch 'main' into hans/reading-times-gam

e420983


          Merge branch 'main' into hans/reading-times-gam

2291eb6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet