This repository contains a collection of notebooks offering an initial surface level analysis of the dataset. The notebooks were generated by Claude Code (CC) so may contain mistakes!
Feel free to take these notebooks and build on them to create your own submission to the RSE Data Competition. There is a £250 prize for the best submission!
These notebooks were created by typing the following command into claude code:
execute PROGRAM.md
The files in this repository are the following:
PROGRAM.md- the instructions for claude code to create the notebooksDATA.md- A description of the dataset and how to access it.CLAUDE.md- Some small personal preferences for CC.contents.md- A list of the contents of the repository, generated by CC when running PROGRAM.md.notebooks/- the generated notebooks containing the analysis of the dataset. Also contain html versions of the notebooks for easier viewing.pyproject.tomlanduv.lockto install dependencies for the notebooks.
NOTE: You will need to change the data location in DATA.md to point to where you have the dataset downloaded on your own file system.
Here are some initial ideas for further analysis from my own head:
- Which country pays their RSEs the best? and has this changed? Would have to control from inflation and cost of living from diffeerent counties.
- Programming languages used by RSEs, how this has changed over time and how it compares to the rest of the software industry. For exampe, there is a lot less javascript in the RSE world than in the software industry as a whole. but presumably there are many other differences.
Change the CLAUDE.md file to AGENTS.md and then use the same way with your coding agent of choice and pray! It won't be the same but maybe it will uncover some different insights! :)