Skip to content

CLI: refactor(isla): rewrite assembler as ELF pipeline with linksem#101

Open
febyeji wants to merge 1 commit intomainfrom
feature/assembler-elf-pipeline
Open

CLI: refactor(isla): rewrite assembler as ELF pipeline with linksem#101
febyeji wants to merge 1 commit intomainfrom
feature/assembler-elf-pipeline

Conversation

@febyeji
Copy link
Copy Markdown
Collaborator

@febyeji febyeji commented Apr 2, 2026

Summary

Replace the old assembler pipeline (echo | llvm-mc → .o, objcopy → .bin) with a structured ELF pipeline: .s + .ld generation → clang+lld → linksem ELF parsing

The old pipeline returned raw bytes only, losing all address information. The new pipeline returns both machine code bytes and linker-assigned addresses via assembly_input → assembly_result types.

Notes

feedback wanted!

  1. Single clang invocation instead of separate llvm-mc + ld.lld: clang handles assemble + link in one step, reducing temp files and config keys (assemble + extract → single command).

  2. Linksem for ELF parsing instead of objcopy: We need symbol addresses and section addresses, not just raw bytes. Linksem gives us the full ELF structure as OCaml types.

  3. Delete symbols.ml: the manual bump allocator is replaced by real linker address resolution. One page per symbol so each gets independent page-table attributes.

  4. Relocatable vs fixed-addr sections: relocatable sections share .text (linker places them), fixed-addr sections get their own named ELF sections at specified addresses (for exception handlers etc).

@febyeji febyeji force-pushed the feature/assembler-elf-pipeline branch 2 times, most recently from cb2163b to a4869ad Compare April 6, 2026 11:59
@febyeji febyeji force-pushed the feature/test-cleanup branch 3 times, most recently from c07e534 to 8a3cecd Compare April 15, 2026 08:36
@febyeji febyeji force-pushed the feature/assembler-elf-pipeline branch from a4869ad to d3b7fd0 Compare April 15, 2026 09:20
@febyeji febyeji marked this pull request as ready for review April 21, 2026 06:43
@febyeji febyeji changed the base branch from feature/test-cleanup to main April 29, 2026 02:00
@febyeji febyeji force-pushed the feature/assembler-elf-pipeline branch 4 times, most recently from 8ac5cbf to eb03070 Compare April 30, 2026 08:25
Copy link
Copy Markdown
Collaborator

@tperami tperami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nit-picks and the allocation of symbols can be simplified to avoid conflicts

Comment thread cli/lib/isla/assembler.mli
Comment thread cli/lib/isla/assembler.ml Outdated
Comment thread cli/lib/isla/assembler.ml Outdated
Comment thread cli/lib/isla/assembler.ml Outdated

(** Compute (section_name, address) pairs for all sections. *)
let compute_section_layout (input : assembly_input) : (string * int) list =
let base = base_addr () in
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My recommendation would be to not have that in the config, and just start at the next page after fixed sections, or in first non-zero page otherwise e.g., 0x1000 for page_bits = 12

Comment thread cli/lib/isla/assembler.ml
Comment thread cli/lib/isla/assembler.ml
@febyeji febyeji force-pushed the feature/assembler-elf-pipeline branch from eb03070 to 592370a Compare May 7, 2026 11:59
CLI: refactor(assembler): expose full ELF symbol table in assembly_result

Replace linked_symbol with direct data + symbols fields.
symbols now contains the full ELF symbol table for uniform
address lookup (data symbols, code labels, sections).
@febyeji febyeji force-pushed the feature/assembler-elf-pipeline branch from 592370a to 9277e33 Compare May 7, 2026 12:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants