I am an AI Research Engineer focused on building reliable data and evaluation pipelines for Large Language Models.
Currently, I work on research and engineering efforts involving Brazilian Portuguese language models, dataset curation, tokenization workflows, model evaluation, post training datasets, and reproducible tooling for LLM development.
My background combines machine learning engineering, graph algorithms, competitive programming, backend systems, and applied research in heterogeneous GPU scheduling for AI workloads.
|
Python library and CLI for token counting, token distribution analysis, and dataset inspection in LLM data workflows. Focus: LLM data pipelines, tokenization, Hugging Face datasets, and reproducible reports. |
Comparative study of list scheduling heuristics for heterogeneous GPU environments inspired by AI training workloads. Algorithms: DLS, HEFT, HEFT LA, PEFT, IHEFT, and IPEFT. |
|
Repository with implementations, experiments, reports, and study material related to NLP, deep learning, and modern AI systems. Focus: Transformers, multimodal models, NLP experiments, and reproducible learning projects. |
Public datasets and resources related to Brazilian Portuguese LLM development, instruction data, evaluation, and data centric AI workflows. Focus: dataset curation, pretraining data, instruction tuning data, and LLM evaluation. |
I have experience in software engineering, machine learning engineering, and academic research.
My work has involved:
| Artificial Intelligence | LLM data pipelines, dataset curation, model evaluation, post training datasets, tokenization workflows, and reproducible ML tooling. |
| Research | Scheduling algorithms, graph theory, heterogeneous GPU environments, performance evaluation, and AI workload optimization. |
| Software Engineering | Backend systems, Python tooling, APIs, automation, JavaScript applications, TypeScript applications, and production oriented development. |
| Competitive Programming | Algorithms, data structures, optimization, problem solving, and ICPC style contests. |



