davinci llm 3B

davinci llm 3B

daVinci-LLM-3B is a research-oriented base model built around openness in pretraining science rather than instruction tuning. The model card describes a two-stage curriculum over about 8T tokens, combining broad web-scale pretraining with reasoning-heavy QA data, and reports an overall score of 51.72 across 19 benchmarks, approaching some 7B-class models. Its main differentiator is transparency: the team publishes the pipeline, checkpoints, processing logic, and large-scale ablation results, making it especially relevant for researchers studying data quality, training dynamics, and evaluation design.

Overview

daVinci-LLM-3B is a 3B base language model built to make pretraining transparent and reproducible. Its release includes not only the weights, but also training trajectories, intermediate checkpoints, data-processing decisions, and more than 200 ablation studies.

📚Non-interactive language modeling 🤖Ai research assistance 🔍Data quality control 🧠Model training

About Sii GAIR-NLP

View Company Profile

Tools using davinci llm 3B

No tools found for this model yet.

Last updated: March 27, 2026

Search

Overview

About Sii GAIR-NLP

Tools using davinci llm 3B

Related Models

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: