Overview
This project presents an accessible solution to educational inequality by creating an offline AI tutor deployable on secondhand computers. The system converts any text article into interactive lesson plans using structured JSON and reasoning-based AI.
Core Innovation
The approach fine-tunes DeepSeek-R1-Distill-Llama-8B, a compact 8-billion parameter model, to operate without internet connectivity. The system transforms articles into structured learning modules featuring key concepts, quizzes, and discussion prompts.
Technical Architecture
JSON Schema Structure
The lesson format includes:
All outputs are validated against a defined schema.
Chain of Thought Training
Rather than simple input-output mapping, the model learns pedagogical reasoning by training on thought traces showing curriculum development logic. This preserves internal reasoning tokens during training.
Practical Implementation
The deployment targets Pakistani classrooms using:
- ThinkCentre PC (i5-7500T, 8GB RAM)
- Quantized 4-bit model for CPU efficiency
- Internet-in-a-Box offline content (Wikipedia, Khan Academy, TED)
Performance currently requires approximately two minutes per lesson on Mac M3 Pro hardware.
Future Optimization
Planned experiments include:
Open Source Vision
Following successful classroom trials, I intend to release datasets, training code, and frontend tools for community adaptation and localization.