ClassIQ
Building a Classroom Monitor with Face Recognition and AI-Powered Lecture Notes
This was my 3rd-year college minor project. I built it to explore how AI could be used for classroom engagement, automating attendance, and helping out with note generation.
What I Was Trying to Do
The basic idea was to build an AI system on a server that uses computer vision, speech recognition, and an LLM to:
- Detect faces and recognize student identities.
- Track attentiveness during a class.
- Transcribe and summarize the lecture into structured markdown notes.
- Store and show all this data on a cloud dashboard.
The system is designed to work with a device in the classroom (like a Raspberry Pi or a local camera), but I built a simple Streamlit app to demo the functionality on a single video right on your laptop.
How It Works
The core system is designed as a server-based pipeline. Once it’s deployed:
- An edge device (like a Pi) records a classroom session (video + audio).
- The URL of the processing server (ngrok) is entered into that device.
- Every time a new video is sent, the server:
- Detects and recognizes faces
- Tracks attentiveness
- Transcribes the audio
- Uses an LLM to generate structured notes
- Saves all the output to cloud storage
The Streamlit UI (apps/demo_app.py) just runs this whole process for a single demo video, giving a preview of how it all fits together.
Core Features
- Face Detection and Recognition for Attendance
- Frame-Based Attentiveness Tracking
- Audio Transcription via Vosk
- LLM-Powered Summarization (Groq’s
llama3-8b-8192) - Streamlit-Based UI for Demos & Data Exploration
Tech Stack
| Layer | Tools & Libraries |
|---|---|
| Vision | YOLOv11, YuNet (OpenCV) |
| Transcription | Vosk Small EN |
| Summarization | Groq LLM (llama3-8b-8192) |
| Backend | Flask, tmux, ngrok |
| UI/Dashboard | Streamlit |
| Deployment | Lightning AI |
Some Results
- Face Recognition Accuracy: 91.67% (Top-1)
- Attentiveness Metric: I used a simple metric for this:
Attentiveness = (frames student is visible) / (total frames) - Lecture Notes Quality: This was pretty subjective and depends heavily on the LLM and the prompt.
What I’d Do Next
- Shift to a custom facial-points-based recognition model.
- Add real-time inattentiveness alerts.
- Refine the LLM prompts for more structured results.
- Switch from CSV to PostgreSQL for a more scalable dashboard.
About Me
Manodeep Ray – I’m really into deep learning, LLMs, CV, and building real-world systems. This project was built as part of my college 3rd-year minor project and was a fun way to blend CV and NLP in an educational setting.
Thanks for checking it out! If you liked this, drop a ⭐ on GitHub or connect with me to chat about AI + education.