The MasterCamp 2025 project aims to develop an intelligent web platform capable of automatically detecting the condition of urban trash bins from images (full, overflowing, empty). This innovative digital solution helps anticipate overflow risks and reduce illegal dumping in public spaces. It is part of an AI for Good approach and promotes eco-responsible waste management through visual data collection, annotation, and automatic extraction of simple visual features (size, color, contrast, etc.).
Project: MasterCamp 2025
Project Objective
The Team Behind the Project
A highly motivated team combining data science, web development, and environmental commitment, namely:
Alexandre, Adam, Xavier, Nicolas, and Paul.

π§ Algorithms & System Architecture
How does the site work?
The site allows users (with personal accounts) to upload an image of an urban trash bin, provide its
geolocation, and validate its cleanliness status (this step helps evaluate the accuracy of our AI
models).
The analysis relies on three complementary levels combining AI, computer vision,
and user feedback:
π¬ 1. Classification with ViT (Vision Transformer)
- Main model based on Vision Transformer (ViT).
- Achieved accuracy: 91% on our test set.
- View model on Hugging Face
- π Training Notebook on Google Colab
ποΈ 2. Detection with YOLOv8
- Used to automatically localize the trash bin in the image.
- Analyzes pixel saturation around the bin (visible waste).
- Complements and reinforces the ViT classification.
π€ 3. User Feedback (as seen above)
- A checkbox allows users to manually annotate the image (clean / dirty).
- Useful to validate or correct automatic prediction and improve overall performance.
π Full technical details available in our report:
π Download Technical Report (FR)Visual Demonstration

π Top: accuracy of our ViT model (clean/dirty classification), left shows user
input, right shows model prediction.
πΊοΈ Bottom: bins analyzed (Montevideo geolocation).

β Post-analysis interface: uploaded image, AI-generated label, and visual feedback.

π― Visual metadata extraction (size, colors, contours) around the bin using YOLO.
π₯ A video is worth a thousand words...