View Project
Food Vision Transformer is an advanced image classification project that utilizes a Vision Transformer (ViT) model to achieve high accuracy in food image recognition. Key features of the project include: 1. Model Development: Developed a Vision Transformer model using PyTorch, achieving 92% accuracy on a curated dataset. 2. Model Fine-Tuning: Fine-tuned pre-trained models from Hugging Face, optimizing performance for diverse image recognition tasks. 3. interactive Web Interface: Built an interactive interface with Gradio, allowing real-time model inference and visualization of classification results. 4. Optimized Performance: Reduced model complexity by 30% using advanced transfer learning techniques, improving computational efficiency without compromising accuracy. This project demonstrates expertise in deep learning, transformer-based models, and web-based model deployment, providing an efficient solution for real-time image classification tasks.
Built with