Document Classifier (GED)

Cut manual document sorting by up to 80% — an ML pipeline that auto-classifies and routes enterprise documents so teams stop hand-filing contracts, invoices, and reports.

  • Reduced manual document sorting time by up to 80% with automated classification
  • Routes contracts, invoices, reports, and correspondence to the right workflow automatically
  • Exposed as a FastAPI service in Docker for drop-in integration with existing systems
App DesktopCompleted
Document Classifier (GED)

Project Overview

The Document Classifier (GED) is a comprehensive machine learning solution that revolutionizes document management in enterprise settings. Built with Python and scikit-learn, this system employs advanced natural language processing techniques to automatically analyze, categorize, and route documents.

Key Features:

• Intelligent document classification using machine learning algorithms

• Automated routing system for efficient document processing

• Export-ready reports with detailed analytics

• Interactive labeling tools for training data preparation

• RESTful API integration for seamless workflow integration

• Docker containerization for easy deployment and scaling

The system processes various document types including contracts, invoices, reports, and correspondence, automatically assigning them to appropriate departments or workflows. This reduces manual sorting time by up to 80% and ensures consistent document handling across the organization.

Technical Implementation:

The classifier uses a combination of TF-IDF vectorization and ensemble learning methods to achieve high accuracy in document categorization. The FastAPI backend provides robust API endpoints for document submission and retrieval, while the Docker containerization ensures consistent deployment across different environments.

Key Features

Automated document classification
Export-ready analytics reports
Interactive labeling interface
RESTful API integration
Docker containerization
High accuracy ML models

Technologies Used

Pythonscikit-learnFastAPIDockerMachine LearningNLP

Project Screenshots