Upload photos of your bookshelf → OCR extracts titles → deduplicates → builds your library with metadata. Rust backend, Bun/React frontend.
  • Rust 70%
  • TypeScript 26.6%
  • CSS 1.6%
  • JavaScript 0.9%
  • HTML 0.9%
Find a file
Claudius 0e4ac34587 Polish pass: Harvard Classics theme, Lucide icons, clear library functionality
- Updated color palette to Harvard Classics inspired theme (deep forest green #1B4332, rich gold #C5A55A, cream background #F5F0E8)
- Replaced all SVG icons and emojis with Lucide React components (Camera, Trash2, BookOpen, X, Pencil, Loader2)
- Added clear library button with confirmation dialog in library header
- Added DELETE /api/library backend endpoint for clearing all books
- Updated all color references across App.tsx, UploadZone.tsx, LibraryView.tsx, ProcessingStatus.tsx, and index.css
- Improved book placeholder with BookOpen icon
- Enhanced processing status with animated loading icon
- Maintained Tailwind v4 syntax (@import "tailwindcss")
- All functionality tested and verified
2026-02-10 22:10:05 +00:00
frontend Polish pass: Harvard Classics theme, Lucide icons, clear library functionality 2026-02-10 22:10:05 +00:00
src Polish pass: Harvard Classics theme, Lucide icons, clear library functionality 2026-02-10 22:10:05 +00:00
.gitignore Fix major issues in ShelfWise app 2026-02-10 21:53:54 +00:00
Cargo.lock Fix compilation errors and warnings 2026-02-10 21:27:19 +00:00
Cargo.toml Fix compilation errors and warnings 2026-02-10 21:27:19 +00:00
PROJECT_STATUS.md 📋 Project completion status - ShelfWise fully implemented 2026-02-10 21:31:29 +00:00
README.md Initial ShelfWise implementation 2026-02-10 21:25:17 +00:00

📚 ShelfWise - Book Shelf Scanner

Upload photos of your bookshelf → OCR extracts titles → deduplicates → builds your library with metadata.

Tech Stack: Rust backend with Axum, Bun/React frontend, Tesseract OCR, SQLite database, Open Library API integration.

Features

🔍 Smart OCR: Uses Tesseract to extract book titles and authors from photos of book spines 🎯 Intelligent Deduplication: Fuzzy string matching to avoid duplicates from overlapping photos
🌐 Rich Metadata: Automatically fetches book covers, ISBNs, page counts, subjects from Open Library API 📱 Clean UI: Simple drag-and-drop interface to upload multiple bookshelf photos 🗑️ Library Management: View your library and remove false positives Fast Processing: Async Rust backend handles multiple images efficiently

Quick Start

Prerequisites

  • Rust (installed via rustup)
  • Bun (JavaScript runtime)
  • Tesseract OCR (apt install tesseract-ocr tesseract-ocr-eng)

Installation

# Clone the repository
git clone <repository-url>
cd shelfwise

# Build the frontend
cd frontend
bun install
bun run build

# Build and run the backend
cd ..
cargo build --release
cargo run --release

The app will be available at http://localhost:3001

Development

Backend (Rust)

# Run in development mode with hot reload
cargo watch -x run

# Run tests
cargo test

Frontend (Bun/React)

cd frontend

# Development server with hot reload
bun run dev

# Build for production
bun run build

Architecture

Backend (/src/)

  • main.rs: Axum HTTP server, routes, database setup
  • ocr.rs: Tesseract integration and text parsing heuristics
  • openlibrary.rs: Open Library API client and metadata fetching
  • dedupe.rs: Fuzzy string matching for duplicate detection

Frontend (/frontend/src/)

  • App.tsx: Main application component
  • components/UploadZone.tsx: Drag & drop file upload interface
  • components/LibraryView.tsx: Book library display with covers
  • components/ProcessingStatus.tsx: Upload progress indicator

Data Flow

  1. Upload: User drags bookshelf photos to upload zone
  2. OCR: Backend runs Tesseract on each image to extract text
  3. Parse: Text is analyzed to identify book titles and authors using regex patterns
  4. Dedupe: Fuzzy string matching removes duplicates from overlapping photos
  5. Enrich: Open Library API adds metadata (covers, ISBNs, subjects, etc.)
  6. Store: Books are saved to SQLite database
  7. Display: Frontend shows the library with covers and metadata

OCR Text Processing

The OCR module uses several heuristics to extract book information:

  • "Title by Author" patterns
  • Vertical spine layouts (Author on top, title below)
  • All-caps author names (common on book spines)
  • Noise filtering to remove OCR artifacts
  • Length validation for reasonable titles/authors

API Endpoints

  • GET / - Serve frontend
  • POST /api/upload - Upload bookshelf images
  • GET /api/library - Get all books
  • DELETE /api/books/:id - Delete a book

Database Schema

CREATE TABLE books (
    id TEXT PRIMARY KEY,
    title TEXT NOT NULL,
    author TEXT,
    isbn TEXT,
    cover_url TEXT,
    page_count INTEGER,
    publish_date TEXT,
    subjects TEXT, -- JSON array
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

Configuration

  • Database: SQLite file at shelfwise.db
  • Uploads: Temporary files in uploads/ directory
  • Port: Backend runs on port 3001
  • OCR Language: English (tesseract -l eng)

Tips for Best Results

📸 Photo Quality: Take clear, well-lit photos of book spines 🔍 Text Size: Ensure book titles are readable in the photo 📐 Angle: Try to photograph spines straight-on when possible 📚 Multiple Shots: Overlap photos to ensure all books are captured ✂️ Cropping: Focus on just the book spines, avoid other objects

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Run cargo test and bun run build
  6. Submit a pull request

License

MIT License - see LICENSE file for details.

Acknowledgments