Metadata-Version: 2.4
Name: fastpluggy-docmanager
Version: 0.1.5
Summary: Document management plugin for FastPluggy — file browser, multi-mount indexing, dedup, cleaning
Author: FastPluggy Team
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: FastPluggy>=0.4.33
Requires-Dist: loguru
Requires-Dist: xxhash
Requires-Dist: python-magic
Requires-Dist: send2trash
Provides-Extra: xattr
Requires-Dist: pyxattr; extra == "xattr"
Provides-Extra: watcher
Requires-Dist: watchdog; extra == "watcher"
Provides-Extra: thumbnails
Requires-Dist: Pillow; extra == "thumbnails"
Provides-Extra: s3
Requires-Dist: minio; extra == "s3"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: pytest-asyncio; extra == "dev"
Requires-Dist: minio; extra == "dev"
Provides-Extra: e2e
Requires-Dist: fastpluggy-cli; extra == "e2e"

# fastpluggy-docmanager

![Doc Manager](https://img.shields.io/badge/FastPluggy-Doc%20Manager-blue)
[![docmanager](https://gitlab.ggcorp.fr/open/fastpluggy/plugins/doc-manager/-/badges/release.svg)](https://gitlab.ggcorp.fr/open/fastpluggy/plugins/doc-manager/-/releases)
[![Pipeline Status](https://gitlab.ggcorp.fr/open/fastpluggy/plugins/doc-manager/badges/main/pipeline.svg?key_text=CI)](https://gitlab.ggcorp.fr/open/fastpluggy/plugins/doc-manager/-/pipelines?ignore_skipped=true)
[![Coverage](https://gitlab.ggcorp.fr/open/fastpluggy/plugins/doc-manager/badges/main/coverage.svg)](https://gitlab.ggcorp.fr/open/fastpluggy/plugins/doc-manager/-/pipelines)

File browser and document management plugin for [FastPluggy](https://fastpluggy.xyz).

## Features

- **Multi-mount indexing** — local filesystem and S3/MinIO backends
- **xattr hash cache** — re-scans of millions of files: 2 syscalls per file, zero reads
- **6-case dedup engine** — handles moves, DB wipes, modified files, duplicates
- **Exclusion rules** — glob, regex, `.docmanagerignore` with priority chain
- **File upload** — drag-and-drop with duplicate-on-upload detection
- **Cleaning dashboard** — empty dirs, orphaned records, untracked files
- **Near-duplicate detection** — filename similarity + file size proximity
- **Prometheus metrics** — 7 gauges with TTL-cached FS metrics
- **Event-driven** — emits events for downstream plugins (e.g. `docmanager_ai`)

## Install

```bash
pip install fastpluggy-docmanager
```

Optional extras:
```bash
pip install "fastpluggy-docmanager[xattr]"     # xattr cache (pyxattr)
pip install "fastpluggy-docmanager[watcher]"   # Filesystem watcher
pip install "fastpluggy-docmanager[thumbnails]" # Thumbnail generation
```

### S3/MinIO support

S3 mounts use the `minio_tools` plugin (soft dependency — no extra pip install needed). Install and enable the `minio_tools` plugin, then create S3 mounts with per-mount endpoint and credentials.

## Configuration

Settings via `DocManagerSettings` (DB-backed):

| Setting | Default | Description |
|---------|---------|-------------|
| `hash_algorithm` | `xxhash` | `xxhash` (fast) or `sha256` (standard) |
| `scan_workers` | `4` | Concurrent scan workers |
| `soft_delete_days` | `7` | Retention before hard delete |
| `enable_thumbnails` | `true` | Generate image thumbnails |
| `enable_metrics` | `true` | Expose Prometheus metrics |

See `docs/docmanager.md` for the full settings reference.

## API Summary

### Frontend routes
| Route | Description |
|-------|-------------|
| `GET /` | Dashboard with KPI cards |
| `GET /browse/{mount}[/{path}]` | File browser |
| `GET /document/{id}` | Document detail |
| `GET /upload/{mount}[/{path}]` | Upload form |
| `GET /mounts` | Mount management |
| `GET /duplicates` | Duplicate management |
| `GET /cleaning` | Cleaning dashboard |
| `GET /jobs` | Scan job list |

### JSON API
| Route | Description |
|-------|-------------|
| `GET /api/mounts` | List mounts |
| `GET /api/browse/{mount}` | Directory listing |
| `POST /api/scan/full/{mount}` | Trigger full scan |
| `GET /api/documents/{id}/file` | Download file |
| `GET /api/documents/search` | Search documents |
| `GET /api/documents/{id}/extracted-info` | Get extracted info |

## Events

Emitted via FastPluggy event bus — subscribe from other plugins:

- `docmanager.document_new` — new file indexed
- `docmanager.document_changed` — file modified
- `docmanager.document_deleted` — file deleted
- `docmanager.scan_complete` — scan finished
- `docmanager.duplicates_found` — duplicates detected

## Development

```bash
pip install -e ".[dev]"
pytest tests/
```

## License

See FastPluggy license.

