feat: add full text search

This commit is contained in:
Michael Barz
2025-05-30 21:52:07 +02:00
parent a833b658af
commit 7cf21ce099
3 changed files with 44 additions and 1 deletions

View File

@@ -139,8 +139,12 @@ START_ADDITIONAL_SERVICES="notifications"
### Apache Tika Content Analysis Toolkit ### ### Apache Tika Content Analysis Toolkit ###
# Tika (search) is disabled by default due to performance reasons. # Tika (search) is disabled by default due to performance reasons.
# Tika is used to extract metadata and text from various file formats.
# Enable it by adding the following to the COMPOSE_FILE variable:
# tika/tika.yml or by using the following command:
# docker compose -f docker-compose.yml -f tika/tika.yml up -d
# Set the desired docker image tag or digest. # Set the desired docker image tag or digest.
# Defaults to "latest" # Defaults to "apache/tika:latest-full"
TIKA_IMAGE= TIKA_IMAGE=
### IMPORTANT Note for Online Office Apps ### ### IMPORTANT Note for Online Office Apps ###

View File

@@ -10,6 +10,7 @@ OpenCloud Compose offers a modular approach to deploying OpenCloud with several
- **External proxy** support for environments with existing reverse proxies (like Nginx, Caddy, etc.) - **External proxy** support for environments with existing reverse proxies (like Nginx, Caddy, etc.)
- **Collabora Online** integration for document editing - **Collabora Online** integration for document editing
- **Keycloak and LDAP** integration for centralized identity management - **Keycloak and LDAP** integration for centralized identity management
- **Full text search** with Apache Tika for content extraction and metadata analysis
## Quick Start Guide ## Quick Start Guide
@@ -127,6 +128,25 @@ Add to `/etc/hosts` for local development:
127.0.0.1 wopiserver.opencloud.test 127.0.0.1 wopiserver.opencloud.test
``` ```
### With Full Text Search
Enable full text search capabilities with Apache Tika using either method:
Using `-f` flags:
```bash
docker compose -f docker-compose.yml -f search/tika.yml -f traefik/opencloud.yml up -d
```
Or by setting in `.env`:
```
COMPOSE_FILE=docker-compose.yml:search/tika.yml:traefik/opencloud.yml
```
This setup includes:
- Apache Tika for text extraction and metadata analysis from various file formats
- Full text search functionality in the OpenCloud interface
- Support for documents, PDFs, images, and other file types
### Behind External Proxy ### Behind External Proxy
If you already have a reverse proxy (Nginx, Caddy, etc.), use either method: If you already have a reverse proxy (Nginx, Caddy, etc.), use either method:
@@ -175,6 +195,7 @@ Key variables:
| `INSECURE` | Skip certificate validation | true | | `INSECURE` | Skip certificate validation | true |
| `COLLABORA_DOMAIN` | Collabora domain | collabora.opencloud.test | | `COLLABORA_DOMAIN` | Collabora domain | collabora.opencloud.test |
| `WOPISERVER_DOMAIN` | WOPI server domain | wopiserver.opencloud.test | | `WOPISERVER_DOMAIN` | WOPI server domain | wopiserver.opencloud.test |
| `TIKA_IMAGE` | Apache Tika image tag | apache/tika:latest-full |
| `KEYCLOAK_DOMAIN` | Keycloak domain | keycloak.opencloud.test | | `KEYCLOAK_DOMAIN` | Keycloak domain | keycloak.opencloud.test |
| `KEYCLOAK_ADMIN` | Keycloak admin username | kcadmin | | `KEYCLOAK_ADMIN` | Keycloak admin username | kcadmin |
| `KEYCLOAK_ADMIN_PASSWORD` | Keycloak admin password | admin | | `KEYCLOAK_ADMIN_PASSWORD` | Keycloak admin password | admin |
@@ -206,6 +227,7 @@ This repository uses a modular approach with multiple compose files:
- `docker-compose.yml` - Core OpenCloud service - `docker-compose.yml` - Core OpenCloud service
- `weboffice/` - Web office integrations (Collabora Online) - `weboffice/` - Web office integrations (Collabora Online)
- `storage/` - Storage backend configurations (decomposeds3) - `storage/` - Storage backend configurations (decomposeds3)
- `search/` - Search and content analysis services (Apache Tika)
- `idm/` - Identity management configurations (Keycloak & LDAP) - `idm/` - Identity management configurations (Keycloak & LDAP)
- `traefik/` - Traefik reverse proxy configurations - `traefik/` - Traefik reverse proxy configurations
- `external-proxy/` - Configuration for external reverse proxies - `external-proxy/` - Configuration for external reverse proxies

17
search/tika.yml Normal file
View File

@@ -0,0 +1,17 @@
---
services:
tika:
image: ${TIKA_IMAGE:-apache/tika:latest-full}
# release notes: https://tika.apache.org
networks:
opencloud-net:
restart: always
logging:
driver: ${LOG_DRIVER:-local}
opencloud:
environment:
# fulltext search
SEARCH_EXTRACTOR_TYPE: tika
SEARCH_EXTRACTOR_TIKA_TIKA_URL: http://tika:9998
FRONTEND_FULL_TEXT_SEARCH_ENABLED: "true"