Initial commit
This commit is contained in:
961
references/ELASTICSEARCH_SETUP.md
Normal file
961
references/ELASTICSEARCH_SETUP.md
Normal file
@@ -0,0 +1,961 @@
|
||||
# Elasticsearch Setup Guide
|
||||
|
||||
This guide documents the Elasticsearch search engine setup for full-text search, multilingual content indexing, and location-based queries in the application.
|
||||
|
||||
## SECURITY
|
||||
|
||||
**Elasticsearch is a DATABASE and must NEVER be exposed to the internet without proper security!**
|
||||
|
||||
### Default Security Risks
|
||||
|
||||
By default, Elasticsearch 7.x ships with:
|
||||
- **NO authentication** - anyone can read/write/delete all data
|
||||
- **NO encryption** - all data transmitted in plain text
|
||||
- **NO access control** - full admin access for anyone who connects
|
||||
|
||||
### Required Security Configuration
|
||||
|
||||
**For Development (Local Machine Only):**
|
||||
```yaml
|
||||
# /etc/elasticsearch/elasticsearch.yml
|
||||
network.host: 127.0.0.1 # ONLY localhost - NOT 0.0.0.0!
|
||||
http.port: 9200
|
||||
xpack.security.enabled: false # OK for localhost-only
|
||||
```
|
||||
|
||||
**For Production/Remote Servers:**
|
||||
```yaml
|
||||
# /etc/elasticsearch/elasticsearch.yml
|
||||
network.host: 127.0.0.1 # ONLY localhost - use reverse proxy if needed
|
||||
http.port: 9200
|
||||
xpack.security.enabled: true # REQUIRED for any server accessible remotely
|
||||
xpack.security.transport.ssl.enabled: true
|
||||
```
|
||||
|
||||
### Verify Your Server Is NOT Exposed
|
||||
|
||||
**Check what interface Elasticsearch is listening on:**
|
||||
```bash
|
||||
ss -tlnp | grep 9200
|
||||
```
|
||||
|
||||
**SAFE** - Should show ONLY localhost addresses:
|
||||
```
|
||||
127.0.0.1:9200 # IPv4 localhost
|
||||
[::1]:9200 # IPv6 localhost
|
||||
[::ffff:127.0.0.1]:9200 # IPv6-mapped IPv4 localhost (also safe!)
|
||||
```
|
||||
|
||||
**Note**: The `[::ffff:127.0.0.1]` format is the IPv6 representation of IPv4 localhost - it's still localhost-only and secure.
|
||||
|
||||
**DANGER** - If you see any of these, YOU ARE EXPOSED:
|
||||
```
|
||||
0.0.0.0:9200 # Listening on ALL interfaces - EXPOSED!
|
||||
*:9200 # Listening on ALL interfaces - EXPOSED!
|
||||
YOUR_PUBLIC_IP:9200 # Listening on public IP - EXPOSED!
|
||||
```
|
||||
|
||||
**Test external accessibility:**
|
||||
```bash
|
||||
# From another machine or from the internet
|
||||
curl http://YOUR_SERVER_IP:9200
|
||||
|
||||
# Should get: Connection refused (GOOD!)
|
||||
# If you get a JSON response - YOU ARE EXPOSED TO THE INTERNET!
|
||||
```
|
||||
|
||||
### What Happens If Exposed?
|
||||
|
||||
If Elasticsearch is exposed to the internet without authentication:
|
||||
1. Attackers can **read all your data** (users, emails, private information)
|
||||
2. Attackers can **delete all your indices** (all search data gone)
|
||||
3. Attackers can **modify data** (corrupt your search results)
|
||||
4. Attackers can **execute scripts** (potential remote code execution)
|
||||
|
||||
**Real-world attacks:**
|
||||
- Ransomware attacks encrypting Elasticsearch data
|
||||
- Mass data exfiltration of exposed databases
|
||||
- Bitcoin mining malware installation
|
||||
- Complete data deletion with ransom demands
|
||||
|
||||
### Immediate Actions If You Discover Exposure
|
||||
|
||||
1. **IMMEDIATELY stop Elasticsearch:**
|
||||
```bash
|
||||
sudo systemctl stop elasticsearch
|
||||
```
|
||||
|
||||
2. **Fix the configuration:**
|
||||
```bash
|
||||
sudo nano /etc/elasticsearch/elasticsearch.yml
|
||||
# Set: network.host: 127.0.0.1
|
||||
# Set: xpack.security.enabled: true
|
||||
```
|
||||
|
||||
3. **Enable authentication and set passwords:**
|
||||
```bash
|
||||
sudo /usr/share/elasticsearch/bin/elasticsearch-setup-passwords interactive
|
||||
```
|
||||
|
||||
4. **Restart with fixed configuration:**
|
||||
```bash
|
||||
sudo systemctl start elasticsearch
|
||||
```
|
||||
|
||||
5. **Verify it's no longer accessible:**
|
||||
```bash
|
||||
curl http://YOUR_SERVER_IP:9200
|
||||
# Should show: Connection refused
|
||||
```
|
||||
|
||||
6. **Review logs for unauthorized access:**
|
||||
```bash
|
||||
sudo grep -i "unauthorized\|access denied\|failed\|401\|403" /var/log/elasticsearch/*.log
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
The application uses **Elasticsearch 7.17.24** with Laravel Scout for:
|
||||
|
||||
- Full-text search across Users, Organizations, Banks, and Posts
|
||||
- Multilingual search with language-specific analyzers (EN, NL, DE, ES, FR)
|
||||
- Location-based search with edge n-gram tokenization
|
||||
- Skill and tag matching with boost factors
|
||||
- Autocomplete suggestions
|
||||
- Custom search optimization with configurable boost factors
|
||||
|
||||
**Scout Driver**: `matchish/laravel-scout-elasticsearch` v7.12.0
|
||||
**Elasticsearch Client**: `elasticsearch/elasticsearch` v8.19.0
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- PHP 8.3+ with required extensions
|
||||
- MySQL/MariaDB database (primary data source)
|
||||
- Redis server (for Scout queue)
|
||||
- Java Runtime Environment (JRE) 11+ for Elasticsearch
|
||||
- At least 4GB RAM available for Elasticsearch (8GB+ recommended for production)
|
||||
|
||||
## Installation
|
||||
|
||||
### 1. Install Elasticsearch
|
||||
|
||||
#### On Ubuntu/Debian:
|
||||
|
||||
```bash
|
||||
# Import the Elasticsearch GPG key
|
||||
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg
|
||||
|
||||
# Add the Elasticsearch repository
|
||||
echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-7.x.list
|
||||
|
||||
# Update package list and install
|
||||
sudo apt-get update
|
||||
sudo apt-get install elasticsearch=7.17.24
|
||||
|
||||
# Hold the package to prevent unwanted upgrades
|
||||
sudo apt-mark hold elasticsearch
|
||||
```
|
||||
|
||||
#### On CentOS/RHEL:
|
||||
|
||||
```bash
|
||||
# Import the GPG key
|
||||
sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
|
||||
|
||||
# Create repository file
|
||||
cat <<EOF | sudo tee /etc/yum.repos.d/elasticsearch.repo
|
||||
[elasticsearch-7.x]
|
||||
name=Elasticsearch repository for 7.x packages
|
||||
baseurl=https://artifacts.elastic.co/packages/7.x/yum
|
||||
gpgcheck=1
|
||||
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
|
||||
enabled=1
|
||||
autorefresh=1
|
||||
type=rpm-md
|
||||
EOF
|
||||
|
||||
# Install specific version
|
||||
sudo yum install elasticsearch-7.17.24
|
||||
```
|
||||
|
||||
### 2. Configure Elasticsearch
|
||||
|
||||
#### Basic Configuration
|
||||
|
||||
Edit `/etc/elasticsearch/elasticsearch.yml`:
|
||||
|
||||
```yaml
|
||||
# Cluster name (single-node setup)
|
||||
cluster.name: elasticsearch
|
||||
|
||||
# Node name
|
||||
node.name: node-1
|
||||
|
||||
# Network settings for local development
|
||||
network.host: 127.0.0.1
|
||||
http.port: 9200
|
||||
|
||||
# Discovery settings (single-node)
|
||||
discovery.type: single-node
|
||||
|
||||
# Path settings (default, can be customized)
|
||||
path.data: /var/lib/elasticsearch
|
||||
path.logs: /var/log/elasticsearch
|
||||
|
||||
# Security (disabled for local development, enable for production)
|
||||
xpack.security.enabled: false
|
||||
```
|
||||
|
||||
#### Memory Configuration
|
||||
|
||||
Configure JVM heap size in `/etc/elasticsearch/jvm.options.d/heap.options`:
|
||||
|
||||
```
|
||||
# Development: 2-4GB
|
||||
-Xms2g
|
||||
-Xmx2g
|
||||
|
||||
# Production: 8-16GB (50% of system RAM, max 32GB)
|
||||
# -Xms16g
|
||||
# -Xmx16g
|
||||
```
|
||||
|
||||
**Important Memory Guidelines:**
|
||||
- Set `-Xms` and `-Xmx` to the same value
|
||||
- Never exceed 50% of total system RAM
|
||||
- Never exceed 32GB (compressed oops limit)
|
||||
- Leave at least 50% of RAM for the OS file cache
|
||||
|
||||
#### System Limits
|
||||
|
||||
The systemd service already configures these limits:
|
||||
|
||||
```
|
||||
LimitNOFILE=65535
|
||||
LimitNPROC=4096
|
||||
LimitAS=infinity
|
||||
```
|
||||
|
||||
If running manually, also set in `/etc/security/limits.conf`:
|
||||
|
||||
```
|
||||
elasticsearch soft nofile 65535
|
||||
elasticsearch hard nofile 65535
|
||||
elasticsearch soft nproc 4096
|
||||
elasticsearch hard nproc 4096
|
||||
```
|
||||
|
||||
### 3. Start and Enable Elasticsearch
|
||||
|
||||
```bash
|
||||
# Start Elasticsearch
|
||||
sudo systemctl start elasticsearch
|
||||
|
||||
# Enable to start on boot
|
||||
sudo systemctl enable elasticsearch
|
||||
|
||||
# Check status
|
||||
sudo systemctl status elasticsearch
|
||||
|
||||
# View logs
|
||||
sudo journalctl -u elasticsearch -f
|
||||
```
|
||||
|
||||
### 4. Verify Installation
|
||||
|
||||
```bash
|
||||
# Test connection
|
||||
curl http://localhost:9200
|
||||
|
||||
# Expected output:
|
||||
# {
|
||||
# "name" : "node-1",
|
||||
# "cluster_name" : "elasticsearch",
|
||||
# "version" : {
|
||||
# "number" : "7.17.24",
|
||||
# ...
|
||||
# },
|
||||
# "tagline" : "You Know, for Search"
|
||||
# }
|
||||
|
||||
# Check cluster health
|
||||
curl http://localhost:9200/_cluster/health?pretty
|
||||
|
||||
# Check available indices
|
||||
curl http://localhost:9200/_cat/indices?v
|
||||
```
|
||||
|
||||
## Laravel Application Configuration
|
||||
|
||||
### 1. Environment Variables
|
||||
|
||||
Configure Elasticsearch connection in `.env`:
|
||||
|
||||
```env
|
||||
# Search configuration
|
||||
SCOUT_DRIVER=matchish-elasticsearch
|
||||
SCOUT_QUEUE=true
|
||||
SCOUT_PREFIX=
|
||||
|
||||
# Elasticsearch connection
|
||||
ELASTICSEARCH_HOST=localhost:9200
|
||||
# ELASTICSEARCH_USER=elastic # Uncomment for production with auth
|
||||
# ELASTICSEARCH_PASSWORD=your_password # Uncomment for production with auth
|
||||
|
||||
# Queue for background indexing (recommended)
|
||||
QUEUE_CONNECTION=redis
|
||||
```
|
||||
|
||||
### 2. Configuration Files
|
||||
|
||||
The application has extensive Elasticsearch configuration:
|
||||
|
||||
**`config/scout.php`**
|
||||
- Driver: `matchish-elasticsearch`
|
||||
- Queue enabled for async indexing
|
||||
- Chunk size: 500 records per batch
|
||||
- Soft deletes: Not kept in search index
|
||||
|
||||
**`config/elasticsearch.php`**
|
||||
- Index mappings for all searchable models (825 lines!)
|
||||
- Language-specific analyzers (NL, EN, FR, DE, ES)
|
||||
- Custom analyzers for names and locations
|
||||
- Date format handling
|
||||
- Field boost configuration
|
||||
|
||||
**`config/timebank-cc.php`** (search section)
|
||||
- Boost factors for fields and models
|
||||
- Search behavior (type, fragment size, highlighting)
|
||||
- Maximum results and caching
|
||||
- Model indices to search
|
||||
- Suggestion count
|
||||
|
||||
### 3. Searchable Models
|
||||
|
||||
The following models use Scout's `Searchable` trait:
|
||||
|
||||
- **User** → `users_index`
|
||||
- **Organization** → `organizations_index`
|
||||
- **Bank** → `banks_index`
|
||||
- **Post** → `posts_index`
|
||||
- **Transaction** → `transactions_index`
|
||||
- **Tag** → `tags_index`
|
||||
|
||||
Each model defines:
|
||||
- `searchableAs()`: Index name
|
||||
- `toSearchableArray()`: Data structure for indexing
|
||||
|
||||
## Index Management
|
||||
|
||||
### Creating Indices
|
||||
|
||||
Indices are automatically created when you import data:
|
||||
|
||||
```bash
|
||||
# Import all models (creates indices with timestamps)
|
||||
php artisan scout:import "App\Models\User"
|
||||
php artisan scout:import "App\Models\Organization"
|
||||
php artisan scout:import "App\Models\Bank"
|
||||
php artisan scout:import "App\Models\Post"
|
||||
|
||||
# Queue-based import (recommended for large datasets)
|
||||
php artisan scout:queue-import "App\Models\User"
|
||||
```
|
||||
|
||||
**Index Naming**: Indices are created with timestamps (e.g., `users_index_1758826582`) and aliases are used for stable names.
|
||||
|
||||
### Reindexing Script
|
||||
|
||||
The application includes a comprehensive reindexing script at `re-index-search.sh`:
|
||||
|
||||
```bash
|
||||
# Run the reindexing script
|
||||
./re-index-search.sh
|
||||
```
|
||||
|
||||
**What it does:**
|
||||
1. Cleans up old indices and removes conflicts
|
||||
2. Waits for cluster health
|
||||
3. Imports all models (Users, Organizations, Banks, Posts)
|
||||
4. Creates stable aliases pointing to latest timestamped indices
|
||||
5. Shows final index and alias status
|
||||
|
||||
**Important**: The script uses `SCOUT_QUEUE=false` to force immediate indexing, bypassing the queue for reliable completion.
|
||||
|
||||
### Manual Index Operations
|
||||
|
||||
```bash
|
||||
# Flush (delete) an index
|
||||
php artisan scout:flush "App\Models\User"
|
||||
|
||||
# Delete a specific index
|
||||
php artisan scout:delete-index users_index_1758826582
|
||||
|
||||
# Delete all indices
|
||||
php artisan scout:delete-all-indexes
|
||||
|
||||
# Create a new index
|
||||
php artisan scout:index users_index
|
||||
|
||||
# Check indices via curl
|
||||
curl http://localhost:9200/_cat/indices?v
|
||||
|
||||
# Check aliases
|
||||
curl http://localhost:9200/_cat/aliases?v
|
||||
```
|
||||
|
||||
## Search Features
|
||||
|
||||
### Multilingual Search
|
||||
|
||||
The configuration supports 5 languages with dedicated analyzers:
|
||||
|
||||
**Language Analyzers:**
|
||||
- `analyzer_nl`: Dutch (stop words + stemming)
|
||||
- `analyzer_en`: English (stop words + stemming)
|
||||
- `analyzer_fr`: French (stop words + stemming)
|
||||
- `analyzer_de`: German (stop words + stemming)
|
||||
- `analyzer_es`: Spanish (stop words + stemming)
|
||||
|
||||
**Special Analyzers:**
|
||||
- `name_analyzer`: For profile names with edge n-grams (autocomplete)
|
||||
- `locations_analyzer`: For cities/districts with custom stop words
|
||||
- `analyzer_general`: Generic tokenization for general text
|
||||
|
||||
### Boost Configuration
|
||||
|
||||
Field boost factors (configured in `config/timebank-cc.php`):
|
||||
|
||||
**Profile Fields:**
|
||||
```php
|
||||
'name' => 1,
|
||||
'full_name' => 1,
|
||||
'cyclos_skills' => 1.5,
|
||||
'tags' => 2, // Highest boost
|
||||
'tag_categories' => 1.4,
|
||||
'motivation' => 1,
|
||||
'about_short' => 1,
|
||||
'about' => 1,
|
||||
```
|
||||
|
||||
**Post Fields:**
|
||||
```php
|
||||
'title' => 2, // Highest boost
|
||||
'excerpt' => 1.5,
|
||||
'content' => 1,
|
||||
'post_category_name' => 2, // High boost
|
||||
```
|
||||
|
||||
**Model Boost (score multipliers):**
|
||||
```php
|
||||
'user' => 1, // Baseline
|
||||
'organization' => 3, // 3x boost
|
||||
'bank' => 3, // 3x boost
|
||||
'post' => 4, // 4x boost (highest)
|
||||
```
|
||||
|
||||
### Location-Based Search
|
||||
|
||||
The application has advanced location boost factors:
|
||||
|
||||
```php
|
||||
'same_district' => 5.0, // Highest boost
|
||||
'same_city' => 3.0, // High boost
|
||||
'same_division' => 2.0, // Medium boost
|
||||
'same_country' => 1.5, // Base boost
|
||||
'different_country' => 1.0, // Neutral
|
||||
'no_location' => 0.9, // Slight penalty
|
||||
```
|
||||
|
||||
### Search Highlighting
|
||||
|
||||
Search results include highlighted matches:
|
||||
|
||||
```php
|
||||
'fragment_size' => 80, // Characters per fragment
|
||||
'number_of_fragments' => 2, // Max fragments
|
||||
'pre-tags' => '<span class="font-semibold text-white leading-tight">',
|
||||
'post-tags' => '</span>',
|
||||
```
|
||||
|
||||
### Caching
|
||||
|
||||
Search results are cached for performance:
|
||||
|
||||
```php
|
||||
'cache_results' => 5, // TTL in minutes
|
||||
```
|
||||
|
||||
## Index Structure Examples
|
||||
|
||||
### Users Index Mapping
|
||||
|
||||
```json
|
||||
{
|
||||
"users_index": {
|
||||
"properties": {
|
||||
"id": { "type": "keyword" },
|
||||
"name": {
|
||||
"type": "text",
|
||||
"analyzer": "name_analyzer",
|
||||
"fields": {
|
||||
"keyword": { "type": "keyword" },
|
||||
"suggest": { "type": "completion" }
|
||||
}
|
||||
},
|
||||
"about_nl": { "type": "text", "analyzer": "analyzer_nl" },
|
||||
"about_en": { "type": "text", "analyzer": "analyzer_en" },
|
||||
"about_fr": { "type": "text", "analyzer": "analyzer_fr" },
|
||||
"about_de": { "type": "text", "analyzer": "analyzer_de" },
|
||||
"about_es": { "type": "text", "analyzer": "analyzer_es" },
|
||||
"locations": {
|
||||
"properties": {
|
||||
"district": { "type": "text", "analyzer": "locations_analyzer" },
|
||||
"city": { "type": "text", "analyzer": "locations_analyzer" },
|
||||
"division": { "type": "text", "analyzer": "locations_analyzer" },
|
||||
"country": { "type": "text", "analyzer": "locations_analyzer" }
|
||||
}
|
||||
},
|
||||
"tags": {
|
||||
"properties": {
|
||||
"contexts": {
|
||||
"properties": {
|
||||
"tags": {
|
||||
"properties": {
|
||||
"name_nl": { "type": "text", "analyzer": "analyzer_nl" },
|
||||
"name_en": { "type": "text", "analyzer": "analyzer_en" }
|
||||
// ... other languages
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Posts Index Mapping
|
||||
|
||||
```json
|
||||
{
|
||||
"posts_index": {
|
||||
"properties": {
|
||||
"id": { "type": "keyword" },
|
||||
"category_id": { "type": "integer" },
|
||||
"status": { "type": "keyword" },
|
||||
"featured": { "type": "boolean" },
|
||||
"post_translations": {
|
||||
"properties": {
|
||||
"title_nl": {
|
||||
"type": "text",
|
||||
"analyzer": "analyzer_nl",
|
||||
"fields": {
|
||||
"keyword": { "type": "keyword" },
|
||||
"suggest": { "type": "completion" }
|
||||
}
|
||||
},
|
||||
"title_en": {
|
||||
"type": "text",
|
||||
"analyzer": "analyzer_en",
|
||||
"fields": {
|
||||
"keyword": { "type": "keyword" },
|
||||
"suggest": { "type": "completion" }
|
||||
}
|
||||
},
|
||||
"content_nl": { "type": "text", "analyzer": "analyzer_nl" },
|
||||
"content_en": { "type": "text", "analyzer": "analyzer_en" },
|
||||
"from_nl": {
|
||||
"type": "date",
|
||||
"format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||strict_date_optional_time||epoch_millis"
|
||||
}
|
||||
// ... other languages and fields
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Elasticsearch Won't Start
|
||||
|
||||
**Problem**: Service fails to start
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. Check memory settings:
|
||||
```bash
|
||||
# View JVM settings
|
||||
cat /etc/elasticsearch/jvm.options.d/heap.options
|
||||
|
||||
# Check available system memory
|
||||
free -h
|
||||
|
||||
# Ensure heap size doesn't exceed 50% of RAM
|
||||
```
|
||||
|
||||
2. Check disk space:
|
||||
```bash
|
||||
df -h /var/lib/elasticsearch
|
||||
```
|
||||
|
||||
3. Check logs:
|
||||
```bash
|
||||
sudo journalctl -u elasticsearch -n 100 --no-pager
|
||||
sudo tail -f /var/log/elasticsearch/elasticsearch.log
|
||||
```
|
||||
|
||||
4. Check Java installation:
|
||||
```bash
|
||||
java -version
|
||||
```
|
||||
|
||||
### Connection Refused
|
||||
|
||||
**Problem**: Cannot connect to Elasticsearch
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. Verify Elasticsearch is running:
|
||||
```bash
|
||||
sudo systemctl status elasticsearch
|
||||
```
|
||||
|
||||
2. Check port binding:
|
||||
```bash
|
||||
ss -tlnp | grep 9200
|
||||
```
|
||||
|
||||
3. Check configuration:
|
||||
```bash
|
||||
sudo grep -E "^network.host|^http.port" /etc/elasticsearch/elasticsearch.yml
|
||||
```
|
||||
|
||||
4. Test connection:
|
||||
```bash
|
||||
curl http://localhost:9200
|
||||
```
|
||||
|
||||
### Index Not Found
|
||||
|
||||
**Problem**: `index_not_found_exception` when searching
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. Check if indices exist:
|
||||
```bash
|
||||
curl http://localhost:9200/_cat/indices?v
|
||||
```
|
||||
|
||||
2. Check if aliases exist:
|
||||
```bash
|
||||
curl http://localhost:9200/_cat/aliases?v
|
||||
```
|
||||
|
||||
3. Reimport the model:
|
||||
```bash
|
||||
php artisan scout:import "App\Models\User"
|
||||
```
|
||||
|
||||
4. Or run the full reindex script:
|
||||
```bash
|
||||
./re-index-search.sh
|
||||
```
|
||||
|
||||
### Slow Indexing / High Memory Usage
|
||||
|
||||
**Problem**: Indexing takes too long or uses excessive memory
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. Enable queue for async indexing in `.env`:
|
||||
```env
|
||||
SCOUT_QUEUE=true
|
||||
QUEUE_CONNECTION=redis
|
||||
```
|
||||
|
||||
2. Start queue worker:
|
||||
```bash
|
||||
php artisan queue:work --queue=high,default
|
||||
```
|
||||
|
||||
3. Reduce chunk size in `config/scout.php`:
|
||||
```php
|
||||
'chunk' => [
|
||||
'searchable' => 250, // Reduced from 500
|
||||
],
|
||||
```
|
||||
|
||||
4. Monitor Elasticsearch memory:
|
||||
```bash
|
||||
curl http://localhost:9200/_nodes/stats/jvm?pretty
|
||||
```
|
||||
|
||||
### Search Results Are Incorrect
|
||||
|
||||
**Problem**: Search doesn't return expected results
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. Check index mapping:
|
||||
```bash
|
||||
curl http://localhost:9200/users_index/_mapping?pretty
|
||||
```
|
||||
|
||||
2. Test query directly:
|
||||
```bash
|
||||
curl -X GET "localhost:9200/users_index/_search?pretty" -H 'Content-Type: application/json' -d'
|
||||
{
|
||||
"query": {
|
||||
"match": {
|
||||
"name": "test"
|
||||
}
|
||||
}
|
||||
}
|
||||
'
|
||||
```
|
||||
|
||||
3. Clear and rebuild index:
|
||||
```bash
|
||||
php artisan scout:flush "App\Models\User"
|
||||
php artisan scout:import "App\Models\User"
|
||||
```
|
||||
|
||||
4. Check Scout queue jobs:
|
||||
```bash
|
||||
php artisan queue:failed
|
||||
php artisan queue:retry all
|
||||
```
|
||||
|
||||
### Out of Memory Errors
|
||||
|
||||
**Problem**: `OutOfMemoryError` in Elasticsearch logs
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. Increase JVM heap (but respect limits):
|
||||
```bash
|
||||
# Edit /etc/elasticsearch/jvm.options.d/heap.options
|
||||
-Xms4g
|
||||
-Xmx4g
|
||||
```
|
||||
|
||||
2. Restart Elasticsearch:
|
||||
```bash
|
||||
sudo systemctl restart elasticsearch
|
||||
```
|
||||
|
||||
3. Monitor memory usage:
|
||||
```bash
|
||||
watch -n 1 'curl -s http://localhost:9200/_cat/nodes?v&h=heap.percent,ram.percent'
|
||||
```
|
||||
|
||||
4. Clear fielddata cache:
|
||||
```bash
|
||||
curl -X POST "localhost:9200/_cache/clear?fielddata=true"
|
||||
```
|
||||
|
||||
### Shards Unassigned
|
||||
|
||||
**Problem**: Yellow or red cluster health
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. Check cluster health:
|
||||
```bash
|
||||
curl http://localhost:9200/_cluster/health?pretty
|
||||
```
|
||||
|
||||
2. Check shard allocation:
|
||||
```bash
|
||||
curl http://localhost:9200/_cat/shards?v
|
||||
```
|
||||
|
||||
3. For single-node setup, set replicas to 0:
|
||||
```bash
|
||||
curl -X PUT "localhost:9200/_settings" -H 'Content-Type: application/json' -d'
|
||||
{
|
||||
"index": {
|
||||
"number_of_replicas": 0
|
||||
}
|
||||
}
|
||||
'
|
||||
```
|
||||
|
||||
## Production Recommendations
|
||||
|
||||
### Security
|
||||
|
||||
1. **Enable X-Pack Security**:
|
||||
|
||||
Edit `/etc/elasticsearch/elasticsearch.yml`:
|
||||
```yaml
|
||||
xpack.security.enabled: true
|
||||
xpack.security.transport.ssl.enabled: true
|
||||
```
|
||||
|
||||
2. **Set passwords**:
|
||||
```bash
|
||||
/usr/share/elasticsearch/bin/elasticsearch-setup-passwords auto
|
||||
```
|
||||
|
||||
3. **Update `.env`**:
|
||||
```env
|
||||
ELASTICSEARCH_USER=elastic
|
||||
ELASTICSEARCH_PASSWORD=generated_password
|
||||
```
|
||||
|
||||
### Performance Optimization
|
||||
|
||||
1. **Increase file descriptors**:
|
||||
```bash
|
||||
# /etc/security/limits.conf
|
||||
elasticsearch soft nofile 65535
|
||||
elasticsearch hard nofile 65535
|
||||
```
|
||||
|
||||
2. **Disable swapping**:
|
||||
```bash
|
||||
# /etc/elasticsearch/elasticsearch.yml
|
||||
bootstrap.memory_lock: true
|
||||
```
|
||||
|
||||
Edit `/etc/systemd/system/elasticsearch.service.d/override.conf`:
|
||||
```ini
|
||||
[Service]
|
||||
LimitMEMLOCK=infinity
|
||||
```
|
||||
|
||||
3. **Use SSD for data directory**:
|
||||
```yaml
|
||||
# /etc/elasticsearch/elasticsearch.yml
|
||||
path.data: /mnt/ssd/elasticsearch
|
||||
```
|
||||
|
||||
4. **Set appropriate refresh interval**:
|
||||
```bash
|
||||
curl -X PUT "localhost:9200/users_index/_settings" -H 'Content-Type: application/json' -d'
|
||||
{
|
||||
"index": {
|
||||
"refresh_interval": "30s"
|
||||
}
|
||||
}
|
||||
'
|
||||
```
|
||||
|
||||
### Backup and Restore
|
||||
|
||||
1. **Configure snapshot repository**:
|
||||
```bash
|
||||
curl -X PUT "localhost:9200/_snapshot/backup_repo" -H 'Content-Type: application/json' -d'
|
||||
{
|
||||
"type": "fs",
|
||||
"settings": {
|
||||
"location": "/var/backups/elasticsearch",
|
||||
"compress": true
|
||||
}
|
||||
}
|
||||
'
|
||||
```
|
||||
|
||||
2. **Create snapshot**:
|
||||
```bash
|
||||
curl -X PUT "localhost:9200/_snapshot/backup_repo/snapshot_1?wait_for_completion=true"
|
||||
```
|
||||
|
||||
3. **Restore snapshot**:
|
||||
```bash
|
||||
curl -X POST "localhost:9200/_snapshot/backup_repo/snapshot_1/_restore"
|
||||
```
|
||||
|
||||
### Monitoring
|
||||
|
||||
1. **Check cluster stats**:
|
||||
```bash
|
||||
curl http://localhost:9200/_cluster/stats?pretty
|
||||
```
|
||||
|
||||
2. **Monitor node stats**:
|
||||
```bash
|
||||
curl http://localhost:9200/_nodes/stats?pretty
|
||||
```
|
||||
|
||||
3. **Check index stats**:
|
||||
```bash
|
||||
curl http://localhost:9200/_stats?pretty
|
||||
```
|
||||
|
||||
4. **Set up monitoring with Kibana** (optional):
|
||||
```bash
|
||||
sudo apt-get install kibana=7.17.24
|
||||
sudo systemctl enable kibana
|
||||
sudo systemctl start kibana
|
||||
```
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Essential Commands
|
||||
|
||||
```bash
|
||||
# Service management
|
||||
sudo systemctl start elasticsearch
|
||||
sudo systemctl stop elasticsearch
|
||||
sudo systemctl restart elasticsearch
|
||||
sudo systemctl status elasticsearch
|
||||
|
||||
# Check health
|
||||
curl http://localhost:9200
|
||||
curl http://localhost:9200/_cluster/health?pretty
|
||||
curl http://localhost:9200/_cat/indices?v
|
||||
|
||||
# Laravel Scout commands
|
||||
php artisan scout:import "App\Models\User"
|
||||
php artisan scout:flush "App\Models\User"
|
||||
php artisan scout:delete-all-indexes
|
||||
|
||||
# Reindex everything
|
||||
./re-index-search.sh
|
||||
|
||||
# Queue worker for async indexing
|
||||
php artisan queue:work --queue=high,default
|
||||
```
|
||||
|
||||
### Configuration Files
|
||||
|
||||
- `.env` - Connection and driver configuration
|
||||
- `config/scout.php` - Laravel Scout settings
|
||||
- `config/elasticsearch.php` - Index mappings and analyzers (825 lines!)
|
||||
- `config/timebank-cc.php` - Search boost factors and behavior
|
||||
- `/etc/elasticsearch/elasticsearch.yml` - Elasticsearch server config
|
||||
- `/etc/elasticsearch/jvm.options.d/heap.options` - JVM memory settings
|
||||
- `/usr/lib/systemd/system/elasticsearch.service` - systemd service
|
||||
|
||||
### Important Paths
|
||||
|
||||
- **Data**: `/var/lib/elasticsearch`
|
||||
- **Logs**: `/var/log/elasticsearch`
|
||||
- **Config**: `/etc/elasticsearch`
|
||||
- **Binary**: `/usr/share/elasticsearch`
|
||||
|
||||
|
||||
## Additional Resources
|
||||
|
||||
- **Elasticsearch Documentation**: https://www.elastic.co/guide/en/elasticsearch/reference/7.17/
|
||||
- **Laravel Scout**: https://laravel.com/docs/10.x/scout
|
||||
- **Matchish Scout Elasticsearch**: https://github.com/matchish/laravel-scout-elasticsearch
|
||||
- **Elasticsearch DSL**: https://www.elastic.co/guide/en/elasticsearch/reference/7.17/query-dsl.html
|
||||
- **Language Analyzers**: https://www.elastic.co/guide/en/elasticsearch/reference/7.17/analysis-lang-analyzer.html
|
||||
|
||||
## Notes
|
||||
|
||||
- This application uses a multilingual search setup with custom analyzers
|
||||
- The `config/elasticsearch.php` file is extensive (825 lines) with detailed field mappings
|
||||
- Location-based search uses edge n-grams for autocomplete functionality
|
||||
- Tags and categories have hierarchical support with multilingual translations
|
||||
- The reindexing script handles index versioning and aliasing automatically
|
||||
- Memory requirements are significant during indexing (plan accordingly)
|
||||
Reference in New Issue
Block a user