# Installing Sherpa AI Server

### Unpacking Client Files

Unpack the archive with client files and prepare the system for installation.

#### Unpacking the Archive with Client Files

```bash
# Find and unpack the archive (the latest version is automatically selected)
tar -xvzf "$(ls client_files_*.tgz | sort -V | tail -n 1)"
```

<details>

<summary>💡 Comments on unpacking the archive</summary>

**tar -xvzf "$(ls client\_files\_\*.tgz | sort -V | tail -n 1)"** - unpacks the client files archive

* `tar -xvzf` - unpacks the archive with detailed output
* `ls client_files_*.tgz` - finds all archive files
* `sort -V` - sorts versions naturally (1.0 < 1.1 < 1.10)
* `tail -n 1` - selects the latest file

**Expected result:** A directory `sh_scripts/` will be created with executable scripts and other necessary files.

</details>

#### Preparing Scripts for Execution

```bash
# Navigate to the scripts directory
cd sh_scripts/

# Make all scripts executable
chmod +x *.sh

# Return to the root directory of the project
cd ..
```

<details>

<summary>💡 Comments on preparing scripts</summary>

**cd sh\_scripts/** - navigates to the installation scripts directory

\**chmod +x .sh* - sets execution permissions for all shell scripts

* `chmod +x` - adds execution permission
* `*.sh` - all files with the .sh extension

**cd ..** - returns to the root directory of the project

**What these commands do:**

* `chmod +x *.sh` - sets execution permissions for all shell scripts
* This is necessary for running scripts in the following installation stages

</details>

#### Structure of the Unpacked Archive:

After unpacking, you should see the following files and directories:

* `sh_scripts/` - directory with installation scripts
  * `download_all_latest_docker_images.sh` - script for downloading Docker images
  * `load_all_docker_images.sh` - script for loading images into Docker
  * `extract_models.sh` - script for unpacking AI models
  * `extract_vllm.sh` - script for unpacking LLM models
* `docker-compose.yml` - Docker Compose configuration for client installation
* `.env` - file with environment variables for system configuration

#### Checking the Success of Unpacking:

```bash
# Check the contents of the directory
ls -la

# Ensure that the scripts are executable
ls -la sh_scripts/*.sh
```

<details>

<summary>💡 Comments on checking unpacking</summary>

**ls -la** - shows detailed contents of the directory

* `-l` - long format
* `-a` - shows hidden files

**ls -la sh\_scripts/\*.sh** - checks scripts in the sh\_scripts directory

</details>

### Executing Scripts for Unpacking

#### Downloading Docker Images

```bash
# Run the script to download Docker images
sudo ./sh_scripts/load_all_docker_images.sh
```

<details>

<summary>💡 Comments on downloading Docker images</summary>

**sudo ./sh\_scripts/load\_all\_docker\_images.sh** - runs the script to download Docker images

**What the script does:**

1. Downloads all Docker images from downloaded .tar.gz files
2. Imports images into the local Docker registry
3. Checks the success of the download

</details>

#### Unpacking AI Models

```bash
# Run the script to unpack the main models
sudo ./sh_scripts/extract_models.sh
```

<details>

<summary>💡 Comments on unpacking main models</summary>

**sudo ./sh\_scripts/extract\_models.sh** - runs the script to unpack models

**What the script does:**

1. Unpacks the Whisper model for speech recognition
2. Unpacks the BGE Reranker model for improved search
3. Unpacks models for generating embeddings
4. Creates necessary directories
5. Checks the success of the unpacking

</details>

```bash
# Run the script to unpack LLM models
sudo ./sh_scripts/extract_vllm.sh
```

<details>

<summary>💡 Comments on unpacking LLM model</summary>

**sudo ./sh\_scripts/extract\_vllm.sh** - runs the script to unpack LLM model

**What the script does:**

1. Unpacks models
2. Places files directly into the models directory
3. Checks the contents after unpacking

</details>

#### Directory Structure After Unpacking (Approximate):

```
./whisper/
└── models/
    ├── base.pt
    └── ...

./bge_reranker/
└── models/
    └── bge-reranker-large/
        ├── config.json
        ├── model.bin
        └── ...

./embed-server/app/
└── model-store/
    └── sentence-transformers/
        └── paraphrase-multilingual-MiniLM-L12-v2/
            ├── config.json
            ├── pytorch_model.bin
            └── ...

./llm-server/models/
├── meta-llama/
│   └── Meta-Llama-3-8B-Instruct/
│       ├── config.json
│       ├── model-00001-of-00004.safetensors
│       ├── model-00002-of-00004.safetensors
│       └── ...
└── tokenizer.json
```

### Configuring System Settings

Sherpa AI Server requires configuring environment variables in the `.env` file before starting.

#### Opening the Configuration File

```bash
# Open the .env file in a text editor
nano ./.env
```

Or use any text editor:

```bash
# Vim
vim ./.env

# VS Code (if installed)
code ./.env
```

<details>

<summary>💡 Comments on opening the configuration file</summary>

**nano ./.env** - opens the .env file in the nano editor

* `nano` - simple text editor
* `./.env` - path to the configuration file

**vim ./.env** - opens the file in the Vim editor **code ./.env** - opens the file in VS Code (if installed)

**Recommendation:** Use the editor you are familiar with

</details>

#### Main Configuration Parameters

**Main server settings (aiserver):**

```bash
# Server IP address (change to your static IP)
HOST_IP=127.0.0.1

# Domain name (change to your domain)
NGINX_DOMAIN_NAME=aiserver.sherparpa.ru

# Maximum message length (in tokens)
MAX_TOKENS_MESSAGE=32000
```

**PostgreSQL Database Settings:**

```bash
# PostgreSQL password (SET YOUR SECURE PASSWORD)
POSTGRES_PASSWORD=password
```

**LLM Server Settings:**

**Choosing an AI model:** Select one of the available models by uncommenting the desired line and commenting out the others:

```bash
# === AI MODEL SELECTION ===
# Uncomment ONLY ONE of the models below:

# Llama 3.1 model (recommended for general use)
LLM_COMPLETION_MODEL_NAME=/model-store/meta-llama/Meta-Llama-3-8B-Instruct
LLM_CHAT_TEMPLATE=/model-templates/tool_chat_template_llama3.1_json.jinja
LLM_TOOL_CALL_PARSER=llama3_json

# Qwen model (alternative model)
# LLM_COMPLETION_MODEL_NAME=/model-store/Qwen3-30B-A3B-AWQ
# LLM_CHAT_TEMPLATE=/model-templates/tool_chat_template_qwen3coder.jinja
# LLM_TOOL_CALL_PARSER=hermes

# OCR model (specialized for text recognition)
# LLM_COMPLETION_MODEL_NAME=/model-store/olmOCR-2-7B-1025-FP8

# === END OF MODEL SELECTION ===

```

#### Security and Passwords

{% hint style="danger" %}
**Critically important:** Change all default passwords to secure ones:
{% endhint %}

```bash
# Generate secure passwords
openssl rand -base64 32

# Or use pwgen if installed
pwgen -s 32 1
```

**Password Recommendations:**

* Minimum of 32 characters
* Use letters, numbers, and special characters
* Do not use dictionary words
* Store passwords in a secure place

#### Checking Configuration

After editing the `.env` file, check the correctness of the settings:

```bash
# Check the syntax of the file
cat .env | grep -v '^#' | grep '=' | wc -l

# Check for required variables
grep -E "(POSTGRES_PASSWORD|HOST_IP|NGINX_DOMAIN_NAME)" .env
```

<details>

<summary>💡 Comments on checking configuration</summary>

**cat .env | grep -v '^#' | grep '=' | wc -l** - counts the number of environment variables

* `cat .env` - outputs the contents of the file
* `grep -v '^#'` - excludes comments (lines starting with #)
* `grep '='` - keeps only lines with variables
* `wc -l` - counts the number of lines

**grep -E "(POSTGRES\_PASSWORD|HOST\_IP|NGINX\_DOMAIN\_NAME)" .env** - checks for the presence of required variables

* `-E` - extended regular expressions
* Lists required variables separated by |

</details>

#### Creating a Backup

```bash
# Create a backup of the settings
cp .env .env.backup
```

<details>

<summary>💡 Comments on creating a backup</summary>

**cp .env .env.backup** - creates a backup of the configuration file

* `cp` - copy
* `.env` - source file
* `.env.backup` - backup file

</details>

{% hint style="warning" %}
**Important:** Without proper configuration of the `.env` file, the system will not start correctly.
{% endhint %}

#### Copying SSL Certificates

To ensure a secure HTTPS connection, you need to copy SSL certificates to the directory `./oais/backend/config/certs/`:

```bash
# Create a directory for certificates (if it does not exist)
mkdir -p ./oais/backend/config/certs/

# Copy your SSL certificates
# Replace with paths to your actual certificates
cp /path/to/your/certificate.crt ./oais/backend/config/certs/aiserver.crt
cp /path/to/your/private.key ./oais/backend/config/certs/aiserver.key

# Or if you have a wildcard certificate:
cp /path/to/your/wildcard.crt ./oais/backend/config/certs/aiserver.crt
cp /path/to/your/wildcard.key ./oais/backend/config/certs/aiserver.key
```

<details>

<summary>💡 Comments on copying SSL certificates</summary>

**mkdir -p ./oais/backend/config/certs/** - creates a directory for certificates

* `-p` - creates parent directories if they do not exist

**cp /path/to/your/certificate.crt ./oais/backend/config/certs/aiserver.crt** - copies the certificate **cp /path/to/your/private.key ./oais/backend/config/certs/aiserver.key** - copies the private key

**Certificate Requirements:**

* The certificate must be in `.crt` or `.pem` format
* The private key must be in `.key` format
* File names must be `aiserver.crt` and `aiserver.key`

</details>

{% hint style="warning" %}
**Important:** Ensure that the certificates have the correct permissions:
{% endhint %}

```bash
# Set the correct permissions on the certificates
chmod 644 ./oais/backend/config/certs/*.crt
chmod 600 ./oais/backend/config/certs/*.key
```

{% hint style="info" %}
**NOTE**: You need to obtain certificates from your network administrator or your corporate certification center; if neither of these options is available, you can refer to the article on obtaining certificates.
{% endhint %}

### Starting the System

After completing all the preparatory steps, you can start Sherpa AI Server. The system will run in the background as a set of Docker containers.

{% hint style="warning" %}
**Important:** The client receives the `docker-compose.yml` file, which contains the configuration of all services. Make sure you use this file to start the system.
{% endhint %}

#### Starting All Services

```bash
# Start the basic services in the background
docker compose up -d
```

<details>

<summary>💡 Comments on starting services</summary>

**docker compose up -d** - starts all services in the background

* `docker compose` - uses Docker Compose to manage containers
* `up` - creates and starts all services from docker-compose.yml
* `-d` - runs containers in detached mode (background)

**Expected startup time:** 2-5 minutes, depending on system performance.

</details>

#### Starting Services with Additional Features

In the `docker-compose.yml` file, some services have profiles and are started only when explicitly specified:

```bash
# Start with the Whisper speech recognition service
docker compose --profile whisper up -d

# Start with the BGE Reranker service
docker compose --profile reranker up -d

# Start all services (Whisper + BGE Reranker + basic)
docker compose --profile full up -d
```

{% hint style="warning" %}
**Important:** Consider the amount of available video memory (VRAM) on your system. If you have limited VRAM, only start the necessary services with the corresponding profiles. If memory is insufficient, the system may operate unstably or fail to start at all.
{% endhint %}

**Available profiles:**

* `whisper` - includes the speech recognition service (port 3005)
* `reranker` - includes the service for re-ranking search results (port 8001)
* `full` - includes all additional services (Whisper + BGE Reranker)

{% hint style="warning" %}
**Important:** Without specifying profiles, the Whisper and BGE Reranker services will not start. Choose the appropriate profile based on your functional requirements.
{% endhint %}

#### Checking the Status of Containers

```bash
# Check the status of all running containers
docker compose ps

# Or use docker ps for detailed information
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
```

<details>

<summary>💡 Comments on checking status</summary>

**docker compose ps** - shows the status of all Docker Compose containers

* Outputs names, status, and ports of containers

**docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"** - shows detailed information about containers

* `--format` - sets a custom output format
* `table` - tabular format
* `{{.Names}}` - container names
* `{{.Status}}` - container status
* `{{.Ports}}` - open ports

</details>

**Expected output (depends on the selected profile):**

**Basic services (without profiles):**

```
NAME                    STATUS              PORTS
aiserver                Up 2 minutes        0.0.0.0:443->443/tcp, 0.0.0.0:80->80/tcp, 0.0.0.0:4500->4500/tcp
aiserver-db             Up 2 minutes        0.0.0.0:3306->3306/tcp
aiserver-pg             Up 2 minutes        0.0.0.0:5432->5432/tcp
aiserver-embed          Up 2 minutes        0.0.0.0:3004->443/tcp
aiserver-llm-server     Up 2 minutes        0.0.0.0:3003->8000/tcp
aiserver-code_interpreter Up 2 minutes        0.0.0.0:3001->3001/tcp
```

**With the `whisper` profile (added):**

```
aiserver-whisper        Up About a minute   0.0.0.0:3005->8000/tcp
```

**With the `reranker` profile (added):**

```
aiserver-bge_reranker   Up About a minute   0.0.0.0:8001->8000/tcp
```

**With the `full` profile (both added):**

```
aiserver-whisper        Up About a minute   0.0.0.0:3005->8000/tcp
aiserver-bge_reranker   Up About a minute   0.0.0.0:8001->8000/tcp
```

All running containers should have the status "Up" and show open ports.

#### Checking Container Logs

```bash
# View logs of the main server
docker compose logs aiserver

# View logs of all services
docker compose logs

# Monitor logs in real time
docker compose logs -f aiserver
```

<details>

<summary>💡 Comments on checking logs</summary>

**docker compose logs aiserver** - shows logs of a specific service

* `aiserver` - service name

**docker compose logs** - shows logs of all services

**docker compose logs -f aiserver** - follows logs in real time

* `-f` - follow (watch for new messages)

**Check for errors:**

* Look for messages about database connection errors
* Check the loading of AI models
* Ensure the correctness of SSL certificates

</details>

#### Checking Service Availability

**Checking the main web interface:**

```bash
# Check HTTP availability (replace with your domain)
curl -I http://aiserver.sherparpa.ru

# Check HTTPS availability (replace with your domain)
curl -I https://aiserver.sherparpa.ru

# Expected response: HTTP/2 200 or redirect to /login
```

<details>

<summary>💡 Comments on checking the web interface</summary>

**curl -I <http://aiserver.sherparpa.ru>** - checks HTTP availability

* `-I` - shows only response headers
* `http://aiserver.sherparpa.ru` - URL to check

**curl -I <https://aiserver.sherparpa.ru>** - checks HTTPS availability

**Expected response:** HTTP/2 200 or redirect to /login

</details>

**Checking AI Services:**

````bash
# Check the LLM server
curl -X POST "http://localhost:3003/v1/completions" \
  -H "Content-Type: application/json" \
  -d '{"model": "meta-llama/Meta-Llama-3-8B-Instruct", "prompt": "Hello", "max_tokens": 10}'

<details>
<summary>💡 Comments on checking the LLM server</summary>

**curl -X POST "http://localhost:3003/v1/completions"** - checks the LLM server
- `-X POST` - HTTP POST method
- `"http://localhost:3003/v1/completions"` - API endpoint URL
- `-H "Content-Type: application/json"` - Content-Type header
- `-d '{"model": "...", "prompt": "Hello", "max_tokens": 10}'` - request data in JSON format
</details>

### Checking Database Connections

```bash

# Check connection to PostgreSQL
docker compose exec aiserver-pg psql -U postgres -d postgres -c "SELECT version();"
````

#### Testing Basic Functions

**Web Interface:**

1. Open a browser and go to `https://aiserver.sherparpa.ru`
2. The login page should open
3. Check the ability to register/login

#### Managing the System

**Stopping the System:**

```bash
# Stop all services (considering running profiles)
docker compose down
```

**Restarting Services:**

```bash
# Restart a specific service
docker compose restart aiserver

# Restart all running services
docker compose restart

# Restart services with a specific profile
docker compose --profile whisper restart aiserver-whisper
```

**Viewing Resources:**

```bash
# Check resource usage
docker stats

# Check GPU usage
nvidia-smi
```

<details>

<summary>💡 Comments on system management</summary>

**Stopping the system:**

* `docker compose down` - stops all services and removes containers
* `docker compose down -v` - stops services and removes volumes (data will be lost!)

**Restarting services:**

* `docker compose restart aiserver` - restarts a specific service
* `docker compose restart` - restarts all services
* `docker compose --profile whisper restart aiserver-whisper` - restarts the service with a specific profile

**Viewing resources:**

* `docker stats` - shows CPU, memory, and network usage for containers
* `nvidia-smi` - shows NVIDIA GPU usage

</details>

#### Possible Issues During Startup:

* **Containers do not start**: Check logs with `docker compose logs`
* **SSL issues**: Ensure the correctness of certificates
* **Database connection errors**: Check environment variables in `.env`
* **GPU issues**: Check CUDA\_VISIBLE\_DEVICES settings

After successfully starting and testing the system, the installation of Sherpa AI Server is complete.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.sherparpa.ru/en/sherpa-ai/sherpa-ai-server/ustanovka-sherpa-ai-server.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
