api-server

Ilab API Server

Overview

This is an Ilab API Server that is a temporary set of APIs for service developing apps against InstructLab. It provides endpoints for model management, data generation, training, job tracking and job logging.

Quickstart

Prerequisites

Ensure that the required directories (base-dir and taxonomy-path) exist and are accessible and Go is installed in the $PATH.

Install Dependencies

To install the necessary dependencies, run:

# gcc in $PATH required for sqlite
go mod download

Run the Server

Build

go build -o ilab-api-server

For macOS with Metal (MPS)

./ilab-api-server --base-dir /path/to/base-dir --taxonomy-path /path/to/taxonomy --osx  --pipeline simple
# Or the full pipeline
./ilab-api-server --base-dir /path/to/base-dir --taxonomy-path /path/to/taxonomy --osx  --pipeline full

For CUDA-enabled environments

Since the device type is cuda, only the accelerated pipeline option is available and set as the default.

./ilab-api-server --base-dir /path/to/base-dir --taxonomy-path /path/to/taxonomy --cuda

For a RHEL AI machine

If you're operating on a Red Hat Enterprise Linux AI (RHEL AI) machine, and the ilab binary is already available in your $PATH, you don't need to specify the --base-dir. Additionally, pass CUDA support with --cuda. The accelerated pipeline is the only option here and also the default.

./ilab-api-server --taxonomy-path ~/.local/share/instructlab/taxonomy/ --rhelai --cuda

The --rhelai flag indicates that the ilab binary is available in the system's $PATH and does not require a virtual environment. When using --rhelai, the --base-dir flag is not required since it will be in a known location at least for meow.

RHELAI API server easy-install

It is recommended that you make a temporary directory for yourself to facilitate the install: mkdir -p ~/temp-apiserver-install && cd ~/temp-apiserver-install.

We have provided some scripts that should facilitate installation of the API server on RHEL-AI. First, we will download and run a script to install glibc-devel as a dependency and reboot the system.

bash -c "$(curl -fsSL https://raw.githubusercontent.com/instructlab/ui/refs/heads/main/api-server/rhelai-install/install-glibc-devel.sh)"

After the reboot has finished we can download and run the rhelai-install.sh script. Make sure to return to your directory before you start:

cd ~/temp-apiserver-install
bash -c  "$(curl -fsSL https://raw.githubusercontent.com/instructlab/ui/refs/heads/main/api-server/rhelai-install/rhelai-install.sh)"

After this, we can cleanup our temp directory as it is no longer required: rm -rf ~/temp-apiserver-install.

Example command with paths

Here's an example command for running the server on a macOS machine with Metal support and debugging enabled:

./ilab-api-server --base-dir /Users/<USERNAME>/<PATH_TO_ILAB>/instructlab/ --taxonomy-path ~/.local/share/instructlab/taxonomy/ --pipeline simple --osx --debug

API Documentation

Models

Get Models

Endpoint: GET /models
Fetches the list of available models.

Response:

[
  {
    "name": "model-name",
    "last_modified": "timestamp",
    "size": "size-string"
  }
]

Data

Get Data

Endpoint: GET /data
Fetches the list of datasets.

Response:

[
  {
    "dataset": "dataset-name",
    "created_at": "timestamp",
    "file_size": "size-string"
  }
]

Generate Data

Endpoint: POST /data/generate
Starts a data generation job.

Request: None
Response:
```
{
  "job_id": "generated-job-id"
}
```

Jobs

List Jobs

Endpoint: GET /jobs
Fetches the list of all jobs.

Response:

[
  {
    "job_id": "job-id",
    "status": "running/finished/failed",
    "cmd": "command",
    "branch": "branch-name",
    "start_time": "timestamp",
    "end_time": "timestamp"
  }
]

Job Status

Endpoint: GET /jobs/{job_id}/status
Fetches the status of a specific job.

Response:

{
  "job_id": "job-id",
  "status": "running/finished/failed",
  "branch": "branch-name",
  "command": "command"
}

Job Logs

Endpoint: GET /jobs/{job_id}/logs
Fetches the logs of a specific job.

Response:
Text logs of the job.

Training

Start Training

Endpoint: POST /model/train
Starts a training job.

Request:
```
{
  "modelName": "name-of-the-model",
  "branchName": "name-of-the-branch",
  "epochs": 10
}
```
Parameters:
- modelName (string, required): The name of the model. Can be provided with or without the models/ prefix.
  - Examples:
    - Without prefix: "granite-7b-lab-Q4_K_M.gguf"
    - With prefix: "models/granite-7b-starter"
- branchName (string, required): The name of the branch to train on.
- epochs (integer, optional): The number of training epochs. Must be a positive integer.
Response:
```
{
  "job_id": "training-job-id"
}
```

Pipeline

Generate and Train Pipeline

Endpoint: POST /pipeline/generate-train
Combines data generation and training into a single pipeline job.

Request:
```
{
  "modelName": "name-of-the-model",
  "branchName": "name-of-the-branch",
  "epochs": 10
}
```
Parameters:
- modelName (string, required): The name of the model. Can be provided with or without the models/ prefix.
  - Examples:
    - Without prefix: "granite-7b-lab-Q4_K_M.gguf"
    - With prefix: "models/granite-7b-starter"
- branchName (string, required): The name of the branch to train on.
- epochs (integer, optional): The number of training epochs. Must be a positive integer.

Response:

{
  "pipeline_job_id": "pipeline-job-id"
}

Model Serving

Serve Latest Checkpoint

Endpoint: POST /model/serve-latest
Serves the latest model checkpoint on port 8001.

Request:
```
{
  "checkpoint": "samples_12345"
}
```
Parameters:
- checkpoint (string, optional): Name of the checkpoint directory (e.g., "samples_12345"). If omitted, the server uses the latest checkpoint.

Response:

{
  "status": "model process started",
  "job_id": "serve-job-id"
}

Serve Base Model

Endpoint: POST /model/serve-base
Serves the base model on port 8000.

Request: None

Response:

{
  "status": "model process started",
  "job_id": "serve-job-id"
}

QnA Evaluation

Run QnA Evaluation

Endpoint: POST /qna-eval
Performs QnA evaluation using a specified model and YAML configuration.

Request:
```
{
  "model_path": "/path/to/model",
  "yaml_file": "/path/to/config.yaml"
}
```
Parameters:
- model_path (string, required): The file path to the model.
- yaml_file (string, required): The file path to the YAML configuration.

Response:

Success:
```
{
  "result": "evaluation results..."
}
```
Error:
```
{
  "error": "error message"
}
```

Checkpoints

List Checkpoints

Endpoint: GET /checkpoints
Lists all available checkpoints.

Response:

[
  "checkpoint1",
  "checkpoint2",
  "checkpoint3"
]

VLLM

List VLLM Containers

Endpoint: GET /vllm-containers
Fetches the list of VLLM containers.

Response:

{
  "containers": [
    {
      "container_id": "container-id-1",
      "served_model_name": "pre-train",
      "status": "running",
      "port": "8000"
    },
    {
      "container_id": "container-id-2",
      "served_model_name": "post-train",
      "status": "running",
      "port": "8001"
    }
  ]
}

Unload VLLM Container

Endpoint: POST /vllm-unload
Unloads a specific VLLM container.

Request:
```
{
  "model_name": "pre-train"
}
```

Response:

{
  "status": "success",
  "message": "Model 'pre-train' unloaded successfully",
  "modelName": "pre-train"
}

Error Response:

{
  "error": "Failed to unload model 'pre-train': error details..."
}

VLLM Status

Endpoint: GET /vllm-status
Fetches the status of a specific VLLM model.

Query Parameters:
- model_name (string, required): The name of the model. Must be either "pre-train" or "post-train".
Response:
```
{
  "status": "running"
}
```

GPU Information

GPU Free

Endpoint: GET /gpu-free
Retrieves the number of free and total GPUs available.

Response:
```
{
  "free_gpus": 2,
  "total_gpus": 4
}
```

Name		Name	Last commit message	Last commit date
parent directory ..
qna-eval		qna-eval
rhelai-install		rhelai-install
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
convert-model.go		convert-model.go
go.mod		go.mod
go.sum		go.sum
handlers.go		handlers.go
jobs.db		jobs.db
jobs.go		jobs.go
main.go		main.go
utils.go		utils.go
vllm-serve.go		vllm-serve.go
zap.go		zap.go

FilesExpand file tree

api-server

Directory actions

More options

Directory actions

More options

Latest commit

History

api-server

Folders and files

parent directory

README.md

Ilab API Server

Overview

Quickstart

Prerequisites

Install Dependencies

Run the Server

Build

For macOS with Metal (MPS)

For CUDA-enabled environments

For a RHEL AI machine

RHELAI API server easy-install

Example command with paths

API Documentation

Models

Get Models

Data

Get Data

Generate Data

Jobs

List Jobs

Job Status

Job Logs

Training

Start Training

Pipeline

Generate and Train Pipeline

Model Serving

Serve Latest Checkpoint

Serve Base Model

QnA Evaluation

Run QnA Evaluation

Checkpoints

List Checkpoints

VLLM

List VLLM Containers

Unload VLLM Container

VLLM Status

GPU Information

GPU Free