This is an Ilab API Server that is a temporary set of APIs for service developing apps against InstructLab. It provides endpoints for model management, data generation, training, job tracking and job logging.
- Ensure that the required directories (
base-dirandtaxonomy-path) exist and are accessible and Go is installed in the $PATH.
To install the necessary dependencies, run:
# gcc in $PATH required for sqlite
go mod downloadgo build -o ilab-api-server./ilab-api-server --base-dir /path/to/base-dir --taxonomy-path /path/to/taxonomy --osx --pipeline simple
# Or the full pipeline
./ilab-api-server --base-dir /path/to/base-dir --taxonomy-path /path/to/taxonomy --osx --pipeline fullSince the device type is cuda, only the accelerated pipeline option is available and set as the default.
./ilab-api-server --base-dir /path/to/base-dir --taxonomy-path /path/to/taxonomy --cuda- If you're operating on a Red Hat Enterprise Linux AI (RHEL AI) machine, and the ilab binary is already available in your $PATH, you don't need to specify the --base-dir. Additionally, pass CUDA support with
--cuda. Theacceleratedpipeline is the only option here and also the default.
./ilab-api-server --taxonomy-path ~/.local/share/instructlab/taxonomy/ --rhelai --cudaThe --rhelai flag indicates that the ilab binary is available in the system's $PATH and does not require a virtual environment.
When using --rhelai, the --base-dir flag is not required since it will be in a known location at least for meow.
It is recommended that you make a temporary directory for yourself to facilitate the install: mkdir -p ~/temp-apiserver-install && cd ~/temp-apiserver-install.
We have provided some scripts that should facilitate installation of the API server on RHEL-AI. First, we will download and run a script to install glibc-devel as a dependency and reboot the system.
bash -c "$(curl -fsSL https://raw.githubusercontent.com/instructlab/ui/refs/heads/main/api-server/rhelai-install/install-glibc-devel.sh)"After the reboot has finished we can download and run the rhelai-install.sh script. Make sure to return to your directory before you start:
cd ~/temp-apiserver-install
bash -c "$(curl -fsSL https://raw.githubusercontent.com/instructlab/ui/refs/heads/main/api-server/rhelai-install/rhelai-install.sh)"After this, we can cleanup our temp directory as it is no longer required: rm -rf ~/temp-apiserver-install.
Here's an example command for running the server on a macOS machine with Metal support and debugging enabled:
./ilab-api-server --base-dir /Users/<USERNAME>/<PATH_TO_ILAB>/instructlab/ --taxonomy-path ~/.local/share/instructlab/taxonomy/ --pipeline simple --osx --debugEndpoint: GET /models
Fetches the list of available models.
-
Response:
[ { "name": "model-name", "last_modified": "timestamp", "size": "size-string" } ]
Endpoint: GET /data
Fetches the list of datasets.
-
Response:
[ { "dataset": "dataset-name", "created_at": "timestamp", "file_size": "size-string" } ]
Endpoint: POST /data/generate
Starts a data generation job.
-
Request: None
-
Response:
{ "job_id": "generated-job-id" }
Endpoint: GET /jobs
Fetches the list of all jobs.
-
Response:
[ { "job_id": "job-id", "status": "running/finished/failed", "cmd": "command", "branch": "branch-name", "start_time": "timestamp", "end_time": "timestamp" } ]
Endpoint: GET /jobs/{job_id}/status
Fetches the status of a specific job.
-
Response:
{ "job_id": "job-id", "status": "running/finished/failed", "branch": "branch-name", "command": "command" }
Endpoint: GET /jobs/{job_id}/logs
Fetches the logs of a specific job.
- Response:
Text logs of the job.
Endpoint: POST /model/train
Starts a training job.
-
Request:
{ "modelName": "name-of-the-model", "branchName": "name-of-the-branch", "epochs": 10 }Parameters:
modelName(string, required): The name of the model. Can be provided with or without themodels/prefix.- Examples:
- Without prefix:
"granite-7b-lab-Q4_K_M.gguf" - With prefix:
"models/granite-7b-starter"
- Without prefix:
- Examples:
branchName(string, required): The name of the branch to train on.epochs(integer, optional): The number of training epochs. Must be a positive integer.
-
Response:
{ "job_id": "training-job-id" }
Endpoint: POST /pipeline/generate-train
Combines data generation and training into a single pipeline job.
-
Request:
{ "modelName": "name-of-the-model", "branchName": "name-of-the-branch", "epochs": 10 }Parameters:
modelName(string, required): The name of the model. Can be provided with or without themodels/prefix.- Examples:
- Without prefix:
"granite-7b-lab-Q4_K_M.gguf" - With prefix:
"models/granite-7b-starter"
- Without prefix:
- Examples:
branchName(string, required): The name of the branch to train on.epochs(integer, optional): The number of training epochs. Must be a positive integer.
-
Response:
{ "pipeline_job_id": "pipeline-job-id" }
Endpoint: POST /model/serve-latest
Serves the latest model checkpoint on port 8001.
-
Request:
{ "checkpoint": "samples_12345" }Parameters:
checkpoint(string, optional): Name of the checkpoint directory (e.g.,"samples_12345"). If omitted, the server uses the latest checkpoint.
-
Response:
{ "status": "model process started", "job_id": "serve-job-id" }
Endpoint: POST /model/serve-base
Serves the base model on port 8000.
-
Request: None
-
Response:
{ "status": "model process started", "job_id": "serve-job-id" }
Endpoint: POST /qna-eval
Performs QnA evaluation using a specified model and YAML configuration.
-
Request:
{ "model_path": "/path/to/model", "yaml_file": "/path/to/config.yaml" }Parameters:
model_path(string, required): The file path to the model.yaml_file(string, required): The file path to the YAML configuration.
-
Response:
-
Success:
{ "result": "evaluation results..." } -
Error:
{ "error": "error message" }
-
Endpoint: GET /checkpoints
Lists all available checkpoints.
-
Response:
[ "checkpoint1", "checkpoint2", "checkpoint3" ]
Endpoint: GET /vllm-containers
Fetches the list of VLLM containers.
-
Response:
{ "containers": [ { "container_id": "container-id-1", "served_model_name": "pre-train", "status": "running", "port": "8000" }, { "container_id": "container-id-2", "served_model_name": "post-train", "status": "running", "port": "8001" } ] }
Endpoint: POST /vllm-unload
Unloads a specific VLLM container.
-
Request:
{ "model_name": "pre-train" } -
Response:
{ "status": "success", "message": "Model 'pre-train' unloaded successfully", "modelName": "pre-train" }Error Response:
{ "error": "Failed to unload model 'pre-train': error details..." }
Endpoint: GET /vllm-status
Fetches the status of a specific VLLM model.
-
Query Parameters:
model_name(string, required): The name of the model. Must be either"pre-train"or"post-train".
-
Response:
{ "status": "running" }
Endpoint: GET /gpu-free
Retrieves the number of free and total GPUs available.
-
Response:
{ "free_gpus": 2, "total_gpus": 4 }