Requirements
Command-line Tools
To provision and manage a Dynamo AI cluster, you will need:
- Terraform – (If using Terraform for cluster or infrastructure creation.)
- Helm – For installing and managing Dynamo AI in Kubernetes.
- Docker – For container image management.
- Kubectl – For interacting with the Kubernetes cluster.
- Cloud-specific CLI (optional, but typically needed if provisioning or configuring infrastructure directly):
- AWS CLI
- Azure CLI
- gcloud CLI
- oc CLI for OpenShift (optional).
Dependent Services
-
Container Registry: A container registry is required to store Dynamo AI container images in the deployment. Dynamo AI images will be copied to the container registry for the cluster to retrieve:
-
Model Storage: A storage service is required for storing Dynamo AI models. There are three options to achieve this:
- Option 1: Granting your cluster access to HuggingFace to download the images directly from the Dynamo AI repo.
- Option 2: Using in-cluster MinIO to store the images. For this option, PersistenVolume must be configured in the Kubernetes cluster.
- Option 3: Copy models to an external service the cluster can access, such as AWS S3.
-
Utility LLM: A Utility LLM is required for advanced Dynamo AI features, such as synthetic data generation in the custom policy creation workflow. There are two options for supporting this:
- Option 1: Leverage an existing LLM service. Dynamo AI can leverage the following services as utility LLM:
- OpenAI (GPT-3.5, GPT-4, GPT-4o)
- Mistral
- Databricks
- Sufficiently performant LLM service (minimum Llama 8B) compliant with API schemas above.
- Option 2: Deploy the Utility LLM inside the same cluster. This requires provisioning additional GPU resources.
- Option 1: Leverage an existing LLM service. Dynamo AI can leverage the following services as utility LLM:
-
PostgreSQL Database (Optional): Dynamo AI offers an in-cluster PostgreSQL database. Alternatively, integrate with an external managed PostgreSQL database for enhanced data management.
-
Target AI System (Post deployment): Integrate your AI system to evaluate or run guardrail against. This integration typically occurs after cluster deployment. If using DynamoEval or DynamoGuard in Managed Inference Mode, the cluster needs to call your target AI system through API.
- External AI Systems: If your AI system is OpenAI, Azure OpenAI, Databricks, mistral.ai, AWS Bedrock, or its API is compliant with the OpenAI schema, Dynamo AI supports out-of-box integration in the UI via the "Connect AI System: and "Remote Model" options.
- Local AI Systems: If the AI system is within the same cluster, configure it in the UI via the "Connect AI System" and "Local model".
GPU Requirements
GPU requirements by Feature
Feature | A10, A10g or L4 GPU | A100 GPU (40GB Accelerator-memory) | No GPU |
---|---|---|---|
Basic API and UI Servers | None | None | 32 vCPUs and 128GB memory |
DynamoGuard Input Content Guardrails (i.e. toxicity, “no financial advice”, etc.), Prompt Injection Guardrails, PII Guardrails | 10 input guardrails per GPU | 20 input guardrails per GPU | 8 vCPU, 16 GB memory per input guardrail |
DynamoGuard Output Content Guardrails (toxicity, “no financial advice”, etc.) | 1 output guardrail per GPU | 2 output guardrails per GPU | Not supported |
DynamoGuard Hallucination Guardrails | Option 1: 2 GPUs to support retrieval relevance + faithfulness capabilities; Option 2: 3 GPUs to support retrieval relevance + faithfulness + response relevance capabilities | Option 1: 1 GPU to support retrieval relevance + faithfulness capabilities; Option 2: 1+1/2 GPUs to support retrieval relevance + faithfulness + response relevance capabilities | You can use third-party model endpoints (e.g., OpenAI or Mistral) to run Hallucination guardrails |
DynamoGuard Fine-Tuning Custom Content Guardrails (i.e. creating your custom No Financial Advice, No Legal advice, etc. guardrails) | 2 GPUs (on the same node) | 1 GPU | You can use Dynamo AI SaaS platform to fine-tune the custom content policies and import the models to your cluster |
DynamoGuard Synthetic Data Generation for creating Custom Content Guardrails | 1 GPU for lower-quailty in-cluster synthetic data generation | Option 1: 1/2 GPU for lower-quailty in-cluster synthetic data generation; Option 2: 4 GPUs for highest quality in-cluster synthetic data generation | You can leverage an external LLM as an alternative to hosting the model in-cluster |
DynamoEval In-Cluster Judge-Model (required to run most evaluations) | 1 GPU | 1/2 GPU | Not supported |
Note:
- The listed requirements may change as we develop new models. Please contact Dynamo AI if you would like to get the most recent information.
- The listed requirements are the lowest required resources to run the models. In production, you may need more resources to achieve high performance. Dynamo AI supports auto-scaling based on traffic volume.