Quickstart Guide
Get started with DeepFix in minutes. This guide will walk you through the basic workflow of ingesting a dataset and running diagnostics.
Prerequisites
- DeepFix installed (see Installation Guide)
- DeepFix server running (see Server Setup)
- Python 3.11 or higher
Server Setup
Before using the SDK, start the DeepFix server:
# Navigate to the server directory
cd deepfix-server
# Create .env file with LLM configuration (if not exists)
# See Configuration Guide for details
# Launch the server
uv run deepfix-server launch -e .env -port 8844 -host 127.0.0.1
The server will start and display: Starting DeepFix server on 127.0.0.1:8844
Basic Workflow
The typical DeepFix workflow consists of two steps:
- Initialize Client: Create a client connection to the server
- Get Diagnosis: Upload your dataset and run AI-powered analysis in one call
Example 1: Image Classification Dataset
This example shows how to diagnose an image classification dataset.
from deepfix_sdk.client import DeepFixClient
from deepfix_sdk.zoo.datasets.foodwaste import load_train_and_val_datasets
from deepfix_sdk.data.datasets import ImageClassificationDataset
# Step 1: Initialize client
client = DeepFixClient(
api_url="http://localhost:8844",
timeout=120 # Increase timeout for large datasets
)
# Step 2: Load and wrap dataset
dataset_name = "cafetaria-foodwaste"
train_data, val_data = load_train_and_val_datasets(
image_size=448,
batch_size=8,
num_workers=4,
pin_memory=False
)
# Wrap datasets for DeepFix
train_dataset = ImageClassificationDataset(
dataset_name=dataset_name,
dataset=train_data
)
val_dataset = ImageClassificationDataset(
dataset_name=dataset_name,
dataset=val_data
)
# Step 3: Get diagnosis (ingests and diagnoses in one call)
result = client.get_diagnosis(
train_data=train_dataset,
test_data=val_dataset,
batch_size=8,
language="english"
)
# Step 4: View results
print(result.to_text())
Example 2: Tabular Dataset
This example shows how to work with tabular (structured) data.
import pandas as pd
from deepfix_sdk.client import DeepFixClient
from deepfix_sdk.data.datasets import TabularDataset
# Step 1: Initialize client
client = DeepFixClient(api_url="http://localhost:8844")
# Step 2: Load tabular data
df_train = pd.read_csv("train_data.csv")
df_test = pd.read_csv("test_data.csv")
# Step 3: Wrap datasets
train_dataset = TabularDataset(
dataset_name="my-tabular-dataset",
data=df_train
)
test_dataset = TabularDataset(
dataset_name="my-tabular-dataset",
data=df_test
)
# Step 4: Get diagnosis (ingests and diagnoses in one call)
result = client.get_diagnosis(
train_data=train_dataset,
test_data=test_dataset
)
print(result.to_text())
Example 3: NLP Dataset
This example shows how to work with natural language processing datasets.
from datasets import load_dataset
from deepfix_sdk.client import DeepFixClient
from deepfix_sdk.data.datasets import NLPDataset
# Step 1: Initialize client
client = DeepFixClient(api_url="http://localhost:8844")
# Step 2: Load NLP dataset
train_data = load_dataset("imdb", split="train")
test_data = load_dataset("imdb", split="test")
# Step 3: Wrap datasets
train_dataset = NLPDataset(
dataset_name="imdb-sentiment",
dataset=train_data
)
test_dataset = NLPDataset(
dataset_name="imdb-sentiment",
dataset=test_data
)
# Step 4: Get diagnosis (ingests and diagnoses in one call)
result = client.get_diagnosis(
train_data=train_dataset,
test_data=test_dataset,
batch_size=16
)
print(result.to_text())
Understanding Results
The get_diagnosis() method returns an APIResponse object containing:
- Agent Results: Detailed findings from each specialized analyzer agent
- Summary: Cross-artifact summary synthesizing all insights
- Recommendations: Prioritized suggestions for improvement
Working with Results
# Get formatted text output
result_text = result.to_text()
print(result_text)
# Access agent-specific results
for agent_name, agent_result in result.agent_results.items():
print(f"\n{agent_name}:")
print(agent_result.findings)
# Get summary
print(f"\nSummary: {result.summary}")
# Access additional outputs
if result.additional_outputs:
print(f"\nAdditional outputs: {result.additional_outputs}")
Common Patterns
Using MLflow Integration
from deepfix_sdk.client import DeepFixClient
from deepfix_sdk.config import MLflowConfig
# Configure MLflow
mlflow_config = MLflowConfig(
tracking_uri="http://localhost:5000",
experiment_name="my-experiment",
run_name="run-1"
)
# Initialize client with MLflow
client = DeepFixClient(
api_url="http://localhost:8844",
mlflow_config=mlflow_config
)
# Get diagnosis - results automatically tracked in MLflow
result = client.get_diagnosis(
train_data=train_dataset,
test_data=test_dataset
)
Handling Large Datasets
# Increase timeout for large datasets
client = DeepFixClient(
api_url="http://localhost:8844",
timeout=300 # 5 minutes
)
# Use smaller batch sizes for memory constraints
result = client.get_diagnosis(
train_data=train_dataset,
batch_size=4 # Reduce batch size
)
Overwriting Existing Datasets
# get_diagnosis automatically overwrites existing datasets with the same name
result = client.get_diagnosis(
train_data=train_dataset,
test_data=test_dataset
)
Next Steps
- Image Classification Guide - Deep dive into image classification
- Tabular Data Guide - Advanced tabular data workflows
- NLP Datasets Guide - Working with NLP datasets
- MLflow Integration - Integrate with MLflow
- API Reference - Complete API documentation
- Configuration Guide - Configure DeepFix for your needs
Troubleshooting
Connection Errors
Problem: Cannot connect to server
# Solution: Verify server is running and URL is correct
client = DeepFixClient(api_url="http://localhost:8844")
# Test connection by checking server status
Timeout Errors
Problem: Requests timing out
# Solution: Increase timeout
client = DeepFixClient(
api_url="http://localhost:8844",
timeout=120 # Increase timeout
)
Dataset Not Found
Problem: Dataset name not found
# Solution: Use get_diagnosis which handles ingestion automatically
result = client.get_diagnosis(
train_data=train_dataset,
test_data=test_dataset
)
See the Troubleshooting section for more issues and solutions.