SDK API Reference
Complete API documentation for the DeepFix SDK client library.
DeepFixClient
Main client for interacting with the DeepFix server.
This client provides a high-level interface for diagnosing ML datasets, ingesting data with quality checks, and leveraging AI-powered recommendations to improve your ML workflows.
Attributes:
| Name | Type | Description |
|---|---|---|
mlflow_config |
MLflowConfig
|
Configuration for MLflow integration. |
api_url |
str
|
Base URL of the DeepFix server. |
timeout |
int
|
Request timeout in seconds. |
Source code in deepfix-sdk\src\deepfix_sdk\client.py
| |
__init__(api_url='http://localhost:8844', mlflow_config=None, artifact_config=None, timeout=30)
Initialize the DeepFixClient.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
api_url
|
str
|
URL of the DeepFix server. Defaults to "http://localhost:8844". |
'http://localhost:8844'
|
mlflow_config
|
MLflowConfig
|
MLflow configuration for experiment tracking. If not provided, a default MLflowConfig is created. Defaults to None. |
None
|
artifact_config
|
ArtifactConfig
|
Artifact cache configuration used to discover stored datasets/models. Defaults to None. |
None
|
timeout
|
int
|
Request timeout in seconds. Defaults to 30. |
30
|
Example
client = DeepFixClient( ... api_url="http://localhost:8844", ... timeout=120 ... )
Source code in deepfix-sdk\src\deepfix_sdk\client.py
diagnose(dataset_name, language='english', model_name=None)
Analyze a run and return diagnostic results with recommendations.
This method performs a comprehensive analysis of the specified run to identify potential issues, quality problems, and provides AI-powered recommendations for improvement.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_name
|
str
|
Name of the dataset to analyze. Must match a dataset that has been previously ingested. |
required |
language
|
str
|
Language for analysis output. Defaults to "english". |
'english'
|
model_name
|
str
|
Name of the model. Defaults to None. |
None
|
Returns: APIResponse: Response object containing: - Analysis results and findings - Quality metrics - Actionable recommendations - Dataset statistics
Raises:
| Type | Description |
|---|---|
ValueError
|
If dataset artifacts cannot be found for the specified dataset. |
Exception
|
If the analysis request fails (non-200 status code). |
Example
response = client.diagnose(dataset_name="my-dataset") print(response.to_text())
Source code in deepfix-sdk\src\deepfix_sdk\client.py
get_dataset_names(status=None)
Convenience method returning only dataset names for UI dropdowns.
Source code in deepfix-sdk\src\deepfix_sdk\client.py
get_diagnosis(train_data, test_data=None, model=None, model_name=None, batch_size=8, language='english')
Ingest and diagnose a model in a single operation.
This convenience method combines ingestion and diagnosis into a single call. It first ingests the dataset and model (if provided), then immediately runs diagnosis on them to get analysis results and recommendations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
train_data
|
BaseDataset
|
Training dataset to ingest. Must be an instance of an appropriate dataset class (e.g., ImageClassificationDataset, TabularDataset, NLPDataset). |
required |
test_data
|
BaseDataset
|
Test/validation dataset. If provided, enables cross-dataset validation checks. Defaults to None. |
None
|
model
|
Any
|
Model to ingest. Must be an instance of a model class. Defaults to None. |
None
|
model_name
|
str
|
Name of the model. Defaults to None. |
None
|
batch_size
|
int
|
Batch size for processing the dataset. Defaults to 8. |
8
|
language
|
str
|
Language for analysis output. Defaults to "english". |
'english'
|
Returns:
| Name | Type | Description |
|---|---|---|
APIResponse |
APIResponse
|
Response object containing: - Analysis results and findings - Actionable recommendations |
Raises:
| Type | Description |
|---|---|
ValueError
|
If dataset with the same name exists and overwrite=False, or if dataset artifacts cannot be found after ingestion. |
Exception
|
If ingestion fails, or if the analysis request fails (non-200 status code). |
Example
from deepfix_sdk.data import TabularDataset import pandas as pd df = pd.read_csv("train.csv") label = "target" cat_features = ["cat_feature1", "cat_feature2"] dataset_name = "my-dataset" train_dataset = TabularDataset(dataset=df, dataset_name=dataset_name, label=label, cat_features=cat_features) response = client.get_diagnosis( ... model_name="my-model", ... train_data=train_dataset, ... batch_size=16 ... ) print(response.to_text())
Source code in deepfix-sdk\src\deepfix_sdk\client.py
ingest(train_data, test_data=None, model=None, model_name=None, batch_size=8, overwrite=False)
Ingest a dataset with optional quality validation.
This method uploads a dataset to the DeepFix server and optionally performs validation checks on the data. Supports multiple data types including images, tabular data, NLP text, and general vision datasets.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
train_data
|
BaseDataset
|
Training dataset to ingest. Must be an instance of an appropriate dataset class (e.g., ImageClassificationDataset, TabularDataset, NLPDataset). The dataset name is extracted from the dataset_name attribute of this object. |
required |
test_data
|
BaseDataset
|
Test/validation dataset. If provided, enables cross-dataset validation checks. Defaults to None. |
None
|
model
|
Any
|
Model to ingest. Must be an instance of a model class. Defaults to None. |
None
|
model_name
|
str
|
Name of the model. Defaults to None. |
None
|
batch_size
|
int
|
Batch size for processing the dataset. Defaults to 8. |
8
|
overwrite
|
bool
|
If True, overwrite existing dataset with the same name. If False, raise an error if dataset exists. Defaults to False. |
False
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If dataset with the same name exists and overwrite=False. |
Exception
|
If data validation fails or ingestion fails. |
Example
from deepfix_sdk.data.datasets import TabularDataset import pandas as pd df = pd.read_csv("train.csv") train_dataset = TabularDataset( ... dataset_name="my-dataset", ... data=df ... ) client.ingest( ... train_data=train_dataset, ... batch_size=16 ... )
Source code in deepfix-sdk\src\deepfix_sdk\client.py
list_datasets(status=None)
List datasets that have been ingested and are available for diagnosis.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
status
|
ArtifactStatus | str | None
|
Optional filter by artifact status. |
None
|
Returns:
| Type | Description |
|---|---|
list[dict[str, Any]]
|
List of dictionaries describing available datasets. Each record contains: - dataset_name: Registered run/dataset name. - status: Artifact registration status. - mlflow_run_id: Associated MLflow run, if any. - local_path: Path to cached artifact on disk, if downloaded. - updated_at / created_at: ISO8601 timestamps for auditing. |
Source code in deepfix-sdk\src\deepfix_sdk\client.py
Configuration
MLflowConfig
Bases: BaseModel
Configuration for MLflow integration.
Attributes:
| Name | Type | Description |
|---|---|---|
tracking_uri |
str
|
MLflow tracking server URI. Must start with http://, https://, or file://. |
run_id |
Optional[str]
|
Optional MLflow run ID to analyze. |
download_dir |
str
|
Local directory for downloading artifacts. |
create_run_if_not_exists |
bool
|
Whether to create the run if it doesn't exist. Defaults to False. |
experiment_name |
str
|
MLflow experiment name for deepfix. |
trace_dspy |
bool
|
Whether to trace dspy requests. Defaults to True. |
Source code in deepfix-sdk\src\deepfix_sdk\config.py
validate_tracking_uri(v)
classmethod
Validate tracking URI format.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
str
|
Tracking URI string to validate. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Validated tracking URI. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If URI doesn't start with http://, https://, or file://. |
Source code in deepfix-sdk\src\deepfix_sdk\config.py
ArtifactConfig
Bases: BaseModel
Configuration for artifact management.
Attributes:
| Name | Type | Description |
|---|---|---|
load_training |
bool
|
Whether to load training artifacts. Defaults to False. |
load_checks |
bool
|
Whether to load Deepchecks artifacts. Defaults to True. |
load_dataset_metadata |
bool
|
Whether to load dataset metadata. Defaults to True. |
load_model_checkpoint |
bool
|
Whether to load model checkpoint. Defaults to True. |
download_if_missing |
bool
|
Whether to download artifacts if not locally cached. Defaults to True. |
cache_enabled |
bool
|
Whether to enable local caching. Defaults to True. |
sqlite_path |
str
|
Path to SQLite database for artifact caching. |
Source code in deepfix-sdk\src\deepfix_sdk\config.py
Datasets
BaseDataset
Bases: Protocol
Source code in deepfix-sdk\src\deepfix_sdk\data\datasets.py
ImageClassificationDataset
Bases: VisionDataset
Source code in deepfix-sdk\src\deepfix_sdk\data\datasets.py
TabularDataset
Bases: BaseDataset
Source code in deepfix-sdk\src\deepfix_sdk\data\datasets.py
NLPDataset
Bases: BaseDataset
Source code in deepfix-sdk\src\deepfix_sdk\data\datasets.py
Pipelines
ArtifactLoadingPipeline
Bases: Pipeline
Source code in deepfix-sdk\src\deepfix_sdk\pipelines\factory.py
Examples
See the Quickstart Guide for usage examples.