SDK API Reference
Complete API documentation for the DeepFix SDK client library.
DeepFixClient
Main client for interacting with the DeepFix server.
This client provides a high-level interface for diagnosing ML datasets, ingesting data with quality checks, and leveraging AI-powered recommendations to improve your ML workflows.
Attributes:
| Name | Type | Description |
|---|---|---|
mlflow_config |
MLflowConfig
|
Configuration for MLflow integration. |
api_url |
str
|
Base URL of the DeepFix server. |
timeout |
int
|
Request timeout in seconds. |
Source code in deepfix-sdk\src\deepfix_sdk\client.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 | |
__init__(api_url='http://localhost:8844', mlflow_config=None, artifact_config=None, timeout=30)
Initialize the DeepFixClient.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
api_url
|
str
|
URL of the DeepFix server. Defaults to "http://localhost:8844". |
'http://localhost:8844'
|
mlflow_config
|
MLflowConfig
|
MLflow configuration for experiment tracking. If not provided, a default MLflowConfig is created. Defaults to None. |
None
|
artifact_config
|
ArtifactConfig
|
Artifact cache configuration used to discover stored datasets/models. Defaults to None. |
None
|
timeout
|
int
|
Request timeout in seconds. Defaults to 30. |
30
|
Example
client = DeepFixClient( ... api_url="http://localhost:8844", ... timeout=120 ... )
Source code in deepfix-sdk\src\deepfix_sdk\client.py
diagnose(dataset_name, language='english', model_name=None)
Analyze a run and return diagnostic results with recommendations.
This method performs a comprehensive analysis of the specified run to identify potential issues, quality problems, and provides AI-powered recommendations for improvement.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_name
|
str
|
Name of the dataset to analyze. Must match a dataset that has been previously ingested. |
required |
language
|
str
|
Language for analysis output. Defaults to "english". |
'english'
|
model_name
|
str
|
Name of the model. Defaults to None. |
None
|
Returns: APIResponse: Response object containing: - Analysis results and findings - Quality metrics - Actionable recommendations - Dataset statistics
Raises:
| Type | Description |
|---|---|
ValueError
|
If dataset artifacts cannot be found for the specified dataset. |
Exception
|
If the analysis request fails (non-200 status code). |
Example
response = client.diagnose(dataset_name="my-dataset") print(response.to_text())
Source code in deepfix-sdk\src\deepfix_sdk\client.py
get_dataset_names(status=None)
Convenience method returning only dataset names for UI dropdowns.
Source code in deepfix-sdk\src\deepfix_sdk\client.py
get_diagnosis(train_data, test_data=None, model=None, model_name=None, batch_size=8, language='english')
Ingest and diagnose a model in a single operation.
This convenience method combines ingestion and diagnosis into a single call. It first ingests the dataset and model (if provided), then immediately runs diagnosis on them to get analysis results and recommendations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
train_data
|
BaseDataset
|
Training dataset to ingest. Must be an instance of an appropriate dataset class (e.g., ImageClassificationDataset, TabularDataset, NLPDataset). |
required |
test_data
|
BaseDataset
|
Test/validation dataset. If provided, enables cross-dataset validation checks. Defaults to None. |
None
|
model
|
Any
|
Model to ingest. Must be an instance of a model class. Defaults to None. |
None
|
model_name
|
str
|
Name of the model. Defaults to None. |
None
|
batch_size
|
int
|
Batch size for processing the dataset. Defaults to 8. |
8
|
language
|
str
|
Language for analysis output. Defaults to "english". |
'english'
|
Returns:
| Name | Type | Description |
|---|---|---|
APIResponse |
APIResponse
|
Response object containing: - Analysis results and findings - Actionable recommendations |
Raises:
| Type | Description |
|---|---|
ValueError
|
If dataset with the same name exists and overwrite=False, or if dataset artifacts cannot be found after ingestion. |
Exception
|
If ingestion fails, or if the analysis request fails (non-200 status code). |
Example
from deepfix_sdk.data import TabularDataset import pandas as pd df = pd.read_csv("train.csv") label = "target" cat_features = ["cat_feature1", "cat_feature2"] dataset_name = "my-dataset" train_dataset = TabularDataset(dataset=df, dataset_name=dataset_name, label=label, cat_features=cat_features) response = client.get_diagnosis( ... model_name="my-model", ... train_data=train_dataset, ... batch_size=16 ... ) print(response.to_text())
Source code in deepfix-sdk\src\deepfix_sdk\client.py
ingest(train_data, test_data=None, model=None, model_name=None, batch_size=8, overwrite=False)
Ingest a dataset with optional quality validation.
This method uploads a dataset to the DeepFix server and optionally performs validation checks on the data. Supports multiple data types including images, tabular data, NLP text, and general vision datasets.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
train_data
|
BaseDataset
|
Training dataset to ingest. Must be an instance of an appropriate dataset class (e.g., ImageClassificationDataset, TabularDataset, NLPDataset). The dataset name is extracted from the dataset_name attribute of this object. |
required |
test_data
|
BaseDataset
|
Test/validation dataset. If provided, enables cross-dataset validation checks. Defaults to None. |
None
|
model
|
Any
|
Model to ingest. Must be an instance of a model class. Defaults to None. |
None
|
model_name
|
str
|
Name of the model. Defaults to None. |
None
|
batch_size
|
int
|
Batch size for processing the dataset. Defaults to 8. |
8
|
overwrite
|
bool
|
If True, overwrite existing dataset with the same name. If False, raise an error if dataset exists. Defaults to False. |
False
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If dataset with the same name exists and overwrite=False. |
Exception
|
If data validation fails or ingestion fails. |
Example
from deepfix_sdk.data.datasets import TabularDataset import pandas as pd df = pd.read_csv("train.csv") train_dataset = TabularDataset( ... dataset_name="my-dataset", ... data=df ... ) client.ingest( ... train_data=train_dataset, ... batch_size=16 ... )
Source code in deepfix-sdk\src\deepfix_sdk\client.py
list_datasets(status=None)
List datasets that have been ingested and are available for diagnosis.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
status
|
ArtifactStatus | str | None
|
Optional filter by artifact status. |
None
|
Returns:
| Type | Description |
|---|---|
list[dict[str, Any]]
|
List of dictionaries describing available datasets. Each record contains: - dataset_name: Registered run/dataset name. - status: Artifact registration status. - mlflow_run_id: Associated MLflow run, if any. - local_path: Path to cached artifact on disk, if downloaded. - updated_at / created_at: ISO8601 timestamps for auditing. |
Source code in deepfix-sdk\src\deepfix_sdk\client.py
Configuration
MLflowConfig
Bases: BaseModel
Configuration for MLflow integration.
Attributes:
| Name | Type | Description |
|---|---|---|
tracking_uri |
str
|
MLflow tracking server URI. Must start with http://, https://, or file://. |
run_id |
Optional[str]
|
Optional MLflow run ID to analyze. |
download_dir |
str
|
Local directory for downloading artifacts. |
create_run_if_not_exists |
bool
|
Whether to create the run if it doesn't exist. Defaults to False. |
experiment_name |
str
|
MLflow experiment name for deepfix. |
trace_dspy |
bool
|
Whether to trace dspy requests. Defaults to True. |
Source code in deepfix-sdk\src\deepfix_sdk\config.py
validate_tracking_uri(v)
classmethod
Validate tracking URI format.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
str
|
Tracking URI string to validate. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Validated tracking URI. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If URI doesn't start with http://, https://, or file://. |
Source code in deepfix-sdk\src\deepfix_sdk\config.py
ArtifactConfig
Bases: BaseModel
Configuration for artifact management.
Attributes:
| Name | Type | Description |
|---|---|---|
load_training |
bool
|
Whether to load training artifacts. Defaults to False. |
load_checks |
bool
|
Whether to load Deepchecks artifacts. Defaults to True. |
load_dataset_metadata |
bool
|
Whether to load dataset metadata. Defaults to True. |
load_model_checkpoint |
bool
|
Whether to load model checkpoint. Defaults to True. |
download_if_missing |
bool
|
Whether to download artifacts if not locally cached. Defaults to True. |
cache_enabled |
bool
|
Whether to enable local caching. Defaults to True. |
sqlite_path |
str
|
Path to SQLite database for artifact caching. |
Source code in deepfix-sdk\src\deepfix_sdk\config.py
Datasets
BaseDataset
Bases: Protocol
Source code in deepfix-sdk\src\deepfix_sdk\data\datasets.py
ImageClassificationDataset
Bases: VisionDataset
Source code in deepfix-sdk\src\deepfix_sdk\data\datasets.py
TabularDataset
Bases: BaseDataset
Source code in deepfix-sdk\src\deepfix_sdk\data\datasets.py
NLPDataset
Bases: BaseDataset
Source code in deepfix-sdk\src\deepfix_sdk\data\datasets.py
Pipelines
ArtifactLoadingPipeline
Bases: Pipeline
Source code in deepfix-sdk\src\deepfix_sdk\pipelines\factory.py
Examples
See the Quickstart Guide for usage examples.