import os
from deepfix_sdk import DeepFixClient

os.environ["DEEPFIX_API_KEY"] = "sk-empty"

client = DeepFixClient(api_url="https://deepfix.delcaux.com", timeout=120)

Computer vision

Image classification

from deepfix_sdk.data.datasets import ImageClassificationDataset
from deepfix_sdk.zoo.datasets.foodwaste import load_train_and_val_datasets

dataset_name = "cafetaria-foodwaste-lstroetmann"
# Load image datasets
train_data, val_data = load_train_and_val_datasets(
    image_size=448,
    batch_size=8,
    num_workers=4,
    pin_memory=False,
)
train_data = ImageClassificationDataset(dataset_name=dataset_name, dataset=train_data)
val_data = ImageClassificationDataset(dataset_name=dataset_name, dataset=val_data)

Getting label mapping: 100%|██████████| 375/375 [00:03<00:00, 107.85it/s]

result = client.get_diagnosis(
    train_data=train_data,
    test_data=val_data,
    language="english",
)

Computing dataset base statistics: 100%|██████████| 215/215 [00:06<00:00, 33.33it/s]
Computing dataset base statistics: 100%|██████████| 160/160 [00:11<00:00, 14.06it/s]

# Visualize results
result.to_text()

╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                               DEEPFIX ANALYSIS RESULT                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

╭────────────────────────────────────────────────────── Summary ───────────────────────────────────────────────────────╮
│ The cross-artifact analysis reveals catastrophic data quality issues that invalidate the current machine learning    │
│ setup. The test set suffers from severe label distribution drift (Cramer's V=0.92) and contains 75% new labels not   │
│ seen in training, indicating fundamental problems with data partitioning. Additionally, significant differences in   │
│ image properties suggest inconsistent acquisition conditions. These issues collectively mean that any model          │
│ evaluation would be unreliable. Immediate remediation of the data splitting methodology and image standardization is │
│ required before proceeding with model development.                                                                   │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

                                      Summary Statistics                                      
 Metric                          Value                                                        
 Total Findings                  3                                                            
 Severity Distribution           HIGH: 2  MEDIUM: 1

                                  HIGH Severity Issues (2)                                   
┏━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ #   ┃ Finding                                  ┃ Action                                   ┃
┡━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 1   │ Critical data partitioning failure       │ Immediately halt model development and   │
│     │ causing unrepresentative test set        │ recreate the train-test split using      │
│     │ Evidence: Combined evidence from         │ proper stratified sampling techniques    │
│     │ Deepchecks: Label drift check failed     │ The test set is fundamentally invalid    │
│     │ with Cramer's V score of 0.92 (far       │ for evaluation due to severe label       │
│     │ exceeding 0.15 threshold) and 75% of     │ distribution mismatch and leakage,       │
│     │ test set labels were not present in      │ making any model performance metrics     │
│     │ training                                 │ meaningless                              │
│ 2   │ Systematic differences in image          │ Standardize image collection protocols   │
│     │ acquisition conditions between datasets  │ and apply normalization techniques to    │
│     │ Evidence: Multiple image property drift  │ align visual characteristics             │
│     │ failures: Brightness (KS=0.42), RMS      │ Large differences in brightness,         │
│     │ Contrast (KS=0.5), Red Intensity         │ contrast, and color properties will      │
│     │ (KS=0.83), Green Intensity (KS=0.82),    │ cause models to learn dataset-specific   │
│     │ Blue Intensity (KS=0.96)                 │ artifacts rather than generalizable      │
│     │                                          │ features                                 │
└─────┴──────────────────────────────────────────┴──────────────────────────────────────────┘

                                 MEDIUM Severity Issues (1)                                  
┏━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ #   ┃ Finding                                  ┃ Action                                   ┃
┡━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 1   │ Incomplete data quality assessment       │ Implement comprehensive data validation  │
│     │ framework                                │ pipeline including outlier detection,    │
│     │ Evidence: DatasetArtifactsAnalyzer       │ label consistency checks, and metadata   │
│     │ failed due to technical issues, and      │ validation                               │
│     │ Deepchecks data integrity section was    │ Current assessment gaps prevent          │
│     │ incomplete                               │ identification of additional data        │
│     │                                          │ quality issues that could impact model   │
│     │                                          │ reliability and performance              │
└─────┴──────────────────────────────────────────┴──────────────────────────────────────────┘

''

Object detection

from deepfix_sdk.data.datasets import ObjectDetectionDataset

dataset_name = "general_dataset"
train_data = ObjectDetectionDataset.from_coco(
    dataset_name=dataset_name,
    images_directory_path=r"D:\workspace\general_dataset\coco\train",
    annotations_path=r"D:\workspace\general_dataset\coco\annotations\annotations_train.json",
)
val_data = ObjectDetectionDataset.from_coco(
    dataset_name=dataset_name,
    images_directory_path=r"D:\workspace\general_dataset\coco\val",
    annotations_path=r"D:\workspace\general_dataset\coco\annotations\annotations_val.json",
)

result = client.get_diagnosis(
    train_data=train_data,
    test_data=val_data,
    language="english",
)

Computing dataset base statistics: 100%|██████████| 1356/1356 [00:20<00:00, 65.22it/s]
Computing base box statistics: 100%|██████████| 1356/1356 [00:00<00:00, 307780.52it/s]
Computing dataset base statistics: 100%|██████████| 668/668 [00:08<00:00, 76.87it/s]
Computing base box statistics: 100%|██████████| 668/668 [00:00<?, ?it/s]

UserWarning: Properties that have class_id as output_type will be skipped.

UserWarning: Properties that have class_id as output_type will be skipped.

# Visualize results
result.to_text()

╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                               DEEPFIX ANALYSIS RESULT                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

╭────────────────────────────────────────────────────── Summary ───────────────────────────────────────────────────────╮
│ The consolidated analysis indicates that the primary risk of overfitting stems from data quality issues identified   │
│ by the Deepchecks analyzer, specifically an unstable feature-label relationship and significant brightness drift     │
│ between train and test sets. These issues suggest the model may overfit to spurious correlations and specific        │
│ lighting conditions. The failure of the DatasetArtifacts analyzer and the incomplete data integrity assessment in    │
│ the Deepchecks results mean that the overall data quality picture is incomplete, presenting a secondary,             │
│ medium-severity risk. Recommendations focus on addressing the identified instabilities and completing the missing    │
│ data integrity checks to ensure the model learns robust, generalizable patterns.                                     │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

                                      Summary Statistics                                      
 Metric                          Value                                                        
 Total Findings                  2                                                            
 Severity Distribution           HIGH: 1  MEDIUM: 1

                                  HIGH Severity Issues (1)                                   
┏━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ #   ┃ Finding                                  ┃ Action                                   ┃
┡━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 1   │ Critical data quality and stability      │ Prioritize mitigating the identified     │
│     │ issues identified as primary overfitting │ feature-label instability and brightness │
│     │ risks                                    │ drift. Investigate the technical error   │
│     │ Evidence: Deepchecks analysis revealed a │ to enable a full DatasetArtifacts        │
│     │ high-severity unstable feature-label     │ analysis for a comprehensive view.       │
│     │ relationship (PPS difference: 0.21) and  │ The model is at high risk of overfitting │
│     │ significant image brightness drift (KS   │ due to learning non-generalizable        │
│     │ score: 0.29). The DatasetArtifacts       │ correlations and being sensitive to      │
│     │ analysis was unavailable due to a system │ lighting variations. A complete dataset  │
│     │ error, preventing a complete dataset     │ analysis is needed to rule out other     │
│     │ assessment.                              │ underlying data issues.                  │
└─────┴──────────────────────────────────────────┴──────────────────────────────────────────┘

                                 MEDIUM Severity Issues (1)                                  
┏━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ #   ┃ Finding                                  ┃ Action                                   ┃
┡━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 1   │ Incomplete data integrity validation     │ Run a full suite of data integrity       │
│     │ obscures potential data quality issues   │ checks, including outlier detection and  │
│     │ Evidence: The Deepchecks analysis noted  │ label validation, to identify any hidden │
│     │ an empty or missing data integrity       │ data quality problems.                   │
│     │ section, leaving outlier detection and   │ Unassessed data integrity issues can     │
│     │ label consistency unassessed. Combined   │ silently contribute to overfitting. A    │
│     │ with the failed DatasetArtifacts         │ complete assessment is crucial for       │
│     │ analysis, there is a gap in the overall  │ building a robust model.                 │
│     │ data quality evaluation.                 │                                          │
└─────┴──────────────────────────────────────────┴──────────────────────────────────────────┘

''

Semantic segmentation

from deepfix_sdk.data.datasets import SemanticSegmentationDataset
from deepfix_sdk.zoo.datasets import load_segmentation_dataset

dataset_name = "coco_segmentation"
train_data, val_data = load_segmentation_dataset(
    batch_size=8,
    shuffle=False,
    pin_memory=False,
)
train_data = SemanticSegmentationDataset(
    dataset_name=dataset_name, dataset=train_data.dataset
)
val_data = SemanticSegmentationDataset(
    dataset_name=dataset_name, dataset=val_data.dataset
)

result = client.get_diagnosis(
    train_data=train_data,
    test_data=val_data,
    language="english",
)

Computing dataset base statistics: 100%|██████████| 48/48 [00:03<00:00, 14.42it/s]
Computing dataset base statistics: 100%|██████████| 49/49 [00:03<00:00, 14.83it/s]

UserWarning: Properties that have class_id as output_type will be skipped.

UserWarning: Properties that have class_id as output_type will be skipped.

# Visualize results
result.to_text()

╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                               DEEPFIX ANALYSIS RESULT                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

╭────────────────────────────────────────────────────── Summary ───────────────────────────────────────────────────────╮
│ The analysis reveals critical data quality issues primarily identified through Deepchecks validation. The most       │
│ severe problems involve significant distribution mismatches between training and test datasets, including class      │
│ imbalance (0.17 categorical drift) and color property differences (red: 0.2, green: 0.24 drift scores). These        │
│ distribution inconsistencies threaten model reliability and generalization. Additionally, gaps in the data           │
│ validation framework (evidenced by incomplete integrity checks and analyzer failures) suggest a need for stronger    │
│ quality assurance processes. Immediate attention should focus on rebalancing datasets and implementing comprehensive │
│ validation to ensure model performance reflects true capabilities rather than dataset artifacts.                     │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

                                      Summary Statistics                                      
 Metric                          Value                                                        
 Total Findings                  2                                                            
 Severity Distribution           HIGH: 1  MEDIUM: 1

                                  HIGH Severity Issues (1)                                   
┏━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ #   ┃ Finding                                  ┃ Action                                   ┃
┡━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 1   │ Critical data distribution               │ Implement comprehensive data             │
│     │ inconsistencies between training and     │ distribution analysis and rebalance      │
│     │ validation sets                          │ training/validation splits to ensure     │
│     │ Evidence: Deepchecks analysis shows      │ consistent class and feature             │
│     │ categorical drift score of 0.17 for      │ distributions                            │
│     │ 'Samples Per Class' (exceeding 0.15      │ Distribution mismatches between datasets │
│     │ threshold) and color property drifts     │ lead to unreliable model evaluation and  │
│     │ (Mean Red: 0.2, Mean Green: 0.24         │ poor generalization to real-world data   │
│     │ exceeding 0.2 threshold), indicating     │                                          │
│     │ significant distribution mismatches      │                                          │
└─────┴──────────────────────────────────────────┴──────────────────────────────────────────┘

                                 MEDIUM Severity Issues (1)                                  
┏━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ #   ┃ Finding                                  ┃ Action                                   ┃
┡━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 1   │ Insufficient data quality validation     │ Establish robust data validation         │
│     │ framework                                │ pipeline with comprehensive integrity    │
│     │ Evidence: Deepchecks analysis indicates  │ checks, outlier detection, and automated │
│     │ incomplete data integrity validation     │ quality monitoring                       │
│     │ section, and DatasetArtifactsAnalyzer    │ Missing or incomplete validation         │
│     │ failed due to technical issues,          │ increases risk of undetected data        │
│     │ suggesting gaps in the data validation   │ quality issues that can compromise model │
│     │ pipeline                                 │ performance and reliability              │
└─────┴──────────────────────────────────────────┴──────────────────────────────────────────┘

''