Computing dataset base statistics: 100%|██████████| 215/215 [00:06<00:00, 33.33it/s]
Computing dataset base statistics: 100%|██████████| 160/160 [00:11<00:00, 14.06it/s]
# Visualize resultsresult.to_text()
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮│ DEEPFIX ANALYSIS RESULT │╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭────────────────────────────────────────────────────── Summary ───────────────────────────────────────────────────────╮│The cross-artifact analysis reveals catastrophic data quality issues that invalidate the current machine learning ││setup. The test set suffers from severe label distribution drift (Cramer's V=0.92) and contains 75% new labels not ││seen in training, indicating fundamental problems with data partitioning. Additionally, significant differences in ││image properties suggest inconsistent acquisition conditions. These issues collectively mean that any model ││evaluation would be unreliable. Immediate remediation of the data splitting methodology and image standardization is││required before proceeding with model development.│╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Summary Statistics Metric Value Total Findings 3 Severity Distribution HIGH: 2 MEDIUM: 1
HIGH Severity Issues (2) ┏━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓┃ # ┃ Finding ┃ Action ┃┡━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩│ 1 │ Critical data partitioning failure │ Immediately halt model development and │││ causing unrepresentative test set │ recreate the train-test split using │││Evidence: Combined evidence from │ proper stratified sampling techniques │││Deepchecks: Label drift check failed │The test set is fundamentally invalid │││with Cramer's V score of 0.92 (far │for evaluation due to severe label │││exceeding 0.15 threshold) and 75% of │distribution mismatch and leakage, │││test set labels were not present in │making any model performance metrics │││training│meaningless││ 2 │ Systematic differences in image │ Standardize image collection protocols │││ acquisition conditions between datasets │ and apply normalization techniques to │││Evidence: Multiple image property drift │ align visual characteristics │││failures: Brightness (KS=0.42), RMS │Large differences in brightness, │││Contrast (KS=0.5), Red Intensity │contrast, and color properties will │││(KS=0.83), Green Intensity (KS=0.82), │cause models to learn dataset-specific │││Blue Intensity (KS=0.96)│artifacts rather than generalizable ││││features│└─────┴──────────────────────────────────────────┴──────────────────────────────────────────┘
MEDIUM Severity Issues (1) ┏━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓┃ # ┃ Finding ┃ Action ┃┡━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩│ 1 │ Incomplete data quality assessment │ Implement comprehensive data validation │││ framework │ pipeline including outlier detection, │││Evidence: DatasetArtifactsAnalyzer │ label consistency checks, and metadata │││failed due to technical issues, and │ validation │││Deepchecks data integrity section was │Current assessment gaps prevent │││incomplete│identification of additional data ││││quality issues that could impact model ││││reliability and performance│└─────┴──────────────────────────────────────────┴──────────────────────────────────────────┘
Computing dataset base statistics: 100%|██████████| 1356/1356 [00:20<00:00, 65.22it/s]
Computing base box statistics: 100%|██████████| 1356/1356 [00:00<00:00, 307780.52it/s]
Computing dataset base statistics: 100%|██████████| 668/668 [00:08<00:00, 76.87it/s]
Computing base box statistics: 100%|██████████| 668/668 [00:00<?, ?it/s]
UserWarning: Properties that have class_id as output_type will be skipped.
UserWarning: Properties that have class_id as output_type will be skipped.
# Visualize resultsresult.to_text()
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮│ DEEPFIX ANALYSIS RESULT │╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭────────────────────────────────────────────────────── Summary ───────────────────────────────────────────────────────╮│The consolidated analysis indicates that the primary risk of overfitting stems from data quality issues identified ││by the Deepchecks analyzer, specifically an unstable feature-label relationship and significant brightness drift ││between train and test sets. These issues suggest the model may overfit to spurious correlations and specific ││lighting conditions. The failure of the DatasetArtifacts analyzer and the incomplete data integrity assessment in ││the Deepchecks results mean that the overall data quality picture is incomplete, presenting a secondary, ││medium-severity risk. Recommendations focus on addressing the identified instabilities and completing the missing ││data integrity checks to ensure the model learns robust, generalizable patterns.│╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Summary Statistics Metric Value Total Findings 2 Severity Distribution HIGH: 1 MEDIUM: 1
HIGH Severity Issues (1) ┏━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓┃ # ┃ Finding ┃ Action ┃┡━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩│ 1 │ Critical data quality and stability │ Prioritize mitigating the identified │││ issues identified as primary overfitting │ feature-label instability and brightness │││ risks │ drift. Investigate the technical error │││Evidence: Deepchecks analysis revealed a│ to enable a full DatasetArtifacts │││high-severity unstable feature-label │ analysis for a comprehensive view. │││relationship (PPS difference: 0.21) and │The model is at high risk of overfitting│││significant image brightness drift (KS │due to learning non-generalizable │││score: 0.29). The DatasetArtifacts │correlations and being sensitive to │││analysis was unavailable due to a system│lighting variations. A complete dataset │││error, preventing a complete dataset │analysis is needed to rule out other │││assessment.│underlying data issues.│└─────┴──────────────────────────────────────────┴──────────────────────────────────────────┘
MEDIUM Severity Issues (1) ┏━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓┃ # ┃ Finding ┃ Action ┃┡━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩│ 1 │ Incomplete data integrity validation │ Run a full suite of data integrity │││ obscures potential data quality issues │ checks, including outlier detection and │││Evidence: The Deepchecks analysis noted │ label validation, to identify any hidden │││an empty or missing data integrity │ data quality problems. │││section, leaving outlier detection and │Unassessed data integrity issues can │││label consistency unassessed. Combined │silently contribute to overfitting. A │││with the failed DatasetArtifacts │complete assessment is crucial for │││analysis, there is a gap in the overall │building a robust model.│││data quality evaluation.││└─────┴──────────────────────────────────────────┴──────────────────────────────────────────┘
Computing dataset base statistics: 100%|██████████| 48/48 [00:03<00:00, 14.42it/s]
Computing dataset base statistics: 100%|██████████| 49/49 [00:03<00:00, 14.83it/s]
UserWarning: Properties that have class_id as output_type will be skipped.
UserWarning: Properties that have class_id as output_type will be skipped.
# Visualize resultsresult.to_text()
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮│ DEEPFIX ANALYSIS RESULT │╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭────────────────────────────────────────────────────── Summary ───────────────────────────────────────────────────────╮│The analysis reveals critical data quality issues primarily identified through Deepchecks validation. The most ││severe problems involve significant distribution mismatches between training and test datasets, including class ││imbalance (0.17 categorical drift) and color property differences (red: 0.2, green: 0.24 drift scores). These ││distribution inconsistencies threaten model reliability and generalization. Additionally, gaps in the data ││validation framework (evidenced by incomplete integrity checks and analyzer failures) suggest a need for stronger ││quality assurance processes. Immediate attention should focus on rebalancing datasets and implementing comprehensive││validation to ensure model performance reflects true capabilities rather than dataset artifacts.│╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Summary Statistics Metric Value Total Findings 2 Severity Distribution HIGH: 1 MEDIUM: 1
HIGH Severity Issues (1) ┏━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓┃ # ┃ Finding ┃ Action ┃┡━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩│ 1 │ Critical data distribution │ Implement comprehensive data │││ inconsistencies between training and │ distribution analysis and rebalance │││ validation sets │ training/validation splits to ensure │││Evidence: Deepchecks analysis shows │ consistent class and feature │││categorical drift score of 0.17 for │ distributions │││'Samples Per Class' (exceeding 0.15 │Distribution mismatches between datasets│││threshold) and color property drifts │lead to unreliable model evaluation and │││(Mean Red: 0.2, Mean Green: 0.24 │poor generalization to real-world data│││exceeding 0.2 threshold), indicating ││││significant distribution mismatches││└─────┴──────────────────────────────────────────┴──────────────────────────────────────────┘
MEDIUM Severity Issues (1) ┏━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓┃ # ┃ Finding ┃ Action ┃┡━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩│ 1 │ Insufficient data quality validation │ Establish robust data validation │││ framework │ pipeline with comprehensive integrity │││Evidence: Deepchecks analysis indicates │ checks, outlier detection, and automated │││incomplete data integrity validation │ quality monitoring │││section, and DatasetArtifactsAnalyzer │Missing or incomplete validation │││failed due to technical issues, │increases risk of undetected data │││suggesting gaps in the data validation │quality issues that can compromise model│││pipeline│performance and reliability│└─────┴──────────────────────────────────────────┴──────────────────────────────────────────┘