model_evaluation.model_evaluation_plotting
model_evaluation.model_evaluation_plotting(
pipeline,
X_test,
y_test,
display_labels=None,
)
Compute classification metrics and generate confusion matrix visualizations.
Parameters
| pipeline |
sklearn estimator or sklearn.pipeline.Pipeline |
Fitted model object that implements .predict() and .score() methods. |
required |
| X_test |
pandas.DataFrame |
Test feature matrix. Must not contain NaN values. |
required |
| y_test |
array-like of shape (n_samples,) |
True labels for test data. Can be pandas Series, numpy array, or list. Must not contain NaN values. |
required |
| display_labels |
array-like of shape (n_classes,) |
Target class labels for confusion matrix display. If None, uses numeric labels. |
None |
Returns
| metrics |
dict |
Dictionary containing: - ‘accuracy’ : float - Classification accuracy on test data - ‘f2’ : float - F2 score (beta=2) - ‘y_pred’ : numpy.ndarray - Predicted labels |
| cm_table |
pandas.DataFrame |
Confusion matrix as a crosstab (rows=true labels, columns=predicted labels). |
| cm_display |
sklearn.metrics.ConfusionMatrixDisplay |
Confusion matrix display object for visualization. |
Raises
|
TypeError |
If pipeline is not fitted, or if input types are invalid. |
|
ValueError |
If X_test and y_test have different lengths, or contain NaN values. |
Examples
>>> from sklearn.datasets import load_iris
>>> from sklearn.model_selection import train_test_split
>>> from sklearn.svm import SVC
>>> import pandas as pd
>>>
>>> X, y = load_iris(return_X_y=True)
>>> X = pd.DataFrame(X)
>>> y = pd.Series(y)
>>> X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
>>>
>>> model = SVC().fit(X_train, y_train)
>>> metrics, cm_table, cm_display = model_evaluation_plotting(
... model, X_test, y_test,
... display_labels=['setosa', 'versicolor', 'virginica']
... )
>>>
>>> print(metrics['accuracy'])
0.9667
>>> print(cm_table)
Predicted 0 1 2
Actual
0 10 0 0
1 0 9 1
2 0 0 10
>>> cm_display.plot() # Shows confusion matrix visualization
Notes
- The model must be fitted before calling this function
- F2 score emphasizes recall over precision (beta=2)
- For binary classification, uses pos_label=“Y”