Tutorial 11 - CometML

View notebook on Github Open In Collab

This tutorial is adapted from a blog post titled “Getting Started with Comet ML” by Angelica Lo Duca.

Overview

CometML is an online experimentation platform for testing Machine Learning projects (similar to Neptune.ai, Guild.ai, etc.). Its main advantage is that it makes it very easy to build reporting dashboards and monitor Machine Learning projects.

CometML can be easily integrated with most popular ML libraries like scikit-learn, Keras, and others. The experiments can be written in Python, Javascript, Java, R, and REST APIs. In this tutorial we will only be discussing how to use the Python SDK, but the SDKs for other languages should be similar enough.

Features of CometML

  • Users can easily build and compare the results of different experiments for the same project.

  • The model can be monitored from the early stages up to debugging.

  • Makes collaborating with other project devs easy (note that this feature is not included with the free account).

  • Easily build reports and panels.

  • Share projects publicly through their platform.

Working with CometML

Step 1: Create a free account, log in, and create a new project.

Step 2: After login with your account, click “To the Quickstart Guide”.

Step 3: The webpage will then generate a snippet of code that is to be placed in the project you are working on (shown in the figure below).

Step 4: Install the Comet ML library using pip install comet_ml

Step 5: Copy and paste the second block of codes shown below in your Python project.

f12a9ea0a83742439449793306f9891c

29e81a5a2027402eb5e37deb8ab8cca6

The default “Project name” is “Uncategorized Experiments” as shown in the following figure from the CometML webpage. You can create a New project and assign it another name on the CometML webpage, and then all you need to do is change the “project_name” variable to the new project name.

b1756870e60e4bb382181495342b108c

Experiment is the core class of CometML. An Experiment represents a unit of measurable research that defines a single execution of code with some associated data; for example, training a model using a single set of hyperparameters. Use Experiment to log new data to the CometML UI.

An Experiment automatically logs scripts output (stdout/stderr), code, and command-line arguments on any script and for the supported libraries, also logs hyperparameters, metrics, and model configuration.

[ ]:
from comet_ml import Experiment
from comet_ml.integration.pytorch import log_model

experiment = Experiment(
  api_key="kPkWeGyLidS2HUoO5Zt0PdxzB",
  project_name="general",
  workspace="llz"
)

Example usage

To show how to use the library, let’s consider an example project using the heart attack dataset from Kaggle. The task for this dataset is to predict whether or not a patient has a high chance of heart attack, given several features including age, sex, resting blood pressure, etc.

[10]:
import pandas as pd

# Load data:
df = pd.read_csv('data/heart.csv')
df.head()
[10]:
age sex cp trtbps chol fbs restecg thalachh exng oldpeak slp caa thall output
0 63 1 3 145 233 1 0 150 0 2.3 0 0 1 1
1 37 1 2 130 250 0 1 187 0 3.5 0 0 2 1
2 41 0 1 130 204 0 0 172 0 1.4 2 0 2 1
3 56 1 1 120 236 0 1 178 0 0.8 2 0 2 1
4 57 0 0 120 354 0 1 163 1 0.6 2 0 2 1
[11]:
# separate data into features and targets
x = df.drop(columns=['output'])
y = df['output']
[ ]:
x.head()
age sex cp trtbps chol fbs restecg thalachh exng oldpeak slp caa thall
0 63 1 3 145 233 1 0 150 0 2.3 0 0 1
1 37 1 2 130 250 0 1 187 0 3.5 0 0 2
2 41 0 1 130 204 0 0 172 0 1.4 2 0 2
3 56 1 1 120 236 0 1 178 0 0.8 2 0 2
4 57 0 0 120 354 0 1 163 1 0.6 2 0 2
[12]:
y.head()
[12]:
0    1
1    1
2    1
3    1
4    1
Name: output, dtype: int64
[13]:
from sklearn.model_selection import train_test_split

# create train/test split
x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=42)
[14]:
from sklearn.preprocessing import MinMaxScaler

# scale inputs
scaler = MinMaxScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)

We will show an example of logging confusion matrix information with CometML. There are numerous logging applications in Experiment Reference.

b96ffb127cf64819ab528c9623037535

[15]:
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import classification_report
import numpy as np

# create model:
model = DecisionTreeClassifier(random_state=42, max_depth=2)
min_samples = 5

target_names = ['class 0', 'class 1']

# examine how the number of training samples affects model performance:
for step in range(min_samples, len(x_train)):

    model.fit(x_train[:step], y_train[:step])

    y_pred = model.predict(x_test)
    report = classification_report(y_test, y_pred, target_names=target_names, output_dict=True)

    for label, metric in report.items():
        try:
            experiment.log_metrics(metric, prefix=label, step=step)

        except:
            experiment.log_metric(label, metric, step=step)

    experiment.log_confusion_matrix(y_test.tolist(), y_pred.tolist())

experiment.display(tab='confusion-matrices')
experiment.end()
COMET INFO: ---------------------------------------------------------------------------------------
COMET INFO: Comet.ml Experiment Summary
COMET INFO: ---------------------------------------------------------------------------------------
COMET INFO:   Data:
COMET INFO:     display_summary_level : 1
COMET INFO:     url                   : https://www.comet.com/llz/general/3076c8951afd4185a68717788a151ffb
COMET INFO:   Metrics [count] (min, max):
COMET INFO:     accuracy [222]               : (0.39473684210526316, 0.7631578947368421)
COMET INFO:     class 0_f1-score [222]       : (0.37837837837837845, 0.7272727272727272)
COMET INFO:     class 0_precision [222]      : (0.358974358974359, 0.8421052631578947)
COMET INFO:     class 0_recall [222]         : (0.34285714285714286, 0.9714285714285714)
COMET INFO:     class 0_support              : 35
COMET INFO:     class 1_f1-score [222]       : (0.046511627906976744, 0.7906976744186047)
COMET INFO:     class 1_precision [222]      : (0.43243243243243246, 0.8064516129032258)
COMET INFO:     class 1_recall [222]         : (0.024390243902439025, 0.926829268292683)
COMET INFO:     class 1_support              : 41
COMET INFO:     macro avg_f1-score [222]     : (0.33518241945807553, 0.7589852008456659)
COMET INFO:     macro avg_precision [222]    : (0.39570339570339574, 0.7648745519713261)
COMET INFO:     macro avg_recall [222]       : (0.3951219512195122, 0.7574912891986063)
COMET INFO:     macro avg_support            : 76
COMET INFO:     weighted avg_f1-score [222]  : (0.31239262012509406, 0.7614888171803716)
COMET INFO:     weighted avg_precision [222] : (0.3986030564977934, 0.7641388417279759)
COMET INFO:     weighted avg_recall [222]    : (0.39473684210526316, 0.7631578947368421)
COMET INFO:     weighted avg_support         : 76
COMET INFO:   Parameters:
COMET INFO:     ccp_alpha                : 0.0
COMET INFO:     class_weight             : 1
COMET INFO:     clip                     : False
COMET INFO:     copy                     : True
COMET INFO:     criterion                : gini
COMET INFO:     feature_range            : (0, 1)
COMET INFO:     max_depth                : 2
COMET INFO:     max_features             : 1
COMET INFO:     max_leaf_nodes           : 1
COMET INFO:     min_impurity_decrease    : 0.0
COMET INFO:     min_samples_leaf         : 1
COMET INFO:     min_samples_split        : 2
COMET INFO:     min_weight_fraction_leaf : 0.0
COMET INFO:     random_state             : 42
COMET INFO:     splitter                 : best
COMET INFO:   Uploads:
COMET INFO:     confusion-matrix    : 222
COMET INFO:     environment details : 1
COMET INFO:     filename            : 1
COMET INFO:     installed packages  : 1
COMET INFO:     notebook            : 2
COMET INFO:     os packages         : 1
COMET INFO:     source_code         : 1
COMET INFO:
COMET INFO: Please wait for metadata to finish uploading (timeout is 3600 seconds)
COMET INFO: Uploading 3154 metrics, params and output messages
COMET INFO: Uploading 2036 metrics, params and output messages

Reference

  1. Blank, D. (2023, November 9). Overview - Comet Docs. https://www.comet.com/docs/v2/api-and-sdk/python-sdk/reference/Experiment/

BACK TO TOP