rhapsody.train package¶

This subpackage contains modules for training Rhapsody classifiers and assess their accuracy.

rhapsody.train.calcScoreMetrics(y_test, y_pred, bootstrap=0, **resample_kwargs)[source]¶: Compute accuracy metrics of continuous values (optionally bootstrapped)

rhapsody.train.calcClassMetrics(y_test, y_pred, bootstrap=0, **resample_kwargs)[source]¶: Compute accuracy metrics of binary labels (optionally bootstrapped)

rhapsody.train.calcPathogenicityProbs(CV_info, num_bins=15, ppred_reliability_cutoff=200, pred_distrib_fig='predictions_distribution.png', path_prob_fig='pathogenicity_prob.png', **kwargs)[source]¶: Compute pathogenicity probabilities, from predictions on CV test sets

rhapsody.train.RandomForestCV(feat_matrix, n_estimators=1500, max_features=2, **kwargs)[source]¶

rhapsody.train.trainRFclassifier(feat_matrix, n_estimators=1500, max_features=2, pickle_name='trained_classifier.pkl', feat_imp_fig='feat_importances.png', **kwargs)[source]¶

rhapsody.train.extendDefaultTrainingDataset(names, arrays, base_default_featset='full')[source]¶

base : array Input array to extend.

names : string, sequence String or sequence of strings corresponding to the names of the new fields.

data : array or sequence of arrays Array or sequence of arrays storing the fields to add to the base.

rhapsody.train.print_pred_distrib_figure(filename, bins, histo, dx, J_opt)[source]¶

rhapsody.train.print_path_prob_figure(filename, bins, histo, dx, path_prob, smooth_plot=None, cutoff=200)[source]¶

rhapsody.train.print_ROC_figure(filename, fpr, tpr, auc_stat)[source]¶

rhapsody.train.print_feat_imp_figure(filename, feat_imp, featset)[source]¶

Submodules¶

This module defines functions for training Random Forest classifiers implementing Rhapsody’s classification schemes.

rhapsody.train.RFtraining.calcScoreMetrics(y_test, y_pred, bootstrap=0, **resample_kwargs)[source]¶: Compute accuracy metrics of continuous values (optionally bootstrapped)

rhapsody.train.RFtraining.calcClassMetrics(y_test, y_pred, bootstrap=0, **resample_kwargs)[source]¶: Compute accuracy metrics of binary labels (optionally bootstrapped)

rhapsody.train.RFtraining.calcPathogenicityProbs(CV_info, num_bins=15, ppred_reliability_cutoff=200, pred_distrib_fig='predictions_distribution.png', path_prob_fig='pathogenicity_prob.png', **kwargs)[source]¶: Compute pathogenicity probabilities, from predictions on CV test sets

rhapsody.train.RFtraining.RandomForestCV(feat_matrix, n_estimators=1500, max_features=2, **kwargs)[source]¶

rhapsody.train.RFtraining.trainRFclassifier(feat_matrix, n_estimators=1500, max_features=2, pickle_name='trained_classifier.pkl', feat_imp_fig='feat_importances.png', **kwargs)[source]¶

rhapsody.train.RFtraining.extendDefaultTrainingDataset(names, arrays, base_default_featset='full')[source]¶

base : array Input array to extend.

names : string, sequence String or sequence of strings corresponding to the names of the new fields.

data : array or sequence of arrays Array or sequence of arrays storing the fields to add to the base.

This module defines functions for generating figures summarizing results from the training process.

rhapsody.train.figures.print_pred_distrib_figure(filename, bins, histo, dx, J_opt)[source]¶

rhapsody.train.figures.print_path_prob_figure(filename, bins, histo, dx, path_prob, smooth_plot=None, cutoff=200)[source]¶

rhapsody.train.figures.print_ROC_figure(filename, fpr, tpr, auc_stat)[source]¶

rhapsody.train.figures.print_feat_imp_figure(filename, feat_imp, featset)[source]¶