rhapsody.train package

This subpackage contains modules for training Rhapsody classifiers and assess their accuracy.

rhapsody.train.calcScoreMetrics(y_test, y_pred, bootstrap=0, **resample_kwargs)[source]

Compute accuracy metrics of continuous values (optionally bootstrapped)

rhapsody.train.calcClassMetrics(y_test, y_pred, bootstrap=0, **resample_kwargs)[source]

Compute accuracy metrics of binary labels (optionally bootstrapped)

rhapsody.train.calcPathogenicityProbs(CV_info, num_bins=15, ppred_reliability_cutoff=200, pred_distrib_fig='predictions_distribution.png', path_prob_fig='pathogenicity_prob.png', **kwargs)[source]

Compute pathogenicity probabilities, from predictions on CV test sets

rhapsody.train.RandomForestCV(feat_matrix, n_estimators=1500, max_features=2, **kwargs)[source]
rhapsody.train.trainRFclassifier(feat_matrix, n_estimators=1500, max_features=2, pickle_name='trained_classifier.pkl', feat_imp_fig='feat_importances.png', **kwargs)[source]
rhapsody.train.extendDefaultTrainingDataset(names, arrays, base_default_featset='full')[source]

base : array Input array to extend.

names : string, sequence String or sequence of strings corresponding to the names of the new fields.

data : array or sequence of arrays Array or sequence of arrays storing the fields to add to the base.

rhapsody.train.print_pred_distrib_figure(filename, bins, histo, dx, J_opt)[source]
rhapsody.train.print_path_prob_figure(filename, bins, histo, dx, path_prob, smooth_plot=None, cutoff=200)[source]
rhapsody.train.print_ROC_figure(filename, fpr, tpr, auc_stat)[source]
rhapsody.train.print_feat_imp_figure(filename, feat_imp, featset)[source]

Submodules

rhapsody.train.RFtraining module

This module defines functions for training Random Forest classifiers implementing Rhapsody’s classification schemes.

rhapsody.train.RFtraining.calcScoreMetrics(y_test, y_pred, bootstrap=0, **resample_kwargs)[source]

Compute accuracy metrics of continuous values (optionally bootstrapped)

rhapsody.train.RFtraining.calcClassMetrics(y_test, y_pred, bootstrap=0, **resample_kwargs)[source]

Compute accuracy metrics of binary labels (optionally bootstrapped)

rhapsody.train.RFtraining.calcPathogenicityProbs(CV_info, num_bins=15, ppred_reliability_cutoff=200, pred_distrib_fig='predictions_distribution.png', path_prob_fig='pathogenicity_prob.png', **kwargs)[source]

Compute pathogenicity probabilities, from predictions on CV test sets

rhapsody.train.RFtraining.RandomForestCV(feat_matrix, n_estimators=1500, max_features=2, **kwargs)[source]
rhapsody.train.RFtraining.trainRFclassifier(feat_matrix, n_estimators=1500, max_features=2, pickle_name='trained_classifier.pkl', feat_imp_fig='feat_importances.png', **kwargs)[source]
rhapsody.train.RFtraining.extendDefaultTrainingDataset(names, arrays, base_default_featset='full')[source]

base : array Input array to extend.

names : string, sequence String or sequence of strings corresponding to the names of the new fields.

data : array or sequence of arrays Array or sequence of arrays storing the fields to add to the base.

rhapsody.train.figures module

This module defines functions for generating figures summarizing results from the training process.

rhapsody.train.figures.print_pred_distrib_figure(filename, bins, histo, dx, J_opt)[source]
rhapsody.train.figures.print_path_prob_figure(filename, bins, histo, dx, path_prob, smooth_plot=None, cutoff=200)[source]
rhapsody.train.figures.print_ROC_figure(filename, fpr, tpr, auc_stat)[source]
rhapsody.train.figures.print_feat_imp_figure(filename, feat_imp, featset)[source]