API¶

This is the full API documentation of the research package.

`research.active_learning`¶

This submodule contains the code developed for experiments related to Active Learning.

active_learning.ALWrapper([classifier, …])

Class to perform Active Learning experiments.

`active_learning.entropy`(unlabeled_ids, …)	Sample selection based on Entropy selection criterion.
`active_learning.breaking_ties`(unlabeled_ids, …)	Sample selection based on breaking ties selection criterion.
`active_learning.random`(unlabeled_ids, increment)	Random sample selection.

`research.datasets`¶

Download, transform and simulate various datasets.

These classes were extracted from the utils.py script from AlgoWit’s publications repo, to which I have also contributed.

Link to related repo: https://github.com/AlgoWit/publications

`datasets.Datasets`([names])	Class to download and save datasets.
`datasets.BinaryDatasets`([names])	Class to download, transform and save binary class datasets.
`datasets.ImbalancedBinaryDatasets`([names])	Class to download, transform and save binary class imbalanced datasets.
`datasets.ContinuousCategoricalDatasets`([names])	Class to download, transform and save datasets with both continuous and categorical features.
`datasets.RemoteSensingDatasets`([names, …])	Class to download, transform and save remote sensing datasets.

This submodule contains various performance metrics/scorers that are not included in scikit-learn’s scorers’ dictionary. Additionally, an expanded dictionary of scorers (as compared with scikit-learn’s) is also provided.

Parts of this code was taken from the utils.py script from AlgoWit’s publications repo, to which I have also contributed.

Link to related repo: https://github.com/AlgoWit/publications

`metrics.geometric_mean_score_macro`(y_true, …)	Geometric mean score with macro average.
`metrics.area_under_learning_curve`(…)	Area under the learning curve.
`metrics.data_utilization_rate`(test_scores, …)	Data Utilization Rate.

metrics.ALScorer(score_func[, sign])

Methods

`research.utils`¶

This submodule contains a variety of general utility functions as well as tools used to format and prepare tables to incorporate into LaTeX code.

Additionally, an expanded (as compared with scikit-learn’s) dictionary of scorers is also provided.

This code was taken from the utils.py script from AlgoWit’s publications repo, to which I have also contributed.

Link to related repo: https://github.com/AlgoWit/publications

`utils.generate_mean_std_tbl`(mean_vals, std_vals)	Generate table that combines mean and sem values.
`utils.generate_pvalues_tbl`(tbl)	Format p-values.
`utils.sort_tbl`(tbl[, ds_order, ovrs_order, …])	Sort tables rows and columns.
`utils.generate_paths`(filepath)	Generate data, results and analysis paths.
`utils.make_bold`(row[, maximum, …])	Make bold the lowest or highest value(s).
`utils.generate_mean_std_tbl_bold`(mean_vals, …)	Generate table that combines mean and sem values.
`utils.img_array_to_pandas`(X, y)	Converts an image as numpy array (with ground truth) to a pandas dataframe
`utils.load_datasets`(data_dir[, suffix, …])	Load datasets from sqlite database and/or csv files.
`utils.check_pipelines`(objects_list, …)	Extract estimators and parameters grids.
`utils.check_pipelines_wrapper`(objects_list, …)
`utils.load_plt_sns_configs`([font_size])	Load LaTeX style configurations for Matplotlib/Seaborn Visualizations.
`utils.val_to_color`(col[, cmap])	Converts a column of values to hex-type colors.

API¶

research.active_learning¶

research.datasets¶

research.metrics¶

research.utils¶

`research.active_learning`¶

`research.datasets`¶

`research.metrics`¶

`research.utils`¶