mlresearch.datasets.ContinuousCategoricalDatasets

class mlresearch.datasets.ContinuousCategoricalDatasets(names: str | list = 'all', data_home: str = None, download_if_missing: bool = True)[source]

Class to download, transform and save datasets with both continuous and categorical features.


download()[source]

Download the datasets.

fetch_abalone()[source]

Download and transform the Abalone Data Set.

https://archive.ics.uci.edu/ml/datasets/Abalone

fetch_acute()[source]

Download and transform the Acute Inflammations Data Set.

https://archive.ics.uci.edu/ml/datasets/Acute+Inflammations

fetch_adult()[source]

Download and transform the Adult Data Set.

https://archive.ics.uci.edu/ml/datasets/Adult

fetch_annealing()[source]

Download and transform the Annealing Data Set.

https://archive.ics.uci.edu/ml/datasets/Annealing

fetch_census()[source]

Download and transform the Census-Income (KDD) Data Set.

https://archive.ics.uci.edu/dataset/117/census+income+kdd

fetch_contraceptive()[source]

Download and transform the Contraceptive Method Choice Data Set.

https://archive.ics.uci.edu/ml/datasets/Contraceptive+Method+Choice

fetch_covertype()[source]

Download and transform the Covertype Data Set.

https://archive.ics.uci.edu/ml/datasets/Covertype

fetch_credit_approval()[source]

Download and transform the Credit Approval Data Set.

https://archive.ics.uci.edu/ml/datasets/Credit+Approval

fetch_dermatology()[source]

Download and transform the Dermatology Data Set.

https://archive.ics.uci.edu/ml/datasets/Dermatology

fetch_echocardiogram()[source]

Download and transform the Echocardiogram Data Set.

https://archive.ics.uci.edu/ml/datasets/Echocardiogram

fetch_flags()[source]

Download and transform the Flags Data Set.

https://archive.ics.uci.edu/ml/datasets/Flags

fetch_german_credit()[source]

Download and transform the German Credit Data Set.

https://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29

fetch_heart()[source]

Download and transform the Heart Data Set.

http://archive.ics.uci.edu/ml/datasets/statlog+(heart)

fetch_heart_disease()[source]

Download and transform the Heart Disease Data Set.

https://archive.ics.uci.edu/ml/datasets/Heart+Disease

fetch_hepatitis()[source]

Download and transform the Hepatitis Data Set.

https://archive.ics.uci.edu/ml/datasets/Hepatitis

fetch_thyroid()[source]

Download and transform the Thyroid Disease Data Set. Label 0 corresponds to no disease found. Label 1 corresponds to one or multiple diseases found.

https://archive.ics.uci.edu/ml/datasets/Thyroid+Disease

imbalance_datasets(imbalance_ratio: float, random_state: int = None)

Appends imbalanced versions of datasets with predefined imbalance ratios to self.content_.

\[IR = \frac{|C_{maj}|}{|C_{min}|}\]
Parameters:
imbalance_ratiofloat

Final Imbalance Ratio expected in the datasets.

random_stateint, RandomState instance, default=None

Control the randomization of the algorithm.

  • If int, random_state is the seed used by the random number generator;

  • If RandomState instance, random_state is the random number generator;

  • If None, the random number generator is the RandomState instance used by np.random.

Returns:
selfDatasets
items()
keys()
save(path, db_name)

Save datasets.

summarize_datasets()[source]

Create a summary of the downloaded datasets.

Returns:
datasets_summarypd.DataFrame

Dataframe with summary statistics of all datasets.

values()