mlresearch.datasets
.BinaryDatasets¶
- class mlresearch.datasets.BinaryDatasets(names: str | list = 'all', data_home: str = None, download_if_missing: bool = True)[source]¶
Class to download, transform and save binary class datasets.
- download(keep_index=False)¶
Download the datasets.
- fetch_banknote_authentication()[source]¶
Download and transform the Banknote Authentication Data Set.
https://archive.ics.uci.edu/ml/datasets/banknote+authentication
- fetch_breast_cancer()[source]¶
Download and transform the Breast Cancer Wisconsin Data Set.
https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)
- imbalance_datasets(imbalance_ratio: float, random_state: int = None)¶
Appends imbalanced versions of datasets with predefined imbalance ratios to
self.content_
.\[IR = \frac{|C_{maj}|}{|C_{min}|}\]- Parameters:
- imbalance_ratiofloat
Final Imbalance Ratio expected in the datasets.
- random_stateint, RandomState instance, default=None
Control the randomization of the algorithm.
If int,
random_state
is the seed used by the random number generator;If
RandomState
instance, random_state is the random number generator;If
None
, the random number generator is theRandomState
instance used bynp.random
.
- Returns:
- selfDatasets
- items()¶
- keys()¶
- save(path, db_name)¶
Save datasets.
- summarize_datasets()¶
Create a summary of the downloaded datasets.
- Returns:
- datasets_summarypd.DataFrame
Dataframe with summary statistics of all datasets.
- values()¶