mlresearch.datasets
.MultiClassDatasets¶
- class mlresearch.datasets.MultiClassDatasets(names: str | list = 'all', data_home: str = None, download_if_missing: bool = True)[source]¶
Class to download, transform and save multi-class datasets.
- download(keep_index=False)¶
Download the datasets.
- fetch_gesture_segmentation()[source]¶
Download and transform the Gesture Phase Segmentation Data Set.
- fetch_mfeat_zernike()[source]¶
Download and transform the Multiple Features Dataset: Zernike Data Set.
- fetch_pendigits()[source]¶
Download and transform the Pen-Based Recognition of Handwritten Digits Data Set.
- fetch_vehicle()[source]¶
Download and transform the Vehicle Silhouettes Data Set.
https://archive.ics.uci.edu/ml/datasets/Statlog+(Vehicle+Silhouettes)
- fetch_waveform()[source]¶
Download and transform the Waveform Database Generator (version 2) Data Set.
- imbalance_datasets(imbalance_ratio: float, random_state: int = None)¶
Appends imbalanced versions of datasets with predefined imbalance ratios to
self.content_
.\[IR = \frac{|C_{maj}|}{|C_{min}|}\]- Parameters:
- imbalance_ratiofloat
Final Imbalance Ratio expected in the datasets.
- random_stateint, RandomState instance, default=None
Control the randomization of the algorithm.
If int,
random_state
is the seed used by the random number generator;If
RandomState
instance, random_state is the random number generator;If
None
, the random number generator is theRandomState
instance used bynp.random
.
- Returns:
- selfDatasets
- items()¶
- keys()¶
- save(path, db_name)¶
Save datasets.
- summarize_datasets()¶
Create a summary of the downloaded datasets.
- Returns:
- datasets_summarypd.DataFrame
Dataframe with summary statistics of all datasets.
- values()¶