mlresearch.datasets.MultiClassDatasets

class mlresearch.datasets.MultiClassDatasets(names: str | list = 'all', data_home: str = None, download_if_missing: bool = True)[source]

Class to download, transform and save multi-class datasets.


download(keep_index=False)

Download the datasets.

fetch_asp_potassco()[source]

Download and transform the ASP-POTASSCO Data Set.

https://www.openml.org/d/41705

fetch_autouniv_au4()[source]

Download and transform the AutoUniv au4 Data Set

https://www.openml.org/d/1548

fetch_autouniv_au7()[source]

Download and transform the AutoUniv au7 Data Set

https://www.openml.org/d/1552

fetch_baseball()[source]

Download and transform the Baseball Hall of Fame Data Set.

https://www.openml.org/d/185

fetch_cardiotocography()[source]

Download and transform the Cardiotocography Data Set.

https://www.openml.org/d/1560

fetch_first_order_theorem()[source]

Download and transform the First Order Theorem Data Set.

https://www.openml.org/d/1475

fetch_gas_drift()[source]

Download and transform the Gas Drift Data Set.

https://www.openml.org/d/1476

fetch_gesture_segmentation()[source]

Download and transform the Gesture Phase Segmentation Data Set.

https://www.openml.org/d/4538

fetch_image_segmentation()[source]

Download and transform the Image Segmentation Data Set.

https://www.openml.org/d/40984

fetch_mfeat_zernike()[source]

Download and transform the Multiple Features Dataset: Zernike Data Set.

https://www.openml.org/d/22

fetch_mice_protein()[source]

Download and transform the Mice Protein Data Set

https://www.openml.org/d/40966

fetch_pendigits()[source]

Download and transform the Pen-Based Recognition of Handwritten Digits Data Set.

https://www.openml.org/d/32

fetch_steel_plates()[source]

Download and transform the Steel Plates Fault Data Set.

https://www.openml.org/d/40982

fetch_texture()[source]

Download and transform the Texture Data Set.

https://www.openml.org/d/40499

fetch_usps()[source]

Download and transform the USPS Data Set.

https://www.openml.org/data/get_csv/19329737/usps.arff

fetch_vehicle()[source]

Download and transform the Vehicle Silhouettes Data Set.

https://archive.ics.uci.edu/ml/datasets/Statlog+(Vehicle+Silhouettes)

fetch_volkert()[source]

Download and transform the Volkert Data Set.

https://www.openml.org/d/41166

fetch_vowels()[source]

Download and transform the Vowels Data Set.

https://www.openml.org/d/375

fetch_waveform()[source]

Download and transform the Waveform Database Generator (version 2) Data Set.

https://www.openml.org/d/60

fetch_wine_quality()[source]

Download and transform the Wine Quality Data Set.

https://www.openml.org/d/40691

imbalance_datasets(imbalance_ratio: float, random_state: int = None)

Appends imbalanced versions of datasets with predefined imbalance ratios to self.content_.

\[IR = \frac{|C_{maj}|}{|C_{min}|}\]
Parameters:
imbalance_ratiofloat

Final Imbalance Ratio expected in the datasets.

random_stateint, RandomState instance, default=None

Control the randomization of the algorithm.

  • If int, random_state is the seed used by the random number generator;

  • If RandomState instance, random_state is the random number generator;

  • If None, the random number generator is the RandomState instance used by np.random.

Returns:
selfDatasets
items()
keys()
save(path, db_name)

Save datasets.

summarize_datasets()

Create a summary of the downloaded datasets.

Returns:
datasets_summarypd.DataFrame

Dataframe with summary statistics of all datasets.

values()