banner



How To Download Mushroom Data Set From Uci

UCI download/process software repository

Build Status Coverage Status

Open source Python repository for downloading, processing, folding and describing supervised machine learning datasets from UCI and others raw repositories

This Github repository is a set of scripts for downloading supervised machine learning datasets from UCI Motorcar Learning Repository, and procedure them into a common format. Originally, information technology was a fork of Julia repository JackDunnNZ/uci-data, from which configuration files were extracted. The UCI ML repository is a useful source for machine learning datasets for testing and benchmarking, but the format of datasets is not consequent. This means effort is required in order to make use of new datasets since they demand to be read differently.

The main goal of this repository is to process the datasets into a format to be read from PyRidge, where each row of final data is every bit follows:

              attribute_1 attribute_2 ... attribute_n class                          

This makes it piece of cake to switch out datasets in ML problems, which is great when automating things.

Converting to mutual format

The datasets are not checked in to git in order to minimise the size of the repository and to avoid rehosting the data. As such, the script downloads any missing datasets direct from UCI equally it runs.

Running the code

At that place are ii means of running the code. Easy/obscure mode is to run first the install_requirements.sh script, using bash

fustigate install_requirements.sh

Which install the Python 3 requirements from requirements.txt. Packages necessaries for this library:

  • numpy
  • pandas
  • sklearn
  • rarfile
  • PyLaTeX

After that, the main script

Yet, it is recommended to use a virtual environment for Python three, which can be done easily following an explanation here. In this virtual enviroment, previous requirements must be installed. Then, you only have to run the scripts in the chief directory

python download_data.py python process_data.py python fold_data.py python describe_data.py

The information will be downloaded, processed, k-folded and described, in that order. Customizable parameters, such as folders to procedure and number of folds, are establish in parameter_config.ini:

              [DOWNLOAD] config_folders = datafiles/regression,datafiles/classification raw_folder = raw_data remove_older = Truthful  [Procedure] config_folders = datafiles/regression,datafiles/nomenclature processed_folder = processed_data remove_older = True  [FOLD] processed_folders = processed_data/regression,processed_data/classification data_folder = information remove_older = True n_fold = 10  [Describe] data_folders = information/regression,information/nomenclature description_folder = description remove_older = True                          

Citation policy

Perales-González, Carlos, (2020). UCI download-procedure, v1.3, GitHub repository, https://github.com/cperales/uci-download-process

              @misc{UCI-download-process,   writer = {Carlos, Perales-González},   title = {UCI download/procedure},   year = {2020},   publisher = {GitHub},   journal = {GitHub repository},   howpublished = {\url{https://github.com/cperales/uci-download-process}},   tag = {1.3} }                          

How To Download Mushroom Data Set From Uci,

Source: https://github.com/cperales/uci-download-process

Posted by: monarrezyousses.blogspot.com

0 Response to "How To Download Mushroom Data Set From Uci"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel