Browse Books

Go to RapidMiner

Powerful, Flexible Tools for a Data-Driven WorldAs the data deluge continues in todays world, the need to master data mining, predictive analytics, and business analytics has never been greater. These techniques and tools provide unprecedented insights into data, enabling better decision making and forecasting, and ultimately the solution of increasingly complex problems. Learn from the Creators of the RapidMiner Software Written by leaders in the data mining community, including the developers of the RapidMiner software, RapidMiner: Data Mining Use Cases and Business Analytics Applications provides an in-depth introduction to the application of data mining and business analytics techniques and tools in scientific research, medicine, industry, commerce, and diverse other sectors. It presents the most powerful and flexible open source software solutions: RapidMiner and RapidAnalytics. The software and their extensions can be freely downloaded at www.RapidMiner.com. Understand Each Stage of the Data Mining ProcessThe book and software tools cover all relevant steps of the data mining process, from data loading, transformation, integration, aggregation, and visualization to automated feature selection, automated parameter and process optimization, and integration with other tools, such as R packages or your IT infrastructure via web services. The book and software also extensively discuss the analysis of unstructured data, including text and image mining. Easily Implement Analytics Approaches Using RapidMiner and RapidAnalytics Each chapter describes an application, how to approach it with data mining methods, and how to implement it with RapidMiner and RapidAnalytics. These application-oriented chapters give you not only the necessary analytics to solve problems and tasks, but also reproducible, step-by-step descriptions of using RapidMiner and RapidAnalytics. The case studies serve as blueprints for your own data mining applications, enabling you to effectively solve similar problems.

Cited By

Di Martino S, Landolfi E, Mazzocca N, Rocco di Torrepadula F and Starace L (2024). A visual-based toolkit to support mobility data analytics, Expert Systems with Applications: An International Journal , 238 :PC , Online publication date: 15-Mar-2024 .

Barron-Lugo J, Gonzalez-Compean J, Lopez-Arevalo I, Carretero J and Martinez-Rodriguez J (2023). Xel, Future Generation Computer Systems , 145 :C , (87-103), Online publication date: 1-Aug-2023 .

Tabakhi S and Moradi P (2023). Universal feature selection tool (UniFeat), Neurocomputing , 535 :C , (156-165), Online publication date: 28-May-2023 .

Lughofer E and Pratama M (2023). Evolving multi-user fuzzy classifier system with advanced explainability and interpretability aspects, Information Fusion , 91 :C , (458-476), Online publication date: 1-Mar-2023 .

de Paula Vidal G, Caiado R, Scavarda L, Ivson P and Garza-Reyes J (2022). Decision support framework for inventory management combining fuzzy multicriteria methods, genetic algorithm, and artificial neural network, Computers and Industrial Engineering , 174 :C , Online publication date: 1-Dec-2022 .

Karmaker (“Santu”) S, Hassan M, Smith M, Xu L, Zhai C and Veeramachaneni K (2021). AutoML to Date and Beyond: Challenges and Opportunities, ACM Computing Surveys , 54 :8 , (1-36), Online publication date: 30-Nov-2022 .

Lughofer E (2022). Evolving multi-user fuzzy classifier systems integrating human uncertainty and expert knowledge, Information Sciences: an International Journal , 596 :C , (30-52), Online publication date: 1-Jun-2022 .

Masson M, Cayèré C, Bessagnet M, Sallaberry C, Roose P and Faucher C An ETL-like platform for the processing of mobility data Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing, (547-555)

Saadallah A, Finkeldey F, Buß J, Morik K, Wiederkehr P and Rhode W (2022). Simulation and sensor data fusion for machine learning application, Advanced Engineering Informatics , 52 :C , Online publication date: 1-Apr-2022 .

Siegert I, Weißkirchen N, Krüger J, Akhtiamov O and Wendemuth A (2022). Admitting the addressee detection faultiness of voice assistants to improve the activation performance using a continuous learning framework, Cognitive Systems Research , 70 :C , (65-79), Online publication date: 1-Dec-2021 .

Françoise J, Caramiaux B and Sanchez T Marcelle: Composing Interactive Machine Learning Workflows and Interfaces The 34th Annual ACM Symposium on User Interface Software and Technology, (39-53)

Cabero I, Epifanio I, Piérola A and Ballester A (2021). Archetype analysis, Knowledge-Based Systems , 217 :C , Online publication date: 6-Apr-2021 .

Jafarian T, Masdari M, Ghaffari A and Majidzadeh K (2021). SADM-SDNC: security anomaly detection and mitigation in software-defined networking using C-support vector classification, Computing , 103 :4 , (641-673), Online publication date: 1-Apr-2021 .

Bjaoui M, Sakly H, Said M, Kraiem N and Bouhlel M Depth insight for data scientist with RapidMiner « an innovative tool for AI and big data towards medical applications» Proceedings of the 2nd International Conference on Digital Tools & Uses Congress, (1-6)

Bielby J, Kuhn S, Colreavy-Donnelly S, Caraffini F, O’Connor S and Anastassi Z Identifying Parkinson’s Disease Through the Classification of Audio Recording Data 2020 IEEE Congress on Evolutionary Computation (CEC), (1-7)

Meşecan İ, Çiço B and Ömür Bucak İ (2020). Feature vector for underground object detection using B-scan images from GprMax, Microprocessors & Microsystems , 76 :C , Online publication date: 1-Jul-2020 .

Bolón-Canedo V and Alonso-Betanzos A (2019). Ensembles for feature selection, Information Fusion , 52 :C , (1-12), Online publication date: 1-Dec-2019 .

Toivonen T and Jormanainen I Evolution of Decision Tree Classifiers in Open Ended Educational Data Mining Proceedings of the Seventh International Conference on Technological Ecosystems for Enhancing Multiculturality, (290-296)

Haynes M, Groen J, Sturzinger E, Zhu D, Shafer J and McGee T Integrating Data Science into a General Education Information Technology Course Proceedings of the 20th Annual SIG Conference on Information Technology Education, (183-188)

Borrison R, Klöpper B and Saini S Industrial Event Log Analyzer - Self-service Data Mining for Domain Experts Machine Learning and Knowledge Discovery in Databases, (794-798)

Sohrabi M and Hemmatian F (2019). An efficient preprocessing method for supervised sentiment analysis by converting sentences to numerical vectors: a twitter case study, Multimedia Tools and Applications , 78 :17 , (24863-24882), Online publication date: 1-Sep-2019 .

Akhtiamov O, Fedotov D and Minker W A Comparative Study of Classical and Deep Classifiers for Textual Addressee Detection in Human-Human-Machine Conversations Speech and Computer, (20-30)

Khan J, Alam A, Hussain J and Lee Y (2019). EnSWF, Applied Intelligence , 49 :8 , (3123-3145), Online publication date: 1-Aug-2019 .

Blincoe K, Dehghan A, Salaou A, Neal A, Linaker J and Damian D (2019). High-level software requirements and iteration changes, Empirical Software Engineering , 24 :3 , (1610-1648), Online publication date: 1-Jun-2019 .

Altakrori M, Iqbal F, Fung B, Ding S and Tubaishat A (2018). Arabic Authorship Attribution, ACM Transactions on Asian and Low-Resource Language Information Processing , 18 :1 , (1-51), Online publication date: 31-Mar-2019 .

Flores V, Keith B and Domingo R (2019). Gradient Boosted Trees Predictive Models for Surface Roughness in High-Speed Milling in the Steel and Aluminum Metalworking Industry, Complexity , 2019 , Online publication date: 1-Jan-2019 .

Kravvaris D and Kermanidis K (2019). Automatic point of interest detection for open online educational video lectures, Multimedia Tools and Applications , 78 :2 , (2465-2479), Online publication date: 1-Jan-2019 .

Abuzaid F, Bailis P, Ding J, Gan E, Madden S, Narayanan D, Rong K and Suri S (2018). MacroBase, ACM Transactions on Database Systems , 43 :4 , (1-45), Online publication date: 16-Dec-2018 .

Camara C, Peris-Lopez P, Gonzalez-Manzano L and Tapiador J (2018). Real-time electrocardiogram streams for continuous authentication, Applied Soft Computing , 68 :C , (784-794), Online publication date: 1-Jul-2018 .

Ahmed M (2018). Reservoir-based network traffic stream summarization for anomaly detection, Pattern Analysis & Applications , 21 :2 , (579-599), Online publication date: 1-May-2018 .

Colucci S, Donini F and Di Sciascio E (2017). Logical comparison over RDF resources in bio-informatics, Journal of Biomedical Informatics , 76 :C , (87-101), Online publication date: 1-Dec-2017 .

Usaphapanus P and Piromsopa K Performance Analysis of Computer Virus Detection from Binary Code using Ensemble Classifier Proceedings of the 9th International Conference on Signal Processing Systems, (8-12)

Mivule K (2017). Data Swapping for Private Information Sharing of Web Search Logs, Procedia Computer Science , 114 :C , (149-158), Online publication date: 1-Nov-2017 .

Ahmad N and Siddique J (2017). Personality Assessment using Twitter Tweets, Procedia Computer Science , 112 :C , (1964-1973), Online publication date: 1-Sep-2017 .

Baralis E, Cagliero L, Cerquitelli T, Garza P and Pulvirenti F (2017). Discovering profitable stocks for intraday trading, Information Sciences: an International Journal , 405 :C , (91-106), Online publication date: 1-Sep-2017 .

Raasveldt M and Mühleisen H (2017). Don't hold my data hostage, Proceedings of the VLDB Endowment , 10 :10 , (1022-1033), Online publication date: 1-Jun-2017 .

Dehghan A, Neal A, Blincoe K, Linaker J and Damian D Predicting likelihood of requirement implementation within the planned iteration Proceedings of the 14th International Conference on Mining Software Repositories, (124-134)

Bailis P, Gan E, Madden S, Narayanan D, Rong K and Suri S MacroBase Proceedings of the 2017 ACM International Conference on Management of Data, (541-556)

Neto R, Jorge Adeodato P and Carolina Salgado A (2017). A framework for data transformation in Credit Behavioral Scoring applications based on Model Driven Development, Expert Systems with Applications: An International Journal , 72 :C , (293-305), Online publication date: 15-Apr-2017 .

Ahmed M (2017). Thwarting DoS Attacks: A Framework for Detection based on Collective Anomalies and Clustering, Computer , 50 :9 , (76-82), Online publication date: 1-Jan-2017 .

Sinnott R, Thomas N, Bansal H and Zhao Z My ever changing moods Proceedings of the 9th International Conference on Utility and Cloud Computing, (175-184)

Vyas R, Bapat S, Jain E, Karthikeyan M, Tambe S and Kulkarni B (2016). Building and analysis of protein-protein interactions related to diabetes mellitus using support vector machine, biomedical text mining and network analysis, Computational Biology and Chemistry , 65 :C , (37-44), Online publication date: 1-Dec-2016 .

Dehghan A, Blincoe K and Damian D A hybrid model for task completion effort estimation Proceedings of the 2nd International Workshop on Software Analytics, (22-28)

Bolt A, Leoni M and Aalst W (2016). Scientific workflows for process mining, International Journal on Software Tools for Technology Transfer (STTT) , 18 :6 , (607-628), Online publication date: 1-Nov-2016 .

Khan J, Jeong B, Lee Y and Alam A Sentiment analysis at sentence level for heterogeneous datasets Proceedings of the Sixth International Conference on Emerging Databases: Technologies, Applications, and Theory, (159-163)

Borg M (2016). TuneR, Journal of Software: Evolution and Process , 28 :6 , (427-459), Online publication date: 1-Jun-2016 .

Hu X, Zhang Y, Chu S and Ke X Towards personalizing an e-quiz bank for primary school students Proceedings of the Sixth International Conference on Learning Analytics & Knowledge, (25-29)

Save to Binder
Altair Engineering Gmbh

Index Terms

RapidMiner: Data Mining Use Cases and Business Analytics Applications

Reviews

Reviewer: Robert M. Lynch

Hofmann and Klinkenberg have produced a fine collection of essays on data mining and analytic models, presented in several cross-disciplinary cases. This book describes data mining and case applications using Rapidminer models and analytic techniques (http://www.Rapidminer.com). Rapidminer is a system for the design and documentation of an overall data mining process. The system offers a comprehensive set of operators and structures that can be used to express the control flow of the process using drag-and-drop tools. The book focuses on the fine details of Rapidminer, from data preparation to model development to evaluating and visualizing the results; further support is freely offered at the website. The book represents the work of more than 30 contributors. Managing the writing styles of so many contributors is a challenging task, and the editors are to be commended for their effort. The material flows well, is very readable, and easily transitions from chapter to chapter and section to section. The book is divided into ten sections, each focusing on a different disciplinary area and a different analytic and mining model. Each section includes one or more cases. Section 1 introduces Rapidminer and data mining in general. Section 2 discusses basic classification, using cases in credit approval, teaching assistant selection, and nursery school selection or rejection. The classification models consider k -nearest neighbor classification and naîve Bayesian classification. Section 3 explores cases in marketing, cross-selling, and recommender systems for higher education programs. Section 4 focuses on medical and educational cases, with a focus on clustering algorithms. Section 5 discusses the mining of text rather than numerical variable data, such as in spam detection, language identification, and customer feedback. Rapidminer includes a text processing extension. Section 6 covers feature selection and classification in astroparticle physics and carpal tunnel syndrome in medicine. A feature selection extension is available in Rapidminer. Section 7 presents models and mining results for datasets in molecular structure and the modeling of property-activity relationships in biochemistry and medicine. Section 8 focuses on image mining, including feature extraction, segmentation, and classification. Section 9 examines anomaly detection, instance selection, and the construction of prototypes. Finally, Section 10 models meta-learning and automated learner selection. In each section, the authors introduce a mining activity, the related model, and the analytic techniques used. The datasets are described, the Rapidminer requirements are enumerated, and the analysis is summarized. The reader can refer to the companion web site for downloadable code and datasets for each case (http://www.rapidminerbook.com). If you would like more information about the book, this is the place to look. Data mining requires a basic knowledge of clustering and classification algorithms, linear models, principal component analysis, and factor analysis, among other grouping and discrimination techniques. A background in such topics would be helpful for reading the book. Similarly, some technical savvy will be useful. Tutorials, manuals, and other support material for Rapidminer are available at http://rapid-i.com. This is a good book. If you are interested in some very interesting data mining cases, or if you would like to learn Rapidminer, it will not disappoint. The bibliographic references are lengthy and the indices are well done. More reviews about this item: Amazon Online Computing Reviews Service

Computing Reviews logoComputing Reviews logo

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Recommendations

Mining the Web of Linked Data with RapidMiner

Lots of data from different domains are published as Linked Open Data (LOD). While there are quite a few browsers for such data, as well as intelligent tools for particular purposes, a versatile tool for deriving additional knowledge by mining the Web .

A Middle-School Module for Introducing Data-Mining, Big-Data, Ethics and Privacy Using RapidMiner and a Hollywood Theme

SIGCSE '18: Proceedings of the 49th ACM Technical Symposium on Computer Science Education

Today's organizations, including online businesses, use the art of data-driven decision-making i.e. business-intelligence (BI) to benefit from all the data out in the open. Given the current market demand for BI skill-sets, including the knowledge of .

Diabetes Data Analysis and Prediction Model Discovery Using RapidMiner

FGCN '08: Proceedings of the 2008 Second International Conference on Future Generation Communication and Networking - Volume 03

Data mining techniques have been extensively applied in bioinformatics to analyze biomedical data. In this paper, we choose the Rapid-I’s RapidMiner as our tool to analyze a Pima Indians Diabetes Data Set, which collects the information of patients with .