Skip to content
Mindware Research Institute

Mindware Research Institute

Concept Research – AI powered Creative Information Analysis

  • Home
  • Concept Research
  • Contact
  • 日本語

Ensemble model — Bagging

2023年10月27日
By Kunihiro TADA In Data Science

Ensemble model — Bagging

If you just use a machine learning library’s algorithm according to a textbook in a real problem, you won’t be able to achieve better than 70% to 80% performance. Making improvements from there is why it is called data science. There are generally three types of improvements to machine learning methods:

  1. Devise a completely new learning algorithm
  2. Improving some of the existing learning algorithms
  3. Combine existing algorithms

If you can do 1, you are an authority. If you can do 2 or 3, you might be a doctor. 2 is a high level because it involves rewriting the library source itself. 3 may be possible by coding in Python. However, there may be limits to that method. In some cases, it may be more productive for most users to use commercial software with these improvements.

The methods using multiple models in the various techniques have been used by our predecessors to improve the performance of machine learning algorithms are collectively called ensemble models. Currently, there are three types of ensemble models:

  1. Bagging
  2. Boosting
  3. Stacking

I think that the reality is that these types did not first exist and methods were developed according to them, but that this is what happened when we categorized the various improvements that our predecessors had undertaken. Bagging is a method in which multiple models are placed in parallel and their results are determined or averaged. I think it’s almost certain that the term “ensemble” comes from this.

By the way, Boosting is a method of connecting models in series. It involves creating one model and using measures such as residuals obtained from the results to determine the parameters of the next model. This is an image of the model being improved through repeated model creation.

Stacking is a method that adds the results obtained from a model using one method to the input of a model created using another method as new features. As mentioned in earlier article, this involves including the distance to the cluster centroid obtained by K-means as a new feature in decision tree learning.

I would like to think a little more about Bugging here. Think of it as an ensemble or chorus. It’s wonderful when a single professional singer sings, but a chorus performed by a group of average people also has a quality that cannot be found in the former. However, everyone is far from being a professional singer. For example, if you were the singing director of a J-Pop idol group, what would you do? You’ll do something like:

  1. Gather a lot of idol candidates.
  2. Give them tough lessons.
  3. As a result of the lessons, those who have reached a certain level are appointed as regular members.
  4. We carefully examine the characteristics of each person’s voice and ask them to sing only the parts that they are good at.

Bagging, which is currently widely used, performs 1 and 2 and takes the majority vote or average. There may be examples that have completed up to 3, but unfortunately I don’t think there are many examples that have completed up to 4. To be more practical, general Bagging creates multiple models by changing the selection of attributes used in the model, but with Self-Organizing Maps you can also change the weighting of attributes. Moreover SOM make it possible to do 4 by rejecting the results of nodes with high error rates.

Viscovery SOMine uses other techniques such as Boosting and Stacking in various places, even though they are not called that.

Written by:

Kunihiro TADA

He has been a watcher of the industrial boom from the early 1980s to the present day. 1982, planner of high-tech seminars at the Japan Technology and Economy Centre, and of seminars and research projects at JMA Consulting; in 1986 he organised AI chip seminars on fuzzy inference and other topics, triggering the fuzzy boom; after freelance writing on CG and multimedia, he founded the Mindware Research Institute, selling the Japanese version of Viscovery SOMine since 2000, and Hugin and XLSTAT since 2003 in Japan.

View All Posts

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

Recent Posts

  • Epistemology vs Ontology: Why This Distinction Matters More Than Ever
  • Entered into AI governance-related business
  • A Unified Perspective on Cosmology, Causal Structure, Many-Worlds Interpretation, and Bayesian Networks
  • Data Science and Buddhism: From the “Ugly Duckling Theorem” to Emptiness, Provisionality, and the Middle Way
  • The Value of Human–AI Interfaces in the Age of AGI
  • Viscovery SOMine 8.1 Release
  • Semantic data mining that fundamentally changes information analysis 2
  • Semantic data mining that fundamentally changes information analysis 1
  • SOM as a platform for ensembles of multi-machine learning models
  • Innovation Maps: IT Industry top 1000 Services and Products Competing Map

Archives

  • April 2026
  • December 2025
  • November 2025
  • October 2025
  • January 2025
  • December 2024
  • July 2024
  • June 2024
  • April 2024
  • March 2024
  • December 2023
  • October 2023
  • September 2023
  • August 2023
RSS Error: Retrieved unsupported status code "404"
Logo  
Daiichi Central Bldg. 6-36, Honmachi, Okayama Kita-ku, 700-0901, Japan
info@mindware-jp.com
+81-86-226-0028

Recent Posts

  • Epistemology vs Ontology: Why This Distinction Matters More Than Ever
  • Entered into AI governance-related business
  • A Unified Perspective on Cosmology, Causal Structure, Many-Worlds Interpretation, and Bayesian Networks
  • Data Science and Buddhism: From the “Ugly Duckling Theorem” to Emptiness, Provisionality, and the Middle Way
  • The Value of Human–AI Interfaces in the Age of AGI

Categories

  • Data Science
  • Innovation Maps
  • Quantitative business strategy management
  • ThinkNavi
  • 未分類

Proudly powered by WordPress | Theme: BusiCare by SpiceThemes