by Tyler 提督九门步军巡捕五营统领
Last Updated October 02, 2018 10:19 AM

I've come to know that normalization (MinMax scaling) and standardization (Z-score normalization) on data have different influences from outliers in the data. In About Feature Scaling and Normalization, the author Sebastian Raschka says:

"(Min-Max scaling...) The cost of having this bounded range - in contrast to standardization - is that we will end up with smaller standard deviations, which can suppress the effect of outliers."

In his book Python Machine Learning, 2nd Edition, he mentioned:

"Furthermore, standardization maintains useful information about outliers and makes the algorithm less sensitive to them in contrast to min-max scaling, which scales the data to a limited range of values."

**What does he mean that MinMax scaling can suppress the effect of outliers compared to standardization?**

**What does he mean that standardization maintains useful information about the outliers and makes algorithm less sensitive to them in contrast to min-max scaling?**

*Can this be explained from the perspective of the equations?*

Standardization: $z=\frac{x-\mu}{\sigma}$

Normalization: $X_{norm}=\frac{X-X_{min}}{X_{max}-X_{min}}$

- ServerfaultXchanger
- SuperuserXchanger
- UbuntuXchanger
- WebappsXchanger
- WebmastersXchanger
- ProgrammersXchanger
- DbaXchanger
- DrupalXchanger
- WordpressXchanger
- MagentoXchanger
- JoomlaXchanger
- AndroidXchanger
- AppleXchanger
- GameXchanger
- GamingXchanger
- BlenderXchanger
- UxXchanger
- CookingXchanger
- PhotoXchanger
- StatsXchanger
- MathXchanger
- DiyXchanger
- GisXchanger
- TexXchanger
- MetaXchanger
- ElectronicsXchanger
- StackoverflowXchanger
- BitcoinXchanger
- EthereumXcanger