AI/ML, Data Science

AI/ML, Data Science

Naïve Bayes Classification Model for Natural Language Processing Problem using Python

Learn how to apply a Naïve Bayes classification model to solve a Natural Language Processing (NLP) problem in Python in this article. Here are the steps we will cover: Download a sample dataset Split the dataset into test and train data Vectorize the data Build and measure the accuracy of the model References Further reference …

Naïve Bayes Classification Model for Natural Language Processing Problem using Python Read More »

Outliers

Handling Outliers in Python How to detect outliers Histogram Histogram also displays these outliers clearly. Scatter Plot If there are more than one variable and scatter plot is also useful in detecting outliers visually. Handling Outliers If we can’t rectify the outliers, then we may think of some the following methods to handle outliers. Doing …

Outliers Read More »

Feature Selection: Filter method, Wrapper method and Embedded method

In this post, let us explore: What is feature selection? Why we need to perform feature selection? Methods What is Feature Selection? Feature selection means selecting and retaining only the most important features in the model. Feature selection is different from feature extraction. In feature selection, we subset the features whereas in feature extraction, we …

Feature Selection: Filter method, Wrapper method and Embedded method Read More »

Principal Component Analysis

Principal Component Analysis (PCA) explained with examples In this post, let us understand   What is Principal Component Analysis (PCA) When to use it and what are the advantages How to perform PCA in Python with an example What is Principal Component Analysis (PCA)? Principal Component Analysis is an unsupervised data analysis technique. It is …

Principal Component Analysis Read More »

Random Forest

In this post, let us explore: Random Forest When to use Advantages Disadvantages Hyperparameters Examples Random Forest When to use Random forest can be used for both classification and regression tasks. If the single decision tree is over-fitting the data, then random forest will help in reducing the over-fit and in improving the accuracy. Advantages …

Random Forest Read More »

Basic concepts

Time Series – Basic concepts Resampling: A) Downsampling In simple terms, it is like aggregating. For example: converting daily data to monthly data, or quarterly data to yearly data etc. In the following example, I have converted daily data to weekly data. B) Upsampling Here we increase the frequency in time series data. It is …

Basic concepts Read More »

Feature Engineering for machine learning

In this post, let us explore: What is the difference between Feature Selection, Feature Extraction, Feature Engineering and Feature Learning Process of Feature Engineering  And examples of Feature Engineering Both Feature engineering and feature extraction are similar: both refer to creating new features from the existing features. Feature engineering refers to creating a new feature when we could …

Feature Engineering for machine learning Read More »

Ensemble Models

In this post, let us explore: Ensemble Models Bagging Boosting Stacking Ensemble Models/Methods/Learning Ensembling is clubbing predictions from different models to get better performance. How to club different predictions from different models to get a single prediction? There are different ways of doing it. Bagging Boosting Stacking These are the major types of ensembles. To …

Ensemble Models Read More »

REQUEST DEMO