# AI/ML, Data Science

AI/ML, Data Science ## Heatmap

By Dr. Shripad Bhat, Data Scientist September 4, 2019 Heatmap depicts the two-dimensional data (matrix form) in the form of graph. Data requirement: Data can be in the form of: Matrix such as correlation matrix Or a pandas cross tabulated dataframe  Example: Importing the data Cross-tabulate the data using pd.crosstab Plot the heatmap using seaborn … ## Importing-Data

By Dr. Shripad Bhat, Data Scientist September 4, 2019 Importing data into Python In this post, we will learn: How to import data into python How to import time series data How to handle different time series formats while importing A) Importing Normal Data Suppose you have a data file saved in csv format on … ## Train-Test split

By Dr. Shripad Bhat, Data Scientist September 4, 2019 Train-Test split and Cross-validation Building an optimum model which neither underfits nor overfits the dataset takes effort. To know the performance of our model on unseen data, we can split the dataset into train and test sets and also perform cross-validation. Train-Test split To know the performance of … By Dr. Shripad Bhat, Data Scientist September 4, 2019 Occam’s Razor, Bias-Variance Tradeoff, No Free Lunch Theorem and The Curse of Dimensionality In this post, let us discuss some of the basic concepts/theorems used in Machine Learning: Occam’s Razor (Law of Parsimony) What is Bias-variance Tradeoff No Free Lunch Theorem The curse of dimensionality Occam’s … ## Exploratory Data Analysis

By Dr. Shripad Bhat, Data Scientist September 4, 2019 “A picture is worth a thousand words” A complex idea can be understood effectively with the help of visual representations. Exploratory Data Analysis (EDA) helps us to understand the nature of the data with the help of summary statistics and visualizations capturing the details which numbers can’t. … ## Pre-processing

By Dr. Shripad Bhat, Data Scientist September 4, 2019 Data Preprocessing – Creating Dummy Variables and Converting Ordinal Variables to Numbers with Examples Data cleaning is a critical step before fitting any statistical model. It includes: Handling missing values Handling outliers Transforming nominal variables to dummy variables (discussed in this post) Converting ordinal data to numbers … ## Confusion Matrix

By Dr. Shripad Bhat, Data Scientist September 4, 2019 Confusion Matrix, Accuracy, Precision, Recall, F score explained with an example In this post, we will learn about What is accuracy What are precision, recall, specificity and F score How to manually calculate these measures How to interpret these measures What is confusion matrix and how … ## Outliers

By Dr. Shripad Bhat, Data Scientist September 4, 2019 Handling Outliers in Python How to detect outliers Histogram Histogram also displays these outliers clearly. Scatter Plot If there are more than one variable and scatter plot is also useful in detecting outliers visually. Handling Outliers If we can’t rectify the outliers, then we may think …