Hi, I'm Md. Zubair. Prospective Ph.D. Student.

I am a Computer Science graduate currently working as a full-time faculty member at Uttara University. Additionally, I am a Technical Writer (Data Science, Computer Vision) at Deepnote Inc. My professional pursuits align with my research interests, which span across Data Science, Machine Learning, Computer Vision, Bioinformatics, and Precision Agriculture. I am actively searching for a Ph.D. opportunity that aligns with my research interests.

About me

Information About Me


I am a lecturer at Uttara University, Bangladesh. I have more than two years of teaching experience in academic setup and currently conducting Statistics, Machine Learning, Computer Vision, and Artificial Intelligence classes. I also supervise the undergraduate thesis. Besides this, I play the Technical Writer (remote) position at Deepnote Inc., USA.

I completed my four-year B.Sc. degree in Computer Science and Engineering (CSE) from Chittagong University of Engineering & Technology (CUET), Bangladesh. I was first acquainted with Machine Learning and Data Science in my third year of undergraduate. Since then, I have believed data is beautiful and drives today's modern era. Therefore, I am interested in the field of Data Science, Machine Learning, Computer Vision, Bioinformatics, and Precision Agriculture. For in-depth exploration into these fields, I engage myself in research under the supervision of professors from home and abroad. My publications and ongoing research have been mentioned in the Research and Publications section. I write to learn. My write-ups are linked in the Blog section.

In my graduation year, I was a research coordinator at "CUET Computer Club" and mentored three teams on Machine Learning. Additionally, I have three months of working experience as a research assistant of machine learning. I spent another three months as a community moderator at Dataquest . Till now, I am helping the learns as a learning assistant at Dataquest . Moreover, I am running a Data Science and ML space on Quora named "ABC of DataScience and ML" with 88.5k+ followers. I like to share knowledge by writing articles on Medium (most of the articles have published in towards data science publication). More than 42 articles have been published and still counting.

I am actively pursuing a graduate study opportunity (especially Ph.D.) with a teaching and/or research assistantship position. Also, I am interested in collaborating on different fascinating and challenging Computer Vision, Machine Learning, and Data Science projects. I am self-motivated toward research and confident that my experience and insights can bring a positive change to any research project and expedite the progress toward the goal.

I consider myself communicative, organized, responsible, studious, and proactive. I have strong academic writing skills and can work well with teammates.

Programming Skills

Python

90%

C++

50%

JAVA

40%

javascript

30%

JOB Experience

2021 August -Present

Lecturer-Department of CSE, Uttara University

I have been working for 1 year and 1 month

2023 August -Present

Technical Writer (Remote), Deepnote Inc., USA

Writing top-notch Data Science articles.

Voluntary Experience

2022 March -Present

Tier-2 Learning Assistant, Dataquest

I am helping the enthusiast Data Science Learners

2020 March - 2020 August

Community Moderator, Dataquest

I had serve the community for more than 3 months

2019 September - 2020 December

Research Coordinator, CUET Computer Club

Coordinating Undergraduate Students' Research

Field of Interests

Machine Learning

Data Science

Computer Vision

Bioinformatics

Precision Agriculture

Artificial Intelligence

Developed by Md.Zubair © Copyright 2022. All right reserved.

My Portfolio

Here is some of my work that I've done.

Computer vision helps us to recognize images.With face recognition technique of computer vision, I have developed this automated attendance system. This system will automatically detect the faces and create an excel sheet at the backend containing the name, id and time.

Project Source

The project is created based on recognition of hand gestures. Controlling mouse pointer and other task with specific hand gestures. I use MediaPipe to track the hand.

Project Source

This website is made for showing my profile at a glance. I have designed this website with basic html, css and javascript. I have tried to intrigrate overview of my expertises, degrees, publications and other achievements in a single website.

Project Source

I frequently write scientific articles in different online platforms. Most of the articles are based on different projects or data science techniques. So, I have included all the codes in a single repository.

Project Source

I have tried to implement some machine learning algorithms form scratch for getting insights into the main techniques of the algorithms.

Project Source

Developed by Md.Zubair © Copyright 2022. All right reserved.

Contact Me

Contact Me here

I am a Computer Science graduate and Currently working as a university faculty. I'm actively searching for Ph.D opportunity in the same discipline.

Location

Uttara, Dhaka-1230, Bangladesh.

Phone

+8801766722703

Email

zubairhossain773@gmail.com

Language

Bangla, English.

Developed by Md.Zubair © Copyright 2022. All right reserved.

My Scientific Articles

Statistics for Data Science

Ultimate Guide to Statistics for Data Science

Statistics at a glance for data science: standard guidelines

Less is More; the ‘Art’ of Sampling

Increase your data analyzing power of vast dataset with sample.

Get Familiar with the Most Powerful Weapon of Data Science ~Variables

Basic concept of variable types, levels of measurement and different representation techniques with python.

To Increase Data Analysing Power You Must Know Frequency Distribution.

Data plays a key role in every organization because it helps business leaders to make suitable decisions based on facts, statistical numbers, and trends.

Find the Patterns of a Dataset by Visualizing Frequency Distribution

Get insights into the Dataset by Visualizing the frequency distribution.

Compare Multiple Frequency Distributions to Extract Valuable Information from a Dataset.

How to Compare Multiple Frequency Distribution and Get Important Information.

Eliminate Your Misconception about Mean with a Brief Discussion

The Frequency distribution is an overview of all distinct values in some variables with the number of times they occur.

Increase Your Data Science Model Efficiency With Normalization.

Normalization is the key part of data pre-processing which is used to transform features on a similar scale.

Basic Probability Concepts for Data Science

Probability is a numerical concept used to measure the chance of any specific event or outcome occurring. The value of the probability ranges from 0 to 1.

Road Map from Naive Bayes Theorem to Naive Bayes Classifier

Complete Guideline for Naive Bayes Classifier with Implementation from Scratch.

All You Need To Know About Hypothesis Testing for Data Science Enthusiasts

Do Hypothesis Test and Draw Conclusions For Population Parameters From a Sample Data

Statistical Comparison Among Multiple Groups With ANOVA (Stat-11)

Are You Thinking How to Compare Among the Multiple Groups? No Worries! Just Use ANOVA

Compare Dependency of Categorical Variables with Chi-Square Test (Stat-12)

Complete Guideline to Find Dependencies among Categorical Variables with Chi-Square Test

Machine Learning

Deep Understanding of Simple Linear Regression

Linear regression is one of the simple and widely used regression algorithms. The regression algorithms predict continuous values.

Multiple Linear Regression: A Deep Dive

There are many machine learning algorithms for solving different problems. This article will introduce a machine learning algorithm which can solve the regression problem (prediction of continuous value) with multiple variables.

Turn Linear Regression into Logistic Regression

Sometimes we need to classify an object or data based on its features. Linear regression algorithms can’t solve these problems. In this scenario, logistic regression’s necessity comes in.

K-means Clustering from Scratch

The machine learning problems can be supervised or unsupervised. This article focuses on an unsupervised machine learning algorithm called ‘K-means’ clustering.

KNN Algorithm from Scratch

Implementation and Details Explanation of the KNN Algorithm

Efficient K-means Clustering Algorithm with Optimum Iteration and Execution Time

Implementation of Efficient K-means Clustering Algorithm; A Research Outcome

Natural Language Processing

Tips and Tricks to Work with Text Files in Python (Part-1)

Work with Text Files and Get Familiar with Awesome Techniques in Python.

Manipulate PDF Files, Extract Information with PyPDF2 and Regular Expression (Part-2)

Make Your PDF Manipulation Task Easy with PyPDF2 and Regular Expression.

A Complete Guideline to Natural Language Processing (NLP) (Part-3)

How machines recognize human language and act accordingly

Complete Guideline to Implementation of Basic NLP Techniques with spaCy (Part-4)

Learn How to Implement Basic NLP Techniques with Python Libraries

Simple Text Classifier With Basic Machine Learning Model (Part-5)

The easiest way to create a text classifier with a basic Machine Learning model for beginners

Build Your Personal Chatbot with ChatGPT Prompt Engineering

ChatGPT (Chat Generative Pre-Trained Transformer) is a generative AI model for carrying conversation. It was developed by OpenAI and launched in the last year.

Computer Vision

Getting Started with NumPy and OpenCV for Computer Vision

We, human beings, perceive the environment and surroundings with our vision system. The human eye, brain, and limbs work together to perceive the environment and act accordingly. An intelligent system can perform .......

A Comprehensive Guide on Color Representation in Computer Vision

The eye is such a beautiful creation of the creators, which can perceive the color of an object in an astatically pleasing and harmonious way. Color is a visual perception based on the electromagnetic spectrum.

The Easiest Guideline on Image Blending

This article will discuss one of the important image processing techniques called blending and pasting images. This knowledge is essential both for image processing and computer vision. Though the techniques are simple, it is one of the core basics of computer vision.

Thresholding — a Way to Make Images More Visible

Image thresholding works on a grayscale image. It is a way of segmenting the grayscale image into a binary image. For thresholding, a particular pixel intensity value is considered a threshold value.

Morphological Operations with Simulation

Every day, we deal with a lot of images. The images come with different intensities and resolutions. Sometimes, we can’t extract proper information from an image for quality.

Histogram Equalization: A Step-by-Step Guideline

Histogram is the process of visual representation of frequency distribution with a bar plot. In computer vision, an image histogram is the process of representation of the frequency of intensity values with a bar plot.

Computer Vision-Based Smart Attendance System

If you read the article till the end, you will be able to create your own. No complex or higher-level coding or mathematics knowledge is needed. Let’s move on.

Write a Few Lines of Code and Detect Faces, Draw Landmarks from Complex Images ~MediaPipe

Detecting face from complex image is easy and fun with MediaPipe

Data Visualization

Ultimate Guide to Data Visualization for Data Science

Data visualization itself is a universal language. I have considered visualization as a universal language because of its ability to represent information to every walk of people.

Basic Guide to Data Visualization for Data Science

Data visualization is a way to represent data and information graphically. In another sense, it can be described as translating data into a visual context using charts, plots, animations, infographics, etc.

Intermediate Guideline to Data Visualization

Without visualization, data is a collection of some numerical or categorical values. Proper visualization paves the way to finding valuable information.

11 Aesthetic Data Visualization for Data Science

Some Unique Data Visualization Techniques for Getting High-Level Insight into the Data

Spread of COVID-19 with Interactive Data Visualization

A Complete Guideline for Bar Chart Race and Interactive Choropleth Map of COVID-19 Pandemic with Python

Miscellaneous

Grab All the Data Science Resources at a Glance and Grow Yourself

A Huge Collections of Best Online Data Science Resources, Books and Many more

Say Goodbye to Screenshot and Use Datapane for Data Science Report

A complete tutorial of Datapane in 7 min read

Tips and Tricks of Exploring Qualitative Data

How Qualitative Data Can Be Analyzed and Interpreted in the Most Easiest Way

Developed by Md.Zubair © Copyright 2022. All right reserved.

Research and Publications

Ongoing Research

Weather Forecasting with Bi-LSTM and Important Decision Making for Agriculture Sector in Context of Bangladesh.

Bangladesh is a developing country. It's economy depends on agriculture. Unfortunately, the modern technologies are not used in the agricultural field. We have collected all the weather stations data from the Meteorological Department of Bangladesh. Our target is to forecast the weather and generate suggestions based on the prediction model to the farmers.

Published Research Articles

Journal Article

An Improved K-means Clustering Algorithm Towards an Efficient Data-Driven Modeling

Year: 2022
Synopsis: One of the K-means algorithm’s main concerns is to find out the initial optimal centroids of clusters. It is the most challenging task to determine the optimum position of the initial clusters’ centroids at the very first iteration. This paper proposes an approach to find the optimal initial centroids efficiently to reduce the number of iterations and execution time.
Journal: Annals of Data Science
Publisher: Springer Berlin Heidelberg
Link: Click here.

Conference Articles

An Intelligent Model to Suggest Top Productive Seasonal Crops Based on User Location in the Context of Bangladesh

Year: 2021
Synopsis: This paper makes an approach to provide a suggestion for the best crops for cultivation according to the season of a selected location of Bangladesh. The model is created based on the data of the agricultural and meteorological department of Bangladesh
Conference: 3rd International Conference on Smart Systems: Innovations in Computing
Publisher: Springer
Link: Click here.

An Efficient K-Means Clustering Algorithm for Analysing COVID-19

Year: 2021
Synopsis: In this paper, we propose an efficient K-means clustering method that determines the initial centroids of the clusters efficiently. Based on this proposed method, we have determined health care quality clusters of countries utilizing the COVID-19 datasets.
Conference: International Conference on Hybrid Intelligent Systems
Publisher: Springer
Link: Click here.

A Modified Naïve Bayesian-based Spam Filter using Support Vector Machine

Year: 2019
Synopsis: In this reseach work, we have proposed a hybrid model for spam filtering. The model is created based on Naïve Bayes algorithm intrigrated with Support Vector Machine (SVM).
Conference: International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT)
Publisher: IEEE
Link: Click here.

Undergraduate Thesis Supervision

Detection of Stage of Pneumonia with Deep-CNN and Preliminary Medical Suggestion.

The students are training different CNN models with X-Ray image so that the machine can detect the affected images automatically. Finally they will find out the best model for final prediction. And there will be a system for preliminary suggestion.

Hybrid model for Product Recommandation

The main purpose of the thesis is to intrigrate different models for product recommandation system. It will be a hybrid approach to combine multiple models for getting better accuracy and recommandation.

Automatic Bangla Text Generation with LSTM.

There are so many automatic text generation model available in different languages. For Bangla language, such types of models are not that much availabale. So, my students' tried to collect huge Bangla text corpus for creating an efficient automatic text generation model.

Developed by Md.Zubair © Copyright 2022. All right reserved.

Education and Certifications

Education

Chittagong University of Engineering and Technology (CUET)

B.Sc. in Computer Sicence and Engineering(CSE)
February,2016 - June,2021
Top 10% in the CSE Department.

Rajuk Uttara Model College (RUMC)

Higher Secondary School Certificate(HSC)
2013 - 2015
Passed with GPA 5 (out of 5)

Milestone School and college

Secondary School Certificate(SSC)
2011 - 2013
Passed with GPA 5 (out of 5)

Online Certifications

Deep Learning Specialization

Issuing Organization: Coursera
Issue Date: July 2020
Expiration Date: This certification does not expire
Credential ID: ZH3RFQUMB56S
Credential URL: Click Here

Introduction to Data Science in Python

Issuing Organization: Coursera
Issue Date: August 2019
Expiration Date: This certification does not expire
Credential ID: K6H5ETX8MHNH
Credential URL: Click Here

Machine Learning with Python

Issuing Organization: Coursera
Issue Date: April 2020
Expiration Date: This certification does not expire
Credential ID: 55W5RF8LNKDF
Credential URL: Click Here

Machine Learning

Issuing Organization: Coursera
Issue Date: July 2019
Expiration Date: This certification does not expire
Credential ID: LTFQWD83AABN
Credential URL: Click Here

Data Scientist in Python

Issuing Organization: DATAQUEST
Issue Date: February 2020
Expiration Date: This certification does not expire
Credential ID: 5BTZ8HLUKLOOSLKVX0VO
Credential URL: Click Here