Learn Python

Learn Data Structure & Algorithm

Learn Numpy

Learn Pandas

Learn Matplotlib

Learn Seaborn

Learn Statistics

Learn Math

Learn MATLAB

Learn Machine learning

Learn Github

Learn OpenCV

Introduction

Setup

ANN

Working process ANN

Propagation

Bias parameter

Activation function

Loss function

Overfitting and Underfitting

Optimization function

Chain rule

Minima

Gradient problem

Weight initialization

Dropout

ANN Regression Exercise

ANN Classification Exercise

Hyper parameter tuning

CNN

CNN basics

Convolution

Padding

Pooling

Data argumentation

Flattening

Create Custom Dataset

Binary Classification Exercise

Multiclass Classification Exercise

Transfer learning

Transfer model Basic template

RNN

How RNN works

LSTM

Bidirectional RNN

Sequence to sequence

Attention model

Transformer model

Bag of words

Tokenization & Stop words

Stemming & Lemmatization

TF-IDF

N-Gram

Word embedding

Normalization

Pos tagging

Parser

semantic analysis

Regular expression

Learn MySQL

Learn MongoDB

Learn Web scraping

Learn Excel

Learn Power BI

Learn Tableau

Learn Docker

Learn Hadoop

Working process of stemming and lemmatization

If we pass text data in our machine learning or deep learning models then these models will not understand those text data. So what we do is that, we try to pre-process the data and try to convert it into a numerical representation. We call these numerical representation as vectors.

What is Stemming?

Stemming is the process where remove prefixes and suffixes from words, so that they are reduced to a simpler form which called stems.

For example:
If we have words like history or historical and after stemming the words will convert into histori. If we have words like finally or final or finalized then after stemming the words will convert into fina. Similarly, go or goes will convert into go.

So here we can say that we are removing suffix and prefix and convert in into simple form.

Example:
Input
import nltk
from nltk.stem import PorterStemmer
from nltk.corpus import stopwords

paragraph="""This website is a awesome website for learning programming. I love this website.Everyone should try this.Here you can learn a lot."""

sentence_toke=nltk.sent_tokenize(praragraph)
stemmer=PorterStemmer()
for in in range(len(sentences)):
   words=nltk.word_tokenize(sentences[i])
   words=[stemmer.stem(word) for word if words not in set(stopwords.words("english"))]
   sentence[i]=" ".join(words)

Output

What is Lemmatization?

Lemmatization is the process where we mapped several different forms of same words to one single form and it is called the root form. This root form is called as lemma.

For example:
If we have words like history or historical and after stemming it will convert into history. If we have words like finally or final or finalized then after stemming it will convert into final.
The words we get after doing stemming are not meaningful. The words we get after stemming we can't say the meaning of that converter word. In Lemmatization the target is same as stemming but in lemmatization, we get more meaningful words than stemming. It means after performing lemmatization, we can say the meaning of the converted word.

Example:
Input
import nltk
from nltk.stem import WordNetLemmatizer
from nltk.corpus import stopwords

paragraph="""This website is a awesome website for learning programming. I love this website.Everyone should try this.Here you can learn a lot."""

sentence_toke=nltk.sent_tokenize(paragraph)
stemmer=WordNetLemmatizer()
for in in range(len(sentences)):
   words=nltk.word_tokenize(sentences[i])
   words=[lammatize.lammatize(word) for word if words not in set(stopwords.words("english"))]
   sentence[i]=" ".join(words)
Output

Difference between stemming and lemmatization:

1. In Lemmatization we always get a meaningful word but in stemming sometimes we get meaningful words and sometimes not.
2. Lemmatization takes more time than stemming.

CodersAim is created for learning and training a self learner to become a professional from beginner. While using CodersAim, you agree to have read and accepted our terms of use, privacy policy, Contact Us

© Copyright All rights reserved www.CodersAim.com. Developed by CodersAim.