mirror of https://github.com/01-edu/Branch-AI.git
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
903 B
903 B
Exercise 6: Text preprocessing
The goal of this exercise is to learn to create a function to prepocess and clean a text using NLTK.
Put this text in a variable:
01 Edu System presents an innovative curriculum in software engineering and programming. With a renowned industry-leading reputation, the curriculum has been rigorously designed for learning skills of the digital world and technology industry. Taking a different approach than the classic teaching methods today, learning is facilitated through a collective and co-créative process in a professional environment.
- Write a function that takes as input the text and returns it preprocessed.
The preprocessing is composed of:
1. Lowercase
2. Removing Punctuation
3. Tokenization
4. Stopword Filtering
5. Stemming
Ressources: https://towardsdatascience.com/nlp-preprocessing-with-nltk-3c04ee00edc0