Browse Source

feat(nlp-scraper): fix broken link and remove optional part with broken URL

pull/2504/head
nprimo 2 months ago committed by Niccolò Primo
parent
commit
187ca1884b
  1. 7
      subjects/ai/nlp-scraper/README.md

7
subjects/ai/nlp-scraper/README.md

@ -56,7 +56,7 @@ SpaCy](https://towardsdatascience.com/named-entity-recognition-with-nltk-and-spa
The goal is to detect what the article is dealing with: Tech, Sport, Business, The goal is to detect what the article is dealing with: Tech, Sport, Business,
Entertainment or Politics. To do so, a labelled dataset is provided: [training Entertainment or Politics. To do so, a labelled dataset is provided: [training
data](bbc_news_train.csv) and [test data](bbc_news_test.csv). From this data](bbc_news_train.csv) and [test data](bbc_news_tests.csv). From this
dataset, build a classifier that learns to detect the right topic in the dataset, build a classifier that learns to detect the right topic in the
article. Save the training process to a python file because the audit requires article. Save the training process to a python file because the audit requires
the auditor to test the model. the auditor to test the model.
@ -68,11 +68,6 @@ that the model is trained correctly and not overfitted.
- Learning constraints: **Score on test: > 95%** - Learning constraints: **Score on test: > 95%**
- **Optional**: If you want to train a news' topic classifier based on a more
challenging dataset, you can use the
[following](https://www.kaggle.com/rmisra/news-category-dataset) which is
based on 200k news headlines.
#### **3. Sentiment analysis:** #### **3. Sentiment analysis:**
The goal is to detect the sentiment (positive, negative or neutral) of the news The goal is to detect the sentiment (positive, negative or neutral) of the news

Loading…
Cancel
Save