readme update
This commit is contained in:
parent
46541cfc05
commit
e9b4ee22ce
1 changed files with 29 additions and 1 deletions
30
README.md
30
README.md
|
@ -1 +1,29 @@
|
|||
Zatím je potřeba v debianu instalovat wkhtmltopdf balíček
|
||||
# Headline
|
||||
Monitor how article titles are changed over time on news websites.
|
||||
|
||||
___
|
||||
This tool is probably not production ready beacause it was written in two afternoons by an amateur (I'm not a professional programmer). If you want to run it, at least put a reverse proxy between it and public network or run it locally.
|
||||
|
||||
I did't do any research on legality of analysing RSS feeds and it's possible you can get into legal issues by presenting the outcomes publicly.
|
||||
___
|
||||
|
||||
## Architecture
|
||||
The "processor" script will fetch rss feeds configured in `processor/config.yaml` every 5 minutes (configured in `processor/crontab`), store the article in Redis and compare new/old articles to find changes in title.
|
||||
When change is found, it generates nice visual diff and stores it with other information (detection time, article link, new/old title, etc.) in permanent database (sqlite3 for now).
|
||||
|
||||
The "view" script is reading data from the permanent database (sqlite3) and presents it to the user.
|
||||
|
||||
|
||||
## Installation
|
||||
|
||||
Run `docker-compose up -d` and everything should start. You can change ./processor/config.yaml to edit rss sources.
|
||||
After first start, you have to wait for ~5mins for the "processor" to create first empty database. The webserver will throw error until then.
|
||||
|
||||
|
||||
|
||||
## to-do
|
||||
* Collect creation time of orig/new article, write it to permanent storage (sqlite3 for now) and display it.
|
||||
* Write better readme and little more docs.
|
||||
* Create view with some more info and stats (list of feeds, articles in redis, etc.)
|
||||
* Create a routine to clear old articles from Redis (otherwise it will just fill up the disk space at some point...)
|
||||
* IDEA: Figure out how to monitor changes in article description (maybe just compare hashes?) and how to present them. (Right now, the code can store descriptions in redis, but nothing else)
|
||||
|
|
Loading…
Reference in a new issue