Слайд 2

Introducing BlogSpider is a website project that allows user to crawl

Introducing

BlogSpider is a website project that allows user to crawl pages

find on them RSS channels and store.
The main goal of the project was to learn new technologies and dive into the AKKA.net.
Слайд 3

Project structure Project consists of four main parts: Lighthouse Crawler Tracker Web application

Project structure

Project consists of four main parts:
Lighthouse
Crawler
Tracker
Web application

Слайд 4

Base crawling alghoritm Here you can see base idea of crawling alghoritm

Base crawling alghoritm

Here you can see base idea of
crawling alghoritm


Слайд 5

Base concept of crawler cluster Here you can see basic roles

Base concept of crawler cluster

Here you can see basic roles wich

must be in
crawler cluster.
Web - web application wich run
some job to crawl.
Tracker - this service which tell what we nee to crawl
Слайд 6

What is lighthouse? Lighthouse is a dedicated seed nodes tool for

What is lighthouse?

Lighthouse is a dedicated seed nodes tool for our

cluster. It only has to be operate one occur cluster itself is upgraded and it’s not actually deployed as part of your application, so it should never have to be redeployed when you make no changes but it will need to be upgraded as occur that cluster gets upgraded.
Слайд 7

Let`s look how it work

Let`s look how it work