Lesson 3 of 545 minModule progress 0%

Module 18: Final Capstone Projects

Capstone 3: Multi-threaded Web Scraper

Scrape URLs concurrently, respect rate limits, and persist structured data.

Coordinate tasks with ExecutorService, throttle requests, and honor robots.txt.

Parse HTML with Jsoup and serialize results to JSON/DB.

Implement retries, back-off strategies, and graceful shutdown.

Advertisement

Lesson check

Which component controls concurrency?

Next lesson →