Anti-Asian Hate Crime Tracker

2023/01/15 Python Scrapy Docker Nginx
114 words in total,read in 1 minutes

Time: Jan.2022 - Jan.2023   Location: Seattle, USA

register&login

Volunteered to work with a team of 4 engineers to develop a crawler that feeds hate incident data to the backend of Anti-Asian Hate Crime Tracker, a website designed to increase awareness of Anti-Asian hate.

Built distributed web crawlers with Scrapy. Created Docker files to run crawlers and proxy service in a portable multi-container application. Protected crawler services by setting up Nginx server authentication.

Developed integrated Scrapy item pipelines to support message push service in multiple platforms including Slack APP, Google Drive and Algolia.

Implemented distributed crawling, URL/URI deduplication and established the data access layer by applying Scrapy-Redis. It effectively filtered duplicate contents and improved the crawler efficiency by 42%.

Article Information

Search

    Table of Contents