README

Using Raku to scrape and analyze Ukraine Ministry of Defense Data

News about combat losses of the Russian invaders are periodically published by the Ukraininan minister of Defense This is a Raku module that extracts information from those pages, for instance this one.

Note: English reports are updated less frequently than the Ukrainian one, which are updated daily. That's left for future work.

Installing

This repo uses Raku as well as Python for performing the whole downloading/scraping workflow. You will need a reasonably recent version of both to work. Additionally, install poetry globally.

When that's done, clone this repo or install via zef (when I upload it to the ecosystem , shortly). If you want to run it directly from here, run

zef install --deps-only .

and

poetry install

Running

You can always check the examples in the t directory. For convenience, an Akefile is also included. It contains several targets which automate some tasks

  • ake CSV: generates CSV file

  • ake download: invokes the python script to download data

  • ake prescrape: check if there's some downloaded file that can't be scraped

See also

Failed tests for scraping are included in the bin directory. scrapy.py is functional, you will need to install the corresponding Python and Chrome dependencies.

  • Download chromedriver from here. You'll need to copy it by hand to the directory in the script, or anywhere else and change the script too . Please bear in mind that there are specific chromedriver binaries for every version of Chrome; they need to be exactly the same.

The raw content of the pages used as source is included in the raw-pages directory, mainly for caching purposes. They are (c) the Ministry of Defense of Ukraine, and the source is this page.

License

This module is licensed under the Artistic 2.0 License (the same as Raku itself). See LICENSE for terms.

Data::UkraineWar::MoD v0.0.2

Raku distribution template

Authors

  • JJ Merelo

License

Artistic-2.0

Dependencies

Ake:ver<0.1.2+>

Test Dependencies

Provides

  • Data::UkraineWar::MoD::Daily
  • Data::UkraineWar::MoD::Scrape

Documentation

The Camelia image is copyright 2009 by Larry Wall. "Raku" is trademark of the Yet Another Society. All rights reserved.