🚧 This is a working draft and will change often. Do not cite!
Use the latest published version instead.
🚧

28.1. Data sources#

This page describes options for obtaining images from Trove.

Documentation#

These sections of the Trove Data Guide explain how to access images from different parts of Trove:

Pre-harvested data sets#

The GLAM Workbench provides datasets containing images harvested from Trove.

Editorial cartoons from The Bulletin, 1886 to 1952

Bulletin cartoon

This dataset includes a collection of 3,471 full-page editorial cartoons downloaded from issues of The Bulletin published between 1886 and 1952. In most cases there is one cartoon per issue.

Creating datasets#

These tools and examples can help you create your own collections of images from Trove.

GLAM Workbench notebooks#

Get covers (or any other pages) from a digitised journal in Trove

This notebook shows how to download all the cover images from a specified periodical. With some minor modifications you could download any page, or range of pages.

Harvest illustrations from periodicals

This notebook shows you how to harvest illustrations from Trove’s digitised periodicals. It makes use of layout information generated by Trove’s OCR processing to find the coordinates of illustrations on a digitised page. Using these coordinates the illustrations can be cropped from the page image and saved.

Save a Trove newspaper article as an image

This notebook grabs the page on which an article was published, and then crops the page image to the boundaries of the article. The result is a complete, intact image which presents the article as it was originally published. And if the article is split across multiple pages, you’ll get one image per page.

Harvest Australian Women’s Weekly covers (or the front pages of any newspaper)

Somewhat confusingly, the Australian Women’s Weekly is in with Trove’s digitised newspapers and not the rest of the magazines. There are notebooks in the GLAM Workbench’s journals section to help harvest all of a journal’s covers as images, so I thought I should do the same for the Weekly. This notebook can be easily adjusted to download the front pages of any digitised newspaper.

Download a collection of digitised images

Digitised photographs and other images are often organised into collections. While the Trove web interface does include a download option for collections, it has a number of limitations. This notebook provides an alternative method that downloads all of the available images in a collection (and any sub-collections) at the highest available resolution.

Software packages#

trove-newspaper-images

This Python package includes tools to download Trove newspaper articles as complete JPEG images. If an article is printed across multiple newspaper pages, multiple images will be downloaded – one for each page. It’s intended for integration into other tools and processing workflows, or for people who like working on the command line.

trove-newspaper-harvester

The Trove Newspaper (& Gazette) Harvester makes it easy to download large quantities of digitised articles from Trove’s newspapers and gazettes. Just give it a search from the Trove web interface, and the harvester will save the metadata of all the articles in a CSV (spreadsheet) file for further analysis. You can also save the full text of every article, as well as copies of the articles as JPG images, and even PDFs.

Other tools#

Save Trove newspaper article as image

A simple web app that helps you save a Trove newspaper article as an image.

Download a page image from Trove’s newspapers The Trove web interface doesn’t provide a way of getting high-resolution page images from newspapers. This simple app lets you download page images as complete, high-resolution jpeg files.