{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "3f763644-38c6-49d0-83d8-63a0814952f5",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "# HOW TO: Create download links for images using `nla.obj` identifiers\n",
    "\n",
    "````{card}\n",
    "On this page\n",
    "\n",
    "```{contents}\n",
    ":local:\n",
    ":backlinks: None\n",
    "```\n",
    "````\n",
    "\n",
    "## Introduction\n",
    "\n",
    "Many of the resources digitised by the NLA and its partners are made up of images. These might be digitised copies of visual material like photos and maps, or scanned pages of print publications like books or periodicals. In Trove, each image or page has its own unique `nla.obj` identifier. You can use these identifiers to construct urls that lead directly to downloadable versions of the image file.\n",
    "\n",
    "## Method\n",
    "\n",
    "````{margin}\n",
    "```{seealso}\n",
    "While this method is particularly useful in developing computational processes for downloading and processing images, you can also use it in the web interface to make sure you're downloading the highest available resolution of an image. See [](/accessing-data/how-to/download-higher-resolution-images)\n",
    "```\n",
    "````\n",
    "\n",
    "To construct a url to an image file you just add a suffix to the identifier url. For example, this [photograph of a group of school children with gardening tools](https://nla.gov.au/nla.obj-141828112) has the identifier `nla.obj-141828112`. To create a direct link to the image, you just add `/image` to the identifier url:\n",
    "\n",
    "<https://nla.gov.au/nla.obj-141828112/image>\n",
    "\n",
    "The `/image` suffix is probably the most useful option as it provides access to the image at its highest available resolution. In many cases this will be at a higher resolution than is available through the download option provided by the web interface. There are, however, other possible image suffixes:\n",
    "\n",
    "| url suffix | description |\n",
    "|------------| ------------|\n",
    "| `/image` | leads to a higher-resolution JPEG version of the image (longest dimension is a maximum of 5000px) |\n",
    "| `-t` | leads to a thumbnail version of the image (usually around 123px wide) |\n",
    "| `/representativeImage` | leads to an image which has been selected to represent a collection |\n",
    "| `/m` | leads to a very high-resolution TIFF version of the image (only available for selected resources, mostly maps) |\n",
    "\n",
    "```{figure} /images/journal-cover-thumbnails.png\n",
    ":name: journal-cover-thumbnails\n",
    "\n",
    "An example of using the `-t` suffix to [assemble a collection of periodical cover thumbnails](digitised:periodicals:data:thumbnails)\n",
    "```\n",
    "\n",
    "There are additional parameters you can use with `/image` and `/representativeImage`, though I'm not sure how reliably they work:\n",
    "\n",
    "| parameter | description |\n",
    "|-----------|-------------|\n",
    "| `wid` | desired width in pixels |\n",
    "| `hei` | desired height in pixels |\n",
    "\n",
    "For example: <a href=\"https://nla.gov.au/nla.obj-141828112/image?wid=500\">https://nla.gov.au/nla.obj-141828112/image?wid=500</a>\n",
    "\n",
    "```{admonition} Image sizes\n",
    ":class: note\n",
    "The sizes of images downloaded using the `/image` suffix vary unpredictably. Sizes seem to range up to a maximum of 5000 pixels along the longest dimension, but some are much smaller, including many digitised photographs. However, images obtained this way are at the same, or higher, resolution than those available through Trove's built-in download option.\n",
    "```\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "87dc6823-351d-4b35-9a85-ef412865c25a",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "## More examples\n",
    "\n",
    "This photograph of some angry penguins on Heard Island has the identifier `nla.obj-147135602`.\n",
    "\n",
    "```{figure} /images/nla.obj-141171021.jpg\n",
    ":width: 600px\n",
    "\n",
    "Two Rockhopper Penguins and a predatory Skua, Heard Island, Antarctica, ca. 1930 (by Frank Hurley) [http://nla.gov.au/nla.obj-141171021](http://nla.gov.au/nla.obj-141171021)\n",
    "```\n",
    "\n",
    "|  |  |\n",
    "|-------------|-----|\n",
    "|The persistent url is created by adding `http://nla.gov.au/` to the identifier|<http://nla.gov.au/nla.obj-141171021>|\n",
    "|To view the photograph in Trove's digitised image viewer, you just add `view` to the persistent url (this is where the persistent url redirects to anyway)|<http://nla.gov.au/nla.obj-141171021/view>|\n",
    "|To access a thumbnail version of the image, you add `-t` to the persistent url|<http://nla.gov.au/nla.obj-141171021-t>|\n",
    "|To access a high-resolution version of the image, you add `/image` to the persistent url|<http://nla.gov.au/nla.obj-141171021/image>|\n",
    "|To access a version of the image that is 1000 pixels wide, you add `/image?wid=1000` to the persistent url|<http://nla.gov.au/nla.obj-141171021/image?wid=1000>|\n",
    "\n",
    "This works the same way with pages in books and periodicals, however, the urls are a bit more complicated. For example, this page in *The Home* also features a photo of penguins by Frank Hurley. The page's identifier is `nla.obj-387326197`.\n",
    "\n",
    "```{figure} /images/nla.obj-387326197.jpg\n",
    ":width: 600px\n",
    "\n",
    "'Penguin pageant' by Frank Hurley, *The Home*, vol. 20, no. 1, January 1940, p. 44 [http://nla.gov.au/nla.obj-387326197](http://nla.gov.au/nla.obj-387326197)\n",
    "```\n",
    "|  |  |\n",
    "|-------------|-----|\n",
    "|The persistent url is created by adding `http://nla.gov.au/` to the identifier|<http://nla.gov.au/nla.obj-387326197>|\n",
    "|If you access the page's persistent url you are redirected to the issue, with the page identifier included as a `partId` parameter|<https://nla.gov.au/nla.obj-387284380/view?partId=nla.obj-387326197>|\n",
    "|To access a thumbnail version of the page image, you add `-t` to the persistent url|<http://nla.gov.au/nla.obj-387326197-t>|\n",
    "|To access a high-resolution version of the page image, you add `/image` to the persistent url|<http://nla.gov.au/nla.obj-387326197/image>|\n",
    "|To access a version of the image that is 500 pixels high, you add `/image?hei=500` to the persistent url|<http://nla.gov.au/nla.obj-387326197/image?hei=500>|\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8093d408-6909-4c57-a758-b3f65c71a82b",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "## Getting image/page identifiers\n",
    "\n",
    "````{margin}\n",
    "```{seealso}\n",
    "The GLAM Workbench notebook [Download a collection of digitised images](https://glam-workbench.net/trove-images/download-image-collection/) provides a full working example of obtaining a list of image identifiers from a collection and then downloading each image by adding the `/image` suffix.\n",
    "\n",
    "```\n",
    "````\n",
    "\n",
    "If you want to use this method in a computational process to download all the images in a collection or publication you need some way of finding all the image/page identifiers. The method for doing depends on the type of digitised resource you're dealing with.\n",
    "\n",
    "If you're downloading images from a resource that is made up of pages, such as a book or periodical, you need to:\n",
    "\n",
    "- [extract the metadata](digitised:howto:embedded:extract-metadata) embedded in the digitised book or journal viewer\n",
    "- [get a list of page identifiers](digitised:howto:embedded:pages) from the metadata\n",
    "\n",
    "If you're downloading images from a collection of photographs, maps, or manuscripts, you need to:\n",
    "\n",
    "- [harvest item identifiers](/other-digitised-resources/how-to/get-collection-items) from the digitised collection viewer\n",
    "\n",
    "In fact, the latter method will also work with books and periodicals because they're treated like collections of pages, but it's much more efficient to grab the page identifiers from the embedded metadata.\n",
    "\n",
    "## Availability of high-resolution TIFF files\n",
    "\n",
    "````{margin}\n",
    "```{seealso}\n",
    "\n",
    "According to the [Exploring digitised maps in Trove](https://glam-workbench.net/trove-maps/exploring-digitised-maps/) notebook in the GLAM Workbench there are more than 30,000 digitised maps with high-resolution TIFF downloads. The largest weighs in at more than 3gb!\n",
    "\n",
    "```\n",
    "````\n",
    "\n",
    "As noted the `/m` suffix can be used to download huge, high-resolution TIFF versions of some images. I've only come across this option amongst the digitised maps, though it could be available elsewhere. If you add the `/m` suffix to an image that doesn't have a TIFF version you'll end up downloading a jpeg placeholder image that says 'Not available online'. So how can you determine if a TIFF version is available to download? You need to:\n",
    "\n",
    "- [extract the metadata](digitised:howto:embedded:extract-metadata) embedded in the digitised map viewer\n",
    "- [inspect the `copies` metadata](digitised:howto:embedded:images) to find a verion with `copyrole` set to `m` and `access` set to `true\n",
    "\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}