🚧 This is a working draft and will change often. Do not cite!
Use the latest published version instead.
🚧

24.3. HOW TO: Extract additional metadata from the digitised resource viewer#

The viewers you use to examine digitised resources in Trove embed some metadata that isn’t available through the Trove API. This includes a JSON-ified version of the item’s MARC record (presumably copied from the NLA catalogue), as well as structural information used by the viewer itself, such as a list of pages in a digitised book.

This metadata can be useful in a number of different contexts. For example, you can extract the number of pages in a digitised book, then use this number to automatically download the full text or a PDF. The GLAM Workbench includes an example where geospatial coordinates are extracted from the MARC data to add to a harvest of digitised maps.

What metadata is available?#

The available metadata varies by viewer and format. The main differences are:

  • the image viewer includes information about digitised images in the copies field

  • the books and journals viewer includes information in the children field about individual pages and sub-sections such as chapters and articles

All viewers#

All of the viewers embed some basic metadata, like id and title, at the top level of the JSON data. However, the actual fields can vary by format and viewer type, so don’t assume that a particular field exists, or has a value. Here’s an example from an issue of Walkabout.

    "id": "71404117",
    "collection": "nla.aus",
    "type": "work",
    "form": "Journal",
    "displayTitlePage": "false",
    "subType": "book",
    "issueDate": "Sun, 02 Dec 1934",
    "subUnitNo": "Vol. 1 No. 2 (2 December 1934)",
    "bibLevel": "Item",
    "bibId": "2592481",
    "holdingNumber": "Nq 919.4 WAL",
    "pid": "nla.obj-714041173",
    "title": "Walkabout.",
    "accessConditions": "Unrestricted",
    "copyrightPolicy": "Out of Copyright",
    "recordSource": "NLACat",
    "sensitiveMaterial": "No",
    "commentsExternal": "Some pages in this issue have been restricted. This may affect left/right page sequencing. Some loss of text in gutter due to page edges stitched into gutter at binding process",
    "digitalStatus": "Captured",
    "startDate": "01 January 1934",
    "creator": "",
    "extent": "v. : ill., maps. ; 34 cm.",
    "isMissingPage": "false",
    "publisherName": "Australian National Travel Association",

There’s also a topLevelCollection field that contains the nla.obj identifier of the parent record in this collection. If it’s a single item (ie a collection of one) then topLevelCollection will probably be the same as the item identifier in pid.

All of the viewers also embed a JSON-ified MARC record in the marcData field.

Image and map viewer#

The image and map viewer includes a copies field at the top level of the JSON data. This field includes a list of the images associated with this item. Here’s an example from nla.obj-133327370:

"copies": [
    {
        "copyrole": "access",
        "blobId": 146939732,
        "filename": "314560922.jp2",
        "filesize": 6663187,
        "technicalmetadata": {
            "width": 8566,
            "height": 12449
        }
    },
    {
        "copyrole": "m",
        "access": "false",
        "filesize": 745416848
    }
],

The ‘copies’ of the image are different formats or resolutions created for specific purposes, such as access or preservation. Apparently copyrole values can be one ofaccess, m, o, i, or fd, but I’ve only come across access and m. The m copies seem to refer to high-resolution TIFFs, and if access is set to true then these TIFF versions are made available for download. You can find downloadable TIFFs amongst the digitised maps. For example, this map has access set to true for the m copy:

"copies": [
    {
        "copyrole": "access",
        "blobId": 7682805,
        "filename": "23216230.jp2",
        "filesize": 1560253,
        "technicalmetadata": {
            "width": 4519,
            "height": 5508
        }
    },
    {
        "copyrole": "m",
        "access": "true",
        "filesize": 74685872
    }
],

The map viewer reads this value and adds a TIFF option under the download tab. If access is true you can also download the high-resolution TIFF directly by adding /m to the item identifier (though take note of the file size as the downloads can be huge!):

https://nla.gov.au/nla.obj-232162256/m

Books and journals viewer#

The books and journals viewer has a children field in the top-level JSON data which includes page, article, and chapter fields.

Pages#

The page field contains details of every page image. Here’s the metadata for a single page in the book The story of the Australian bushrangers:

{
    "id": "48661387",
    "subType": "page",
    "title": "The story of the Australian bushrangers",
    "bibId": "1068148",
    "pid": "nla.obj-486613874",
    "form": "Book",
    "accessConditions": "Unrestricted",
    "copyrightPolicy": "Out of Copyright",
    "bibLevel": "Part",
    "digitalStatus": "Captured",
    "holdingNumber": "NL 343.94 BOX",
    "copies": [
        {
            "copyrole": "access",
            "blobId": 15236579,
            "filename": "48661395.jp2",
            "filesize": 506342,
            "technicalmetadata": {
                "width": 2335,
                "height": 3495
            }
        },
        {
            "copyrole": "m",
            "access": "false",
            "filesize": 24482931
        }
    ]
}

While some of these fields duplicate what’s available at the top-level of the metadata, the pid here is the identifier of this particular page. This identifier can be used to download the page image and OCR data.

Each page has a copies field describing the available image versions. The image dimensions of the access copy included in the technicalmetadata field can be useful if you want to use the OCR data to crop sections out of the page image.

Articles#

Periodical issues can include a list of articles in the article field. Here’s an example of an article entry from Walkabout:

{
    "id": "75337488",
    "subType": "article",
    "pid": "nla.obj-753374885",
    "title": "A Visit to Lake Frome",
    "creator": "By ARTHUR W. UPFIELD",
    "bibLevel": "Section",
    "existson": [
        {
            "id": "71404264",
            "page": "nla.obj-714042646"
        },
        {
            "id": "71404251",
            "page": "nla.obj-714042515"
        },
        {
            "id": "71404232",
            "page": "nla.obj-714042324"
        },
        {
            "id": "71404219",
            "page": "nla.obj-714042196"
        }
    ]
}

Articles have their own values for pid, title, and creator (if the article has a byline). The existson field lists the pages on which this article appears. This article starts on page nla.obj-714042646.

Chapters#

Books can include a list of chapters in the chapter field. Here’s an example of a chapter entry from The story of the Australian bushrangers:

{
    "id": "49622020",
    "subType": "chapter",
    "subUnitNo": "2",
    "title": "PREFACE.",
    "pid": "nla.obj-496220207",
    "bibLevel": "Section",
    "existson": [
        {
            "id": "48661510",
            "page": "nla.obj-486615102"
        },
        {
            "id": "48661523",
            "page": "nla.obj-486615233"
        }
    ]
}

Chapters have their own values for pid and title, while the subUnitNo specifies the order of the chapters. The existson field lists the pages on which this chapter appears.

Extracting the metadata#

The function to extract the metadata is fairly straightforward. It loads the viewer’s HTML code and uses a regular expression to find and extract the embedded JSON string. It expects an nla.obj identifier. For the image and map viewers, this is the identifier of an individual item. For the book and journal viewer you can use the nla.obj identifier for the book, issue, page, or article. This is because page and article identifiers are redirected to issues. Here’s a full examp[le that extracts the embedded metadata for the book Lord Robert Cecil’s gold fields diary.

import json
import re

import requests
from IPython.display import JSON


def get_metadata(id):
    """
    Extract work data in a JSON string from the work's HTML page.
    """
    if not id.startswith("http"):
        id = "https://nla.gov.au/" + id
    response = requests.get(id)
    try:
        work_data = re.search(
            r"var work = JSON\.parse\(JSON\.stringify\((\{.*\})", response.text
        ).group(1)
    except AttributeError:
        work_data = "{}"
    return json.loads(work_data)


book_id = "https://nla.gov.au/nla.obj-362059651/"

metadata = get_metadata(book_id)

display(metadata)
{'id': '36205965',
 'collection': 'nla.aus',
 'type': 'work',
 'form': 'Book',
 'subType': 'book',
 'bibLevel': 'Item',
 'bibId': '653766',
 'holdingNumber': 'JAFp BIO 92',
 'pid': 'nla.obj-362059651',
 'title': "Lord Robert Cecil's gold fields diary",
 'accessConditions': 'Unrestricted',
 'copyrightPolicy': 'Out of Copyright',
 'recordSource': 'NLACat',
 'digitalStatus': 'Captured',
 'startDate': '01 January 1945',
 'creator': 'Salisbury, Robert Cecil, marquess of, 1830-1903. 338373 9112d83c-f87f-5a34-a022-bea98d9ee823',
 'extent': '32 p., [20] p. of plates : ill. ; 18 cm.',
 'publisherName': 'Melbourne University Press',
 'allowSearchEngineIndexing': 'false',
 'findingAidAvailable': 'No',
 'isOriginalCopyAvaliable': 'false',
 'ocrMetsCopyAvaliable': 'true',
 'partnerNucs': [],
 'parentProjectIds': [],
 'projectIds': [],
 'marcData': {'record': [{'leader': {'type': 'Bibliographic',
     'content': '01297cam a2200289 a 4500'},
    'datafield': [{'ind2': ' ',
      'ind1': 1,
      'subfield': [{'code': 'a', 'content': 2617933},
       {'code': 'z', 'content': 9324621}],
      'tag': '019'},
     {'ind2': ' ',
      'ind1': ' ',
      'subfield': {'code': 9, 'content': '(AuCNLDY)577939'},
      'tag': '035'},
     {'ind2': ' ',
      'ind1': ' ',
      'subfield': {'code': 'a', 'content': 653766},
      'tag': '035'},
     {'ind2': ' ',
      'ind1': ' ',
      'subfield': [{'code': 'a', 'content': 'NNCU:A'},
       {'code': 'b', 'content': 'eng'},
       {'code': 'c', 'content': 'NNCU:A'},
       {'code': 'd', 'content': 'AUC:LSM'}],
      'tag': '040'},
     {'ind2': ' ',
      'ind1': ' ',
      'subfield': {'code': 'a', 'content': 'anuc'},
      'tag': '042'},
     {'ind2': ' ',
      'ind1': ' ',
      'subfield': {'code': 'a', 'content': 'u-at-vi'},
      'tag': '043'},
     {'ind2': 4,
      'ind1': 0,
      'subfield': [{'code': 'a', 'content': '994.5/03'},
       {'code': 2, 'content': 19}],
      'tag': '082'},
     {'ind2': ' ',
      'ind1': 1,
      'subfield': [{'code': 'a', 'content': 'Salisbury, Robert Cecil,'},
       {'code': 'c', 'content': 'marquess of,'},
       {'code': 'd', 'content': '1830-1903.'},
       {'code': 0, 'content': 338373},
       {'code': 9, 'content': '9112d83c-f87f-5a34-a022-bea98d9ee823'}],
      'tag': 100},
     {'ind2': 0,
      'ind1': 1,
      'subfield': [{'code': 'a',
        'content': "Lord Robert Cecil's gold fields diary /"},
       {'code': 'c',
        'content': 'with introduction and notes by Sir Ernest Scott.'}],
      'tag': 245},
     {'ind2': ' ',
      'ind1': ' ',
      'subfield': {'code': 'a', 'content': '2nd ed.'},
      'tag': 250},
     {'ind2': ' ',
      'ind1': ' ',
      'subfield': [{'code': 'a', 'content': 'Carlton, Vic. :'},
       {'code': 'b', 'content': 'Melbourne University Press,'},
       {'code': 'c', 'content': 1945}],
      'tag': 260},
     {'ind2': ' ',
      'ind1': ' ',
      'subfield': [{'code': 'a', 'content': '32 p., [20] p. of plates :'},
       {'code': 'b', 'content': 'ill. ;'},
       {'code': 'c', 'content': '18 cm.'}],
      'tag': 300},
     {'ind2': ' ',
      'ind1': ' ',
      'subfield': [{'code': 'a', 'content': 'Also available online'},
       {'code': 'u', 'content': 'http://nla.gov.au/nla.obj-362059651'}],
      'tag': 530},
     {'ind2': 0,
      'ind1': 1,
      'subfield': [{'code': 'a', 'content': 'Salisbury, Robert Cecil,'},
       {'code': 'c', 'content': 'marquess of,'},
       {'code': 'd', 'content': '1830-1903.'},
       {'code': 0, 'content': 338373},
       {'code': 9, 'content': '9112d83c-f87f-5a34-a022-bea98d9ee823'}],
      'tag': 600},
     {'ind2': 0,
      'ind1': ' ',
      'subfield': [{'code': 'a', 'content': 'Gold mines and mining'},
       {'code': 'z', 'content': 'Victoria.'}],
      'tag': 650},
     {'ind2': 0,
      'ind1': ' ',
      'subfield': [{'code': 'a', 'content': 'Victoria'},
       {'code': 'x', 'content': 'Description and travel.'}],
      'tag': 651},
     {'ind2': ' ',
      'ind1': 1,
      'subfield': [{'code': 'a', 'content': 'Scott, Ernest,'},
       {'code': 'd', 'content': '1868-1939.'},
       {'code': 0, 'content': 118638},
       {'code': 9, 'content': '6a97b19f-f2eb-57c3-a661-e5b83e83581c'}],
      'tag': 700},
     {'ind2': 1,
      'ind1': 4,
      'subfield': [{'code': 'z',
        'content': 'National Library of Australia digitised item. JAFp BIO 92 copy'},
       {'code': 'u', 'content': 'http://nla.gov.au/nla.obj-362059651'},
       {'code': 'x', 'content': 'fulltext'}],
      'tag': 856},
     {'ind2': 'f',
      'ind1': 'f',
      'subfield': [{'code': 'i',
        'content': '2c2be5dd-982b-5020-ac7c-f8820fc3ae34'},
       {'code': 's', 'content': 'b08c2134-2fff-5812-bdd3-a55d68f73fa1'}],
      'tag': 999}],
    'controlfield': [{'tag': '001', 'content': 653766},
     {'tag': '005', 'content': 20240325010329.3},
     {'tag': '008', 'content': '830518s1945    vraa          000 0aeng d'}]},
   {'leader': {'type': 'Holdings', 'content': '00000nam a2200000 a 4500'},
    'datafield': [{'ind2': ' ',
      'ind1': 8,
      'subfield': [{'code': 'b', 'content': 'AUSP'},
       {'code': 'h', 'content': 'Np 994.5 SAL'}],
      'tag': 852},
     {'ind2': 0,
      'ind1': ' ',
      'subfield': {'code': 'z', 'content': 'N pbk'},
      'tag': 866},
     {'ind2': ' ',
      'ind1': ' ',
      'subfield': {'code': 'a', 'content': 'NLA'},
      'tag': 954}],
    'controlfield': [{'tag': '001', 'content': 2912182},
     {'tag': '004', 'content': 653766}]},
   {'leader': {'type': 'Holdings', 'content': '00000nam a2200000 a 4500'},
    'datafield': [{'ind2': ' ',
      'ind1': 8,
      'subfield': [{'code': 'b', 'content': 'PET'},
       {'code': 'h', 'content': 'JAFp BIO 92'}],
      'tag': 852},
     {'ind2': ' ',
      'ind1': ' ',
      'subfield': {'code': 'a', 'content': 'NLA'},
      'tag': 954}],
    'controlfield': [{'tag': '001', 'content': 2912183},
     {'tag': '004', 'content': 653766}]},
   {'leader': {'type': 'Holdings', 'content': '00000nam a2200000 a 4500'},
    'datafield': [{'ind2': ' ',
      'ind1': 8,
      'subfield': [{'code': 'b', 'content': 'PET'},
       {'code': 'h', 'content': 'JAFp GEN SAL'}],
      'tag': 852},
     {'ind2': 0,
      'ind1': ' ',
      'subfield': {'code': 'z', 'content': 'FC copy'},
      'tag': 866},
     {'ind2': ' ',
      'ind1': ' ',
      'subfield': {'code': 'a', 'content': 'NLA'},
      'tag': 954}],
    'controlfield': [{'tag': '001', 'content': 2912184},
     {'tag': '004', 'content': 653766}]},
   {'leader': {'type': 'Holdings', 'content': '00000nam a2200000 a 4500'},
    'datafield': [{'ind2': ' ',
      'ind1': 8,
      'subfield': [{'code': 'b', 'content': 'AUSLP'},
       {'code': 'h', 'content': 'NLp 994.5 SAL'}],
      'tag': 852},
     {'ind2': 0,
      'ind1': ' ',
      'subfield': {'code': 'z', 'content': 'NL pbk'},
      'tag': 866},
     {'ind2': ' ',
      'ind1': ' ',
      'subfield': {'code': 'a', 'content': 'NLA'},
      'tag': 954}],
    'controlfield': [{'tag': '001', 'content': 4042315},
     {'tag': '004', 'content': 653766}]}]},
 'children': {'page': [{'id': '36205990',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362059904',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551350,
      'filename': '36205998.jp2',
      'filesize': 357765,
      'technicalmetadata': {'width': 1926, 'height': 2840}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 16436560}]},
   {'id': '36206003',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362060036',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551354,
      'filename': '36206011.jp2',
      'filesize': 325233,
      'technicalmetadata': {'width': 1743, 'height': 2808}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14709600}]},
   {'id': '36206017',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362060175',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551358,
      'filename': '36206025.jp2',
      'filesize': 332757,
      'technicalmetadata': {'width': 1766, 'height': 2822}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14976176}]},
   {'id': '36206030',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362060307',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551362,
      'filename': '36206038.jp2',
      'filesize': 290972,
      'technicalmetadata': {'width': 1642, 'height': 2649}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13074028}]},
   {'id': '36206043',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362060433',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551366,
      'filename': '36206051.jp2',
      'filesize': 316353,
      'technicalmetadata': {'width': 1694, 'height': 2798}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14244196}]},
   {'id': '36206056',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362060563',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551370,
      'filename': '36206064.jp2',
      'filesize': 308812,
      'technicalmetadata': {'width': 1678, 'height': 2739}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13813536}]},
   {'id': '36206069',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362060694',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551374,
      'filename': '36206077.jp2',
      'filesize': 315081,
      'technicalmetadata': {'width': 1688, 'height': 2792}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14165840}]},
   {'id': '36206082',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362060828',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551378,
      'filename': '36206090.jp2',
      'filesize': 314714,
      'technicalmetadata': {'width': 1690, 'height': 2782}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14129776}]},
   {'id': '36206095',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362060959',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551382,
      'filename': '36206103.jp2',
      'filesize': 315411,
      'technicalmetadata': {'width': 1700, 'height': 2792}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14266964}]},
   {'id': '36206108',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362061083',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551386,
      'filename': '36206116.jp2',
      'filesize': 310531,
      'technicalmetadata': {'width': 1674, 'height': 2783}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14004236}]},
   {'id': '36206121',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362061212',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551390,
      'filename': '36206129.jp2',
      'filesize': 316813,
      'technicalmetadata': {'width': 1706, 'height': 2804}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14378908}]},
   {'id': '36206134',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362061349',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551394,
      'filename': '36206142.jp2',
      'filesize': 309855,
      'technicalmetadata': {'width': 1670, 'height': 2796}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14036000}]},
   {'id': '36206147',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362061476',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551398,
      'filename': '36206155.jp2',
      'filesize': 321690,
      'technicalmetadata': {'width': 1706, 'height': 2822}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14469512}]},
   {'id': '36206160',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362061603',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551402,
      'filename': '36206168.jp2',
      'filesize': 302721,
      'technicalmetadata': {'width': 1645, 'height': 2750}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13599008}]},
   {'id': '36206173',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362061731',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551406,
      'filename': '36206181.jp2',
      'filesize': 302730,
      'technicalmetadata': {'width': 1661, 'height': 2743}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13696764}]},
   {'id': '36206186',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362061862',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551410,
      'filename': '36206194.jp2',
      'filesize': 309234,
      'technicalmetadata': {'width': 1669, 'height': 2783}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13962668}]},
   {'id': '36206199',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362061995',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551414,
      'filename': '36206207.jp2',
      'filesize': 318131,
      'technicalmetadata': {'width': 1694, 'height': 2822}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14369336}]},
   {'id': '36206212',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362062124',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551418,
      'filename': '36206220.jp2',
      'filesize': 307502,
      'technicalmetadata': {'width': 1665, 'height': 2787}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13949196}]},
   {'id': '36206225',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362062250',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551422,
      'filename': '36206233.jp2',
      'filesize': 310439,
      'technicalmetadata': {'width': 1664, 'height': 2822}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14114880}]},
   {'id': '36206238',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362062384',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551426,
      'filename': '36206246.jp2',
      'filesize': 301196,
      'technicalmetadata': {'width': 1648, 'height': 2763}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13687828}]},
   {'id': '36206251',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362062514',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551430,
      'filename': '36206259.jp2',
      'filesize': 309281,
      'technicalmetadata': {'width': 1666, 'height': 2775}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13897256}]},
   {'id': '36206264',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362062649',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551434,
      'filename': '36206272.jp2',
      'filesize': 305751,
      'technicalmetadata': {'width': 1630, 'height': 2806}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13748248}]},
   {'id': '36206277',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362062779',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551438,
      'filename': '36206285.jp2',
      'filesize': 308294,
      'technicalmetadata': {'width': 1640, 'height': 2796}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13783776}]},
   {'id': '36206290',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362062905',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551442,
      'filename': '36206298.jp2',
      'filesize': 305747,
      'technicalmetadata': {'width': 1612, 'height': 2830}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13713280}]},
   {'id': '36206303',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362063031',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551446,
      'filename': '36206311.jp2',
      'filesize': 306590,
      'technicalmetadata': {'width': 1640, 'height': 2797}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13788724}]},
   {'id': '36206316',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362063165',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551450,
      'filename': '36206324.jp2',
      'filesize': 299724,
      'technicalmetadata': {'width': 1625, 'height': 2789}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13623720}]},
   {'id': '36206329',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362063296',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551454,
      'filename': '36206337.jp2',
      'filesize': 295007,
      'technicalmetadata': {'width': 1595, 'height': 2737}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13124464}]},
   {'id': '36206342',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362063426',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551458,
      'filename': '36206350.jp2',
      'filesize': 297779,
      'technicalmetadata': {'width': 1588, 'height': 2812}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13423928}]},
   {'id': '36206355',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362063555',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551462,
      'filename': '36206363.jp2',
      'filesize': 296505,
      'technicalmetadata': {'width': 1609, 'height': 2771}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13402596}]},
   {'id': '36206368',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362063681',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551466,
      'filename': '36206376.jp2',
      'filesize': 295690,
      'technicalmetadata': {'width': 1601, 'height': 2749}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13230876}]},
   {'id': '36206381',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362063813',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551470,
      'filename': '36206389.jp2',
      'filesize': 295336,
      'technicalmetadata': {'width': 1602, 'height': 2732}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13157308}]},
   {'id': '36206394',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362063942',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551474,
      'filename': '36206402.jp2',
      'filesize': 297440,
      'technicalmetadata': {'width': 1594, 'height': 2794}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13388024}]},
   {'id': '36206407',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362064075',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551478,
      'filename': '36206415.jp2',
      'filesize': 289403,
      'technicalmetadata': {'width': 1569, 'height': 2764}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13037244}]},
   {'id': '36206420',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362064205',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551482,
      'filename': '36206428.jp2',
      'filesize': 294963,
      'technicalmetadata': {'width': 1593, 'height': 2786}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13341424}]},
   {'id': '36206433',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362064331',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551486,
      'filename': '36206441.jp2',
      'filesize': 300005,
      'technicalmetadata': {'width': 1599, 'height': 2784}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13381884}]},
   {'id': '36206446',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362064462',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551490,
      'filename': '36206454.jp2',
      'filesize': 295397,
      'technicalmetadata': {'width': 1599, 'height': 2778}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13353752}]},
   {'id': '36206459',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362064599',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551494,
      'filename': '36206467.jp2',
      'filesize': 293873,
      'technicalmetadata': {'width': 1584, 'height': 2754}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13114076}]},
   {'id': '36206472',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362064729',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551498,
      'filename': '36206480.jp2',
      'filesize': 289866,
      'technicalmetadata': {'width': 1554, 'height': 2783}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13001488}]},
   {'id': '36206485',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362064858',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551502,
      'filename': '36206493.jp2',
      'filesize': 300002,
      'technicalmetadata': {'width': 1639, 'height': 2761}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13603936}]},
   {'id': '36206498',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362064987',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551506,
      'filename': '36206506.jp2',
      'filesize': 317983,
      'technicalmetadata': {'width': 1738, 'height': 2754}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14387576}]},
   {'id': '36206511',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362065113',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551510,
      'filename': '36206519.jp2',
      'filesize': 322407,
      'technicalmetadata': {'width': 1728, 'height': 2786}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14470684}]},
   {'id': '36206524',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362065244',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551514,
      'filename': '36206532.jp2',
      'filesize': 333367,
      'technicalmetadata': {'width': 1762, 'height': 2830}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14987704}]},
   {'id': '36206537',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362065373',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551518,
      'filename': '36206545.jp2',
      'filesize': 330449,
      'technicalmetadata': {'width': 1754, 'height': 2822}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14877328}]},
   {'id': '36206550',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362065503',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551522,
      'filename': '36206558.jp2',
      'filesize': 324108,
      'technicalmetadata': {'width': 1752, 'height': 2786}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14669868}]},
   {'id': '36206563',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362065633',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551526,
      'filename': '36206571.jp2',
      'filesize': 315480,
      'technicalmetadata': {'width': 1716, 'height': 2762}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14246788}]},
   {'id': '36206576',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362065767',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551530,
      'filename': '36206584.jp2',
      'filesize': 324317,
      'technicalmetadata': {'width': 1750, 'height': 2788}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14665132}]},
   {'id': '36206589',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362065894',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551534,
      'filename': '36206597.jp2',
      'filesize': 326781,
      'technicalmetadata': {'width': 1742, 'height': 2822}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14775888}]},
   {'id': '36206602',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362066023',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551538,
      'filename': '36206610.jp2',
      'filesize': 333360,
      'technicalmetadata': {'width': 1780, 'height': 2830}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 15140296}]},
   {'id': '36206615',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362066153',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551542,
      'filename': '36206623.jp2',
      'filesize': 314707,
      'technicalmetadata': {'width': 1706, 'height': 2755}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14128324}]},
   {'id': '36206628',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362066287',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551546,
      'filename': '36206636.jp2',
      'filesize': 324455,
      'technicalmetadata': {'width': 1758, 'height': 2790}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14742720}]},
   {'id': '36206641',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362066417',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551550,
      'filename': '36206649.jp2',
      'filesize': 321925,
      'technicalmetadata': {'width': 1732, 'height': 2808}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14618624}]},
   {'id': '36206654',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362066547',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551554,
      'filename': '36206662.jp2',
      'filesize': 329478,
      'technicalmetadata': {'width': 1779, 'height': 2795}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14945160}]},
   {'id': '36206667',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362066673',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551558,
      'filename': '36206675.jp2',
      'filesize': 308578,
      'technicalmetadata': {'width': 1682, 'height': 2747}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 13889596}]},
   {'id': '36206680',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362066807',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551562,
      'filename': '36206688.jp2',
      'filesize': 323456,
      'technicalmetadata': {'width': 1750, 'height': 2763}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14533588}]},
   {'id': '36206693',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362066935',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551566,
      'filename': '36206701.jp2',
      'filesize': 324834,
      'technicalmetadata': {'width': 1718, 'height': 2822}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 14569580}]},
   {'id': '36205977',
    'subType': 'page',
    'title': "Lord Robert Cecil's gold fields diary",
    'bibId': '653766',
    'pid': 'nla.obj-362059771',
    'form': 'Book',
    'accessConditions': 'Unrestricted',
    'copyrightPolicy': 'Out of Copyright',
    'bibLevel': 'Part',
    'digitalStatus': 'Captured',
    'holdingNumber': 'JAFp BIO 92',
    'copies': [{'copyrole': 'access',
      'blobId': 11551346,
      'filename': '36205985.jp2',
      'filesize': 361053,
      'technicalmetadata': {'width': 1938, 'height': 2801}},
     {'copyrole': 'm', 'access': 'false', 'filesize': 16310064}]}],
  'article': [],
  'chapter': [{'id': '36207811',
    'subType': 'chapter',
    'subUnitNo': '1',
    'pid': 'nla.obj-362078119',
    'bibLevel': 'Section',
    'existson': [{'id': '36205990', 'page': 'nla.obj-362059904'},
     {'id': '36206003', 'page': 'nla.obj-362060036'},
     {'id': '36206017', 'page': 'nla.obj-362060175'},
     {'id': '36206030', 'page': 'nla.obj-362060307'},
     {'id': '36206043', 'page': 'nla.obj-362060433'},
     {'id': '36206056', 'page': 'nla.obj-362060563'},
     {'id': '36206069', 'page': 'nla.obj-362060694'}]},
   {'id': '36207820',
    'subType': 'chapter',
    'subUnitNo': '2',
    'pid': 'nla.obj-362078203',
    'bibLevel': 'Section',
    'existson': [{'id': '36206095', 'page': 'nla.obj-362060959'},
     {'id': '36206082', 'page': 'nla.obj-362060828'},
     {'id': '36206108', 'page': 'nla.obj-362061083'},
     {'id': '36206121', 'page': 'nla.obj-362061212'},
     {'id': '36206134', 'page': 'nla.obj-362061349'},
     {'id': '36206147', 'page': 'nla.obj-362061476'},
     {'id': '36206160', 'page': 'nla.obj-362061603'},
     {'id': '36206173', 'page': 'nla.obj-362061731'},
     {'id': '36206186', 'page': 'nla.obj-362061862'},
     {'id': '36206199', 'page': 'nla.obj-362061995'},
     {'id': '36206212', 'page': 'nla.obj-362062124'},
     {'id': '36206225', 'page': 'nla.obj-362062250'},
     {'id': '36206238', 'page': 'nla.obj-362062384'},
     {'id': '36206251', 'page': 'nla.obj-362062514'},
     {'id': '36206264', 'page': 'nla.obj-362062649'},
     {'id': '36206277', 'page': 'nla.obj-362062779'},
     {'id': '36206290', 'page': 'nla.obj-362062905'},
     {'id': '36206303', 'page': 'nla.obj-362063031'},
     {'id': '36206316', 'page': 'nla.obj-362063165'},
     {'id': '36206329', 'page': 'nla.obj-362063296'},
     {'id': '36206342', 'page': 'nla.obj-362063426'},
     {'id': '36206355', 'page': 'nla.obj-362063555'},
     {'id': '36206368', 'page': 'nla.obj-362063681'},
     {'id': '36206381', 'page': 'nla.obj-362063813'},
     {'id': '36206394', 'page': 'nla.obj-362063942'},
     {'id': '36206407', 'page': 'nla.obj-362064075'},
     {'id': '36206420', 'page': 'nla.obj-362064205'},
     {'id': '36206433', 'page': 'nla.obj-362064331'},
     {'id': '36206446', 'page': 'nla.obj-362064462'},
     {'id': '36206459', 'page': 'nla.obj-362064599'},
     {'id': '36206472', 'page': 'nla.obj-362064729'},
     {'id': '36206485', 'page': 'nla.obj-362064858'},
     {'id': '36206498', 'page': 'nla.obj-362064987'},
     {'id': '36206511', 'page': 'nla.obj-362065113'},
     {'id': '36206524', 'page': 'nla.obj-362065244'},
     {'id': '36206537', 'page': 'nla.obj-362065373'},
     {'id': '36206550', 'page': 'nla.obj-362065503'},
     {'id': '36206563', 'page': 'nla.obj-362065633'},
     {'id': '36206576', 'page': 'nla.obj-362065767'},
     {'id': '36206589', 'page': 'nla.obj-362065894'},
     {'id': '36206602', 'page': 'nla.obj-362066023'},
     {'id': '36206615', 'page': 'nla.obj-362066153'},
     {'id': '36206628', 'page': 'nla.obj-362066287'},
     {'id': '36206641', 'page': 'nla.obj-362066417'},
     {'id': '36206654', 'page': 'nla.obj-362066547'},
     {'id': '36206667', 'page': 'nla.obj-362066673'},
     {'id': '36206680', 'page': 'nla.obj-362066807'},
     {'id': '36206693', 'page': 'nla.obj-362066935'},
     {'id': '36205977', 'page': 'nla.obj-362059771'}]}],
  'book': [],
  'volume': [],
  'other': []},
 'topLevelCollection': 'nla.obj-362059651'}

Get MARC catalogue data#

The MARC data is contained in the marcData field. This field can contain multiple records – the main metadata is contained in the record which has type set to Bibliographic in the leader field.

Tools like PyMARC can help you get information from MARC records, however, Trove’s marcData isn’t in a format that PyMARC recognises. The function below finds the Bibliographic record and restructures the data for use with PyMARC.

import json

from pymarc import JSONReader


def parse_marc(metadata):
    """
    Parse the bibliographic MARC data in the embedded metadata.
    This produces a structure that can be loaded into PyMarc's JSON reader.
    """
    # Some nla.obj items don't have MARC data
    # For example some collections
    try:
        records = metadata["marcData"]["record"]
    except KeyError:
        return {}

    # The metadata contains bibliographic and holdings MARC data
    # here we'll select the bib record.
    for record in records:
        if record["leader"].get("type") == "Bibliographic":
            break

    fields = []
    # Control fields only have content, no subfields
    for cf in record.get("controlfield", []):
        fields.append({str(cf["tag"]): str(cf["content"])})

    # Loop through all the fields
    for field in record["datafield"]:
        subfields = []
        # Get any subfields
        sfs = field.get("subfield", [])
        # The subfields value can be a list or dict
        # Check if it's a list
        if isinstance(sfs, list):
            # Loop through the subfields adding the values
            for sf in sfs:
                subfields.append({sf["code"]: str(sf["content"])})
        # If it's not a list just add the details from the dict
        else:
            subfields.append({sfs["code"]: str(sfs["content"])})
        fields.append(
            {
                str(field["tag"]): {
                    "subfields": subfields,
                    "ind1": field["ind1"],
                    "ind2": field["ind2"],
                }
            }
        )

    return [{"leader": record["leader"]["content"], "fields": fields}]

First you extract the MARC data and restructure it for use with PyMARC.

marc_json = parse_marc(metadata)

Then you can load the MARC data into PyMARC.

# PyMARC expects a JSON string so we dump it to a string first
reader = JSONReader(json.dumps(marc_json))

To retrieve a value from PyMARC you need to know the MARC tag and subfield for the field you’re interested in. For example, the main title of a work is in MARC tag 245, subfield a.

for record in reader:
    print(record["245"]["a"])
Lord Robert Cecil's gold fields diary /

The subfield c contains a ‘statement of responsibility’.

for record in reader:
    print(record["245"]["c"])
with introduction and notes by Sir Ernest Scott.

PyMARC also includes some handy shortcuts to save you having to remeber all the codes.

for record in reader:
    print(record.title)
    print(record.author)
    print(record.publisher)
    print(record.pubyear)
Lord Robert Cecil's gold fields diary /
Salisbury, Robert Cecil, marquess of, 1830-1903. 338373 9112d83c-f87f-5a34-a022-bea98d9ee823
Melbourne University Press,
1945

Get information about pages#

Books and periodical issues should include page data in the children field. To find the number of pages, you just need to get the length of the page list.

# How many pages are there?
len(metadata["children"]["page"])
56

If you want to get the identifiers for each individual page, just loop through the list of pages saving the pid value.

page_ids = [p["pid"] for p in metadata["children"]["page"]]
page_ids[:5]
['nla.obj-362059904',
 'nla.obj-362060036',
 'nla.obj-362060175',
 'nla.obj-362060307',
 'nla.obj-362060433']

These page identifiers can be used to download images of the pages.

Here’s a function you can use to get the dimensions of the access copy of a page. Note, however, that the downloadble versions of page images seem to limited to a maximum of 5000 pixels on the longest dimension. It’s important to know the difference between the size of the access copy and the downloaded page if you’re going to make use of the page’s OCR layout data.

def get_page_size(page_id):
    """
    Get the dimensions of a page image from embedded metadata.
    """
    metadata = get_metadata(page_id)
    for page in metadata["children"]["page"]:
        if page["pid"] == page_id:
            for copy in page["copies"]:
                if copy["copyrole"] == "access":
                    break
    return copy["technicalmetadata"]

get_page_size("nla.obj-362059904")
{'width': 1926, 'height': 2840}

Get a list of articles in a periodical issue#

Get information about images and maps#

The digitised image and map viewers include information about digitised images in the copies field. This function returns the details of the image with the specified role – defaulting to the access version.

def get_image_copy(image_id, role="access"):
    """
    Get image copy details for a particular copy role.
    """
    metadata = get_metadata(image_id)
    for copy in metadata["copies"]:
        if copy["copyrole"] == role:
            return copy

get_image_copy("nla.obj-232162256")
{'copyrole': 'access',
 'blobId': 7682805,
 'filename': '23216230.jp2',
 'filesize': 1560253,
 'technicalmetadata': {'width': 4519, 'height': 5508}}

You might want to check whether a high-resolution TIFF version is available for download. To do this you would look for a version with a copyrole value set to m. You can then check the access value to see whether it is set to true (can be downloaded) or false (can’t be downloaded).

tiff_copy = get_image_copy("nla.obj-232162256", role="m")

if tiff_copy["access"] == "true":
    download_url = "https://nla.gov.au/nla.obj-232162256/m"
    print(f"Download: {download_url}")
else:
    print("Cannot be downloaded")
Download: https://nla.gov.au/nla.obj-232162256/m