25.2. HOW TO: Extract additional metadata from the digitised resource viewer#
On this page
The viewers you use to examine digitised resources in Trove embed some metadata that isn’t available through the Trove API. This includes a JSON-ified version of the item’s MARC record (presumably copied from the NLA catalogue), as well as structural information used by the viewer itself, such as a list of pages in a digitised book.
This metadata can be useful in a number of different contexts. For example, you can extract the number of pages in a digitised book, then use this number to automatically download the full text or a PDF. The GLAM Workbench includes an example where geospatial coordinates are extracted from the MARC data to add to a harvest of digitised maps.
25.2.1. What metadata is available?#
The available metadata varies by viewer and format. The main differences are:
the image viewer includes information about digitised images in the
copies
fieldthe books and journals viewer includes information in the
children
field about individual pages and sub-sections such as chapters and articles
All viewers#
All of the viewers embed some basic metadata, like id
and title
, at the top level of the JSON data. However, the actual fields can vary by format and viewer type, so don’t assume that a particular field exists, or has a value. Here’s an example from an issue of Walkabout.
"id": "71404117",
"collection": "nla.aus",
"type": "work",
"form": "Journal",
"displayTitlePage": "false",
"subType": "book",
"issueDate": "Sun, 02 Dec 1934",
"subUnitNo": "Vol. 1 No. 2 (2 December 1934)",
"bibLevel": "Item",
"bibId": "2592481",
"holdingNumber": "Nq 919.4 WAL",
"pid": "nla.obj-714041173",
"title": "Walkabout.",
"accessConditions": "Unrestricted",
"copyrightPolicy": "Out of Copyright",
"recordSource": "NLACat",
"sensitiveMaterial": "No",
"commentsExternal": "Some pages in this issue have been restricted. This may affect left/right page sequencing. Some loss of text in gutter due to page edges stitched into gutter at binding process",
"digitalStatus": "Captured",
"startDate": "01 January 1934",
"creator": "",
"extent": "v. : ill., maps. ; 34 cm.",
"isMissingPage": "false",
"publisherName": "Australian National Travel Association",
There’s also a topLevelCollection
field that contains the nla.obj
identifier of the parent record in this collection. If it’s a single item (ie a collection of one) then topLevelCollection
will probably be the same as the item identifier in pid
.
All of the viewers also embed a JSON-ified MARC record in the marcData
field.
Image and map viewer#
The image and map viewer includes a copies
field at the top level of the JSON data. This field includes a list of the images associated with this item. Here’s an example from nla.obj-133327370:
"copies": [
{
"copyrole": "access",
"blobId": 146939732,
"filename": "314560922.jp2",
"filesize": 6663187,
"technicalmetadata": {
"width": 8566,
"height": 12449
}
},
{
"copyrole": "m",
"access": "false",
"filesize": 745416848
}
],
The ‘copies’ of the image are different formats or resolutions created for specific purposes, such as access or preservation. Apparently copyrole
values can be one ofaccess
, m
, o
, i
, or fd
, but I’ve only come across access
and m
. The m
copies seem to refer to high-resolution TIFFs, and if access
is set to true
then these TIFF versions are made available for download. You can find downloadable TIFFs amongst the digitised maps. For example, this map has access
set to true
for the m
copy:
"copies": [
{
"copyrole": "access",
"blobId": 7682805,
"filename": "23216230.jp2",
"filesize": 1560253,
"technicalmetadata": {
"width": 4519,
"height": 5508
}
},
{
"copyrole": "m",
"access": "true",
"filesize": 74685872
}
],
The map viewer reads this value and adds a TIFF option under the download tab. If access
is true
you can also download the high-resolution TIFF directly by adding /m
to the item identifier (though take note of the file size as the downloads can be huge!):
Books and journals viewer#
The books and journals viewer has a children
field in the top-level JSON data which includes page
, article
, and chapter
fields.
Pages#
The page
field contains details of every page image. Here’s the metadata for a single page in the book The story of the Australian bushrangers:
{
"id": "48661387",
"subType": "page",
"title": "The story of the Australian bushrangers",
"bibId": "1068148",
"pid": "nla.obj-486613874",
"form": "Book",
"accessConditions": "Unrestricted",
"copyrightPolicy": "Out of Copyright",
"bibLevel": "Part",
"digitalStatus": "Captured",
"holdingNumber": "NL 343.94 BOX",
"copies": [
{
"copyrole": "access",
"blobId": 15236579,
"filename": "48661395.jp2",
"filesize": 506342,
"technicalmetadata": {
"width": 2335,
"height": 3495
}
},
{
"copyrole": "m",
"access": "false",
"filesize": 24482931
}
]
}
While some of these fields duplicate what’s available at the top-level of the metadata, the pid
here is the identifier of this particular page. This identifier can be used to download the page image and OCR data.
Each page has a copies
field describing the available image versions. The image dimensions of the access
copy included in the technicalmetadata
field can be useful if you want to use the OCR data to crop sections out of the page image.
Articles#
Periodical issues can include a list of articles in the article
field. Here’s an example of an article entry from Walkabout:
{
"id": "75337488",
"subType": "article",
"pid": "nla.obj-753374885",
"title": "A Visit to Lake Frome",
"creator": "By ARTHUR W. UPFIELD",
"bibLevel": "Section",
"existson": [
{
"id": "71404264",
"page": "nla.obj-714042646"
},
{
"id": "71404251",
"page": "nla.obj-714042515"
},
{
"id": "71404232",
"page": "nla.obj-714042324"
},
{
"id": "71404219",
"page": "nla.obj-714042196"
}
]
}
Articles have their own values for pid
, title
, and creator
(if the article has a byline). The existson
field lists the pages on which this article appears. This article starts on page nla.obj-714042646
.
Chapters#
Books can include a list of chapters in the chapter
field. Here’s an example of a chapter entry from The story of the Australian bushrangers:
{
"id": "49622020",
"subType": "chapter",
"subUnitNo": "2",
"title": "PREFACE.",
"pid": "nla.obj-496220207",
"bibLevel": "Section",
"existson": [
{
"id": "48661510",
"page": "nla.obj-486615102"
},
{
"id": "48661523",
"page": "nla.obj-486615233"
}
]
}
Chapters have their own values for pid
and title
, while the subUnitNo
specifies the order of the chapters. The existson
field lists the pages on which this chapter appears.
25.2.2. Extracting the metadata#
The function to extract the metadata is fairly straightforward. It loads the viewer’s HTML code and uses a regular expression to find and extract the embedded JSON string. It expects an nla.obj
identifier. For the image and map viewers, this is the identifier of an individual item. For the book and journal viewer you can use the nla.obj
identifier for the book, issue, page, or article. This is because page and article identifiers are redirected to issues. Here’s a full examp[le that extracts the embedded metadata for the book Lord Robert Cecil’s gold fields diary.
import json
import re
import requests
from IPython.display import JSON
def get_metadata(id):
"""
Extract work data in a JSON string from the work's HTML page.
"""
if not id.startswith("http"):
id = "https://nla.gov.au/" + id
response = requests.get(id)
try:
work_data = re.search(
r"var work = JSON\.parse\(JSON\.stringify\((\{.*\})", response.text
).group(1)
except AttributeError:
work_data = "{}"
return json.loads(work_data)
book_id = "https://nla.gov.au/nla.obj-362059651/"
metadata = get_metadata(book_id)
display(metadata)
{'id': '36205965',
'collection': 'nla.aus',
'type': 'work',
'form': 'Book',
'subType': 'book',
'bibLevel': 'Item',
'bibId': '653766',
'holdingNumber': 'JAFp BIO 92',
'pid': 'nla.obj-362059651',
'title': "Lord Robert Cecil's gold fields diary",
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'recordSource': 'NLACat',
'digitalStatus': 'Captured',
'startDate': '01 January 1945',
'creator': 'Salisbury, Robert Cecil, marquess of, 1830-1903.',
'extent': '32 p., [20] p. of plates : ill. ; 18 cm.',
'publisherName': 'Melbourne University Press',
'allowSearchEngineIndexing': 'false',
'findingAidAvailable': 'No',
'isOriginalCopyAvaliable': 'false',
'ocrMetsCopyAvaliable': 'true',
'partnerNucs': [],
'parentProjectIds': [],
'projectIds': [],
'marcData': {'record': [{'leader': {'type': 'Bibliographic',
'content': '01297cam a2200289 a 4500'},
'datafield': [{'ind2': ' ',
'ind1': 1,
'subfield': [{'code': 'a', 'content': 2617933},
{'code': 'z', 'content': 9324621}],
'tag': '019'},
{'ind2': ' ',
'ind1': ' ',
'subfield': {'code': 9, 'content': '(AuCNLDY)577939'},
'tag': '035'},
{'ind2': ' ',
'ind1': ' ',
'subfield': {'code': 'a', 'content': 653766},
'tag': '035'},
{'ind2': ' ',
'ind1': ' ',
'subfield': [{'code': 'a', 'content': 'NNCU:A'},
{'code': 'b', 'content': 'eng'},
{'code': 'c', 'content': 'NNCU:A'},
{'code': 'd', 'content': 'AUC:LSM'}],
'tag': '040'},
{'ind2': ' ',
'ind1': ' ',
'subfield': {'code': 'a', 'content': 'anuc'},
'tag': '042'},
{'ind2': ' ',
'ind1': ' ',
'subfield': {'code': 'a', 'content': 'u-at-vi'},
'tag': '043'},
{'ind2': 4,
'ind1': 0,
'subfield': [{'code': 'a', 'content': '994.5/03'},
{'code': 2, 'content': 19}],
'tag': '082'},
{'ind2': ' ',
'ind1': 1,
'subfield': [{'code': 'a', 'content': 'Salisbury, Robert Cecil,'},
{'code': 'c', 'content': 'marquess of,'},
{'code': 'd', 'content': '1830-1903.'},
{'code': 0, 'content': 338373},
{'code': 9, 'content': '9112d83c-f87f-5a34-a022-bea98d9ee823'}],
'tag': 100},
{'ind2': 0,
'ind1': 1,
'subfield': [{'code': 'a',
'content': "Lord Robert Cecil's gold fields diary /"},
{'code': 'c',
'content': 'with introduction and notes by Sir Ernest Scott.'}],
'tag': 245},
{'ind2': ' ',
'ind1': ' ',
'subfield': {'code': 'a', 'content': '2nd ed.'},
'tag': 250},
{'ind2': ' ',
'ind1': ' ',
'subfield': [{'code': 'a', 'content': 'Carlton, Vic. :'},
{'code': 'b', 'content': 'Melbourne University Press,'},
{'code': 'c', 'content': 1945}],
'tag': 260},
{'ind2': ' ',
'ind1': ' ',
'subfield': [{'code': 'a', 'content': '32 p., [20] p. of plates :'},
{'code': 'b', 'content': 'ill. ;'},
{'code': 'c', 'content': '18 cm.'}],
'tag': 300},
{'ind2': ' ',
'ind1': ' ',
'subfield': [{'code': 'a', 'content': 'Also available online'},
{'code': 'u', 'content': 'http://nla.gov.au/nla.obj-362059651'}],
'tag': 530},
{'ind2': 0,
'ind1': 1,
'subfield': [{'code': 'a', 'content': 'Salisbury, Robert Cecil,'},
{'code': 'c', 'content': 'marquess of,'},
{'code': 'd', 'content': '1830-1903.'},
{'code': 0, 'content': 338373},
{'code': 9, 'content': '9112d83c-f87f-5a34-a022-bea98d9ee823'}],
'tag': 600},
{'ind2': 0,
'ind1': ' ',
'subfield': [{'code': 'a', 'content': 'Gold mines and mining'},
{'code': 'z', 'content': 'Victoria.'}],
'tag': 650},
{'ind2': 0,
'ind1': ' ',
'subfield': [{'code': 'a', 'content': 'Victoria'},
{'code': 'x', 'content': 'Description and travel.'}],
'tag': 651},
{'ind2': ' ',
'ind1': 1,
'subfield': [{'code': 'a', 'content': 'Scott, Ernest,'},
{'code': 'd', 'content': '1868-1939.'},
{'code': 0, 'content': 118638},
{'code': 9, 'content': '6a97b19f-f2eb-57c3-a661-e5b83e83581c'}],
'tag': 700},
{'ind2': 1,
'ind1': 4,
'subfield': [{'code': 'z',
'content': 'National Library of Australia digitised item. JAFp BIO 92 copy'},
{'code': 'u', 'content': 'http://nla.gov.au/nla.obj-362059651'},
{'code': 'x', 'content': 'fulltext'}],
'tag': 856},
{'ind2': 'f',
'ind1': 'f',
'subfield': [{'code': 'i',
'content': '2c2be5dd-982b-5020-ac7c-f8820fc3ae34'},
{'code': 's', 'content': 'b08c2134-2fff-5812-bdd3-a55d68f73fa1'}],
'tag': 999}],
'controlfield': [{'tag': '001', 'content': 653766},
{'tag': '005', 'content': 20240325010329.3},
{'tag': '008', 'content': '830518s1945 vraa 000 0aeng d'}]},
{'leader': {'type': 'Holdings', 'content': '00000nam a2200000 a 4500'},
'datafield': [{'ind2': ' ',
'ind1': 8,
'subfield': [{'code': 'b', 'content': 'PET'},
{'code': 'h', 'content': 'JAFp BIO 92'}],
'tag': 852},
{'ind2': ' ',
'ind1': ' ',
'subfield': {'code': 'a', 'content': 'NLA'},
'tag': 954}],
'controlfield': [{'tag': '001', 'content': 2912183},
{'tag': '004', 'content': 653766}]},
{'leader': {'type': 'Holdings', 'content': '00000nam a2200000 a 4500'},
'datafield': [{'ind2': ' ',
'ind1': 8,
'subfield': [{'code': 'b', 'content': 'PET'},
{'code': 'h', 'content': 'JAFp GEN SAL'}],
'tag': 852},
{'ind2': 0,
'ind1': ' ',
'subfield': {'code': 'z', 'content': 'FC copy'},
'tag': 866},
{'ind2': ' ',
'ind1': ' ',
'subfield': {'code': 'a', 'content': 'NLA'},
'tag': 954}],
'controlfield': [{'tag': '001', 'content': 2912184},
{'tag': '004', 'content': 653766}]},
{'leader': {'type': 'Holdings', 'content': '00000nam a2200000 a 4500'},
'datafield': [{'ind2': ' ',
'ind1': 8,
'subfield': [{'code': 'b', 'content': 'AUSP'},
{'code': 'h', 'content': 'Np 994.5 SAL'}],
'tag': 852},
{'ind2': 0,
'ind1': ' ',
'subfield': {'code': 'z', 'content': 'N pbk'},
'tag': 866},
{'ind2': ' ',
'ind1': ' ',
'subfield': {'code': 'a', 'content': 'NLA'},
'tag': 954}],
'controlfield': [{'tag': '001', 'content': 2912182},
{'tag': '004', 'content': 653766}]},
{'leader': {'type': 'Holdings', 'content': '00000nam a2200000 a 4500'},
'datafield': [{'ind2': ' ',
'ind1': 8,
'subfield': [{'code': 'b', 'content': 'AUSLP'},
{'code': 'h', 'content': 'NLp 994.5 SAL'}],
'tag': 852},
{'ind2': 0,
'ind1': ' ',
'subfield': {'code': 'z', 'content': 'NL pbk'},
'tag': 866},
{'ind2': ' ',
'ind1': ' ',
'subfield': {'code': 'a', 'content': 'NLA'},
'tag': 954}],
'controlfield': [{'tag': '001', 'content': 4042315},
{'tag': '004', 'content': 653766}]}]},
'children': {'page': [{'id': '36205990',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362059904',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551350,
'filename': '36205998.jp2',
'filesize': 357765,
'technicalmetadata': {'width': 1926, 'height': 2840}},
{'copyrole': 'm', 'access': 'false', 'filesize': 16436560}]},
{'id': '36206003',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362060036',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551354,
'filename': '36206011.jp2',
'filesize': 325233,
'technicalmetadata': {'width': 1743, 'height': 2808}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14709600}]},
{'id': '36206017',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362060175',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551358,
'filename': '36206025.jp2',
'filesize': 332757,
'technicalmetadata': {'width': 1766, 'height': 2822}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14976176}]},
{'id': '36206030',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362060307',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551362,
'filename': '36206038.jp2',
'filesize': 290972,
'technicalmetadata': {'width': 1642, 'height': 2649}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13074028}]},
{'id': '36206043',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362060433',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551366,
'filename': '36206051.jp2',
'filesize': 316353,
'technicalmetadata': {'width': 1694, 'height': 2798}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14244196}]},
{'id': '36206056',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362060563',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551370,
'filename': '36206064.jp2',
'filesize': 308812,
'technicalmetadata': {'width': 1678, 'height': 2739}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13813536}]},
{'id': '36206069',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362060694',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551374,
'filename': '36206077.jp2',
'filesize': 315081,
'technicalmetadata': {'width': 1688, 'height': 2792}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14165840}]},
{'id': '36206082',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362060828',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551378,
'filename': '36206090.jp2',
'filesize': 314714,
'technicalmetadata': {'width': 1690, 'height': 2782}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14129776}]},
{'id': '36206095',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362060959',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551382,
'filename': '36206103.jp2',
'filesize': 315411,
'technicalmetadata': {'width': 1700, 'height': 2792}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14266964}]},
{'id': '36206108',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362061083',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551386,
'filename': '36206116.jp2',
'filesize': 310531,
'technicalmetadata': {'width': 1674, 'height': 2783}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14004236}]},
{'id': '36206121',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362061212',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551390,
'filename': '36206129.jp2',
'filesize': 316813,
'technicalmetadata': {'width': 1706, 'height': 2804}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14378908}]},
{'id': '36206134',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362061349',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551394,
'filename': '36206142.jp2',
'filesize': 309855,
'technicalmetadata': {'width': 1670, 'height': 2796}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14036000}]},
{'id': '36206147',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362061476',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551398,
'filename': '36206155.jp2',
'filesize': 321690,
'technicalmetadata': {'width': 1706, 'height': 2822}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14469512}]},
{'id': '36206160',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362061603',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551402,
'filename': '36206168.jp2',
'filesize': 302721,
'technicalmetadata': {'width': 1645, 'height': 2750}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13599008}]},
{'id': '36206173',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362061731',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551406,
'filename': '36206181.jp2',
'filesize': 302730,
'technicalmetadata': {'width': 1661, 'height': 2743}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13696764}]},
{'id': '36206186',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362061862',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551410,
'filename': '36206194.jp2',
'filesize': 309234,
'technicalmetadata': {'width': 1669, 'height': 2783}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13962668}]},
{'id': '36206199',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362061995',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551414,
'filename': '36206207.jp2',
'filesize': 318131,
'technicalmetadata': {'width': 1694, 'height': 2822}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14369336}]},
{'id': '36206212',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362062124',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551418,
'filename': '36206220.jp2',
'filesize': 307502,
'technicalmetadata': {'width': 1665, 'height': 2787}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13949196}]},
{'id': '36206225',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362062250',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551422,
'filename': '36206233.jp2',
'filesize': 310439,
'technicalmetadata': {'width': 1664, 'height': 2822}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14114880}]},
{'id': '36206238',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362062384',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551426,
'filename': '36206246.jp2',
'filesize': 301196,
'technicalmetadata': {'width': 1648, 'height': 2763}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13687828}]},
{'id': '36206251',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362062514',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551430,
'filename': '36206259.jp2',
'filesize': 309281,
'technicalmetadata': {'width': 1666, 'height': 2775}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13897256}]},
{'id': '36206264',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362062649',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551434,
'filename': '36206272.jp2',
'filesize': 305751,
'technicalmetadata': {'width': 1630, 'height': 2806}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13748248}]},
{'id': '36206277',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362062779',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551438,
'filename': '36206285.jp2',
'filesize': 308294,
'technicalmetadata': {'width': 1640, 'height': 2796}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13783776}]},
{'id': '36206290',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362062905',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551442,
'filename': '36206298.jp2',
'filesize': 305747,
'technicalmetadata': {'width': 1612, 'height': 2830}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13713280}]},
{'id': '36206303',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362063031',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551446,
'filename': '36206311.jp2',
'filesize': 306590,
'technicalmetadata': {'width': 1640, 'height': 2797}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13788724}]},
{'id': '36206316',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362063165',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551450,
'filename': '36206324.jp2',
'filesize': 299724,
'technicalmetadata': {'width': 1625, 'height': 2789}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13623720}]},
{'id': '36206329',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362063296',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551454,
'filename': '36206337.jp2',
'filesize': 295007,
'technicalmetadata': {'width': 1595, 'height': 2737}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13124464}]},
{'id': '36206342',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362063426',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551458,
'filename': '36206350.jp2',
'filesize': 297779,
'technicalmetadata': {'width': 1588, 'height': 2812}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13423928}]},
{'id': '36206355',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362063555',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551462,
'filename': '36206363.jp2',
'filesize': 296505,
'technicalmetadata': {'width': 1609, 'height': 2771}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13402596}]},
{'id': '36206368',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362063681',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551466,
'filename': '36206376.jp2',
'filesize': 295690,
'technicalmetadata': {'width': 1601, 'height': 2749}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13230876}]},
{'id': '36206381',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362063813',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551470,
'filename': '36206389.jp2',
'filesize': 295336,
'technicalmetadata': {'width': 1602, 'height': 2732}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13157308}]},
{'id': '36206394',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362063942',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551474,
'filename': '36206402.jp2',
'filesize': 297440,
'technicalmetadata': {'width': 1594, 'height': 2794}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13388024}]},
{'id': '36206407',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362064075',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551478,
'filename': '36206415.jp2',
'filesize': 289403,
'technicalmetadata': {'width': 1569, 'height': 2764}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13037244}]},
{'id': '36206420',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362064205',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551482,
'filename': '36206428.jp2',
'filesize': 294963,
'technicalmetadata': {'width': 1593, 'height': 2786}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13341424}]},
{'id': '36206433',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362064331',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551486,
'filename': '36206441.jp2',
'filesize': 300005,
'technicalmetadata': {'width': 1599, 'height': 2784}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13381884}]},
{'id': '36206446',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362064462',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551490,
'filename': '36206454.jp2',
'filesize': 295397,
'technicalmetadata': {'width': 1599, 'height': 2778}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13353752}]},
{'id': '36206459',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362064599',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551494,
'filename': '36206467.jp2',
'filesize': 293873,
'technicalmetadata': {'width': 1584, 'height': 2754}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13114076}]},
{'id': '36206472',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362064729',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551498,
'filename': '36206480.jp2',
'filesize': 289866,
'technicalmetadata': {'width': 1554, 'height': 2783}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13001488}]},
{'id': '36206485',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362064858',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551502,
'filename': '36206493.jp2',
'filesize': 300002,
'technicalmetadata': {'width': 1639, 'height': 2761}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13603936}]},
{'id': '36206498',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362064987',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551506,
'filename': '36206506.jp2',
'filesize': 317983,
'technicalmetadata': {'width': 1738, 'height': 2754}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14387576}]},
{'id': '36206511',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362065113',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551510,
'filename': '36206519.jp2',
'filesize': 322407,
'technicalmetadata': {'width': 1728, 'height': 2786}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14470684}]},
{'id': '36206524',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362065244',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551514,
'filename': '36206532.jp2',
'filesize': 333367,
'technicalmetadata': {'width': 1762, 'height': 2830}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14987704}]},
{'id': '36206537',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362065373',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551518,
'filename': '36206545.jp2',
'filesize': 330449,
'technicalmetadata': {'width': 1754, 'height': 2822}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14877328}]},
{'id': '36206550',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362065503',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551522,
'filename': '36206558.jp2',
'filesize': 324108,
'technicalmetadata': {'width': 1752, 'height': 2786}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14669868}]},
{'id': '36206563',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362065633',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551526,
'filename': '36206571.jp2',
'filesize': 315480,
'technicalmetadata': {'width': 1716, 'height': 2762}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14246788}]},
{'id': '36206576',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362065767',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551530,
'filename': '36206584.jp2',
'filesize': 324317,
'technicalmetadata': {'width': 1750, 'height': 2788}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14665132}]},
{'id': '36206589',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362065894',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551534,
'filename': '36206597.jp2',
'filesize': 326781,
'technicalmetadata': {'width': 1742, 'height': 2822}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14775888}]},
{'id': '36206602',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362066023',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551538,
'filename': '36206610.jp2',
'filesize': 333360,
'technicalmetadata': {'width': 1780, 'height': 2830}},
{'copyrole': 'm', 'access': 'false', 'filesize': 15140296}]},
{'id': '36206615',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362066153',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551542,
'filename': '36206623.jp2',
'filesize': 314707,
'technicalmetadata': {'width': 1706, 'height': 2755}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14128324}]},
{'id': '36206628',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362066287',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551546,
'filename': '36206636.jp2',
'filesize': 324455,
'technicalmetadata': {'width': 1758, 'height': 2790}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14742720}]},
{'id': '36206641',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362066417',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551550,
'filename': '36206649.jp2',
'filesize': 321925,
'technicalmetadata': {'width': 1732, 'height': 2808}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14618624}]},
{'id': '36206654',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362066547',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551554,
'filename': '36206662.jp2',
'filesize': 329478,
'technicalmetadata': {'width': 1779, 'height': 2795}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14945160}]},
{'id': '36206667',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362066673',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551558,
'filename': '36206675.jp2',
'filesize': 308578,
'technicalmetadata': {'width': 1682, 'height': 2747}},
{'copyrole': 'm', 'access': 'false', 'filesize': 13889596}]},
{'id': '36206680',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362066807',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551562,
'filename': '36206688.jp2',
'filesize': 323456,
'technicalmetadata': {'width': 1750, 'height': 2763}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14533588}]},
{'id': '36206693',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362066935',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551566,
'filename': '36206701.jp2',
'filesize': 324834,
'technicalmetadata': {'width': 1718, 'height': 2822}},
{'copyrole': 'm', 'access': 'false', 'filesize': 14569580}]},
{'id': '36205977',
'subType': 'page',
'title': "Lord Robert Cecil's gold fields diary",
'bibId': '653766',
'pid': 'nla.obj-362059771',
'form': 'Book',
'accessConditions': 'Unrestricted',
'copyrightPolicy': 'Out of Copyright',
'bibLevel': 'Part',
'digitalStatus': 'Captured',
'holdingNumber': 'JAFp BIO 92',
'copies': [{'copyrole': 'access',
'blobId': 11551346,
'filename': '36205985.jp2',
'filesize': 361053,
'technicalmetadata': {'width': 1938, 'height': 2801}},
{'copyrole': 'm', 'access': 'false', 'filesize': 16310064}]}],
'article': [],
'chapter': [{'id': '36207811',
'subType': 'chapter',
'subUnitNo': '1',
'pid': 'nla.obj-362078119',
'bibLevel': 'Section',
'existson': [{'id': '36205990', 'page': 'nla.obj-362059904'},
{'id': '36206003', 'page': 'nla.obj-362060036'},
{'id': '36206017', 'page': 'nla.obj-362060175'},
{'id': '36206030', 'page': 'nla.obj-362060307'},
{'id': '36206043', 'page': 'nla.obj-362060433'},
{'id': '36206056', 'page': 'nla.obj-362060563'},
{'id': '36206069', 'page': 'nla.obj-362060694'}]},
{'id': '36207820',
'subType': 'chapter',
'subUnitNo': '2',
'pid': 'nla.obj-362078203',
'bibLevel': 'Section',
'existson': [{'id': '36206095', 'page': 'nla.obj-362060959'},
{'id': '36206082', 'page': 'nla.obj-362060828'},
{'id': '36206108', 'page': 'nla.obj-362061083'},
{'id': '36206121', 'page': 'nla.obj-362061212'},
{'id': '36206134', 'page': 'nla.obj-362061349'},
{'id': '36206147', 'page': 'nla.obj-362061476'},
{'id': '36206160', 'page': 'nla.obj-362061603'},
{'id': '36206173', 'page': 'nla.obj-362061731'},
{'id': '36206186', 'page': 'nla.obj-362061862'},
{'id': '36206199', 'page': 'nla.obj-362061995'},
{'id': '36206212', 'page': 'nla.obj-362062124'},
{'id': '36206225', 'page': 'nla.obj-362062250'},
{'id': '36206238', 'page': 'nla.obj-362062384'},
{'id': '36206251', 'page': 'nla.obj-362062514'},
{'id': '36206264', 'page': 'nla.obj-362062649'},
{'id': '36206277', 'page': 'nla.obj-362062779'},
{'id': '36206290', 'page': 'nla.obj-362062905'},
{'id': '36206303', 'page': 'nla.obj-362063031'},
{'id': '36206316', 'page': 'nla.obj-362063165'},
{'id': '36206329', 'page': 'nla.obj-362063296'},
{'id': '36206342', 'page': 'nla.obj-362063426'},
{'id': '36206355', 'page': 'nla.obj-362063555'},
{'id': '36206368', 'page': 'nla.obj-362063681'},
{'id': '36206381', 'page': 'nla.obj-362063813'},
{'id': '36206394', 'page': 'nla.obj-362063942'},
{'id': '36206407', 'page': 'nla.obj-362064075'},
{'id': '36206420', 'page': 'nla.obj-362064205'},
{'id': '36206433', 'page': 'nla.obj-362064331'},
{'id': '36206446', 'page': 'nla.obj-362064462'},
{'id': '36206459', 'page': 'nla.obj-362064599'},
{'id': '36206472', 'page': 'nla.obj-362064729'},
{'id': '36206485', 'page': 'nla.obj-362064858'},
{'id': '36206498', 'page': 'nla.obj-362064987'},
{'id': '36206511', 'page': 'nla.obj-362065113'},
{'id': '36206524', 'page': 'nla.obj-362065244'},
{'id': '36206537', 'page': 'nla.obj-362065373'},
{'id': '36206550', 'page': 'nla.obj-362065503'},
{'id': '36206563', 'page': 'nla.obj-362065633'},
{'id': '36206576', 'page': 'nla.obj-362065767'},
{'id': '36206589', 'page': 'nla.obj-362065894'},
{'id': '36206602', 'page': 'nla.obj-362066023'},
{'id': '36206615', 'page': 'nla.obj-362066153'},
{'id': '36206628', 'page': 'nla.obj-362066287'},
{'id': '36206641', 'page': 'nla.obj-362066417'},
{'id': '36206654', 'page': 'nla.obj-362066547'},
{'id': '36206667', 'page': 'nla.obj-362066673'},
{'id': '36206680', 'page': 'nla.obj-362066807'},
{'id': '36206693', 'page': 'nla.obj-362066935'},
{'id': '36205977', 'page': 'nla.obj-362059771'}]}],
'book': [],
'volume': [],
'other': []},
'topLevelCollection': 'nla.obj-362059651'}
25.2.3. Get MARC catalogue data#
The MARC data is contained in the marcData
field. This field can contain multiple records – the main metadata is contained in the record which has type
set to Bibliographic
in the leader
field.
Tools like PyMARC can help you get information from MARC records, however, Trove’s marcData
isn’t in a format that PyMARC recognises. The function below finds the Bibliographic
record and restructures the data for use with PyMARC.
import json
from pymarc import JSONReader
def parse_marc(metadata):
"""
Parse the bibliographic MARC data in the embedded metadata.
This produces a structure that can be loaded into PyMarc's JSON reader.
"""
# Some nla.obj items don't have MARC data
# For example some collections
try:
records = metadata["marcData"]["record"]
except KeyError:
return {}
# The metadata contains bibliographic and holdings MARC data
# here we'll select the bib record.
for record in records:
if record["leader"].get("type") == "Bibliographic":
break
fields = []
# Control fields only have content, no subfields
for cf in record.get("controlfield", []):
fields.append({str(cf["tag"]): str(cf["content"])})
# Loop through all the fields
for field in record["datafield"]:
subfields = []
# Get any subfields
sfs = field.get("subfield", [])
# The subfields value can be a list or dict
# Check if it's a list
if isinstance(sfs, list):
# Loop through the subfields adding the values
for sf in sfs:
subfields.append({sf["code"]: str(sf["content"])})
# If it's not a list just add the details from the dict
else:
subfields.append({sfs["code"]: str(sfs["content"])})
fields.append(
{
str(field["tag"]): {
"subfields": subfields,
"ind1": field["ind1"],
"ind2": field["ind2"],
}
}
)
return [{"leader": record["leader"]["content"], "fields": fields}]
First you extract the MARC data and restructure it for use with PyMARC.
marc_json = parse_marc(metadata)
Then you can load the MARC data into PyMARC.
# PyMARC expects a JSON string so we dump it to a string first
reader = JSONReader(json.dumps(marc_json))
To retrieve a value from PyMARC you need to know the MARC tag and subfield for the field you’re interested in. For example, the main title of a work is in MARC tag 245
, subfield a
.
for record in reader:
print(record["245"]["a"])
Lord Robert Cecil's gold fields diary /
The subfield c
contains a ‘statement of responsibility’.
for record in reader:
print(record["245"]["c"])
with introduction and notes by Sir Ernest Scott.
PyMARC also includes some handy shortcuts to save you having to remeber all the codes.
for record in reader:
print(record.title)
print(record.author)
print(record.publisher)
print(record.pubyear)
Lord Robert Cecil's gold fields diary /
Salisbury, Robert Cecil, marquess of, 1830-1903. 338373 9112d83c-f87f-5a34-a022-bea98d9ee823
Melbourne University Press,
1945
25.2.4. Get information about pages#
Books and periodical issues should include page
data in the children
field. To find the number of pages, you just need to get the length of the page
list.
# How many pages are there?
len(metadata["children"]["page"])
56
If you want to get the identifiers for each individual page, just loop through the list of pages saving the pid
value.
page_ids = [p["pid"] for p in metadata["children"]["page"]]
page_ids[:5]
['nla.obj-362059904',
'nla.obj-362060036',
'nla.obj-362060175',
'nla.obj-362060307',
'nla.obj-362060433']
These page identifiers can be used to download images of the pages.
Here’s a function you can use to get the dimensions of the access
copy of a page. Note, however, that the downloadble versions of page images seem to limited to a maximum of 5000 pixels on the longest dimension. It’s important to know the difference between the size of the access
copy and the downloaded page if you’re going to make use of the page’s OCR layout data.
def get_page_size(page_id):
"""
Get the dimensions of a page image from embedded metadata.
"""
metadata = get_metadata(page_id)
for page in metadata["children"]["page"]:
if page["pid"] == page_id:
for copy in page["copies"]:
if copy["copyrole"] == "access":
break
return copy["technicalmetadata"]
get_page_size("nla.obj-362059904")
{'width': 1926, 'height': 2840}
25.2.5. Get a list of articles in a periodical issue#
from IPython.display import Markdown
issue_id = "nla.obj-714041173"
issue_metadata = get_metadata(issue_id)
md = ""
for article in issue_metadata["children"]["article"]:
md += f"* [{article['title']}](https://nla.gov.au/{article['pid']})\n"
display(Markdown(md))
25.2.6. Get information about images and maps#
The digitised image and map viewers include information about digitised images in the copies
field. This function returns the details of the image with the specified role – defaulting to the access
version.
def get_image_copy(image_id, role="access"):
"""
Get image copy details for a particular copy role.
"""
metadata = get_metadata(image_id)
for copy in metadata["copies"]:
if copy["copyrole"] == role:
return copy
get_image_copy("nla.obj-232162256")
{'copyrole': 'access',
'blobId': 7682805,
'filename': '23216230.jp2',
'filesize': 1560253,
'technicalmetadata': {'width': 4519, 'height': 5508}}
You might want to check whether a high-resolution TIFF version is available for download. To do this you would look for a version with a copyrole
value set to m
. You can then check the access
value to see whether it is set to true
(can be downloaded) or false
(can’t be downloaded).
tiff_copy = get_image_copy("nla.obj-232162256", role="m")
if tiff_copy["access"] == "true":
download_url = "https://nla.gov.au/nla.obj-232162256/m"
print(f"Download: {download_url}")
else:
print("Cannot be downloaded")
Download: https://nla.gov.au/nla.obj-232162256/m