Customize the Search Index
Under the hood Canopy uses FlexSearch (opens in a new tab) to power the search index. FlexSearch is a full-text, memory efficient, search library that is fast and easy to use. It is also highly customizable. Canopy provides a number of ways to customize the search index.
This guide assumes you have a Canopy IIIF project. See the Create a Project guide to get started.
Use Case
You'd like to use Canopy IIIF to create a digital exhibit featuring Arabic manuscripts. For example, the Arabic Manuscripts from West Africa (opens in a new tab) provided by Northwestern University. The IIIF Manifest data (opens in a new tab) contains both Arabic script and English text in its label
and summary
properties.
You'd like to customize search configuration in the following three ways:
- Support the "Arabic" character set in Search (in addition to default "Latin").
- Include text from Manifest
summary
values in search results. - Include additional
Manifest
metadata in search results. In our example Manifests (opens in a new tab) we include "Contributor" and "Alternate Title" asmetadata
items and would like to surface these in search results.
Implementation
Add search configuration
Setup a Canopy IIIF project with the following configuration, including the search
property with default values.
{
"collection": "https://api.dc.library.northwestern.edu/api/v2/collections/59ec43f9-a96c-4314-9b44-9923790b371c?as=iiif&size=100",
"search": {
"enabled": true,
"flexSearch": {
"bidirectional": false,
"charset": "latin:extra",
"document": {
"index": [
{
"bidirectional": true,
"depth": 3,
"field": "label",
"resolution": 9,
"tokenize": "full"
},
{
"field": "metadata",
"resolution": 2
},
{
"field": "summary",
"resolution": 1
}
]
},
"optimize": true,
"tokenize": "strict"
},
"index": {
"metadata": {
"all": false,
"enabled": true
},
"summary": {
"enabled": false
}
}
}
}
Support additional language charsets
Edit config/canopy.json
and add the additional language encoding, arabic:extra
, to the search.flexSearch.charset
property. The entries should be an array of strings as we are using multiple language encodings.
{
"collection": "https://api.dc.library.northwestern.edu/api/v2/collections/59ec43f9-a96c-4314-9b44-9923790b371c?as=iiif&size=100",
"search": {
"enabled": true,
"flexSearch": {
"bidirectional": false,
"charset": ["latin:extra", "arabic:extra"],
"document": {
"index": [
{
"bidirectional": true,
"depth": 3,
"field": "label",
"resolution": 9,
"tokenize": "full"
},
{
"field": "metadata",
"resolution": 2
},
{
"field": "summary",
"resolution": 1
}
]
},
"optimize": true,
"tokenize": "strict"
},
"index": {
"metadata": {
"all": false,
"enabled": true
},
"summary": {
"enabled": false
}
}
}
}
Include summary in search results
The default search configuration indexes only
Manifest label
and metadata
values.
To include Manifest summary
values in the search index, update the search.index.summary.enabled
to true
.
{
"collection": "https://api.dc.library.northwestern.edu/api/v2/collections/59ec43f9-a96c-4314-9b44-9923790b371c?as=iiif&size=100",
"search": {
"enabled": true,
"flexSearch": {
"bidirectional": false,
"charset": ["latin:extra", "arabic:extra"],
"document": {
"index": [
{
"bidirectional": true,
"depth": 3,
"field": "label",
"resolution": 9,
"tokenize": "full"
},
{
"field": "metadata",
"resolution": 2
},
{
"field": "summary",
"resolution": 1
}
]
},
"optimize": true,
"tokenize": "strict"
},
"index": {
"metadata": {
"all": false,
"enabled": true
},
"summary": {
"enabled": true
}
}
}
}
Curate metadata labels for indexing
Implementers may choose to index all, part, or none of the metadata
in Manifests. By default, Canopy IIIF indexes only values defined in the metadata
property of config/canopy.json
file.
Our source IIIF Collection has Manifests with specific metadata
content to index, and we want to limit this to Date, Subject, Contributor, and Alternate Title labels. In this example Manifest (opens in a new tab), the respective values of "Translated title: Love fāʼidah with the amulet of Prophet Yūsuf" and "Falke, ʻUmar, 1893-1962 (Collector)" would be included in the index.
{
"@context": "http://iiif.io/api/presentation/3/context.json",
"id": "https://api.dc.library.northwestern.edu/api/v2/works/2ca1b09b-cbad-43dd-82bf-a7fa807269d8?as=iiif",
"type": "Manifest",
"label": {
"none": [
"محبة مع خاتم النبي يوسف."
]
},
"metadata": [
{
"label": {
"none": [
"Alternate Title"
]
},
"value": {
"none": [
"Translated title: Love fāʼidah with the amulet of Prophet Yūsuf"
]
}
},
{
"label": {
"none": [
"Contributor"
]
},
"value": {
"none": [
"Falke, ʻUmar, 1893-1962 (Collector)"
]
}
},
...
],
...
}
Update the config/canopy.json
file to include these labels in the metadata
property.
{
"collection": "https://api.dc.library.northwestern.edu/api/v2/collections/59ec43f9-a96c-4314-9b44-9923790b371c?as=iiif&size=100",
"metadata": ["Date", "Subject", "Contributor", "Alternate Title"],
"search": {
"enabled": true,
"flexSearch": {
"bidirectional": false,
"charset": ["latin:extra", "arabic:extra"],
"document": {
"index": [
{
"bidirectional": true,
"depth": 3,
"field": "label",
"resolution": 9,
"tokenize": "full"
},
{
"field": "metadata",
"resolution": 2
},
{
"field": "summary",
"resolution": 1
}
]
},
"optimize": true,
"tokenize": "strict"
},
"index": {
"metadata": {
"all": false,
"enabled": true
},
"summary": {
"enabled": true
}
}
}
}
Tip: To confirm text is being indexed for search, open the file
.canopy/index.json
and verify your custom data is being added to the index.
Validate search customizations
Verify your customizations are working by searching for:
- An Arabic phrase (e.g. "مجموع الفوائد.")
- A Manifest
summary
value (e.g. "Fāʼidah of Prophet Yūsuf on gaining people's love and respect.") - A Manifest
metadata
value (e.g. "Falke", or "Prophet Yūsuf")