@@ -96,31 +96,116 @@ curl -X POST https://data.ldaca.edu.au/api/search \
9696 }'
9797```
9898
99- ## 3. Downloading Files from Entities
99+ ## 3. Retrieving RO-Crate Metadata
100100
101- ### Getting File Information
101+ ### Get Complete RO-Crate JSON-LD
102102
103- First, get entity details to see available files :
103+ Access the raw RO-Crate metadata for any entity :
104104
105105``` bash
106- curl " https://data.ldaca.edu.au/api/entity/https%3A%2F%2Fcatalog.paradisec.org.au%2Frepository%2FLRB%2F001"
106+ curl " https://data.ldaca.edu.au/api/entity/https%3A%2F%2Fcatalog.paradisec.org.au%2Frepository%2FLRB%2F001/crate "
107107```
108108
109- ### Downloading Files
109+ ** Response:**
110+
111+ ``` json
112+ {
113+ "@context" : " https://w3id.org/ro/crate/1.1/context" ,
114+ "@graph" : [
115+ {
116+ "@id" : " ro-crate-metadata.json" ,
117+ "@type" : " CreativeWork" ,
118+ "conformsTo" : {
119+ "@id" : " https://w3id.org/ro/crate/1.1"
120+ },
121+ "about" : {
122+ "@id" : " ./"
123+ }
124+ },
125+ {
126+ "@id" : " ./" ,
127+ "@type" : " Dataset" ,
128+ "name" : " Recordings of West Alor languages" ,
129+ "description" : " A compilation of recordings featuring various West Alor languages"
130+ }
131+ ]
132+ }
133+ ```
134+
135+ This is useful for:
136+
137+ - Validating RO-Crate compliance
138+ - Accessing extended metadata not exposed in the entity API
139+ - Archival and preservation workflows
140+ - Integration with RO-Crate tools
110141
111- Download a specific file:
142+ ## 4. Working with Files
143+
144+ ### Understanding Entities vs Files
145+
146+ The API provides two ways to work with files:
147+
148+ - ** ` /entities ` ** - Returns RO-Crate entities including MediaObjects (files that are part of the RO-Crate metadata)
149+ - ** ` /files ` ** - Returns files from the repository's file system
150+
151+ ** Important** : Not all files are represented as RO-Crate entities. MediaObject entities are typically a subset of all files in the repository.
152+
153+ ### Listing Files from the File System
154+
155+ List all files in the repository:
112156
113157``` bash
114- # Direct download
115- wget " https://data.ldaca.edu.au/api/entity/https%3A%2F%2Fcatalog.paradisec.org.au%2Frepository%2FLRB%2F001/file/recording.wav"
158+ # List all files
159+ curl " https://data.ldaca.edu.au/api/files"
160+
161+ # List files attached to a specific entity
162+ curl " https://data.ldaca.edu.au/api/files?memberOf=https%3A%2F%2Fcatalog.paradisec.org.au%2Frepository%2FLRB%2F001"
163+
164+ # Paginate through files
165+ curl " https://data.ldaca.edu.au/api/files?limit=50&offset=0"
116166```
117167
118- ### Getting Download URLs
168+ ### Listing MediaObject Entities
119169
120- Instead of direct download, get the file location :
170+ List files that are part of the RO-Crate :
121171
122172``` bash
123- curl " https://data.ldaca.edu.au/api/entity/https%3A%2F%2Fcatalog.paradisec.org.au%2Frepository%2FLRB%2F001/file/recording.wav?noRedirect=true"
173+ curl " https://data.ldaca.edu.au/api/entities?entityType=http://schema.org/MediaObject"
174+ ```
175+
176+ MediaObject entities include a ` fileId ` field that references the file in the ` /files ` endpoint:
177+
178+ ``` json
179+ {
180+ "id" : " https://catalog.paradisec.org.au/repository/LRB/001/recording.wav" ,
181+ "name" : " recording.wav" ,
182+ "entityType" : " http://schema.org/MediaObject" ,
183+ "fileId" : " https://catalog.paradisec.org.au/repository/LRB/001/recording.wav" ,
184+ ...
185+ }
186+ ```
187+
188+ ### Accessing File Content
189+
190+ ``` bash
191+ # Direct file download
192+ curl " https://data.ldaca.edu.au/api/file/https%3A%2F%2Fcatalog.paradisec.org.au%2Frepository%2FLRB%2F001%2Frecording.wav" -o recording.wav
193+ ```
194+
195+ ### Download as Attachment
196+
197+ Force download with a custom filename:
198+
199+ ``` bash
200+ curl " https://data.ldaca.edu.au/api/file/https%3A%2F%2Fcatalog.paradisec.org.au%2Frepository%2FLRB%2F001%2Frecording.wav?disposition=attachment&filename=my-recording.wav"
201+ ```
202+
203+ ### Getting File Location
204+
205+ Get the file location without downloading:
206+
207+ ``` bash
208+ curl " https://data.ldaca.edu.au/api/file/https%3A%2F%2Fcatalog.paradisec.org.au%2Frepository%2FLRB%2F001%2Frecording.wav?noRedirect=true"
124209```
125210
126211** Response:**
@@ -131,22 +216,32 @@ curl "https://data.ldaca.edu.au/api/entity/https%3A%2F%2Fcatalog.paradisec.org.a
131216}
132217```
133218
134- ## 4. Paginating Through Large Result Sets
219+ ### Partial Content Download
220+
221+ Use HTTP range requests for streaming or resuming downloads:
222+
223+ ``` bash
224+ # Download first 1KB
225+ curl -H " Range: bytes=0-1023" \
226+ " https://data.ldaca.edu.au/api/file/https%3A%2F%2Fcatalog.paradisec.org.au%2Frepository%2FLRB%2F001%2Frecording.wav"
227+ ```
228+
229+ ## 5. Paginating Through Large Result Sets
135230
136231### Basic Pagination
137232
138233``` bash
139234# First page
140235curl " https://data.ldaca.edu.au/api/entities?limit=100&offset=0"
141236
142- # Second page
237+ # Second page
143238curl " https://data.ldaca.edu.au/api/entities?limit=100&offset=100"
144239
145240# Third page
146241curl " https://data.ldaca.edu.au/api/entities?limit=100&offset=200"
147242```
148243
149- ## 5 . Working with Communication Modes
244+ ## 6 . Working with Communication Modes
150245
151246Language archives often categorise data by communication mode:
152247
0 commit comments