publish_my_data 1.2.4 → 1.3.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -1,939 +1,650 @@
1
- %section
2
- %h2 Introduction
3
-
4
- :markdown
5
- This page describes the current version of our production API, which was deployed on 26th of June 2013. This release included only backward-compatible changes. See our [Change Log](/changelog) for more details of what changed.
6
-
7
-
8
- %nav.contents
9
- %h2 Contents
10
-
11
- %h3
12
- 1 -
13
- %strong Linked Data API
14
-
15
- %ul
16
- %li
17
- %a(href="#uri-dereferencing")
18
- 1.1
19
- %strong URI Dereferencing
20
- %li
21
- %a(href="#resource-formats")
22
- 1.2
23
- %strong Resource Formats
24
- %li
25
- %a(href="#ruby-dereferencing")
26
- 1.2
27
- %strong Example: Dereferencing URIs with Ruby
28
- %li
29
- %a(href="#curl-dereferencing")
30
- 1.2
31
- %strong Example: Dereferencing URIs with cURL
32
-
33
- %h3
34
- 2 -
35
- %strong Other Resource APIs
36
- %ul
37
- %li
38
- %a(href="#individual-datasets")
39
- 2.1
40
- %strong Individual Datasets
41
- %li
42
- %a(href="#themes")
43
- 2.2
44
- %strong Themes
45
- %li
46
- %a(href="#dataset-collections")
47
- 2.3
48
- %strong Collections of Datasets
49
- %li
50
- %a(href="#individual-resources")
51
- 2.4
52
- %strong Individual Resources
53
- %li
54
- %a(href="#resource-collections")
55
- 2.5
56
- %strong Collections of Resources
57
- %li
58
- %a(href="#resource-options-and-limits")
59
- 2.6
60
- %strong Options and Limits
61
- %li
62
- %a(href="#ruby-resources-example")
63
- 2.7
64
- %strong Example: Using Ruby to get a filtered list of resources
65
- %li
66
- %a(href="#js-resources-example")
67
- 2.9
68
- %strong Example: Using JavaScript to get a filtered list of resources
69
- %li
70
- %a(href="#curl-resources-example")
71
- 2.9
72
- %strong Example: Using cURL to get the list of datasets in a theme
73
-
74
- %h3
75
- 3 -
76
- %strong SPARQL
77
- %ul
78
- %li
79
- %a(href="#sparql-introduction")
80
- 3.1
81
- %strong Introduction to SPARQL
82
- %li
83
- %a(href="#sparql-results-formats")
84
- 3.2
85
- %strong SPARQL Results Formats
86
- %li
87
- %a(href="#sparql-results-pagination")
88
- 3.3
89
- %strong SPARQL Results Pagination
90
- %li
91
- %a(href="#sparql-errors")
92
- 3.4
93
- %strong SPARQL Errors
94
- %li
95
- %a(href="#sparql-json-p")
96
- 3.5
97
- %strong JSON-P
98
- %li
99
- %a(href="#named-graphs")
100
- 3.6
101
- %strong Use of Named Graphs
102
- %li
103
- %a(href="#parameter-interpolation")
104
- 3.7
105
- %strong Parameter Substitution
106
- %li
107
- %a(href="#ruby-sparql-example")
108
- 3.8
109
- %strong Example: Using Ruby to request data from the SPARQL Endpoint
110
- %li
111
- %a(href="#js-sparql-example")
112
- 3.9
113
- %strong Example: Using JavaScript to request data from the SPARQL Endpoint
114
-
115
- %h3
116
- 4 -
117
- %strong General
118
- %ul
119
- %li
120
- %a(href="#response-size-limits")
121
- 4.1
122
- %strong Response Size Limits
123
- %li
124
- %a(href="#errors")
125
- 4.2
126
- %strong Errors
127
- %li
128
- %a(href="#cors")
129
- 4.3
130
- %strong CORS
131
- %li
132
- %a(href="#discontinued-datasets")
133
- 4.4
134
- %strong Discontinued Datasets
135
- %li
136
- %a(href="#api-keys")
137
- 4.5
138
- %strong API Keys
139
-
140
- %section
141
- %h2 1 - Linked Data API
142
-
143
- .subsection#uri-dereferencing
144
- %h3
145
- 1.1
146
- %strong URI Dereferencing
147
- %p Following the standard practices for Linked Data, we distinguish between a 'real-world' resource and documents about that resource. <strong>Identifiers (URIs)</strong> for the resources follow the pattern:
148
- %code.block http://{data-site-domain}/<strong>id</strong>/{...}
149
- %p When you look them up you get redirected to the corresponding document about that thing. The <strong>document URLs</strong> follow the pattern:
150
- %code.block http://{data-site-domain}/<strong>doc</strong>/{...}
151
- %p For example, for a URI identified by a URI:
152
- %code.block http://{data-site-domain}/<strong>id</strong>/my/resource
153
- %p If you put it into your browser you get redirected, with an HTTP status code of 303 ("See Other"), to an HTML page about that resource
154
- %code.block http://{data-site-domain}/<strong>doc</strong>/my/resource
155
-
156
- :markdown
157
- In cases where a URI identifies something that is essentially a document (an 'information resource') then we respond with a 200, as their URI and document page URL are one and the same. This includes [Datasets](#individual-datasets) as well as ontology terms and concept schemes.
158
-
159
- .subsection#resource-formats
160
- %h3
161
- 1.2
162
- %strong Resource Formats
163
- :markdown
164
- You can specify what format you want the resulting document to be in. By default you get HTML in a human-readable form, but you can also ask for the document in one of several RDF formats: <strong>RDF/XML</strong>, <strong>N-triples</strong>, <strong>Turtle</strong> or <strong>JSON-LD</strong>.
165
-
166
- :markdown
167
- There are two ways to specify which format you want: you can append a <strong>format extension</strong> to the document page's URL or you can use an <Strong>HTTP Accept header</strong> with the resource's URI or document page's URL.
168
-
169
-
170
- %table
171
- %thead
172
- %tr
173
- %th Format
174
- %th Extensions
175
- %th Accept Headers
176
- %tbody
177
- %tr
178
- %td.details RDF/XML
179
- %td .rdf
180
- %td.hardwrap application/rdf+xml
181
- %tr
182
- %td.details n-triples
183
- %td .nt, .txt, .text
184
- %td.hardwrap
185
- application/n-triples,
186
- text/plain
187
- %tr
188
- %td.details Turtle
189
- %td .ttl
190
- %td.hardwrap text/turtle
191
- %tr
192
- %td.details JSON-LD
193
- %td .json
194
- %td.hardwrap
195
- application/ld+json,
196
- application/json
197
- %br
198
-
199
- .subsection#ruby-dereferencing
200
- %h3
201
- 1.3
202
- %strong Example: Dereferencing URIs with Ruby
203
- :markdown
204
- Here's an example of dereferencing a URI using the [RestClient](http://rubydoc.info/gems/rest-client) library. Similar approaches can be taken in other languages. This assumes you already have Ruby set up on your system. Also, if you don't already have it, you'll need to install the gem:
205
-
206
- <code class="block">$ gem install rest-client</code>
207
-
208
- ... and require it in your script.
209
-
210
- <code class="block">require 'rest-client'</code>
211
-
212
- %h4
213
- 1.3.1
214
- %strong Specifying the format in an accept header - in this case RDF/XML
215
-
216
- :markdown
217
- If you're using the accept header, you can directly request the URI. This involves two requests, because doing an HTTP GET on the resource identifier gives you a 303 redirect to the appropriate document page. RestClient looks after that for you.
218
-
219
- %code.prettyprint.lang-ruby.block
220
- :preserve
221
- RestClient.get 'http://{data-site-domain}/id/my/resource, :accept=>'application/rdf+xml'
222
-
223
- :markdown
224
- You can also request the document page directly:
225
-
226
- %code.prettyprint.lang-ruby.block
227
- :preserve
228
- RestClient.get 'http://{data-site-domain}/doc/my/resource', :accept=>'application/rdf+xml'
229
-
230
- %h4
231
- 1.3.2
232
- %strong Specifing the format as an extension - in this case JSON
233
-
234
- :markdown
235
- If using an extension, you must request the document page directly (as '.json' is not part of the URI)
236
-
237
- %code.prettyprint.lang-ruby.block
238
- RestClient.get 'http://{data-site-domain}/doc/my/resource.json'
239
-
240
- .subsection#curl-dereferencing
241
- %h3
242
- 1.4
243
- %strong Example: Dereferencing URIs with cURL
244
- :markdown
245
- Here's an example of dereferencing a URI using the widely available [cURL](http://curl.haxx.se) command line program.
246
-
247
- %h4
248
- 1.4.1
249
- %strong Specifying the format in an accept header (in this case, Turtle)
250
-
251
- :markdown
252
- If you're using the accept header, you can directly request the URI. This involves two requests, because doing an HTTP GET on the resource identifier gives you a 303 redirect to the appropriate document page. cURL looks after that for you if you use the <code>-L</code> option.
253
-
254
- %code.prettyprint.block
255
- :preserve
256
- curl -L -H "Accept: text/turtle" http://{data-site-domain}/id/my/resource
257
-
258
- :markdown
259
- You can also request the document page directly
260
-
261
- %code.prettyprint.block
262
- curl -H "Accept: text/turtle" http://{data-site-domain}/id/my/resource
263
-
264
- %h4
265
- 1.4.2
266
- %strong Specifing the format as an extension - in this case N-triples
267
-
268
- :markdown
269
- If using an extension, you must request the document page directly (as '.nt' is not part of the URI)
270
-
271
- %code.prettyprint.lang-ruby.block
272
- curl http://{data-site-domain}/doc/my/resource.nt
273
-
274
-
275
- %section
276
- %h2 2 - Other Resource APIs
277
-
278
- :markdown
279
- Alongside the URI dereferencing we offer the following additional ways of accessing data in the system. Please be sure to read the [Options and Limits](#resource-options-and-limits) section, for some background information which applies to all these APIs, such as details on data formats and pagination.
280
-
281
- Some examples of accessing the data from our APIs using different languages follow at the end of this section.
282
-
283
- .subsection#individual-datasets
284
- %h3
285
- 2.1
286
- %strong Individual Datasets
287
- :markdown
288
- Dataset identifiers take the form <code>http://example.com/data/{dataset-short-name}</code>, where <code>{dataset-short-name}</code> is a URI section that uniquely identifies the dataset. The short name can contain lower-case letters, numbers, slashes, and hyphens.
289
-
290
- Dereferencing a dataset identifier responds with HTTP status code 200 and provides metadata about the dataset, including a link to where the dataset contents can be downloaded. e.g.:
291
-
292
- <code class="block">http://{data-site-domain}/data/my/dataset</code>
293
-
294
- Please also see the [Use of Named Graphs](#named-graphs) section, for how the dataset data and metadata is stored in the database.
295
-
296
- .subsection#themes
297
- %h3
298
- 2.2
299
- %strong Themes
300
- :markdown
301
- Datasets are grouped into Themes. A list of all themes is available at: <code>http://{data-site-domain}/themes</code>.
302
-
303
- Information about a particular theme can be accessed by [dereferencing](#uri-dereferencing) the theme's URI. e.g.
304
-
305
- <code class="block">http://{data-site-domain}/def/concept/themes/my/theme</code>
306
-
307
- .subsection#dataset-collections
308
- %h3
309
- 2.3
310
- %strong Collections of Datasets
311
- :markdown
312
- A list of all datasets is available at:
313
- <code>http://{data-site-domain}/data</code>, [paginatable](#resource-options-and-limits) with <code>page</code> and <code>per_page</code>.
314
-
315
- Lists of datasets in a single theme are available at: <code>http://{data-site-domain}/themes/{theme-name}</code>, where <code>{theme-name}</code> is the part of the theme URI after <code>/themes/</code>
316
-
317
- .subsection#individual-resources
318
- %h3
319
- 2.4
320
- %strong Individual Resources
321
- :markdown
322
- As well as using [URI dereferencing](#uri-dereferencing) to access information about individual resources, you can use the following URL pattern:
323
-
324
- <code class="block">http://{data-site-domain}/resource?uri={resource-uri}</code>
325
-
326
- This is especially useful for resources for which we have information in our database, but which aren't in the site's domain (i.e. so you can't dereference them in this site). e.g.
327
-
328
- <code class="block">http://{data-site-domain}/resource?uri=http://another.domain/id/external/resource</code>
329
-
330
- If using a format extension to request a particular format for the resource, the extension is added immediately after '/resource', for example to get a JSON-LD version of the above postcode
331
-
332
- <code class="block">http://{data-site-domain}/resource.json?uri={resource-uri}</code>
333
-
334
- .subsection#resource-collections
335
- %h3
336
- 2.5
337
- %strong Collections of Resources
338
- :markdown
339
- Collections of resources can be retrieved from <code>/resources</code> by supplying filters. For now, we just support filters for <code>dataset</code> and <code>type_uri</code>.
340
-
341
- %table
342
- %thead
343
- %tr
344
- %th Filter parameter
345
- %th Expected value
346
- %th Behaviour
347
- %tbody
348
- %tr
349
- %td.details dataset
350
- %td The <span style="font-style:italic">short name</span> of a dataset (see <a href="#individual-datasets">above</a>).
351
- %td Filters the results to only include resources in the named graph of that dataset.
352
- %tr
353
- %td.details type_uri
354
- %td The URI of a resource type.
355
- %td Filters the results to only include resources of the type identified by that URI.
356
- %br
357
-
358
- :markdown
359
- e.g.
360
-
361
- <code class="block">http://{data-site-domain}/resources?dataset={dataset-name}&type_uri={URL-encoded type URI}</code>
362
-
363
- <code class="block">http://{data-site-domain}/resources?dataset=my-dataset&type_uri=http%3A%2F%2Fexample.com%2Fdef%2Fmy%2Ftype</code>
364
-
365
- .subsection#resource-options-and-limits
366
- %h3
367
- 2.6
368
- %strong Options and Limits
369
-
370
- %h4
371
- 2.6.1
372
- %strong Formats
373
- :markdown
374
- Resources accessed via our resource APIs can be accessed in the same [choice of formats](#resource-format) as for URI dereferencing (via both **format extensions** or **HTTP Accept headers**).
375
-
376
- %h4
377
- 2.6.2
378
- %strong Pagination
379
- :markdown
380
- For any APIs which return collections of things, the list can be paginated using <code>page</code>
381
- (default 1) and <code>per_page</code> (default 1000) query-string parameters. The maximum allowable page size will initially be set to 1000, but we may consider increasing this (as well as the default) in the future.
382
-
383
- %h4
384
- 2.6.3
385
- %strong Response Size Limits
386
-
387
- :markdown
388
- All requests to our APIs are subject to the [response size limits](#response-size-limits).
389
-
390
- .subsection#ruby-resources-example
391
- %h3
392
- 2.7
393
- %strong Example: Using Ruby to get a filtered list of resources
394
-
395
- %h4
396
- 2.7.1
397
- %strong Basic Example
398
-
399
- :markdown
400
- Here we use Ruby to retrieve a list of all resources of a type in a dataset as N-triples.
401
-
402
- Let's assume the short name for that dataset is <code>my/dataset</code>, and the URI for the type is <code>http://purl.org/linked-data/cube#Observation</code>, so the URL we need to call is as follows. (See [the Collections of Resources section](#resource-collections)).
403
-
404
- <code class="block">http://{site-domain}/resources?dataset=my%2Fdatase&type_uri=http%3A%2F%2Fpurl.org%2Flinked-data%2Fcube%23Observation</code>
405
-
406
- If you visited that URL in your browser (substituting the site domain, dataset name and type uri for real values), you'd see a paginated list of the resources. You can try this by clicking on the links in the footers of the sample resource tables on dataset pages.
407
-
408
- We want to get it in N-triples format, so we'll add the .nt extension. (See the [Formats section](#resource-formats)).
409
-
410
- The following Ruby code assigns a string of N-triples into the <code>ntriples_data</code> variable. Note that as the [maximum page size](#resource-options-and-limits) is 1000, and there are over 1000 resoures of that type in the dataset, we'll need to make multiple requests.
411
-
412
- We use the [RestClient](http://rubydoc.info/gems/rest-client) here, which you can install with <code>$ gem install rest-client</code>.
413
-
414
- %code.prettyprint.lang-ruby.pre
415
- = preserve do
416
- :escaped
417
- require 'rest-client'
418
-
419
- url = "http://{site-domain}/resources.nt"
420
-
421
- ntriples_data = ""
422
- page = 1
423
- done = false
424
-
425
- while !done
426
- puts "requesting page \#{page}..."
427
- response = RestClient.get url, {:params =>
428
- {
429
- :page => page,
430
- :per_page => 1000,
431
- :dataset => "my/dataset",
432
- :type_uri => "http://purl.org/linked-data/cube#Observation"
433
- }
434
- }
435
-
436
- if response.length > 0
437
- ntriples_data += response
438
- page += 1
439
- else
440
- puts "no more data"
441
- done = true
442
- end
443
- end
444
-
445
- puts "data:"
446
- puts ntriples_data
447
-
448
- %h4
449
- 2.7.2
450
- %strong Extension: parsing the n-triples into an array of statements.
451
-
452
- :markdown
453
- The [ruby-rdf](http://rubydoc.info/github/ruby-rdf/rdf/master/) library is useful for parsing various rdf formats. Install it with <code>$ gem install rdf</code>. The following code reads our string of ntriples data into an array of <code>RDF::Statement</code>s.
454
-
455
- %code.prettyprint.lang-ruby.pre
456
- =preserve do
457
- :escaped
458
- require 'rdf'
459
-
460
- statements = []
461
- RDF::Reader.for(:ntriples).new(ntriples_data) {|r| r.each {|s| statements << s}}
462
-
463
- puts "parsed \#{statements.length} triples"
464
-
465
- :markdown
466
- **Note**: If you're doing a lot of work with RDF in Ruby, you might want to look at using [Swirrl](http://swirrl.com)'s open-source SPARQL ORM for Ruby, [Tripod](http://github.com/swirrl/tripod).
467
-
468
- .subsection#js-resources-example
469
- %h3
470
- 2.8
471
- %strong Example: Using JavaScript to get a filtered list of resources
472
-
473
- :markdown
474
- Here we use jQuery to retrieve a list of all the resources of a certain type in a dataset, as JSON-LD.
475
-
476
- Let's assume the short name for that dataset is <code>my/dataset</code>, and the URI for the type is <code>http://purl.org/linked-data/cube#Observation</code>, so the URL we need to call is as follows. (See [the Collections of Resources section](#resource-collections)).
477
-
478
- <code class="block">http://{site-domain}/resources?dataset=my%2Fdatase&type_uri=http%3A%2F%2Fpurl.org%2Flinked-data%2Fcube%23Observation</code>
479
-
480
- If you visited that URL in your browser (substituting the site domain, dataset name and type uri for real values), you'd see a paginated list of the resources. You can try this by clicking on the links in the footers of the sample resource tables on dataset pages.
481
-
482
- We want to get it in JSON format, so we'll add the .json extension. (See the [Formats section](#resource-formats)).
483
-
484
- The following HTML page uses JavaScript to request the data as JSON and add it to the <code>results</code> array. Note that as the [maximum page size](#resource-options-and-limits) is 1000, and there are over 1000 resoures of that type in the dataset, we'll need to make multiple requests.
485
-
486
- %code.prettyprint.lang-js.pre
487
- = preserve do
488
- :escaped
489
- <!DOCTYPE html>
490
- <html>
491
- <head>
492
- <script src="http://code.jquery.com/jquery-1.9.1.min.js"></script>
493
- </head>
494
- <body>
495
- <script type="text/javascript">
496
- var perPage = 100;
497
- var typeUri = "http://purl.org/linked-data/cube#Observation";
498
- var dataset = "my/dataset";
499
-
500
- var baseUrl = "http://{site-domain}/resources.json?"
501
- baseUrl += "per_page=" + perPage.toString();
502
- baseUrl += "&dataset=" + encodeURIComponent(dataset);
503
- baseUrl += "&type_uri=" + encodeURIComponent(typeUri);
504
-
505
- var page = 1;
506
- var results = [];
507
-
508
- function callAjaxPaging() {
509
- console.log("trying page: " + page.toString());
510
- url = baseUrl + "&page=" + page.toString();
511
-
512
- $.ajax({
513
- dataType: 'json',
514
- url: url,
515
- success: function(pageOfData) {
516
- results = results.concat(pageOfData);
517
- console.log("got " + results.length.toString() + " so far");
518
-
519
- if (pageOfData.length == perPage) {
520
- // this page was full. There might be more.
521
- page += 1;
522
- console.log("trying next page");
523
- callAjaxPaging();
524
- } else {
525
- // no more pages.
526
- alert('finished with ' + results.length.toString() + " results");
527
- }
528
- }
529
- });
530
- }
531
-
532
- alert('press OK to begin');
533
- callAjaxPaging();
534
- </script>
535
- </body>
536
- </html>
537
-
538
- .subsection#curl-resources-example
539
- %h3
540
- 2.9
541
- %strong Example: Using cURL to get the list of datasets in a theme
542
-
543
- :markdown
544
- Here we use the [cURL](http://curl.haxx.se/) command line program to get a list of datasets in the a theme, as JSON-LD.
545
-
546
- Let's assume the theme's name is <code>my/theme</code> is, so the URL we need to call is as follows. (See the [Collections of Datasets section](#dataset-collections)).
547
-
548
- <code class="block">http://{site-domain}/themes/my/theme</code>
549
-
550
- We'll use the Accept header to tell the server we want the response as JSON.
551
-
552
- %code.prettyprint.block
553
- curl -H "Accept: application/json" http://{site-domain}/themes/my/theme
554
-
555
- %section
556
- %h2 3 - SPARQL
557
-
558
- .subsection#sparql-introduction
559
- %h3
560
- 3.1
561
- %strong Introduction to SPARQL
562
- :markdown
563
-
564
- The most flexible way to access the data is by using SPARQL. Pronounced "sparkle", SPARQL stands for **S**parql **P**rotocol and **R**DF **Q**uery **L**anguage. It's a query language, analagous to SQL for relational databases, for retrieving and manipulating data from triple-stores like ours. We support [SPARQL 1.1](http://www.w3.org/TR/2013/REC-sparql11-query-20130321/) query syntax.
565
-
566
- To submit a SPARQL query from your code, issue an HTTP GET request to our **endpoint**:
567
-
568
- %code.block http://{site-domain}/sparql?query={URL-encoded query}
569
-
570
- %p For example, to run this simple query...
571
-
572
- %code.prettyprint.lang-sparql.pre
573
- = preserve do
574
- :escaped
575
- SELECT * WHERE {?s ?p ?o} LIMIT 10
576
-
577
- %p ...and get the results as JSON, you could GET the following URL (note the <code>.json</code> extension):
578
-
579
- %code.block http://{site-domain}/sparql.json?query=SELECT+%2A+WHERE+%7B%3Fs+%3Fp+%3Fo%7D+LIMIT+10
580
-
581
- :markdown
582
- See below for <a href="#sparql-results-formats">details of SPARQL Results Formats</a> and examples in a variety of languages (at the end of this section).
583
-
584
- .subsection#sparql-results-formats
585
- %h3
586
- 3.2
587
- %strong SPARQL Results formats
588
-
589
- :markdown
590
- As with other aspects of our API, to get the data in different formats, you can use either format extensions or HTTP Accept headers.
591
-
592
- The available formats depend on the type of SPARQL query. A SPARQL query can be one of four main forms: <strong>SELECT</strong>, <strong>ASK</strong>, <strong>CONSTRUCT</strong> or <strong>DESCRIBE</strong>.
593
-
594
- %table
595
- %thead
596
- %tr
597
- %th Query Type
598
- %th Format
599
- %th Extension
600
- %th Accept Headers
601
- %tbody
602
- %tr
603
- %td.details(rowspan=4) SELECT
604
- %td xml
605
- %td .xml
606
- %td.hardwrap
607
- application/xml,
608
- application/sparql-results+xml
609
- %tr
610
- %td json
611
- %td .json
612
- %td.hardwrap
613
- application/json,
614
- application/sparql-results+json
615
- %tr
616
- %td text
617
- %td .txt, .text
618
- %td.hardwrap text/plain
619
- %tr
620
- %td csv
621
- %td .csv
622
- %td.hardwrap text/csv
623
- %tr
624
- %td.details(rowspan=3) ASK
625
- %td json
626
- %td .json
627
- %td.hardwrap
628
- application/json,
629
- application/sparql-results+json
630
- %tr
631
- %td xml
632
- %td .xml
633
- %td.hardwrap
634
- application/xml,
635
- application/sparql-results+json
636
- %tr
637
- %td text
638
- %td .txt, .text
639
- %td.hardwrap text/plain
640
- %tr
641
- %td.details(rowspan=3) CONSTRUCT
642
- %td RDF/XML
643
- %td .rdf
644
- %td.hardwrap application/rdf+xml
645
- %tr
646
- %td N-triples
647
- %td .nt, .txt, .text
648
- %td.hardwrap
649
- text/plain,
650
- application/n-triples
651
- %tr
652
- %td Turtle
653
- %td .ttl
654
- %td.hardwrap text/turtle
655
- %tr
656
- %td.details(rowspan=3) DESCRIBE
657
- %td RDF/XML
658
- %td .rdf
659
- %td.hardwrap application/rdf+xml
660
- %tr
661
- %td N-triples
662
- %td .nt, .txt, .text
663
- %td.hardwrap
664
- text/plain,
665
- application/n-triples
666
- %tr
667
- %td Turtle
668
- %td .ttl
669
- %td.hardwrap text/turtle
670
-
671
- .subsection#sparql-results-pagination
672
- %h3
673
- 3.3
674
- %strong SPARQL Results Pagination
675
-
676
- :markdown
677
- We will accept <code>page</code> and <code>per_page</code> query-string parameters for paginating the results of SELECT queries.
678
-
679
- For requests made through the website (i.e. HTML format), the page size is defaulted to 20.
680
-
681
- For requests to our sparql endpoint for data formats (i.e. non-HTML), there will be no defaults for these parameters (i.e. results are unlimited).
682
-
683
- For SELECT queries, for convenience you can optionally pass the pagination parameters and we will use them to apply <code>LIMIT</code> and <code>OFFSET</code> clauses to the query. For other query types (i.e. DESCRIBE, CONSTRUCT, ASK), pagination like this doesn't make so much sense, so those parameters will be ignored.
684
-
685
- Please also refer to the [Response Size Limits](#response-size-limits) section below, and the examples at the end of this section.
686
-
687
- .subsection#sparql-errors
688
- %h3
689
- 3.4
690
- %strong SPARQL Errors
691
-
692
- :markdown
693
- If you make a SPARQL request with a malformed query in a data format (i.e. non-HTML), then we will respond with HTTP status 400, with a helpful message in the response.
694
-
695
- Additionally, please note the [Response Size Limits](#response-size-limits), which apply to all API calls, as well as the [Errors](#errors) section.
696
-
697
- .subsection#sparql-json-p
698
- %h3
699
- 3.5
700
- %strong JSON-P
701
-
702
- :markdown
703
- If you're requesting SPARQL results as JSON, you can additionally pass a <code>callback</code> parameter and the results will be wrapped in that function. This is useful for getting around cross-domain issues if you're running JavaScript on older browsers. (Please also see the [Cross-Origin Resource Sharing](#cors) section).
704
-
705
- For example:
706
-
707
- %code.block
708
- http://{site-domain}/sparql.json?callback=myCallbackFunction&query=SELECT+%2A+WHERE+%7B%3Fs+%3Fp+%3Fo%7D+LIMIT+10
709
-
710
- :markdown
711
- Or to make a JSON-P request with jQuery, you can omit the callback parameter from the url and just set the dataType to <code>jsonp</code>.
712
-
713
- %code.prettyprint.lang-js.pre
714
- = preserve do
715
- :escaped
716
- queryUrl = '{site-domain}/sparql.json?query=SELECT+%2A+WHERE+%7B%3Fs+%3Fp+%3Fo%7D+LIMIT+10'
717
-
718
- $.ajax({
719
- dataType: 'jsonp',
720
- url: queryUrl,
721
- success: function(data) {
722
- // callback code here
723
- alert('success!');
724
- }
725
- });
726
-
727
- .subsection#named-graphs
728
- %h3
729
- 3.6
730
- %strong Use of Named graphs
731
-
732
- %h4
733
- 3.6.1
734
- %strong Dataset Data
735
-
736
- %p The data for each dataset is contained within a separate named graph. The dataset itself has a URI, for example
737
-
738
- %code.block http://{site-domain}/data/<strong>my/dataset</strong>
739
-
740
- %p The web page for the dataset lists the named graph that contains the dataset, in this case
741
-
742
- %code http://{site-domain}/graph/<strong>my/dataset</strong>
743
-
744
- %p The graph name for the dataset is contained in the dataset metadata, using a predicate called <code>http://publishmydata.com/def/dataset#graph</code> and can be obtained by a query like this:
745
-
746
- %code.prettyprint.lang-sparql.pre
747
- = preserve do
748
- :escaped
749
- SELECT ?graph
750
- WHERE {
751
- <http://example.com/data/my/dataset> <http://publishmydata.com/def/dataset#graph> ?graph
752
- }
753
-
754
- %h4
755
- 3.6.2
756
- %strong Dataset Metadata
757
-
758
- :markdown
759
- The metadata we store about the [dataset](#individual-datasets) itself (that is returned by dereferencing its URI), is stored its own separate graph, for example:
760
-
761
- %code.block http://{site-domain}/graph/<strong>my/dataset</strong>/metadata
762
-
763
- :markdown
764
- We also use named graphs for each concept scheme and ontology.
765
-
766
- .subsection#parameter-interpolation
767
- %h3
768
- 3.7
769
- %strong Parameter Substitution
770
-
771
- %p You can parameterise your SPARQL by including %{tokens} in your queries, and providing values for the tokens on the url query string.
772
-
773
- %code.block http://{site-domain}/sparql?query=URL-encoded-SPARQL-query?token1=value-for-token1&token2=value-for-token2
774
-
775
- :markdown
776
- Note that the following tokens are reserved and cannot be used as parameters for substitution.
777
-
778
- - controller
779
- - action
780
- - page
781
- - per_page
782
- - id
783
- - commit
784
- - utf8
785
- - query
786
-
787
-
788
-
789
-
790
- .subsection#ruby-sparql-example
791
- %h3
792
- 3.8
793
- %strong Example: Using Ruby to request data from the SPARQL Endpoint
794
-
795
- :markdown
796
- This sample Ruby makes a request to our SPARQL endpoint (as JSON) and then puts the result in a Hash.
797
-
798
- %code.prettyprint.lang-ruby.pre
799
- = preserve do
800
- :escaped
801
- require 'rest-client'
802
- require 'json'
803
-
804
- query = 'SELECT * WHERE {?s ?p ?o} LIMIT 10'
805
- site_domain = "example.com"
806
- url = "http://\#{site_domain}/sparql.json"
807
-
808
- results_str = RestClient.get url, {:params => {:query => query}}
809
- results_hash = JSON.parse results_str
810
- results_array = results_hash["results"]["bindings"]
811
-
812
- puts "total number of results: \#{results_array.length}"
813
-
814
- :markdown
815
- **Note**: If you're doing a lot of work with RDF in Ruby, you might want to look at using [Swirrl](http://swirrl.com)'s open-source SPARQL ORM for Ruby, [Tripod](http://github.com/swirrl/tripod).
816
-
817
- .subsection#js-sparql-example
818
- %h3
819
- 3.8
820
- %strong Example: Using JavaScript to request data from the SPARQL Endpoint
821
-
822
- :markdown
823
- This example HTML page uses jQuery to make a request to our SPARQL endpoint for all resources of type Unitary Authority (as JSON) and log the results.
824
-
825
- %code.prettyprint.lang-html.pre
826
- = preserve do
827
- :escaped
828
- <!DOCTYPE html>
829
- <html>
830
- <head>
831
- <script src="http://code.jquery.com/jquery-1.9.1.min.js"></script>
832
- </head>
833
- <body>
834
- <script type="text/javascript">
835
- var siteDomain = "example.com"
836
- var query = "SELECT * WHERE {?s ?p ?o} LIMIT 10";
837
- var url = "http://" + siteDomain + "/sparql.json?query=";
838
- url += encodeURIComponent(query);
839
- $.ajax({
840
- dataType: 'json',
841
- url: url,
842
- success: function(data) {
843
- alert('success: ' + data.results.bindings.length + ' results');
844
- console.log(data);
845
- }
846
- });
847
- </script>
848
- </body>
849
- </html>
850
- :markdown
851
- **Note**: See the [Cross-Origin Resource Sharing](#cors) section for a note about accessing data from from other domains.
852
- %section
853
- %h2 4 - General
854
-
855
- .subsection#response-size-limits
856
- %h3
857
- 4.1
858
- %strong Response Size Limits
859
-
860
- :markdown
861
- For all requests to our API, if the request issues a request to the database which causes more than 5MB of data to be returned, we will respond with HTTP status code 400, with the a message in the response body including the phrase <code>Response too large</code>. Note that full pre-canned dumps of all datasets will be available (in zipped n-triples format) at URLs defined in the [dataset metadata](#individual-datasets).
862
-
863
- .subsection#errors
864
- %h3
865
- 4.2
866
- %strong Errors
867
-
868
- %table
869
- %thead
870
- %tr
871
- %th Error type
872
- %th HTTP status code
873
- %th Notes
874
- %tbody
875
- %tr
876
- %td Response too large
877
- %td 400
878
- %td We will include a text message in the response body including the phrase "Response too large."
879
- %tr
880
- %td SPARQL Syntax Error
881
- %td 400
882
- %td We will include a text message in the response body with details of the error.
883
- %tr
884
- %td Resource Not Found
885
- %td 404
886
- %td Returned if you request a resource or URL that doesn't exist
887
- %tr
888
- %td Not Acceptable
889
- %td 406
890
- %td Returned if you request a non-supported data format
891
- %tr
892
- %td Unexpected Errors
893
- %td 500
894
- %td
895
- %tr
896
- %td Query Timeouts
897
- %td 503
898
- %td The timeout for requesting data from our database will initially be set to 10 seconds.
899
- %br
900
-
901
- .subsection#cors
902
- %h3
903
- 4.3
904
- %strong Cross-Origin Resource Sharing (CORS)
905
- :markdown
906
- Our web server is configured to allow access from all domains (by adding the following line to our nginx configuration):
907
-
908
- <code class="block">add_header Access-Control-Allow-Origin "*";</code>
909
-
910
- :markdown
911
- This means that if you're writing JavaScript to request data from our server in to a web page hosted on another domain, your browser should check this header and allow it.
912
-
913
- ####**A Note about Browser Support for CORS:**
914
-
915
- Modern browsers (such as recent versions of Internet Explorer Firefox, Chrome and Safari) have full CORS support. It is not supported in Internet Explorer 6 and 7. Versions 8 & 9 of Internet Explorer offer limited support. If you need to support older browsers, consider making requests for data via SPARQL, with [JSON-P](#sparql-json-p).
916
-
917
- .subsection#discontinued-datasets
918
- %h3
919
- 4.4
920
- %strong Discontinued Datasets
921
-
922
- :markdown
923
- A dataset can be marked as 'discontinued'. This approach is most often used in cases where a dataset uses an outdated vocabulary or outdated naming convention for URIs. This is similar to the concept of deprecation (in computer software).
924
-
925
- A discontinued dataset is assigned a type of <code>http://publishmydata.com/def/dataset#DeprecatedDataset</code> as well as the usual <code>http://publishmydata.com/def/dataset#Dataset</code>. The discontinued-status is indicated on the list of datasets in the user interface and on the individual dataset page.
926
-
927
- Optionally, a discontinued dataset may be replaced by another dataset. In this case a link to the new dataset appears on the dataset web page and the dataset metadata contains the triple:
928
-
929
- %code.pre
930
- = preserve do
931
- :escaped
932
- <{discontinued dataset URI}> <http://purl.org/dc/terms/isReplacedBy> <{new dataset URI}> .
933
-
934
- :markdown
935
- The contents of discontinued datasets are still available via SPARQL queries and other APIs. This allows us to update the way that data is represented without breaking external applications that use it. Discontinued datasets will generally be removed completely after some period of time. Information about planned deletion will be provided in the 'Further Information' section on the dataset web page.
936
-
937
-
938
- :javascript
939
- prettyPrint();
1
+ -# -------------------------------------------------------------------
2
+ -# FULL WIDTH INTRODUCTION
3
+ -# -------------------------------------------------------------------
4
+
5
+ - content_for :docs_intro do
6
+ %h1 Developer Documentation
7
+ %p This page describes the current version of our production API, which was deployed on 26th of June 2013.
8
+
9
+ -# -------------------------------------------------------------------
10
+ -# MAIN CONTENT
11
+ -#
12
+ -# Note that if you change a section title, you should check you
13
+ -# have also updated any inline links to it
14
+ -#
15
+ -# -------------------------------------------------------------------
16
+
17
+ = documentation_section "Linked Data API" do
18
+ = documentation_subsection "URI Dereferencing" do
19
+ %p Following the standard practices for Linked Data, we distinguish between a 'real-world' resource and documents about that resource. <strong>Identifiers (URIs)</strong> for the resources follow the pattern:
20
+ = codeblock "uri" do
21
+ http://{data-site-domain}/<strong>id</strong>/{...}
22
+ %p When you look them up you get redirected to the corresponding document about that thing. The <strong>document URLs</strong> follow the pattern:
23
+ = codeblock "uri" do
24
+ http://{data-site-domain}/<strong>doc</strong>/{...}
25
+ %p For example, for a URI identified by a URI:
26
+ = codeblock "uri" do
27
+ http://{data-site-domain}/<strong>id</strong>/my/resource
28
+ %p If you put it into your browser you get redirected, with an HTTP status code of 303 ("See Other"), to an HTML page about that resource
29
+ = codeblock "uri" do
30
+ http://{data-site-domain}/<strong>doc</strong>/my/resource
31
+ %p
32
+ In cases where a URI identifies something that is essentially a document (an 'information resource') then we respond with a 200, as their URI and document page URL are one and the same. This includes
33
+ = docs_inline_link "datasets", "Individual Datasets"
34
+ as well as ontology terms and concept schemes.
35
+ = documentation_subsection "Resource Formats" do
36
+ %p You can specify what format you want the resulting document to be in. By default you get HTML in a human-readable form, but you can also ask for the document in one of several RDF formats: <strong>RDF/XML</strong>, <strong>N-triples</strong>, <strong>Turtle</strong> or <strong>JSON-LD</strong>.
37
+ %p There are two ways to specify which format you want: you can append a <strong>format extension</strong> to the document page's URL or you can use an <strong>HTTP Accept header</strong> with the resource's URI or document page's URL.
38
+ %table
39
+ %thead
40
+ %tr
41
+ %th Format
42
+ %th Extensions
43
+ %th Accept Headers
44
+ %tbody
45
+ %tr
46
+ %td.details RDF/XML
47
+ %td .rdf
48
+ %td.hardwrap application/rdf+xml
49
+ %tr
50
+ %td.details n-triples
51
+ %td .nt, .txt, .text
52
+ %td.hardwrap
53
+ application/n-triples,
54
+ text/plain
55
+ %tr
56
+ %td.details Turtle
57
+ %td .ttl
58
+ %td.hardwrap text/turtle
59
+ %tr
60
+ %td.details JSON-LD
61
+ %td .json
62
+ %td.hardwrap
63
+ application/ld+json,
64
+ application/json
65
+ = documentation_subsection "Example: Dereferencing URIs with Ruby" do
66
+ %p
67
+ Here's an example of dereferencing a URI using the
68
+ = link_to "RestClient", "http://rubydoc.info/gems/rest-client"
69
+ library. Similar approaches can be taken in other languages. This assumes you already have Ruby set up on your system. Also, if you don't already have it, you'll need to install the gem:
70
+ = codeblock "ruby" do
71
+ $ gem install rest-client
72
+ %p &hellip; and require it in your script.
73
+ = codeblock "ruby" do
74
+ require 'rest-client'
75
+ = documentation_subsubsection "Specifying the format in an accept header - in this case RDF/XML" do
76
+ %p If you're using the accept header, you can directly request the URI. This involves two requests, because doing an HTTP GET on the resource identifier gives you a 303 redirect to the appropriate document page. RestClient looks after that for you.
77
+ = codeblock "ruby" do
78
+ RestClient.get 'http://{data-site-domain}/id/my/resource, :accept=>'application/rdf+xml'
79
+ %p You can also request the document page directly:
80
+ = codeblock "ruby" do
81
+ RestClient.get 'http://{data-site-domain}/doc/my/resource', :accept=>'application/rdf+xml'
82
+ = documentation_subsubsection "Specifing the format as an extension - in this case JSON" do
83
+ %p If using an extension, you must request the document page directly (as '.json' is not part of the URI)
84
+ = codeblock "ruby" do
85
+ RestClient.get 'http://{data-site-domain}/doc/my/resource.json'
86
+ = documentation_subsection "Example: Dereferencing URIs with cURL" do
87
+ %p
88
+ Here's an example of dereferencing a URI using the widely available
89
+ = link_to "cURL", "http://curl.haxx.se"
90
+ command line program.
91
+ = documentation_subsubsection "Specifying the format in an accept header (in this case, Turtle)" do
92
+ %p If you're using the accept header, you can directly request the URI. This involves two requests, because doing an HTTP GET on the resource identifier gives you a 303 redirect to the appropriate document page. cURL looks after that for you if you use the <code>-L</code> option.
93
+ = codeblock "terminal" do
94
+ curl -L -H "Accept: text/turtle" http://{data-site-domain}/id/my/resource
95
+ %p You can also request the document page directly
96
+ = codeblock "terminal" do
97
+ curl -H "Accept: text/turtle" http://{data-site-domain}/id/my/resource
98
+ = documentation_subsubsection "Specifing the format as an extension (in this case N-triples)" do
99
+ %p If using an extension, you must request the document page directly (as '.nt' is not part of the URI)
100
+ = codeblock "terminal" do
101
+ curl http://{data-site-domain}/doc/my/resource.nt
102
+
103
+ = documentation_section "Other Resource APIs" do
104
+ = documentation_subsection "Ways to access data" do
105
+ %p
106
+ Alongside the URI dereferencing we offer the following additional ways of accessing data in the system. Please be sure to read the
107
+ =docs_inline_link "Options and Limits", "Options and Limits"
108
+ section, for some background information which applies to all these APIs, such as details on data formats and pagination.
109
+ %p Some examples of accessing the data from our APIs using different languages follow at the end of this section.
110
+ = documentation_subsection "Individual Datasets" do
111
+ %p Dataset identifiers take the form
112
+ = codeblock "uri" do
113
+ http://example.com/data/{dataset-short-name}
114
+ %p where <code>{dataset-short-name}</code> is a URI section that uniquely identifies the dataset. The short name can contain lower-case letters, numbers, slashes, and hyphens.
115
+ %p Dereferencing a dataset identifier responds with HTTP status code 200 and provides metadata about the dataset, including a link to where the dataset contents can be downloaded. e.g.:
116
+ = codeblock "uri" do
117
+ http://{data-site-domain}/data/my/dataset
118
+ %p
119
+ Please also see the
120
+ = docs_inline_link "Use of Named Graphs", "Use of Named Graphs"
121
+ section, for how the dataset data and metadata is stored in the database.
122
+ = documentation_subsection "Themes" do
123
+ %p Datasets are grouped into Themes. A list of all themes is available at:
124
+ = codeblock "uri" do
125
+ http://{data-site-domain}/themes
126
+ %p
127
+ Information about a particular theme can be accessed by
128
+ = docs_inline_link "dereferencing", "URI Dereferencing"
129
+ the theme's URI. e.g.
130
+ = codeblock "uri" do
131
+ http://{data-site-domain}/def/concept/themes/my/theme
132
+ = documentation_subsection "Collections of Datasets" do
133
+ %p A list of all datasets is available at:
134
+ = codeblock "uri" do
135
+ http://{data-site-domain}/data
136
+ %p
137
+ =docs_inline_link "paginatable", "Options and Limits"
138
+ with <code>page</code> and <code>per_page</code>.
139
+ %p Lists of datasets in a single theme are available at:
140
+ = codeblock "uri" do
141
+ http://{data-site-domain}/themes/{theme-name}
142
+ %p where <code>{theme-name}</code> is the part of the theme URI after <code>/themes/</code>
143
+ = documentation_subsection "Individual Resources" do
144
+ %p
145
+ As well as using
146
+ = docs_inline_link "dereferencing", "URI Dereferencing"
147
+ to access information about individual resources, you can use the following URL pattern:
148
+ = codeblock "uri" do
149
+ http://{data-site-domain}/resource?uri={resource-uri}
150
+ %p This is especially useful for resources for which we have information in our database, but which aren't in the site's domain (i.e. so you can't dereference them in this site). e.g.
151
+ = codeblock "uri" do
152
+ http://{data-site-domain}/resource?uri=http://another.domain/id/external/resource
153
+ %p If using a format extension to request a particular format for the resource, the extension is added immediately after '/resource', for example to get a JSON-LD version of the above postcode
154
+ = codeblock "uri" do
155
+ http://{data-site-domain}/resource.json?uri={resource-uri}
156
+ = documentation_subsection "Collections of Resources" do
157
+ %p Collections of resources can be retrieved from <code>/resources</code> by supplying filters. For now, we just support filters for <code>dataset</code> and <code>type_uri</code>.
158
+ %table
159
+ %thead
160
+ %tr
161
+ %th Filter parameter
162
+ %th Expected value
163
+ %th Behaviour
164
+ %tbody
165
+ %tr
166
+ %td.details dataset
167
+ %td
168
+ The <span style="font-style:italic">short name</span> of a dataset (see
169
+ = docs_inline_link "above", "Individual Datasets"
170
+ ).
171
+ %td Filters the results to only include resources in the named graph of that dataset.
172
+ %tr
173
+ %td.details type_uri
174
+ %td The URI of a resource type.
175
+ %td Filters the results to only include resources of the type identified by that URI.
176
+ %p e.g.
177
+ = codeblock "uri" do
178
+ http://{data-site-domain}/resources?dataset={dataset-name}&type_uri={URL-encoded type URI}
179
+ = codeblock "uri" do
180
+ http://{data-site-domain}/resources?dataset=my-dataset&type_uri=http%3A%2F%2Fexample.com%2Fdef%2Fmy%2Ftype
181
+ = documentation_subsection "Options and Limits" do
182
+ = documentation_subsubsection "Formats" do
183
+ %p
184
+ Resources accessed via our resource APIs can be accessed in the same
185
+ = docs_inline_link "choice of formats", "Results Formats"
186
+ as for URI dereferencing (via both <strong>format extensions</strong> or <strong>HTTP Accept headers</strong>).
187
+ = documentation_subsubsection "Pagination" do
188
+ %p For any APIs which return collections of things, the list can be paginated using <code>page</code> (default 1) and <code>per_page</code> (default 1000) query-string parameters. The maximum allowable page size will initially be set to 1000, but we may consider increasing this (as well as the default) in the future.
189
+ = documentation_subsubsection "Response Size Limits" do
190
+ %p
191
+ All requests to our APIs are subject to the
192
+ = docs_inline_link "response size limits.", "Response Size Limits"
193
+ = documentation_subsection "Example: Using Ruby to get a filtered list of resources" do
194
+ = documentation_subsubsection "Basic Example" do
195
+ %p Here we use Ruby to retrieve a list of all resources of a type in a dataset as N-triples.
196
+ %p
197
+ Let's assume the short name for that dataset is <code>my/dataset</code>, and the URI for the type is <code>http://purl.org/linked-data/cube#Observation</code>, so the URL we need to call is as follows. (See
198
+ = docs_inline_link "the Collections of Resources section", "Collections of Resources"
199
+ ).
200
+ = codeblock "uri" do
201
+ http://{site-domain}/resources?dataset=my%2Fdatase&type_uri=http%3A%2F%2Fpurl.org%2Flinked-data%2Fcube%23Observation
202
+ %p If you visited that URL in your browser (substituting the site domain, dataset name and type uri for real values), you'd see a paginated list of the resources. You can try this by clicking on the links in the footers of the sample resource tables on dataset pages.
203
+ %p
204
+ We want to get it in N-triples format, so we'll add the .nt extension. (See the
205
+ = docs_inline_link "Formats section", "Results Formats"
206
+ ).
207
+ %p
208
+ The following Ruby code assigns a string of N-triples into the <code>ntriples_data</code> variable. Note that as the
209
+ = docs_inline_link "maximum page size", "Options and Limits"
210
+ is 1000, and there are over 1000 resoures of that type in the dataset, we'll need to make multiple requests.
211
+ %p
212
+ We use the
213
+ = link_to "RestClient", "http://rubydoc.info/gems/rest-client"
214
+ here, which you can install with <code>$ gem install rest-client</code>.
215
+ = codeblock_pre "uri" do
216
+ = preserve do
217
+ :escaped
218
+ require 'rest-client'
219
+
220
+ url = "http://{site-domain}/resources.nt"
221
+
222
+ ntriples_data = ""
223
+ page = 1
224
+ done = false
225
+
226
+ while !done
227
+ puts "requesting page \#{page}..."
228
+ response = RestClient.get url, {:params =>
229
+ {
230
+ :page => page,
231
+ :per_page => 1000,
232
+ :dataset => "my/dataset",
233
+ :type_uri => "http://purl.org/linked-data/cube#Observation"
234
+ }
235
+ }
236
+
237
+ if response.length > 0
238
+ ntriples_data += response
239
+ page += 1
240
+ else
241
+ puts "no more data"
242
+ done = true
243
+ end
244
+ end
245
+
246
+ puts "data:"
247
+ puts ntriples_data
248
+ = documentation_subsubsection "Extension: parsing the n-triples into an array of statements" do
249
+ %p
250
+ The
251
+ = link_to "ruby-rdf", "http://rubydoc.info/github/ruby-rdf/rdf/master/"
252
+ library is useful for parsing various rdf formats. Install it with <code>$ gem install rdf</code>. The following code reads our string of ntriples data into an array of <code>RDF::Statement</code>s.
253
+ = codeblock_pre "uri" do
254
+ = preserve do
255
+ :escaped
256
+ require 'rdf'
257
+
258
+ statements = []
259
+ RDF::Reader.for(:ntriples).new(ntriples_data) {|r| r.each {|s| statements << s}}
260
+
261
+ puts "parsed \#{statements.length} triples"
262
+ %p
263
+ <strong>Note</strong>: If you're doing a lot of work with RDF in Ruby, you might want to look at using
264
+ = link_to "Swirrl", "http://swirrl.com"
265
+ 's open-source SPARQL ORM for Ruby,
266
+ = link_to "Tripod.", "http://github.com/swirrl/tripod"
267
+ = documentation_subsection "Example: Using JavaScript to get a filtered list of resources" do
268
+ %p Here we use jQuery to retrieve a list of all the resources of a certain type in a dataset, as JSON-LD.
269
+ %p
270
+ Let's assume the short name for that dataset is <code>my/dataset</code>, and the URI for the type is <code>http://purl.org/linked-data/cube#Observation</code>, so the URL we need to call is as follows. (See
271
+ = docs_inline_link "the Collections of Resources section", "Collections of Resources"
272
+ ).
273
+ = codeblock "uri" do
274
+ http://{site-domain}/resources?dataset=my%2Fdatase&type_uri=http%3A%2F%2Fpurl.org%2Flinked-data%2Fcube%23Observation
275
+ %p If you visited that URL in your browser (substituting the site domain, dataset name and type uri for real values), you'd see a paginated list of the resources. You can try this by clicking on the links in the footers of the sample resource tables on dataset pages.
276
+ %p
277
+ We want to get it in JSON format, so we'll add the .json extension. (See the
278
+ = docs_inline_link "Formats section", "Results Formats"
279
+ ).
280
+ %p
281
+ The following HTML page uses JavaScript to request the data as JSON and add it to the <code>results</code> array. Note that as the
282
+ = docs_inline_link "maximum page size", "Options and Limits"
283
+ is 1000, and there are over 1000 resoures of that type in the dataset, we'll need to make multiple requests.
284
+ = codeblock_pre "uri" do
285
+ = preserve do
286
+ :escaped
287
+ <!DOCTYPE html>
288
+ <html>
289
+ <head>
290
+ <script src="http://code.jquery.com/jquery-1.9.1.min.js"></script>
291
+ </head>
292
+ <body>
293
+ <script type="text/javascript">
294
+ var perPage = 100;
295
+ var typeUri = "http://purl.org/linked-data/cube#Observation";
296
+ var dataset = "my/dataset";
297
+
298
+ var baseUrl = "http://{site-domain}/resources.json?"
299
+ baseUrl += "per_page=" + perPage.toString();
300
+ baseUrl += "&dataset=" + encodeURIComponent(dataset);
301
+ baseUrl += "&type_uri=" + encodeURIComponent(typeUri);
302
+
303
+ var page = 1;
304
+ var results = [];
305
+
306
+ function callAjaxPaging() {
307
+ console.log("trying page: " + page.toString());
308
+ url = baseUrl + "&page=" + page.toString();
309
+
310
+ $.ajax({
311
+ dataType: 'json',
312
+ url: url,
313
+ success: function(pageOfData) {
314
+ results = results.concat(pageOfData);
315
+ console.log("got " + results.length.toString() + " so far");
316
+
317
+ if (pageOfData.length == perPage) {
318
+ // this page was full. There might be more.
319
+ page += 1;
320
+ console.log("trying next page");
321
+ callAjaxPaging();
322
+ } else {
323
+ // no more pages.
324
+ alert('finished with ' + results.length.toString() + " results");
325
+ }
326
+ }
327
+ });
328
+ }
329
+
330
+ alert('press OK to begin');
331
+ callAjaxPaging();
332
+ </script>
333
+ </body>
334
+ </html>
335
+ = documentation_subsection "Example: Using cURL to get the list of datasets in a theme" do
336
+ %p
337
+ Here we use the
338
+ = link_to "cURL", "http://curl.haxx.se"
339
+ command line program to get a list of datasets in the a theme, as JSON-LD.
340
+ %p
341
+ Let's assume the theme's name is <code>my/theme</code> is, so the URL we need to call is as follows. (See the
342
+ =docs_inline_link "Collections of Datasets section", "Collections of Datasets"
343
+ ).
344
+ = codeblock "uri" do
345
+ http://{site-domain}/themes/my/theme
346
+ %p We'll use the Accept header to tell the server we want the response as JSON.
347
+ = codeblock "terminal" do
348
+ curl -H "Accept: application/json" http://{site-domain}/themes/my/theme
349
+
350
+ = documentation_section "SPARQL" do
351
+ = documentation_subsection "Introduction to SPARQL" do
352
+ %p
353
+ The most flexible way to access the data is by using SPARQL. Pronounced "sparkle", SPARQL stands for <strong>S</strong>parql <strong>P</strong>rotocol and <strong>R</strong>DF <strong>Q</strong>uery <strong>L</strong>anguage. It's a query language, analagous to SQL for relational databases, for retrieving and manipulating data from triple-stores like ours. We support
354
+ = link_to "SPARQL 1.1", "http://www.w3.org/TR/2013/REC-sparql11-query-20130321/"
355
+ query syntax.
356
+ %p To submit a SPARQL query from your code, issue an HTTP GET request to our **endpoint**:
357
+ = codeblock "uri" do
358
+ http://{site-domain}/sparql?query={URL-encoded query}
359
+ %p For example, to run this simple query...
360
+ = codeblock_pre "sparql" do
361
+ = preserve do
362
+ :escaped
363
+ SELECT * WHERE {?s ?p ?o} LIMIT 10
364
+
365
+ %p ...and get the results as JSON, you could GET the following URL (note the <code>.json</code> extension):
366
+ = codeblock "uri" do
367
+ http://{site-domain}/sparql.json?query=SELECT+%2A+WHERE+%7B%3Fs+%3Fp+%3Fo%7D+LIMIT+10
368
+ = documentation_subsection "SPARQL Results formats" do
369
+ %p As with other aspects of our API, to get the data in different formats, you can use either format extensions or HTTP Accept headers.
370
+ %p The available formats depend on the type of SPARQL query. A SPARQL query can be one of four main forms: <strong>SELECT</strong>, <strong>ASK</strong>, <strong>CONSTRUCT</strong> or <strong>DESCRIBE</strong>.
371
+ %table
372
+ %thead
373
+ %tr
374
+ %th Query Type
375
+ %th Format
376
+ %th Extension
377
+ %th Accept Headers
378
+ %tbody
379
+ %tr
380
+ %td.details(rowspan=4) SELECT
381
+ %td xml
382
+ %td .xml
383
+ %td.hardwrap
384
+ application/xml,
385
+ application/sparql-results+xml
386
+ %tr
387
+ %td json
388
+ %td .json
389
+ %td.hardwrap
390
+ application/json,
391
+ application/sparql-results+json
392
+ %tr
393
+ %td text
394
+ %td .txt, .text
395
+ %td.hardwrap text/plain
396
+ %tr
397
+ %td csv
398
+ %td .csv
399
+ %td.hardwrap text/csv
400
+ %tr
401
+ %td.details(rowspan=3) ASK
402
+ %td json
403
+ %td .json
404
+ %td.hardwrap
405
+ application/json,
406
+ application/sparql-results+json
407
+ %tr
408
+ %td xml
409
+ %td .xml
410
+ %td.hardwrap
411
+ application/xml,
412
+ application/sparql-results+json
413
+ %tr
414
+ %td text
415
+ %td .txt, .text
416
+ %td.hardwrap text/plain
417
+ %tr
418
+ %td.details(rowspan=3) CONSTRUCT
419
+ %td RDF/XML
420
+ %td .rdf
421
+ %td.hardwrap application/rdf+xml
422
+ %tr
423
+ %td N-triples
424
+ %td .nt, .txt, .text
425
+ %td.hardwrap
426
+ text/plain,
427
+ application/n-triples
428
+ %tr
429
+ %td Turtle
430
+ %td .ttl
431
+ %td.hardwrap text/turtle
432
+ %tr
433
+ %td.details(rowspan=3) DESCRIBE
434
+ %td RDF/XML
435
+ %td .rdf
436
+ %td.hardwrap application/rdf+xml
437
+ %tr
438
+ %td N-triples
439
+ %td .nt, .txt, .text
440
+ %td.hardwrap
441
+ text/plain,
442
+ application/n-triples
443
+ %tr
444
+ %td Turtle
445
+ %td .ttl
446
+ %td.hardwrap text/turtle
447
+ = documentation_subsection "SPARQL Results Pagination" do
448
+ %p We will accept <code>page</code> and <code>per_page</code> query-string parameters for paginating the results of SELECT queries.
449
+ %p For requests made through the website (i.e. HTML format), the page size is defaulted to 20.
450
+ %p For requests to our sparql endpoint for data formats (i.e. non-HTML), there will be no defaults for these parameters (i.e. results are unlimited).
451
+ %p For SELECT queries, for convenience you can optionally pass the pagination parameters and we will use them to apply <code>LIMIT</code> and <code>OFFSET</code> clauses to the query. For other query types (i.e. DESCRIBE, CONSTRUCT, ASK), pagination like this doesn't make so much sense, so those parameters will be ignored.
452
+ %p
453
+ Please also refer to the
454
+ = docs_inline_link "Response Size Limits", "Response Size Limits"
455
+ section below, and the examples at the end of this section.
456
+ = documentation_subsection "SPARQL Errors" do
457
+ %p If you make a SPARQL request with a malformed query in a data format (i.e. non-HTML), then we will respond with HTTP status 400, with a helpful message in the response.
458
+ %p
459
+ Additionally, please note the
460
+ = docs_inline_link "Response Size Limits", "Response Size Limits"
461
+ , which apply to all API calls, as well as the
462
+ = docs_inline_link "Errors", "Errors"
463
+ section.
464
+ = documentation_subsection "JSON-P" do
465
+ %p
466
+ If you're requesting SPARQL results as JSON, you can additionally pass a <code>callback</code> parameter and the results will be wrapped in that function. This is useful for getting around cross-domain issues if you're running JavaScript on older browsers. (Please also see the
467
+ = docs_inline_link "Cross-Origin Resource Sharing", "Cross-Origin Resource Sharing (CORS)"
468
+ section). For example:
469
+ = codeblock "uri" do
470
+ http://{site-domain}/sparql.json?callback=myCallbackFunction&query=SELECT+%2A+WHERE+%7B%3Fs+%3Fp+%3Fo%7D+LIMIT+10
471
+ %p Or to make a JSON-P request with jQuery, you can omit the callback parameter from the url and just set the dataType to <code>jsonp</code>.
472
+ = codeblock_pre "javascript" do
473
+ = preserve do
474
+ :escaped
475
+ queryUrl = '{site-domain}/sparql.json?query=SELECT+%2A+WHERE+%7B%3Fs+%3Fp+%3Fo%7D+LIMIT+10'
476
+
477
+ $.ajax({
478
+ dataType: 'jsonp',
479
+ url: queryUrl,
480
+ success: function(data) {
481
+ // callback code here
482
+ alert('success!');
483
+ }
484
+ });
485
+ = documentation_subsection "Use of Named Graphs" do
486
+ = documentation_subsubsection "Dataset Data" do
487
+ %p The data for each dataset is contained within a separate named graph. The dataset itself has a URI, for example
488
+ = codeblock "uri" do
489
+ http://{site-domain}/data/<strong>my/dataset</strong>
490
+ %p The web page for the dataset lists the named graph that contains the dataset, in this case
491
+ = codeblock "uri" do
492
+ http://{site-domain}/graph/<strong>my/dataset</strong>
493
+ %p The graph name for the dataset is contained in the dataset metadata, using a predicate called <code>http://publishmydata.com/def/dataset#graph</code> and can be obtained by a query like this:
494
+ = codeblock_pre "sparql" do
495
+ = preserve do
496
+ :escaped
497
+ SELECT ?graph
498
+ WHERE {
499
+ <http://example.com/data/my/dataset> <http://publishmydata.com/def/dataset#graph> ?graph
500
+ }
501
+ = documentation_subsubsection "Dataset Metaata" do
502
+ %p
503
+ The metadata we store about the
504
+ = docs_inline_link "dataset", "Individual Datasets"
505
+ itself (that is returned by dereferencing its URI), is stored its own separate graph, for example:
506
+ = codeblock "uri" do
507
+ http://{site-domain}/graph/<strong>my/dataset</strong>/metadata
508
+ %p We also use named graphs for each concept scheme and ontology.
509
+ = documentation_subsection "Parameter Substitution" do
510
+ %p You can parameterise your SPARQL by including <code>%{tokens}</code> in your queries, and providing values for the tokens on the url query string.
511
+ = codeblock "uri" do
512
+ http://{site-domain}/sparql?query=URL-encoded-SPARQL-query?token1=value-for-token1&token2=value-for-token2
513
+ %p Note that the following tokens are reserved and cannot be used as parameters for substitution.
514
+ %ul
515
+ %li controller
516
+ %li action
517
+ %li page
518
+ %li per_page
519
+ %li id
520
+ %li commit
521
+ %li utf8
522
+ %li query
523
+ = documentation_subsection "Example: Using Ruby to request data from the SPARQL Endpoint" do
524
+ %p This sample Ruby makes a request to our SPARQL endpoint (as JSON) and then puts the result in a Hash.
525
+ = codeblock_pre "ruby" do
526
+ = preserve do
527
+ :escaped
528
+ require 'rest-client'
529
+ require 'json'
530
+
531
+ query = 'SELECT * WHERE {?s ?p ?o} LIMIT 10'
532
+ site_domain = "example.com"
533
+ url = "http://\#{site_domain}/sparql.json"
534
+
535
+ results_str = RestClient.get url, {:params => {:query => query}}
536
+ results_hash = JSON.parse results_str
537
+ results_array = results_hash["results"]["bindings"]
538
+
539
+ puts "total number of results: \#{results_array.length}"
540
+
541
+ %p
542
+ <strong>Note</strong>: If you're doing a lot of work with RDF in Ruby, you might want to look at using
543
+ = link_to "Swirrl", "http://swirrl.com"
544
+ 's open-source SPARQL ORM for Ruby,
545
+ = link_to "Tripod.", "http://github.com/swirrl/tripod"
546
+
547
+ = documentation_subsection "Example: Using JavaScript to request data from the SPARQL Endpoint" do
548
+ %p This example HTML page uses jQuery to make a request to our SPARQL endpoint.
549
+ = codeblock_pre "javascript" do
550
+ = preserve do
551
+ :escaped
552
+ <!DOCTYPE html>
553
+ <html>
554
+ <head>
555
+ <script src="http://code.jquery.com/jquery-1.9.1.min.js"></script>
556
+ </head>
557
+ <body>
558
+ <script type="text/javascript">
559
+ var siteDomain = "example.com"
560
+ var query = "SELECT * WHERE {?s ?p ?o} LIMIT 10";
561
+ var url = "http://" + siteDomain + "/sparql.json?query=";
562
+ url += encodeURIComponent(query);
563
+ $.ajax({
564
+ dataType: 'json',
565
+ url: url,
566
+ success: function(data) {
567
+ alert('success: ' + data.results.bindings.length + ' results');
568
+ console.log(data);
569
+ }
570
+ });
571
+ </script>
572
+ </body>
573
+ </html>
574
+ %p
575
+ <strong>Note</strong>: See the
576
+ = docs_inline_link "Cross-Origin Resource Sharing", "Cross-Origin Resource Sharing (CORS)"
577
+ section for a note about accessing data from from other domains.
578
+
579
+ = documentation_section "General" do
580
+ = documentation_subsection "Response Size Limits" do
581
+ %p
582
+ For all requests to our API, if the request issues a request to the database which causes more than 5MB of data to be returned, we will respond with HTTP status code 400, with the a message in the response body including the phrase <code>Response too large</code>. Note that full pre-canned dumps of all datasets will be available (in zipped n-triples format) at URLs defined in the
583
+ = docs_inline_link "dataset metadata.", "Individual Datasets"
584
+ = documentation_subsection "Errors" do
585
+ %table
586
+ %thead
587
+ %tr
588
+ %th Error type
589
+ %th HTTP status code
590
+ %th Notes
591
+ %tbody
592
+ %tr
593
+ %td Response too large
594
+ %td 400
595
+ %td We will include a text message in the response body including the phrase "Response too large."
596
+ %tr
597
+ %td SPARQL Syntax Error
598
+ %td 400
599
+ %td We will include a text message in the response body with details of the error.
600
+ %tr
601
+ %td Resource Not Found
602
+ %td 404
603
+ %td Returned if you request a resource or URL that doesn't exist
604
+ %tr
605
+ %td Not Acceptable
606
+ %td 406
607
+ %td Returned if you request a non-supported data format
608
+ %tr
609
+ %td Unexpected Errors
610
+ %td 500
611
+ %td
612
+ %tr
613
+ %td Query Timeouts
614
+ %td 503
615
+ %td The timeout for requesting data from our database will initially be set to 10 seconds.
616
+ = documentation_subsection "Cross-Origin Resource Sharing (CORS)" do
617
+ %p Our web server is configured to allow access from all domains (by adding the following line to our nginx configuration):
618
+ = codeblock_pre "text" do
619
+ add_header Access-Control-Allow-Origin "*";
620
+ %p This means that if you're writing JavaScript to request data from our server in to a web page hosted on another domain, your browser should check this header and allow it.
621
+ = documentation_subsubsection "A Note about Browser Support for CORS" do
622
+ %p
623
+ Modern browsers (such as recent versions of Internet Explorer Firefox, Chrome and Safari) have full CORS support. It is not supported in Internet Explorer 6 and 7. Versions 8 & 9 of Internet Explorer offer limited support. If you need to support older browsers, consider making requests for data via SPARQL, with
624
+ = docs_inline_link "JSON-P.", "JSON-P"
625
+ = documentation_subsection "Discontinued Datasets" do
626
+ %p A dataset can be marked as 'discontinued'. This approach is most often used in cases where a dataset uses an outdated vocabulary or outdated naming convention for URIs. This is similar to the concept of deprecation (in computer software).
627
+ %p A discontinued dataset is assigned a type of <code>http://publishmydata.com/def/dataset#DeprecatedDataset</code> as well as the usual <code>http://publishmydata.com/def/dataset#Dataset</code>. The discontinued-status is indicated on the list of datasets in the user interface and on the individual dataset page.
628
+ %p Optionally, a discontinued dataset may be replaced by another dataset. In this case a link to the new dataset appears on the dataset web page and the dataset metadata contains the triple:
629
+ = codeblock "rdf" do
630
+ <{discontinued dataset URI}> <http://purl.org/dc/terms/isReplacedBy> <{new dataset URI}> .
631
+ %p The contents of discontinued datasets are still available via SPARQL queries and other APIs. This allows us to update the way that data is represented without breaking external applications that use it. Discontinued datasets will generally be removed completely after some period of time. Information about planned deletion will be provided in the 'Further Information' section on the dataset web page.
632
+
633
+ -# -------------------------------------------------------------------
634
+ -# CONTENTS NAVIGATION
635
+ -# -------------------------------------------------------------------
636
+
637
+ - content_for :docs_contents do
638
+ %nav.contents
639
+ %h2 Contents
640
+ - @documentation_sections.each do |section|
641
+ %h3= section[:name]
642
+ %ul
643
+ - section[:subsections].each do |subsection|
644
+ %li= docs_inline_link subsection[:name], subsection[:name]
645
+ /
646
+ - # don't include subsubsections in nav
647
+ - if (subsection[:subsubsections].length > 0)
648
+ %ul
649
+ - subsection[:subsubsections].each do |subsubsection|
650
+ %li= docs_inline_link subsubsection[:name], subsubsection[:name]