kithe 2.0.3 → 2.4.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +35 -40
- data/app/characterization/kithe/ffprobe_characterization.rb +179 -0
- data/app/indexing/kithe/indexable/thread_settings.rb +6 -3
- data/app/indexing/kithe/indexable.rb +3 -0
- data/app/indexing/kithe/indexer/obj_extract.rb +2 -1
- data/app/models/kithe/asset.rb +11 -22
- data/app/models/kithe/validators/model_parent.rb +5 -4
- data/app/validators/array_inclusion_validator.rb +1 -1
- data/lib/kithe/engine.rb +8 -1
- data/lib/kithe/indexable_settings.rb +5 -2
- data/lib/kithe/version.rb +1 -4
- data/lib/shrine/plugins/kithe_derivative_definitions.rb +10 -4
- data/lib/shrine/plugins/kithe_determine_mime_type.rb +9 -0
- data/lib/shrine/plugins/kithe_persisted_derivatives.rb +5 -0
- metadata +22 -8
- data/lib/kithe/patch_fx.rb +0 -39
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 6f3d3c39a3bdccd9fc97f8f5e16c94cf12d4926e7ca50cc5e3343406ad3044d3
|
4
|
+
data.tar.gz: 2bd6fc9c175d4ec213e58460fbb95b5c40b49ca1ae5808e77df26932cf60474a
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 55c9e6eb14e975a46927fa01b4c7a8a67ca1b804ae7662f2e34e7878ee0a3247b944395edf93e311f7c39b640392ddd5b74a3df77381ba5f761e78f883e48dfa
|
7
|
+
data.tar.gz: 730f6f1d20d45d7973d2c340e15889d6f2b56295f70eec8703a1e1b666ba078357b1d1ec2e34fa14ca5575ff17032f52340ffba3070e2b16e3da35012c5e8d9e
|
data/README.md
CHANGED
@@ -7,72 +7,69 @@ An experiment in shareable tools/components for building a digital collections a
|
|
7
7
|
|
8
8
|
Kithe is a toolkit for building digital collections/repository applications in Rails. It comes out of experience in the [samvera](https://samvera.org/) community of open source library-archives-museums digital collections/preservation work (but is not a samvera project).
|
9
9
|
|
10
|
-
Kithe does not use fedora or valkyrie, but stores all metadata using ActiveRecord. Kithe requires you use postgres 9.5+ as your db.
|
10
|
+
Kithe does not use fedora or valkyrie, but stores all metadata using ActiveRecord. Kithe requires you use postgres 9.5+ as your db. It uses [shrine](https://shrinerb.com) for file-handling/asset-storing and tries to support developing your app as a normal Rails/ActiveRecord app. It will not give you a working turnkey application, but is a collection of tools for building an app with certain patterns.
|
11
11
|
|
12
|
-
Kithe
|
12
|
+
Kithe provides tools to supports these architectural patterns:
|
13
13
|
|
14
|
-
|
14
|
+
* [Modelling and Persistence](./guides/modelling.md):
|
15
|
+
* A Collection/Work/Asset model based on Samvera/PCDM, using rails Single-Table Inheritance to support hetereogenous associations with efficient rdbms lookup.
|
16
|
+
* Using Postgres JSONB for "schema-less" flexible storage, via [attr_json](https://github.com/jrochkind/attr_json), supporting complex structured nested repeatable data values.
|
17
|
+
* [Work representatives](./guides/work_representative.md) via ActiveRecord association, using postgres recursive CTE's to compute the "leaf" representative, designed to support efficient use of the DB including pre-loading leaf representatives.
|
18
|
+
* UUIDv4's as internal primary keys, but also provide a "friendlier_id" with a shorter unique alphanumeric identifier for URLs and other UI. By default they are supplied by a postgres stored procedure, but your code can set them to whatever you like.
|
15
19
|
|
16
|
-
|
20
|
+
* [Form support](./guides/forms.md): Easy Rails-like forms for that complex nested and repeatable form data, leaning on simple_form.
|
21
|
+
* An extension to Rails "strong parameters" that make some common patterns for
|
22
|
+
embedded JSON attributes more convenient, [Kithe::Parameters](./app/models/kithe/parameters.rb)
|
17
23
|
|
18
|
-
|
24
|
+
* [File handling](./guides/file_handling.md): A framework that let's you easily plug in your own custom characterization and derivatives handling, to be handled in an efficient and flexible way, ordinarily using background jobs. Implemented on top of [shrine](https://shrinerb.com).
|
25
|
+
* [Derivatives](./guides/derivatives.md) handling ensures data consistency without race conditions, and efficient querying patterns, letting you plugin custom derivatives creation, with some standard routines included.
|
19
26
|
|
20
|
-
|
21
|
-
|
22
|
-
While it is working well for us, since it hasn't had wide use, it could still be considered somewhat of an experiment. But you are invited to try it out and see how it works. You are welcome to use it, but also welcome to copy any code or just ideas from kithe.
|
23
|
-
|
24
|
-
Any questions or feedback of any kind are very welcome and encouraged, in the github project issues, samvera slack, or wherever is convenient.
|
25
|
-
|
26
|
-
|
27
|
-
# Kithe parts
|
28
|
-
|
29
|
-
Some guide documentation is available to explain each of kithe's major functionality areas. Definitely start with the modelling guide.
|
27
|
+
* [Solr Indexing](./guides/solr_indexing.md): Built-in Solr indexing using [traject](https://github.com/traject/traject) for defining mappings from your model objects to what you want in a Solr index. Uses ActiveRecord callbacks to automatically sync saves to solr, with many opportunities for customization.
|
28
|
+
* Not coupled to any other kithe components, could be used independently, hypothetically on any ActiveRecord model.
|
30
29
|
|
31
|
-
* [
|
30
|
+
* A [recommended approach for using Blacklight](./guides/blacklight_approach.md) with search result view templates based on actual ActiveRecord models. Blacklight use is optional with kithe, but kithe works well with blacklight.
|
32
31
|
|
33
|
-
|
32
|
+
* Assorted optional utilities
|
33
|
+
* [Kithe::ConfigBase](./app/models/kithe/config_base.rb) A totally optional solution for managing environmental config variables.
|
34
34
|
|
35
|
-
*
|
35
|
+
* [ArrayInclusionValdaitor](./app/validators/array_inclusion_validator.rb) Useful for validating on attr_json arrays of primitives.
|
36
36
|
|
37
|
-
|
37
|
+
## Setting up your app to use kithe
|
38
38
|
|
39
|
-
|
39
|
+
So you want to start an app that uses kithe. We should later provide better 'getting started' guide. For now some sketchy notes:
|
40
40
|
|
41
|
-
|
41
|
+
* Again re-iterate that kithe requires your Rails app use postgres, 9.5+.
|
42
42
|
|
43
|
-
*
|
44
|
-
* Not coupled to any other kithe components, could be used independently, hypothetically on any ActiveRecord model.
|
45
|
-
* Written after review of "prior art" in [sunspot](https://github.com/sunspot/sunspot) and [searchkick](https://github.com/ankane/searchkick) (which both used AR callback-based indexing), and others.
|
43
|
+
* kithe works with Rails 5.2 through 6.1.
|
46
44
|
|
47
|
-
*
|
45
|
+
* To install migrations from kithe to setup your database for it's models: `rake kithe_engine:install:migrations`
|
48
46
|
|
49
|
-
|
47
|
+
* Kithe view support generally assumes your app uses bootstrap 4, and uses [simple form](https://github.com/plataformatec/simple_form) configured with bootstrap settings. See https://github.com/plataformatec/simple_form#bootstrap . So you should install simple_form and bootstrap 4.
|
50
48
|
|
51
|
-
* [
|
49
|
+
* Specific additional pre-requisites/requirements can sometimes be found in individual feature docs. And include the Javascript from [cocoon](https://github.com/nathanvda/cocoon), for form support for repeatable-field editing forms. We haven't quite figured out our preferred sane approach for sharing Javascript via kithe.
|
52
50
|
|
53
|
-
* [Kithe::ConfigBase](./app/models/kithe/config_base.rb) A totally optional solution for managing environmental config variables.
|
54
51
|
|
55
|
-
|
52
|
+
## Why kithe?
|
56
53
|
|
57
|
-
|
54
|
+
Kithe tries to let you develop your app like "an ordinary Rails app" (in all it's possible variations), while handling some of the rough spots common to the kinds of modelling and administration common to digital collections domains. But developers should be able to use standard Rails patterns and skills to develop an app to your specific local needs, familiar, no more complicated than building any other Rails app. You add features to a kithe app just like building Rails, using whatever patterns you like. We support modern Rails versions, 5.2+.
|
58
55
|
|
59
|
-
|
56
|
+
In that kithe provides tools and not a turnkey app, develping an app based on kythe in some ways similar to developing an app based on [valkyrie](https://github.com/samvera-labs/valkyrie) (but not hyrax). They both provide basic architecture for modelling/persistence, although in quite different ways. Kithe also provides tools in addition to modelling/persistence, but does _not_ provide the data-mapper/repository pattern valkyrie does, or any built-in abstraction for persisting anywhere but a postgres DB.
|
60
57
|
|
61
|
-
|
58
|
+
If you are comparing it to a "solution bundle" digital collections platform like hyrax, kithe may seem like more work. But experience has shown us that in our domain, "solution bundles" can turn out less of a "turnkey" approach than they seem, and can have greater development cost over total app lifecycle than anticipated. If you have similar experience that leads you to consider a more 'bespoke' app approach -- you may want to consider kithe. We hope to provide architecturally simple support and standardization for your custom app, taking care of some of the common "hard parts" and leaving you with flexibility to build out the app that meets your needs.
|
62
59
|
|
63
|
-
|
60
|
+
Kithe has beeen developed in tandem with the Science History Institute's in-development [replacement digital collections](https://github.com/sciencehistory/scihist_digicoll) app, which has been in production for several years using kithe.
|
64
61
|
|
65
|
-
|
62
|
+
[The University of Minnesota found kithe](https://docs.google.com/presentation/d/1Z4AoIDOaxbY4pt3mDhNt6MfUs6VIMjysKKmaYQpjuk8/edit?usp=sharing) to pair well with GeoBlacklight, an easy way to provide the persistence layer and metadata editing UI that blacklight on it's own lacks.
|
66
63
|
|
67
|
-
|
64
|
+
We are serious about [semantic verisioning](https://semver.org/) and will endeavor to release backwards breaking changes only with a major release, and minimize major releases.
|
68
65
|
|
69
|
-
|
66
|
+
Kithe is working well for us, but has had limited (but non-zero) adoption from other institutions. It's still somewhat of an experiment, but one we think is going well. If you would consider developing a digital collections/repository app in "just Rails", we think it's worth investigating if kithe can save you some trouble in some rough common use cases. You are invited to try it out and see how it works, using kithe directly, or copying any code or just ideas from kithe.
|
70
67
|
|
71
|
-
|
68
|
+
Any questions or feedback of any kind are very welcome and encouraged! In the github project issues, samvera slack, or wherever is convenient.
|
72
69
|
|
73
70
|
## To be done
|
74
71
|
|
75
|
-
Considering some blacklight integration support
|
72
|
+
Considering some additional blacklight integration support, is any needed?
|
76
73
|
|
77
74
|
Other components/features may become more clear as we continue to develop. It's possible that kithe won't (at least for a long time) contain controllers themselves (it may contain some helper methods for controllers), or generalized permissions architecture. Both of these are some of the things most particular to specific apps, that are hard to generalize without creating monsters.
|
78
75
|
|
@@ -83,8 +80,6 @@ This is a Rails 'engine' whose template was created with: `rails plugin new kith
|
|
83
80
|
|
84
81
|
* Note we have chosen not to make it 'mountable' or 'isolated', I think that would be inappropriate for this kind of gem. It _is_ an engine so it can hook into Rails load paths and config as needed.
|
85
82
|
|
86
|
-
|
87
|
-
|
88
83
|
* Note we are currently using the standard rails-generated dummy app in spec/dummy for testing, rather than [engine_cart](https://github.com/cbeer/engine_cart) or [combustion](https://github.com/pat/combustion).
|
89
84
|
* Before you run the tests for the first time, create the database by running: `rails db:setup`. This will create two databases, kithe_development and kithe_test.
|
90
85
|
* Some of the rspec tests depend on [FFmpeg](https://ffmpeg.org/) for testing file derivative transformations. Mac users can install [ffmpeg via homebrew](https://formulae.brew.sh/formula/ffmpeg): `brew install ffmpeg`
|
@@ -0,0 +1,179 @@
|
|
1
|
+
require 'tty/command'
|
2
|
+
require 'json'
|
3
|
+
|
4
|
+
module Kithe
|
5
|
+
# Characterizes Audio or Video files using `ffprobe`, a tool that comes with `ffmpeg`.
|
6
|
+
#
|
7
|
+
# You can pass in a local File object (with a pathname), a local String pathname, or
|
8
|
+
# a remote URL. (Remote URLs will be passed directy to ffprobe, which can efficiently
|
9
|
+
# fetch just the bytes it needs)
|
10
|
+
#
|
11
|
+
# You can get back normalized A/V metadata:
|
12
|
+
#
|
13
|
+
# metadata = FfprobeCharacterization.new(url).normalized_metadata
|
14
|
+
#
|
15
|
+
# Normalized metadata is a *flat* hash of typed JSON-able values. It uses
|
16
|
+
# keys based on what the ActiveEncode gem seems to use, but adds some extras
|
17
|
+
# and makes a few tweaks. See the #normalized_metadata method source for
|
18
|
+
# keys supplied.
|
19
|
+
#
|
20
|
+
# Or the complete FFprobe response as JSON. (We try to use ffprobe options that
|
21
|
+
# are exhausitive as to what is returned, including ffprobe version(s))
|
22
|
+
#
|
23
|
+
# ffprobe_results = FfprobeCharacterization.new(url).ffprobe_hash
|
24
|
+
#
|
25
|
+
class FfprobeCharacterization
|
26
|
+
class_attribute :ffprobe_command, default: "ffprobe"
|
27
|
+
class_attribute :ffprobe_timeout, default: 10
|
28
|
+
|
29
|
+
attr_reader :input_arg
|
30
|
+
|
31
|
+
# @param input [String,File] local File OR local filepath as String, OR remote URL as string
|
32
|
+
# If you have a remote url, just passing hte remote url is way more performant than
|
33
|
+
# downloading it yourself locally -- ffprobe will just fetch the bytes it needs.
|
34
|
+
def initialize(input)
|
35
|
+
if input.respond_to?(:path)
|
36
|
+
input = input.path
|
37
|
+
end
|
38
|
+
@input_arg = input
|
39
|
+
end
|
40
|
+
|
41
|
+
# a helper for creating a block for shrine uploader, you can always use
|
42
|
+
# FFprobeCharecterization.new directly too!
|
43
|
+
#
|
44
|
+
# * Does not run on "cache" action, only on promotion (or manual execution).
|
45
|
+
#
|
46
|
+
# * Will run only on items with "audio/" or "video/" content-type.
|
47
|
+
#
|
48
|
+
# * By default only on main original, not derivatives, although
|
49
|
+
# you can pass `run_on_derivatives: true` if desired.
|
50
|
+
#
|
51
|
+
# Will use ffprobe with direct URL if possible based on source_io (ffprobe
|
52
|
+
# can very efficiently access only bytes needed from URL), otherwise will
|
53
|
+
# download local temp copy if necessary.
|
54
|
+
#
|
55
|
+
# class AssetUploader < Kithe::AssetUploader
|
56
|
+
# add_metadata do |source_io, **context|
|
57
|
+
# Kithe::FfprobeCharacterization.characterize_from_uploader(source_io, context)
|
58
|
+
# end
|
59
|
+
#
|
60
|
+
# #...
|
61
|
+
# end
|
62
|
+
#
|
63
|
+
def self.characterize_from_uploader(source_io, add_metadata_context, run_on_derivatives: false)
|
64
|
+
# only for A/V please
|
65
|
+
return {} unless add_metadata_context.dig(:metadata, "mime_type")&.start_with?(%r{\A(audio|video)/})
|
66
|
+
|
67
|
+
# don't run on cache, only on promotion or manual trigger
|
68
|
+
return {} unless add_metadata_context[:action] != :cache
|
69
|
+
|
70
|
+
# don't run on derivatives unless option given
|
71
|
+
return {} unless add_metadata_context[:derivative].nil? || run_on_derivatives
|
72
|
+
|
73
|
+
# ffprobe can use a URL and very efficiently only retrieve what bytes it needs...
|
74
|
+
if source_io.respond_to?(:url) && source_io.url.start_with?(/\Ahttps?:/)
|
75
|
+
Kithe::FfprobeCharacterization.new(source_io.url).normalized_metadata
|
76
|
+
else
|
77
|
+
# if not already a file, will download, possibly slow, but gets us to go.
|
78
|
+
Shrine.with_file(source_io) do |file|
|
79
|
+
Kithe::FfprobeCharacterization.new(file.path).normalized_metadata
|
80
|
+
end
|
81
|
+
end
|
82
|
+
end
|
83
|
+
|
84
|
+
# ffprobe args come from this suggestion:
|
85
|
+
#
|
86
|
+
# https://gist.github.com/nrk/2286511?permalink_comment_id=2593200#gistcomment-2593200
|
87
|
+
#
|
88
|
+
# We also add in various current version tags! If we're going to record all ffprobe
|
89
|
+
# output, we'll want that too!
|
90
|
+
def ffprobe_options
|
91
|
+
[
|
92
|
+
"-hide_banner",
|
93
|
+
"-loglevel", "fatal",
|
94
|
+
"-show_error", "-show_format", "-show_streams", "-show_programs",
|
95
|
+
"-show_chapters", "-show_private_data", "-show_versions",
|
96
|
+
"-print_format", "json",
|
97
|
+
]
|
98
|
+
end
|
99
|
+
|
100
|
+
# ffprobe output parsed as JSON...
|
101
|
+
def ffprobe_hash
|
102
|
+
@ffprobe_hash ||= JSON.parse(ffprobe_stdout).merge(
|
103
|
+
"ffprobe_options_used" => ffprobe_options.join(" ")
|
104
|
+
)
|
105
|
+
end
|
106
|
+
|
107
|
+
# Returns a FLAT JSON-able hash of normalized a/v metadata.
|
108
|
+
#
|
109
|
+
# Tries to standardize to what ActiveEncode uses, with some changes and additions.
|
110
|
+
# https://github.com/samvera-labs/active_encode/blob/42f5ed5427a39e56093a5e82123918c4b2619a47/lib/active_encode/technical_metadata.rb
|
111
|
+
#
|
112
|
+
# A video file or other container can have more than one audio or video stream in it, although
|
113
|
+
# this is somewhat unusual for our domain. For the stream-specific audio_ and video_ metadata
|
114
|
+
# returned, we just choose the *first* returned audio or video stream (which may be more or
|
115
|
+
# less arbitrary)
|
116
|
+
#
|
117
|
+
# See also #ffprobe_hash for complete ffprobe results
|
118
|
+
def normalized_metadata
|
119
|
+
# overall audio_sample_rate are null, audio codec is wrong
|
120
|
+
@normalized_metadata ||= {
|
121
|
+
"width" => first_video_stream_json&.dig("width"),
|
122
|
+
"height" => first_video_stream_json&.dig("height"),
|
123
|
+
"frame_rate" => video_frame_rate_as_float, # frames per second
|
124
|
+
"duration_seconds" => ffprobe_hash&.dig("format", "duration")&.to_f&.round(3),
|
125
|
+
"audio_codec" => first_audio_stream_json&.dig("codec_name"),
|
126
|
+
"video_codec" => first_video_stream_json&.dig("codec_name"),
|
127
|
+
"audio_bitrate" => first_audio_stream_json&.dig("bit_rate")&.to_i, # in bps
|
128
|
+
"video_bitrate" => first_video_stream_json&.dig("bit_rate")&.to_i, # in bps
|
129
|
+
# extra ones not ActiveEncode
|
130
|
+
"bitrate" => ffprobe_hash.dig("format", "bit_rate")&.to_i, # overall bitrate of whole file in bps
|
131
|
+
"audio_sample_rate" => first_audio_stream_json&.dig("sample_rate")&.to_i, # in Hz
|
132
|
+
"audio_channels" => first_audio_stream_json&.dig("channels")&.to_i, # usually 1 or 2 (for stereo)
|
133
|
+
"audio_channel_layout" => first_audio_stream_json&.dig("channel_layout"), # stereo or mono or (dolby) 2.1, or something else.
|
134
|
+
}.compact
|
135
|
+
end
|
136
|
+
|
137
|
+
# just the ffprobe version please. This is also available
|
138
|
+
# in ffprobe_hash
|
139
|
+
def ffprobe_version
|
140
|
+
ffprobe_hash.dig("program_version", "version")
|
141
|
+
end
|
142
|
+
|
143
|
+
private
|
144
|
+
|
145
|
+
def ffprobe_stdout
|
146
|
+
@ffprobe_output ||= TTY::Command.new(printer: :null).run(
|
147
|
+
ffprobe_command,
|
148
|
+
*ffprobe_options,
|
149
|
+
input_arg,
|
150
|
+
timeout: ffprobe_timeout).out
|
151
|
+
end
|
152
|
+
|
153
|
+
def first_video_stream_json
|
154
|
+
@first_video_stream_json ||= ffprobe_hash["streams"].find { |stream| stream["codec_type"] == "video" }
|
155
|
+
end
|
156
|
+
|
157
|
+
def first_audio_stream_json
|
158
|
+
@first_audio_stream_json ||= ffprobe_hash["streams"].find { |stream| stream["codec_type"] == "audio" }
|
159
|
+
end
|
160
|
+
|
161
|
+
# There are a few different values we could choose here. We're going to choose
|
162
|
+
# `avg_frame_rate` == total duration / number of frames,
|
163
|
+
# vs (not chosen) `r_frame_rate ` "the lowest framerate with which all timestamps can be represented accurately (it is the least common multiple of all framerates in the stream)"
|
164
|
+
#
|
165
|
+
# (note this sometimes gets us not what we expected, like it gets us 29.78 fps instead of 29.97)
|
166
|
+
#
|
167
|
+
# Then we have to change it from numerator/denomominator to float truncated to two decimal places,
|
168
|
+
# which we let ruby rational do for us.
|
169
|
+
def video_frame_rate_as_float
|
170
|
+
avg_frame_rate = first_video_stream_json&.dig("avg_frame_rate")
|
171
|
+
|
172
|
+
return nil unless avg_frame_rate
|
173
|
+
|
174
|
+
return nil if avg_frame_rate.split("/")[1] == "0" # sometimes it returns '0/0', don't know why.
|
175
|
+
|
176
|
+
Rational(avg_frame_rate).to_f.round(2)
|
177
|
+
end
|
178
|
+
end
|
179
|
+
end
|
@@ -57,7 +57,7 @@ module Kithe
|
|
57
57
|
def initialize(batching:, disable_callbacks:, original_settings:,
|
58
58
|
writer:, on_finish:)
|
59
59
|
@original_settings = original_settings
|
60
|
-
@batching =
|
60
|
+
@batching = batching
|
61
61
|
@disable_callbacks = disable_callbacks
|
62
62
|
@on_finish = on_finish
|
63
63
|
|
@@ -80,8 +80,9 @@ module Kithe
|
|
80
80
|
def writer
|
81
81
|
@writer ||= begin
|
82
82
|
if @batching
|
83
|
+
batch_size = (@batching == true) ? Kithe.indexable_settings.batching_mode_batch_size : @batching
|
83
84
|
@local_writer = true
|
84
|
-
Kithe.indexable_settings.writer_instance!("solr_writer.batch_size" =>
|
85
|
+
Kithe.indexable_settings.writer_instance!("solr_writer.batch_size" => batch_size)
|
85
86
|
end
|
86
87
|
end
|
87
88
|
end
|
@@ -97,6 +98,8 @@ module Kithe
|
|
97
98
|
# only call on-finish if we have a writer, batch writers are lazily
|
98
99
|
# created and maybe we never created one
|
99
100
|
if @writer
|
101
|
+
# if we created the writer ourselves locally and nobody
|
102
|
+
# specified an on_finish, close our locally-created writer.
|
100
103
|
on_finish = if @local_writer && @on_finish.nil?
|
101
104
|
proc {|writer| writer.close }
|
102
105
|
else
|
@@ -105,7 +108,7 @@ module Kithe
|
|
105
108
|
on_finish.call(@writer) if on_finish
|
106
109
|
end
|
107
110
|
|
108
|
-
Thread.current[THREAD_CURRENT_KEY] = @
|
111
|
+
Thread.current[THREAD_CURRENT_KEY] = @original_settings
|
109
112
|
end
|
110
113
|
|
111
114
|
private
|
@@ -120,6 +120,9 @@ module Kithe
|
|
120
120
|
#
|
121
121
|
# By default will use a per-update writer, or thread/block-specific writer configured with `self.index_with`,
|
122
122
|
# or you can pass one in.
|
123
|
+
#
|
124
|
+
# This method is part of Kithe API, including allowing local apps to override! Backwards
|
125
|
+
# compatibilty matters for semver with any change to method signature.
|
123
126
|
def update_index(mapper: kithe_indexable_mapper, writer:nil)
|
124
127
|
RecordIndexUpdater.new(self, mapper: mapper, writer: writer).update_index
|
125
128
|
end
|
@@ -48,7 +48,8 @@ module Kithe
|
|
48
48
|
first, *rest = *path
|
49
49
|
|
50
50
|
result = if obj.kind_of?(Array)
|
51
|
-
|
51
|
+
first_path_element = path.shift # remove it from path
|
52
|
+
obj.flat_map { |item| obj_extractor(item, [first_path_element]) }
|
52
53
|
elsif obj.kind_of?(Hash)
|
53
54
|
obj[first]
|
54
55
|
else
|
data/app/models/kithe/asset.rb
CHANGED
@@ -27,7 +27,8 @@ class Kithe::Asset < Kithe::Model
|
|
27
27
|
to: :file, allow_nil: true
|
28
28
|
delegate :stored?, to: :file_attacher
|
29
29
|
delegate :set_promotion_directives, :promotion_directives, to: :file_attacher
|
30
|
-
|
30
|
+
# delegate "metadata" as #file_metadata
|
31
|
+
delegate :metadata, to: :file, prefix: true, allow_nil: true
|
31
32
|
|
32
33
|
# will be sent to file_attacher.set_promotion_directives, provided by our
|
33
34
|
# kithe_promotion_hooks shrine plugin.
|
@@ -81,31 +82,11 @@ class Kithe::Asset < Kithe::Model
|
|
81
82
|
source = file
|
82
83
|
return false unless source
|
83
84
|
|
84
|
-
|
85
|
-
local_files = _process_kithe_derivatives_without_download(source, only: only, except: except, lazy: lazy)
|
85
|
+
local_files = file_attacher.process_derivatives(:kithe_derivatives, only: only, except: except, lazy: lazy)
|
86
86
|
|
87
87
|
file_attacher.add_persisted_derivatives(local_files)
|
88
88
|
end
|
89
89
|
|
90
|
-
# Working around Shrine's insistence on pre-downloading original before calling derivative processor.
|
91
|
-
# We want to avoid that, so when our `lazy` argument is in use, original does not get eagerly downloaded,
|
92
|
-
# but only gets downloaded if needed to make derivatives.
|
93
|
-
#
|
94
|
-
# This is a somewhat hacky way to do that, loking at the internals of shrine `process_derivatives`,
|
95
|
-
# and pulling them out to skip the parts we don't want. We also lose shrine instrumentation
|
96
|
-
# around this action.
|
97
|
-
#
|
98
|
-
# See: https://github.com/shrinerb/shrine/issues/470
|
99
|
-
#
|
100
|
-
# If that were resolved, the 'ordinary' shrine thing would be to replace calls
|
101
|
-
# to this local private method with:
|
102
|
-
#
|
103
|
-
# file_attacher.process_derivatives(:kithe_derivatives, only: only, except: except, lazy: lazy)
|
104
|
-
#
|
105
|
-
private def _process_kithe_derivatives_without_download(source, **options)
|
106
|
-
processor = file_attacher.class.derivatives_processor(:kithe_derivatives)
|
107
|
-
local_files = file_attacher.instance_exec(source, **options, &processor)
|
108
|
-
end
|
109
90
|
|
110
91
|
# Just a convennience for file_attacher.add_persisted_derivatives (from :kithe_derivatives),
|
111
92
|
# feel free to use that if you want to add more than one etc. By default stores to
|
@@ -144,6 +125,14 @@ class Kithe::Asset < Kithe::Model
|
|
144
125
|
result && result.values.first
|
145
126
|
end
|
146
127
|
|
128
|
+
# Like #update_derivative, but can update multiple at once.
|
129
|
+
#
|
130
|
+
# asset.update_derivatives({ "big_thumb" => big_thumb_io, "small_thumb" => small_thumb_io })
|
131
|
+
#
|
132
|
+
# Options from kithe `add_persisted_derivatives`/shrine `add_derivative` supported.
|
133
|
+
#
|
134
|
+
# asset.update_derivatives({ "big_thumb" => big_thumb_io, "small_thumb" => small_thumb_io }, delete_false)
|
135
|
+
#
|
147
136
|
def update_derivatives(deriv_hash, **options)
|
148
137
|
file_attacher.add_persisted_derivatives(deriv_hash, **options)
|
149
138
|
end
|
@@ -1,13 +1,14 @@
|
|
1
1
|
class Kithe::Validators::ModelParent < ActiveModel::Validator
|
2
2
|
def validate(record)
|
3
|
+
# don't load the parent just to validate it if it hasn't even changed.
|
4
|
+
return unless record.parent_id_changed?
|
5
|
+
|
3
6
|
if record.parent.present? && (record.parent.class <= Kithe::Asset)
|
4
|
-
record.errors
|
7
|
+
record.errors.add(:parent, 'can not be an Asset instance')
|
5
8
|
end
|
6
9
|
|
7
10
|
if record.parent.present? && record.class <= Kithe::Collection
|
8
|
-
record.errors
|
11
|
+
record.errors.add(:parent, 'is invalid for Collection instances')
|
9
12
|
end
|
10
|
-
|
11
|
-
# TODO avoid recursive parents, maybe using a postgres CTE for efficiency?
|
12
13
|
end
|
13
14
|
end
|
@@ -42,7 +42,7 @@ class ArrayInclusionValidator < ActiveModel::EachValidator
|
|
42
42
|
|
43
43
|
unless not_allowed_values.blank?
|
44
44
|
formatted_rejected = not_allowed_values.uniq.collect(&:inspect).join(",")
|
45
|
-
record.errors.add(attribute, :inclusion, options.except(:in).merge!(rejected_values: formatted_rejected, value: value))
|
45
|
+
record.errors.add(attribute, :inclusion, **options.except(:in).merge!(rejected_values: formatted_rejected, value: value))
|
46
46
|
end
|
47
47
|
end
|
48
48
|
end
|
data/lib/kithe/engine.rb
CHANGED
@@ -9,7 +9,6 @@ require 'shrine'
|
|
9
9
|
# https://github.com/teoljungberg/fx/issues/33
|
10
10
|
# https://github.com/teoljungberg/fx/pull/53
|
11
11
|
require 'fx'
|
12
|
-
require 'kithe/patch_fx'
|
13
12
|
|
14
13
|
# not auto-loaded, let's just load it for backwards compat though
|
15
14
|
require "kithe/config_base"
|
@@ -22,5 +21,13 @@ module Kithe
|
|
22
21
|
g.assets false
|
23
22
|
g.helper false
|
24
23
|
end
|
24
|
+
|
25
|
+
# the fx gem lets us include stored procedures in schema.rb. For it to work
|
26
|
+
# in kithe's case, the stored procedures have to be *first* in schema.rb,
|
27
|
+
# so they can then be referenced as default value for columns in tables
|
28
|
+
# subsequently created. We configure that here, forcing it for any app, yes, sorry.
|
29
|
+
Fx.configure do |config|
|
30
|
+
config.dump_functions_at_beginning_of_schema = true
|
31
|
+
end
|
25
32
|
end
|
26
33
|
end
|
@@ -1,14 +1,17 @@
|
|
1
1
|
module Kithe
|
2
2
|
class IndexableSettings
|
3
3
|
attr_accessor :solr_url, :writer_class_name, :writer_settings,
|
4
|
-
:model_name_solr_field, :solr_id_value_attribute, :disable_callbacks
|
4
|
+
:model_name_solr_field, :solr_id_value_attribute, :disable_callbacks,
|
5
|
+
:batching_mode_batch_size
|
5
6
|
def initialize(solr_url:, writer_class_name:, writer_settings:,
|
6
|
-
model_name_solr_field:, solr_id_value_attribute:, disable_callbacks: false
|
7
|
+
model_name_solr_field:, solr_id_value_attribute:, disable_callbacks: false,
|
8
|
+
batching_mode_batch_size: 100)
|
7
9
|
@solr_url = solr_url
|
8
10
|
@writer_class_name = writer_class_name
|
9
11
|
@writer_settings = writer_settings
|
10
12
|
@model_name_solr_field = model_name_solr_field
|
11
13
|
@solr_id_value_attribute = solr_id_value_attribute || 'id'
|
14
|
+
@batching_mode_batch_size = batching_mode_batch_size
|
12
15
|
end
|
13
16
|
|
14
17
|
# Use configured solr_url, and merge together with configured
|
data/lib/kithe/version.rb
CHANGED
@@ -10,7 +10,11 @@ class Shrine
|
|
10
10
|
|
11
11
|
# Register our derivative processor, that will create our registered derivatives,
|
12
12
|
# with our custom options.
|
13
|
-
|
13
|
+
#
|
14
|
+
# We do download: false, so when our `lazy` argument is in use, original does not get eagerly downloaded,
|
15
|
+
# but only gets downloaded if needed to make derivatives. This is great for performance, especially
|
16
|
+
# when running batch job to add just missing derivatives.
|
17
|
+
uploader::Attacher.derivatives(:kithe_derivatives, download: false) do |original, **options|
|
14
18
|
Kithe::Asset::DerivativeCreator.new(self.class.kithe_derivative_definitions,
|
15
19
|
source_io: original,
|
16
20
|
shrine_attacher: self,
|
@@ -44,10 +48,12 @@ class Shrine
|
|
44
48
|
# Tempfile and Dir.mktmpdir may be useful.
|
45
49
|
#
|
46
50
|
# If in order to do your transformation you need additional information about the original,
|
47
|
-
# just add a `
|
51
|
+
# just add a `attacher:` keyword argument to your block, and a `Shrine::Attacher` subclass
|
52
|
+
# will be passed in. You can then get the model object from `attacher.record`, or the
|
53
|
+
# original file as a `Shrine::UploadedFile` object with `attacher.file`.
|
48
54
|
#
|
49
|
-
# define_derivative :thumbnail do |original_file,
|
50
|
-
# record.
|
55
|
+
# define_derivative :thumbnail do |original_file, attacher:|
|
56
|
+
# attacher.record.title, attacher.file.width, attacher.file.content_type # etc
|
51
57
|
# end
|
52
58
|
#
|
53
59
|
# Derivatives are normally uploaded to the Shrine storage labeled :kithe_derivatives,
|
@@ -19,6 +19,12 @@ class Shrine
|
|
19
19
|
# Ensure that if mime-type can't be otherwise determined, it is assigned
|
20
20
|
# "application/octet-stream", basically the type for generic binary.
|
21
21
|
class KitheDetermineMimeType
|
22
|
+
# marcel version 1.0 says audio/x-flac, whereas previous versions
|
23
|
+
# said audio/flac, which we prefer. Let's fix it.
|
24
|
+
RPELACE_CONTENT_TYPES = {
|
25
|
+
"audio/x-flac" => "audio/flac"
|
26
|
+
}
|
27
|
+
|
22
28
|
def self.load_dependencies(uploader, *)
|
23
29
|
uploader.plugin :determine_mime_type, analyzer: -> (io, analyzers) do
|
24
30
|
mime_type = analyzers[:marcel].call(io)
|
@@ -30,6 +36,9 @@ class Shrine
|
|
30
36
|
|
31
37
|
mime_type = "application/octet-stream" if mime_type.blank?
|
32
38
|
|
39
|
+
# Are there any we prefer an alternate spelling of?
|
40
|
+
mime_type = RPELACE_CONTENT_TYPES.fetch(mime_type, mime_type)
|
41
|
+
|
33
42
|
mime_type
|
34
43
|
end
|
35
44
|
end
|
@@ -26,6 +26,11 @@ class Shrine
|
|
26
26
|
# Like the shrine `add_derivatives` method, but also *persists* the
|
27
27
|
# derivatives (saves to db), in a realiably concurrency-safe way.
|
28
28
|
#
|
29
|
+
# For ruby 3 compatibility, make sure you supply local_files as a hash
|
30
|
+
# literal with curly braces:
|
31
|
+
#
|
32
|
+
# attacher.add_persisted_derivatives({ derivative_name1: io_obj1, deriv2: io2 })
|
33
|
+
#
|
29
34
|
# Generally can take any options that shrine `add_derivatives`
|
30
35
|
# can take, including custom `storage` or `metadata` arguments.
|
31
36
|
#
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: kithe
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 2.0
|
4
|
+
version: 2.4.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Jonathan Rochkind
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2022-02-14 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: rails
|
@@ -70,14 +70,14 @@ dependencies:
|
|
70
70
|
requirements:
|
71
71
|
- - "~>"
|
72
72
|
- !ruby/object:Gem::Version
|
73
|
-
version: '3.
|
73
|
+
version: '3.3'
|
74
74
|
type: :runtime
|
75
75
|
prerelease: false
|
76
76
|
version_requirements: !ruby/object:Gem::Requirement
|
77
77
|
requirements:
|
78
78
|
- - "~>"
|
79
79
|
- !ruby/object:Gem::Version
|
80
|
-
version: '3.
|
80
|
+
version: '3.3'
|
81
81
|
- !ruby/object:Gem::Dependency
|
82
82
|
name: shrine-url
|
83
83
|
requirement: !ruby/object:Gem::Requirement
|
@@ -188,7 +188,7 @@ dependencies:
|
|
188
188
|
requirements:
|
189
189
|
- - ">="
|
190
190
|
- !ruby/object:Gem::Version
|
191
|
-
version: 0.
|
191
|
+
version: 0.6.0
|
192
192
|
- - "<"
|
193
193
|
- !ruby/object:Gem::Version
|
194
194
|
version: '1'
|
@@ -198,7 +198,7 @@ dependencies:
|
|
198
198
|
requirements:
|
199
199
|
- - ">="
|
200
200
|
- !ruby/object:Gem::Version
|
201
|
-
version: 0.
|
201
|
+
version: 0.6.0
|
202
202
|
- - "<"
|
203
203
|
- !ruby/object:Gem::Version
|
204
204
|
version: '1'
|
@@ -264,6 +264,20 @@ dependencies:
|
|
264
264
|
- - ">="
|
265
265
|
- !ruby/object:Gem::Version
|
266
266
|
version: '0'
|
267
|
+
- !ruby/object:Gem::Dependency
|
268
|
+
name: db-query-matchers
|
269
|
+
requirement: !ruby/object:Gem::Requirement
|
270
|
+
requirements:
|
271
|
+
- - "<"
|
272
|
+
- !ruby/object:Gem::Version
|
273
|
+
version: '1'
|
274
|
+
type: :development
|
275
|
+
prerelease: false
|
276
|
+
version_requirements: !ruby/object:Gem::Requirement
|
277
|
+
requirements:
|
278
|
+
- - "<"
|
279
|
+
- !ruby/object:Gem::Version
|
280
|
+
version: '1'
|
267
281
|
- !ruby/object:Gem::Dependency
|
268
282
|
name: pg
|
269
283
|
requirement: !ruby/object:Gem::Requirement
|
@@ -331,6 +345,7 @@ files:
|
|
331
345
|
- README.md
|
332
346
|
- Rakefile
|
333
347
|
- app/assets/config/kithe_manifest.js
|
348
|
+
- app/characterization/kithe/ffprobe_characterization.rb
|
334
349
|
- app/derivative_transformers/kithe/ffmpeg_transformer.rb
|
335
350
|
- app/derivative_transformers/kithe/vips_cli_image_to_jpeg.rb
|
336
351
|
- app/helpers/kithe/form_helper.rb
|
@@ -376,7 +391,6 @@ files:
|
|
376
391
|
- lib/kithe/config_base.rb
|
377
392
|
- lib/kithe/engine.rb
|
378
393
|
- lib/kithe/indexable_settings.rb
|
379
|
-
- lib/kithe/patch_fx.rb
|
380
394
|
- lib/kithe/sti_preload.rb
|
381
395
|
- lib/kithe/version.rb
|
382
396
|
- lib/shrine/plugins/kithe_accept_remote_url.rb
|
@@ -411,7 +425,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
411
425
|
- !ruby/object:Gem::Version
|
412
426
|
version: '0'
|
413
427
|
requirements: []
|
414
|
-
rubygems_version: 3.
|
428
|
+
rubygems_version: 3.2.32
|
415
429
|
signing_key:
|
416
430
|
specification_version: 4
|
417
431
|
summary: Shareable tools/components for building a digital collections app in Rails.
|
data/lib/kithe/patch_fx.rb
DELETED
@@ -1,39 +0,0 @@
|
|
1
|
-
# fx is a gem that lets Rails schema.rb capture postgres functions and triggers
|
2
|
-
#
|
3
|
-
# For it to work for our use case, we need it to define functions BEFORE tables when
|
4
|
-
# doing a `rake db:schema:load`, so we can refer to functions as default values in our
|
5
|
-
# tables.
|
6
|
-
#
|
7
|
-
# This is a known issue in fx, with a PR, but isn't yet merged/released, so we hack
|
8
|
-
# in a patch to force it. Better than forking.
|
9
|
-
#
|
10
|
-
# Based on: https://github.com/teoljungberg/fx/pull/53/
|
11
|
-
#
|
12
|
-
# We try to write future-compat code assuming that will be merged eventually....
|
13
|
-
|
14
|
-
require 'fx'
|
15
|
-
|
16
|
-
if Fx.configuration.respond_to?(:dump_functions_at_beginning_of_schema)
|
17
|
-
# we have the feature!
|
18
|
-
|
19
|
-
Fx.configure do |config|
|
20
|
-
config.dump_functions_at_beginning_of_schema = true
|
21
|
-
end
|
22
|
-
|
23
|
-
else
|
24
|
-
# Fx does not have the feature, we have to patch it in
|
25
|
-
|
26
|
-
require 'fx/schema_dumper/function'
|
27
|
-
|
28
|
-
module Fx
|
29
|
-
module SchemaDumper
|
30
|
-
module Function
|
31
|
-
def tables(stream)
|
32
|
-
functions(stream)
|
33
|
-
super
|
34
|
-
end
|
35
|
-
end
|
36
|
-
end
|
37
|
-
end
|
38
|
-
|
39
|
-
end
|