kithe 2.0.3 → 2.4.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 6fe55cd9eb6b6f323f8a64418420a91a3d3cc99baa677a006e72608d3e05ccaf
4
- data.tar.gz: 167d25b42b8a490d4c8bd3e7fbdf8756560b9bcbdc9c684bc17db854dce3897a
3
+ metadata.gz: 6f3d3c39a3bdccd9fc97f8f5e16c94cf12d4926e7ca50cc5e3343406ad3044d3
4
+ data.tar.gz: 2bd6fc9c175d4ec213e58460fbb95b5c40b49ca1ae5808e77df26932cf60474a
5
5
  SHA512:
6
- metadata.gz: f1cb5e7242f6fffcda9b0be4894f24251470b919204faf355f694f245863acd9a9c5e1ce9f67abd3c3cfb9b3ba01a8b7235d45ce8a9b22349ee8d03d1e7dbe67
7
- data.tar.gz: fe8af824a5dc4f5af3719d12a109eff834cd260f647d668eba452c8aa88bbe61ac7b33c81206cbd1ec95169e260c312934843f820ee556a5c0452fc767bdb42a
6
+ metadata.gz: 55c9e6eb14e975a46927fa01b4c7a8a67ca1b804ae7662f2e34e7878ee0a3247b944395edf93e311f7c39b640392ddd5b74a3df77381ba5f761e78f883e48dfa
7
+ data.tar.gz: 730f6f1d20d45d7973d2c340e15889d6f2b56295f70eec8703a1e1b666ba078357b1d1ec2e34fa14ca5575ff17032f52340ffba3070e2b16e3da35012c5e8d9e
data/README.md CHANGED
@@ -7,72 +7,69 @@ An experiment in shareable tools/components for building a digital collections a
7
7
 
8
8
  Kithe is a toolkit for building digital collections/repository applications in Rails. It comes out of experience in the [samvera](https://samvera.org/) community of open source library-archives-museums digital collections/preservation work (but is not a samvera project).
9
9
 
10
- Kithe does not use fedora or valkyrie, but stores all metadata using ActiveRecord. Kithe requires you use postgres 9.5+ as your db. While kithe provides some additional architecture and support on top of "ordinary" ActiveRecord, it tries to mostly let you develop as you would any Rails app, inclduing all choices and you'd normally have, and taking advantage of standard ActiveRecord-based functionality. Kithe bases it's file handling framework on [shrine](https://shrinerb.com), supporting cloud or local file storage.
10
+ Kithe does not use fedora or valkyrie, but stores all metadata using ActiveRecord. Kithe requires you use postgres 9.5+ as your db. It uses [shrine](https://shrinerb.com) for file-handling/asset-storing and tries to support developing your app as a normal Rails/ActiveRecord app. It will not give you a working turnkey application, but is a collection of tools for building an app with certain patterns.
11
11
 
12
- Kithe will not give you a working "turnkey" application. It is a collection of tools to help you write a Rails app. You may end up using some but not all of kithe's tools. The range of tools provided and areas of an app given some support in kithe will probably grow over time, in hopefully a careful and cautious way.
12
+ Kithe provides tools to supports these architectural patterns:
13
13
 
14
- In that it provides tools and not a turnkey app, develping an app based on kythe in some ways similar to developing an app based on [valkyrie](https://github.com/samvera-labs/valkyrie). They both provide basic architecture for modelling/persistence, although in quite different ways. Kithe also provides tools in addition to modelling/persistence, but does _not_ provide the data-mapper/repository pattern valkyrie does, or any built-in abstraction for persisting anywhere but a postgres DB.
14
+ * [Modelling and Persistence](./guides/modelling.md):
15
+ * A Collection/Work/Asset model based on Samvera/PCDM, using rails Single-Table Inheritance to support hetereogenous associations with efficient rdbms lookup.
16
+ * Using Postgres JSONB for "schema-less" flexible storage, via [attr_json](https://github.com/jrochkind/attr_json), supporting complex structured nested repeatable data values.
17
+ * [Work representatives](./guides/work_representative.md) via ActiveRecord association, using postgres recursive CTE's to compute the "leaf" representative, designed to support efficient use of the DB including pre-loading leaf representatives.
18
+ * UUIDv4's as internal primary keys, but also provide a "friendlier_id" with a shorter unique alphanumeric identifier for URLs and other UI. By default they are supplied by a postgres stored procedure, but your code can set them to whatever you like.
15
19
 
16
- If you are comparing it to a "solution bundle" digital collections platform, kithe may seem like more work. But experience has shown me that in our domain, historically "solution bundles" can be less of a "turnkey" approach than they seem, and can have greater development cost over total app lifecycle than anticipated. If you have similar experience that leads you to consider a more 'bespoke' app approach -- you may want to consider kithe. We hope to provide architecturally simple support and standardization for your custom app, taking care of some of the common "hard parts" and leaving you with flexibility to build out the app that meets your needs.
20
+ * [Form support](./guides/forms.md): Easy Rails-like forms for that complex nested and repeatable form data, leaning on simple_form.
21
+ * An extension to Rails "strong parameters" that make some common patterns for
22
+ embedded JSON attributes more convenient, [Kithe::Parameters](./app/models/kithe/parameters.rb)
17
23
 
18
- Kithe has beeen developed in tandem with the Science History Institute's in-development [replacement digital collections](https://github.com/sciencehistory/scihist_digicoll) app, and you can look there for a model/canonical/demo kithe use.
24
+ * [File handling](./guides/file_handling.md): A framework that let's you easily plug in your own custom characterization and derivatives handling, to be handled in an efficient and flexible way, ordinarily using background jobs. Implemented on top of [shrine](https://shrinerb.com).
25
+ * [Derivatives](./guides/derivatives.md) handling ensures data consistency without race conditions, and efficient querying patterns, letting you plugin custom derivatives creation, with some standard routines included.
19
26
 
20
- The Science History Institute app is live and working, so kithe is 1.0. We are serious about [semantic verisioning](https://semver.org/) and will endeavor to release backwards breaking changes only with a major release, and minimize major releases.
21
-
22
- While it is working well for us, since it hasn't had wide use, it could still be considered somewhat of an experiment. But you are invited to try it out and see how it works. You are welcome to use it, but also welcome to copy any code or just ideas from kithe.
23
-
24
- Any questions or feedback of any kind are very welcome and encouraged, in the github project issues, samvera slack, or wherever is convenient.
25
-
26
-
27
- # Kithe parts
28
-
29
- Some guide documentation is available to explain each of kithe's major functionality areas. Definitely start with the modelling guide.
27
+ * [Solr Indexing](./guides/solr_indexing.md): Built-in Solr indexing using [traject](https://github.com/traject/traject) for defining mappings from your model objects to what you want in a Solr index. Uses ActiveRecord callbacks to automatically sync saves to solr, with many opportunities for customization.
28
+ * Not coupled to any other kithe components, could be used independently, hypothetically on any ActiveRecord model.
30
29
 
31
- * [Modelling and Persistence](./guides/modelling.md): It can be somewhwat challenging to figure out a good way to model our data in an rdbms. We give you a hopefully flexible and understandable architecture that is designed to support efficient performance. It's influenced by PCDM and traditional samvera modelling. It's based on [attr_json](https://github.com/jrochkind/attr_json) to let you model arbitrary and complex object-oriented data that gets persisted as a serialized json hash. It uses rails Single Table Inheritance to support hetereogenous associations and collections. The modelling classes in some places use postgres-specific features for efficiency.
30
+ * A [recommended approach for using Blacklight](./guides/blacklight_approach.md) with search result view templates based on actual ActiveRecord models. Blacklight use is optional with kithe, but kithe works well with blacklight.
32
31
 
33
- * [Work representatives](./guides/work_representative.md). Built in associations to support "representative", using postgres recursive CTE's to compute the "leaf" representative, designed to support efficient use of the DB including pre-loading leaf representatives.
32
+ * Assorted optional utilities
33
+ * [Kithe::ConfigBase](./app/models/kithe/config_base.rb) A totally optional solution for managing environmental config variables.
34
34
 
35
- * Kithe objects use UUIDv4's as internal primary keys, but also provide a "friendlier_id" with a shorter unique alphanumeric identifier for URLs and other UI. By default they are supplied by a postgres stored procedure, but your code can set them to whatever you like.
35
+ * [ArrayInclusionValdaitor](./app/validators/array_inclusion_validator.rb) Useful for validating on attr_json arrays of primitives.
36
36
 
37
- * [Form support](./guides/forms.md): Dealing with complex and _repeatable_ data, as our modelling layer allows, can be tricky in an HTML form. We supply javascript-backed Rails form input help for repeatable and compound/nested data.
37
+ ## Setting up your app to use kithe
38
38
 
39
- * [File handling](./guides/file_handling.md): Handling files is at the core of digital repository use cases. We need a file handling framework that is flexible, predictable and reliable, and architected for performance. We try to give you one based on the [shrine](https://shrinerb.com) file attachment toolkit for ruby.
39
+ So you want to start an app that uses kithe. We should later provide better 'getting started' guide. For now some sketchy notes:
40
40
 
41
- * [Derivatives](./guides/derivatives.md) A flexible and reliable derivatives architecture, designed to ensure data consistency without race conditions, and support efficient DB usage patterns.
41
+ * Again re-iterate that kithe requires your Rails app use postgres, 9.5+.
42
42
 
43
- * [Solr Indexing](./guides/solr_indexing.md): Uses [traject](https://github.com/traject/traject) for defining mappings from your model objects to what you want in a Solr index. Uses ActiveRecord callbacks to automatically sync saves to solr, with many opportunities for customization.
44
- * Not coupled to any other kithe components, could be used independently, hypothetically on any ActiveRecord model.
45
- * Written after review of "prior art" in [sunspot](https://github.com/sunspot/sunspot) and [searchkick](https://github.com/ankane/searchkick) (which both used AR callback-based indexing), and others.
43
+ * kithe works with Rails 5.2 through 6.1.
46
44
 
47
- * A [recommended approach for using Blacklight](./guides/blacklight_approach.md) with search result view templates based on actual ActiveRecord models. It is totally optional to use Blacklight at all with kithe, or to use this approach if you do.
45
+ * To install migrations from kithe to setup your database for it's models: `rake kithe_engine:install:migrations`
48
46
 
49
- ### Also
47
+ * Kithe view support generally assumes your app uses bootstrap 4, and uses [simple form](https://github.com/plataformatec/simple_form) configured with bootstrap settings. See https://github.com/plataformatec/simple_form#bootstrap . So you should install simple_form and bootstrap 4.
50
48
 
51
- * [Kithe::Parameters](./app/models/kithe/parameters.rb) provides some shortcuts around Rails "strong params" for attr_json serialized attributes.
49
+ * Specific additional pre-requisites/requirements can sometimes be found in individual feature docs. And include the Javascript from [cocoon](https://github.com/nathanvda/cocoon), for form support for repeatable-field editing forms. We haven't quite figured out our preferred sane approach for sharing Javascript via kithe.
52
50
 
53
- * [Kithe::ConfigBase](./app/models/kithe/config_base.rb) A totally optional solution for managing environmental config variables.
54
51
 
55
- * [ArrayInclusionValdaitor](./app/validators/array_inclusion_validator.rb) Useful for validating on attr_json arrays of primitives.
52
+ ## Why kithe?
56
53
 
57
- ## Setting up your app to use kithe
54
+ Kithe tries to let you develop your app like "an ordinary Rails app" (in all it's possible variations), while handling some of the rough spots common to the kinds of modelling and administration common to digital collections domains. But developers should be able to use standard Rails patterns and skills to develop an app to your specific local needs, familiar, no more complicated than building any other Rails app. You add features to a kithe app just like building Rails, using whatever patterns you like. We support modern Rails versions, 5.2+.
58
55
 
59
- So you want to start an app that uses kithe. We should later provide better 'getting started' guide. For now some sketchy notes:
56
+ In that kithe provides tools and not a turnkey app, develping an app based on kythe in some ways similar to developing an app based on [valkyrie](https://github.com/samvera-labs/valkyrie) (but not hyrax). They both provide basic architecture for modelling/persistence, although in quite different ways. Kithe also provides tools in addition to modelling/persistence, but does _not_ provide the data-mapper/repository pattern valkyrie does, or any built-in abstraction for persisting anywhere but a postgres DB.
60
57
 
61
- * Again re-iterate that kithe requires your Rails app use postgres, 9.5+.
58
+ If you are comparing it to a "solution bundle" digital collections platform like hyrax, kithe may seem like more work. But experience has shown us that in our domain, "solution bundles" can turn out less of a "turnkey" approach than they seem, and can have greater development cost over total app lifecycle than anticipated. If you have similar experience that leads you to consider a more 'bespoke' app approach -- you may want to consider kithe. We hope to provide architecturally simple support and standardization for your custom app, taking care of some of the common "hard parts" and leaving you with flexibility to build out the app that meets your needs.
62
59
 
63
- * kithe works with Rails 5.2 through 6.1.
60
+ Kithe has beeen developed in tandem with the Science History Institute's in-development [replacement digital collections](https://github.com/sciencehistory/scihist_digicoll) app, which has been in production for several years using kithe.
64
61
 
65
- * To install migrations from kithe to setup your database for it's models: `rake kithe_engine:install:migrations`
62
+ [The University of Minnesota found kithe](https://docs.google.com/presentation/d/1Z4AoIDOaxbY4pt3mDhNt6MfUs6VIMjysKKmaYQpjuk8/edit?usp=sharing) to pair well with GeoBlacklight, an easy way to provide the persistence layer and metadata editing UI that blacklight on it's own lacks.
66
63
 
67
- * Kithe view support generally assumes your app uses bootstrap 4, and uses [simple form](https://github.com/plataformatec/simple_form) configured with bootstrap settings. See https://github.com/plataformatec/simple_form#bootstrap . So you should install simple_form and bootstrap 4.
64
+ We are serious about [semantic verisioning](https://semver.org/) and will endeavor to release backwards breaking changes only with a major release, and minimize major releases.
68
65
 
69
- * Specific additional pre-requisites/requirements can sometimes be found in individual feature docs. And include the Javascript from [cocoon](https://github.com/nathanvda/cocoon), for form support for repeatable-field editing forms. We haven't quite figured out our preferred sane approach for sharing Javascript via kithe.
66
+ Kithe is working well for us, but has had limited (but non-zero) adoption from other institutions. It's still somewhat of an experiment, but one we think is going well. If you would consider developing a digital collections/repository app in "just Rails", we think it's worth investigating if kithe can save you some trouble in some rough common use cases. You are invited to try it out and see how it works, using kithe directly, or copying any code or just ideas from kithe.
70
67
 
71
- Note that at present kithe will end up forcing your app to use `:sql` [style schema dumps](https://guides.rubyonrails.org/v3.2.8/migrations.html#types-of-schema-dumps). We may try to fix this.
68
+ Any questions or feedback of any kind are very welcome and encouraged! In the github project issues, samvera slack, or wherever is convenient.
72
69
 
73
70
  ## To be done
74
71
 
75
- Considering some blacklight integration support.
72
+ Considering some additional blacklight integration support, is any needed?
76
73
 
77
74
  Other components/features may become more clear as we continue to develop. It's possible that kithe won't (at least for a long time) contain controllers themselves (it may contain some helper methods for controllers), or generalized permissions architecture. Both of these are some of the things most particular to specific apps, that are hard to generalize without creating monsters.
78
75
 
@@ -83,8 +80,6 @@ This is a Rails 'engine' whose template was created with: `rails plugin new kith
83
80
 
84
81
  * Note we have chosen not to make it 'mountable' or 'isolated', I think that would be inappropriate for this kind of gem. It _is_ an engine so it can hook into Rails load paths and config as needed.
85
82
 
86
-
87
-
88
83
  * Note we are currently using the standard rails-generated dummy app in spec/dummy for testing, rather than [engine_cart](https://github.com/cbeer/engine_cart) or [combustion](https://github.com/pat/combustion).
89
84
  * Before you run the tests for the first time, create the database by running: `rails db:setup`. This will create two databases, kithe_development and kithe_test.
90
85
  * Some of the rspec tests depend on [FFmpeg](https://ffmpeg.org/) for testing file derivative transformations. Mac users can install [ffmpeg via homebrew](https://formulae.brew.sh/formula/ffmpeg): `brew install ffmpeg`
@@ -0,0 +1,179 @@
1
+ require 'tty/command'
2
+ require 'json'
3
+
4
+ module Kithe
5
+ # Characterizes Audio or Video files using `ffprobe`, a tool that comes with `ffmpeg`.
6
+ #
7
+ # You can pass in a local File object (with a pathname), a local String pathname, or
8
+ # a remote URL. (Remote URLs will be passed directy to ffprobe, which can efficiently
9
+ # fetch just the bytes it needs)
10
+ #
11
+ # You can get back normalized A/V metadata:
12
+ #
13
+ # metadata = FfprobeCharacterization.new(url).normalized_metadata
14
+ #
15
+ # Normalized metadata is a *flat* hash of typed JSON-able values. It uses
16
+ # keys based on what the ActiveEncode gem seems to use, but adds some extras
17
+ # and makes a few tweaks. See the #normalized_metadata method source for
18
+ # keys supplied.
19
+ #
20
+ # Or the complete FFprobe response as JSON. (We try to use ffprobe options that
21
+ # are exhausitive as to what is returned, including ffprobe version(s))
22
+ #
23
+ # ffprobe_results = FfprobeCharacterization.new(url).ffprobe_hash
24
+ #
25
+ class FfprobeCharacterization
26
+ class_attribute :ffprobe_command, default: "ffprobe"
27
+ class_attribute :ffprobe_timeout, default: 10
28
+
29
+ attr_reader :input_arg
30
+
31
+ # @param input [String,File] local File OR local filepath as String, OR remote URL as string
32
+ # If you have a remote url, just passing hte remote url is way more performant than
33
+ # downloading it yourself locally -- ffprobe will just fetch the bytes it needs.
34
+ def initialize(input)
35
+ if input.respond_to?(:path)
36
+ input = input.path
37
+ end
38
+ @input_arg = input
39
+ end
40
+
41
+ # a helper for creating a block for shrine uploader, you can always use
42
+ # FFprobeCharecterization.new directly too!
43
+ #
44
+ # * Does not run on "cache" action, only on promotion (or manual execution).
45
+ #
46
+ # * Will run only on items with "audio/" or "video/" content-type.
47
+ #
48
+ # * By default only on main original, not derivatives, although
49
+ # you can pass `run_on_derivatives: true` if desired.
50
+ #
51
+ # Will use ffprobe with direct URL if possible based on source_io (ffprobe
52
+ # can very efficiently access only bytes needed from URL), otherwise will
53
+ # download local temp copy if necessary.
54
+ #
55
+ # class AssetUploader < Kithe::AssetUploader
56
+ # add_metadata do |source_io, **context|
57
+ # Kithe::FfprobeCharacterization.characterize_from_uploader(source_io, context)
58
+ # end
59
+ #
60
+ # #...
61
+ # end
62
+ #
63
+ def self.characterize_from_uploader(source_io, add_metadata_context, run_on_derivatives: false)
64
+ # only for A/V please
65
+ return {} unless add_metadata_context.dig(:metadata, "mime_type")&.start_with?(%r{\A(audio|video)/})
66
+
67
+ # don't run on cache, only on promotion or manual trigger
68
+ return {} unless add_metadata_context[:action] != :cache
69
+
70
+ # don't run on derivatives unless option given
71
+ return {} unless add_metadata_context[:derivative].nil? || run_on_derivatives
72
+
73
+ # ffprobe can use a URL and very efficiently only retrieve what bytes it needs...
74
+ if source_io.respond_to?(:url) && source_io.url.start_with?(/\Ahttps?:/)
75
+ Kithe::FfprobeCharacterization.new(source_io.url).normalized_metadata
76
+ else
77
+ # if not already a file, will download, possibly slow, but gets us to go.
78
+ Shrine.with_file(source_io) do |file|
79
+ Kithe::FfprobeCharacterization.new(file.path).normalized_metadata
80
+ end
81
+ end
82
+ end
83
+
84
+ # ffprobe args come from this suggestion:
85
+ #
86
+ # https://gist.github.com/nrk/2286511?permalink_comment_id=2593200#gistcomment-2593200
87
+ #
88
+ # We also add in various current version tags! If we're going to record all ffprobe
89
+ # output, we'll want that too!
90
+ def ffprobe_options
91
+ [
92
+ "-hide_banner",
93
+ "-loglevel", "fatal",
94
+ "-show_error", "-show_format", "-show_streams", "-show_programs",
95
+ "-show_chapters", "-show_private_data", "-show_versions",
96
+ "-print_format", "json",
97
+ ]
98
+ end
99
+
100
+ # ffprobe output parsed as JSON...
101
+ def ffprobe_hash
102
+ @ffprobe_hash ||= JSON.parse(ffprobe_stdout).merge(
103
+ "ffprobe_options_used" => ffprobe_options.join(" ")
104
+ )
105
+ end
106
+
107
+ # Returns a FLAT JSON-able hash of normalized a/v metadata.
108
+ #
109
+ # Tries to standardize to what ActiveEncode uses, with some changes and additions.
110
+ # https://github.com/samvera-labs/active_encode/blob/42f5ed5427a39e56093a5e82123918c4b2619a47/lib/active_encode/technical_metadata.rb
111
+ #
112
+ # A video file or other container can have more than one audio or video stream in it, although
113
+ # this is somewhat unusual for our domain. For the stream-specific audio_ and video_ metadata
114
+ # returned, we just choose the *first* returned audio or video stream (which may be more or
115
+ # less arbitrary)
116
+ #
117
+ # See also #ffprobe_hash for complete ffprobe results
118
+ def normalized_metadata
119
+ # overall audio_sample_rate are null, audio codec is wrong
120
+ @normalized_metadata ||= {
121
+ "width" => first_video_stream_json&.dig("width"),
122
+ "height" => first_video_stream_json&.dig("height"),
123
+ "frame_rate" => video_frame_rate_as_float, # frames per second
124
+ "duration_seconds" => ffprobe_hash&.dig("format", "duration")&.to_f&.round(3),
125
+ "audio_codec" => first_audio_stream_json&.dig("codec_name"),
126
+ "video_codec" => first_video_stream_json&.dig("codec_name"),
127
+ "audio_bitrate" => first_audio_stream_json&.dig("bit_rate")&.to_i, # in bps
128
+ "video_bitrate" => first_video_stream_json&.dig("bit_rate")&.to_i, # in bps
129
+ # extra ones not ActiveEncode
130
+ "bitrate" => ffprobe_hash.dig("format", "bit_rate")&.to_i, # overall bitrate of whole file in bps
131
+ "audio_sample_rate" => first_audio_stream_json&.dig("sample_rate")&.to_i, # in Hz
132
+ "audio_channels" => first_audio_stream_json&.dig("channels")&.to_i, # usually 1 or 2 (for stereo)
133
+ "audio_channel_layout" => first_audio_stream_json&.dig("channel_layout"), # stereo or mono or (dolby) 2.1, or something else.
134
+ }.compact
135
+ end
136
+
137
+ # just the ffprobe version please. This is also available
138
+ # in ffprobe_hash
139
+ def ffprobe_version
140
+ ffprobe_hash.dig("program_version", "version")
141
+ end
142
+
143
+ private
144
+
145
+ def ffprobe_stdout
146
+ @ffprobe_output ||= TTY::Command.new(printer: :null).run(
147
+ ffprobe_command,
148
+ *ffprobe_options,
149
+ input_arg,
150
+ timeout: ffprobe_timeout).out
151
+ end
152
+
153
+ def first_video_stream_json
154
+ @first_video_stream_json ||= ffprobe_hash["streams"].find { |stream| stream["codec_type"] == "video" }
155
+ end
156
+
157
+ def first_audio_stream_json
158
+ @first_audio_stream_json ||= ffprobe_hash["streams"].find { |stream| stream["codec_type"] == "audio" }
159
+ end
160
+
161
+ # There are a few different values we could choose here. We're going to choose
162
+ # `avg_frame_rate` == total duration / number of frames,
163
+ # vs (not chosen) `r_frame_rate ` "the lowest framerate with which all timestamps can be represented accurately (it is the least common multiple of all framerates in the stream)"
164
+ #
165
+ # (note this sometimes gets us not what we expected, like it gets us 29.78 fps instead of 29.97)
166
+ #
167
+ # Then we have to change it from numerator/denomominator to float truncated to two decimal places,
168
+ # which we let ruby rational do for us.
169
+ def video_frame_rate_as_float
170
+ avg_frame_rate = first_video_stream_json&.dig("avg_frame_rate")
171
+
172
+ return nil unless avg_frame_rate
173
+
174
+ return nil if avg_frame_rate.split("/")[1] == "0" # sometimes it returns '0/0', don't know why.
175
+
176
+ Rational(avg_frame_rate).to_f.round(2)
177
+ end
178
+ end
179
+ end
@@ -57,7 +57,7 @@ module Kithe
57
57
  def initialize(batching:, disable_callbacks:, original_settings:,
58
58
  writer:, on_finish:)
59
59
  @original_settings = original_settings
60
- @batching = !!batching
60
+ @batching = batching
61
61
  @disable_callbacks = disable_callbacks
62
62
  @on_finish = on_finish
63
63
 
@@ -80,8 +80,9 @@ module Kithe
80
80
  def writer
81
81
  @writer ||= begin
82
82
  if @batching
83
+ batch_size = (@batching == true) ? Kithe.indexable_settings.batching_mode_batch_size : @batching
83
84
  @local_writer = true
84
- Kithe.indexable_settings.writer_instance!("solr_writer.batch_size" => 100)
85
+ Kithe.indexable_settings.writer_instance!("solr_writer.batch_size" => batch_size)
85
86
  end
86
87
  end
87
88
  end
@@ -97,6 +98,8 @@ module Kithe
97
98
  # only call on-finish if we have a writer, batch writers are lazily
98
99
  # created and maybe we never created one
99
100
  if @writer
101
+ # if we created the writer ourselves locally and nobody
102
+ # specified an on_finish, close our locally-created writer.
100
103
  on_finish = if @local_writer && @on_finish.nil?
101
104
  proc {|writer| writer.close }
102
105
  else
@@ -105,7 +108,7 @@ module Kithe
105
108
  on_finish.call(@writer) if on_finish
106
109
  end
107
110
 
108
- Thread.current[THREAD_CURRENT_KEY] = @original_thread_current_settings
111
+ Thread.current[THREAD_CURRENT_KEY] = @original_settings
109
112
  end
110
113
 
111
114
  private
@@ -120,6 +120,9 @@ module Kithe
120
120
  #
121
121
  # By default will use a per-update writer, or thread/block-specific writer configured with `self.index_with`,
122
122
  # or you can pass one in.
123
+ #
124
+ # This method is part of Kithe API, including allowing local apps to override! Backwards
125
+ # compatibilty matters for semver with any change to method signature.
123
126
  def update_index(mapper: kithe_indexable_mapper, writer:nil)
124
127
  RecordIndexUpdater.new(self, mapper: mapper, writer: writer).update_index
125
128
  end
@@ -48,7 +48,8 @@ module Kithe
48
48
  first, *rest = *path
49
49
 
50
50
  result = if obj.kind_of?(Array)
51
- obj.flat_map { |item| obj_extractor(item, path) }
51
+ first_path_element = path.shift # remove it from path
52
+ obj.flat_map { |item| obj_extractor(item, [first_path_element]) }
52
53
  elsif obj.kind_of?(Hash)
53
54
  obj[first]
54
55
  else
@@ -27,7 +27,8 @@ class Kithe::Asset < Kithe::Model
27
27
  to: :file, allow_nil: true
28
28
  delegate :stored?, to: :file_attacher
29
29
  delegate :set_promotion_directives, :promotion_directives, to: :file_attacher
30
-
30
+ # delegate "metadata" as #file_metadata
31
+ delegate :metadata, to: :file, prefix: true, allow_nil: true
31
32
 
32
33
  # will be sent to file_attacher.set_promotion_directives, provided by our
33
34
  # kithe_promotion_hooks shrine plugin.
@@ -81,31 +82,11 @@ class Kithe::Asset < Kithe::Model
81
82
  source = file
82
83
  return false unless source
83
84
 
84
- #local_files = file_attacher.process_derivatives(:kithe_derivatives, only: only, except: except, lazy: lazy)
85
- local_files = _process_kithe_derivatives_without_download(source, only: only, except: except, lazy: lazy)
85
+ local_files = file_attacher.process_derivatives(:kithe_derivatives, only: only, except: except, lazy: lazy)
86
86
 
87
87
  file_attacher.add_persisted_derivatives(local_files)
88
88
  end
89
89
 
90
- # Working around Shrine's insistence on pre-downloading original before calling derivative processor.
91
- # We want to avoid that, so when our `lazy` argument is in use, original does not get eagerly downloaded,
92
- # but only gets downloaded if needed to make derivatives.
93
- #
94
- # This is a somewhat hacky way to do that, loking at the internals of shrine `process_derivatives`,
95
- # and pulling them out to skip the parts we don't want. We also lose shrine instrumentation
96
- # around this action.
97
- #
98
- # See: https://github.com/shrinerb/shrine/issues/470
99
- #
100
- # If that were resolved, the 'ordinary' shrine thing would be to replace calls
101
- # to this local private method with:
102
- #
103
- # file_attacher.process_derivatives(:kithe_derivatives, only: only, except: except, lazy: lazy)
104
- #
105
- private def _process_kithe_derivatives_without_download(source, **options)
106
- processor = file_attacher.class.derivatives_processor(:kithe_derivatives)
107
- local_files = file_attacher.instance_exec(source, **options, &processor)
108
- end
109
90
 
110
91
  # Just a convennience for file_attacher.add_persisted_derivatives (from :kithe_derivatives),
111
92
  # feel free to use that if you want to add more than one etc. By default stores to
@@ -144,6 +125,14 @@ class Kithe::Asset < Kithe::Model
144
125
  result && result.values.first
145
126
  end
146
127
 
128
+ # Like #update_derivative, but can update multiple at once.
129
+ #
130
+ # asset.update_derivatives({ "big_thumb" => big_thumb_io, "small_thumb" => small_thumb_io })
131
+ #
132
+ # Options from kithe `add_persisted_derivatives`/shrine `add_derivative` supported.
133
+ #
134
+ # asset.update_derivatives({ "big_thumb" => big_thumb_io, "small_thumb" => small_thumb_io }, delete_false)
135
+ #
147
136
  def update_derivatives(deriv_hash, **options)
148
137
  file_attacher.add_persisted_derivatives(deriv_hash, **options)
149
138
  end
@@ -1,13 +1,14 @@
1
1
  class Kithe::Validators::ModelParent < ActiveModel::Validator
2
2
  def validate(record)
3
+ # don't load the parent just to validate it if it hasn't even changed.
4
+ return unless record.parent_id_changed?
5
+
3
6
  if record.parent.present? && (record.parent.class <= Kithe::Asset)
4
- record.errors[:parent] << 'can not be an Asset instance'
7
+ record.errors.add(:parent, 'can not be an Asset instance')
5
8
  end
6
9
 
7
10
  if record.parent.present? && record.class <= Kithe::Collection
8
- record.errors[:parent] << 'is invalid for Collection instances'
11
+ record.errors.add(:parent, 'is invalid for Collection instances')
9
12
  end
10
-
11
- # TODO avoid recursive parents, maybe using a postgres CTE for efficiency?
12
13
  end
13
14
  end
@@ -42,7 +42,7 @@ class ArrayInclusionValidator < ActiveModel::EachValidator
42
42
 
43
43
  unless not_allowed_values.blank?
44
44
  formatted_rejected = not_allowed_values.uniq.collect(&:inspect).join(",")
45
- record.errors.add(attribute, :inclusion, options.except(:in).merge!(rejected_values: formatted_rejected, value: value))
45
+ record.errors.add(attribute, :inclusion, **options.except(:in).merge!(rejected_values: formatted_rejected, value: value))
46
46
  end
47
47
  end
48
48
  end
data/lib/kithe/engine.rb CHANGED
@@ -9,7 +9,6 @@ require 'shrine'
9
9
  # https://github.com/teoljungberg/fx/issues/33
10
10
  # https://github.com/teoljungberg/fx/pull/53
11
11
  require 'fx'
12
- require 'kithe/patch_fx'
13
12
 
14
13
  # not auto-loaded, let's just load it for backwards compat though
15
14
  require "kithe/config_base"
@@ -22,5 +21,13 @@ module Kithe
22
21
  g.assets false
23
22
  g.helper false
24
23
  end
24
+
25
+ # the fx gem lets us include stored procedures in schema.rb. For it to work
26
+ # in kithe's case, the stored procedures have to be *first* in schema.rb,
27
+ # so they can then be referenced as default value for columns in tables
28
+ # subsequently created. We configure that here, forcing it for any app, yes, sorry.
29
+ Fx.configure do |config|
30
+ config.dump_functions_at_beginning_of_schema = true
31
+ end
25
32
  end
26
33
  end
@@ -1,14 +1,17 @@
1
1
  module Kithe
2
2
  class IndexableSettings
3
3
  attr_accessor :solr_url, :writer_class_name, :writer_settings,
4
- :model_name_solr_field, :solr_id_value_attribute, :disable_callbacks
4
+ :model_name_solr_field, :solr_id_value_attribute, :disable_callbacks,
5
+ :batching_mode_batch_size
5
6
  def initialize(solr_url:, writer_class_name:, writer_settings:,
6
- model_name_solr_field:, solr_id_value_attribute:, disable_callbacks: false)
7
+ model_name_solr_field:, solr_id_value_attribute:, disable_callbacks: false,
8
+ batching_mode_batch_size: 100)
7
9
  @solr_url = solr_url
8
10
  @writer_class_name = writer_class_name
9
11
  @writer_settings = writer_settings
10
12
  @model_name_solr_field = model_name_solr_field
11
13
  @solr_id_value_attribute = solr_id_value_attribute || 'id'
14
+ @batching_mode_batch_size = batching_mode_batch_size
12
15
  end
13
16
 
14
17
  # Use configured solr_url, and merge together with configured
data/lib/kithe/version.rb CHANGED
@@ -1,6 +1,3 @@
1
1
  module Kithe
2
- # not sure why rubygems turned our alphas into 2.0.0.pre.alpha1, inserting
3
- # "pre". We need to do same thing with betas to get version orderings
4
- # appropriate.
5
- VERSION = '2.0.3'
2
+ VERSION = '2.4.0'
6
3
  end
@@ -10,7 +10,11 @@ class Shrine
10
10
 
11
11
  # Register our derivative processor, that will create our registered derivatives,
12
12
  # with our custom options.
13
- uploader::Attacher.derivatives(:kithe_derivatives) do |original, **options|
13
+ #
14
+ # We do download: false, so when our `lazy` argument is in use, original does not get eagerly downloaded,
15
+ # but only gets downloaded if needed to make derivatives. This is great for performance, especially
16
+ # when running batch job to add just missing derivatives.
17
+ uploader::Attacher.derivatives(:kithe_derivatives, download: false) do |original, **options|
14
18
  Kithe::Asset::DerivativeCreator.new(self.class.kithe_derivative_definitions,
15
19
  source_io: original,
16
20
  shrine_attacher: self,
@@ -44,10 +48,12 @@ class Shrine
44
48
  # Tempfile and Dir.mktmpdir may be useful.
45
49
  #
46
50
  # If in order to do your transformation you need additional information about the original,
47
- # just add a `record:` keyword argument to your block, and the Asset object will be passed in:
51
+ # just add a `attacher:` keyword argument to your block, and a `Shrine::Attacher` subclass
52
+ # will be passed in. You can then get the model object from `attacher.record`, or the
53
+ # original file as a `Shrine::UploadedFile` object with `attacher.file`.
48
54
  #
49
- # define_derivative :thumbnail do |original_file, record:|
50
- # record.width, record.height, record.content_type # etc
55
+ # define_derivative :thumbnail do |original_file, attacher:|
56
+ # attacher.record.title, attacher.file.width, attacher.file.content_type # etc
51
57
  # end
52
58
  #
53
59
  # Derivatives are normally uploaded to the Shrine storage labeled :kithe_derivatives,
@@ -19,6 +19,12 @@ class Shrine
19
19
  # Ensure that if mime-type can't be otherwise determined, it is assigned
20
20
  # "application/octet-stream", basically the type for generic binary.
21
21
  class KitheDetermineMimeType
22
+ # marcel version 1.0 says audio/x-flac, whereas previous versions
23
+ # said audio/flac, which we prefer. Let's fix it.
24
+ RPELACE_CONTENT_TYPES = {
25
+ "audio/x-flac" => "audio/flac"
26
+ }
27
+
22
28
  def self.load_dependencies(uploader, *)
23
29
  uploader.plugin :determine_mime_type, analyzer: -> (io, analyzers) do
24
30
  mime_type = analyzers[:marcel].call(io)
@@ -30,6 +36,9 @@ class Shrine
30
36
 
31
37
  mime_type = "application/octet-stream" if mime_type.blank?
32
38
 
39
+ # Are there any we prefer an alternate spelling of?
40
+ mime_type = RPELACE_CONTENT_TYPES.fetch(mime_type, mime_type)
41
+
33
42
  mime_type
34
43
  end
35
44
  end
@@ -26,6 +26,11 @@ class Shrine
26
26
  # Like the shrine `add_derivatives` method, but also *persists* the
27
27
  # derivatives (saves to db), in a realiably concurrency-safe way.
28
28
  #
29
+ # For ruby 3 compatibility, make sure you supply local_files as a hash
30
+ # literal with curly braces:
31
+ #
32
+ # attacher.add_persisted_derivatives({ derivative_name1: io_obj1, deriv2: io2 })
33
+ #
29
34
  # Generally can take any options that shrine `add_derivatives`
30
35
  # can take, including custom `storage` or `metadata` arguments.
31
36
  #
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: kithe
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.0.3
4
+ version: 2.4.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jonathan Rochkind
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2021-01-11 00:00:00.000000000 Z
11
+ date: 2022-02-14 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rails
@@ -70,14 +70,14 @@ dependencies:
70
70
  requirements:
71
71
  - - "~>"
72
72
  - !ruby/object:Gem::Version
73
- version: '3.2'
73
+ version: '3.3'
74
74
  type: :runtime
75
75
  prerelease: false
76
76
  version_requirements: !ruby/object:Gem::Requirement
77
77
  requirements:
78
78
  - - "~>"
79
79
  - !ruby/object:Gem::Version
80
- version: '3.2'
80
+ version: '3.3'
81
81
  - !ruby/object:Gem::Dependency
82
82
  name: shrine-url
83
83
  requirement: !ruby/object:Gem::Requirement
@@ -188,7 +188,7 @@ dependencies:
188
188
  requirements:
189
189
  - - ">="
190
190
  - !ruby/object:Gem::Version
191
- version: 0.5.0
191
+ version: 0.6.0
192
192
  - - "<"
193
193
  - !ruby/object:Gem::Version
194
194
  version: '1'
@@ -198,7 +198,7 @@ dependencies:
198
198
  requirements:
199
199
  - - ">="
200
200
  - !ruby/object:Gem::Version
201
- version: 0.5.0
201
+ version: 0.6.0
202
202
  - - "<"
203
203
  - !ruby/object:Gem::Version
204
204
  version: '1'
@@ -264,6 +264,20 @@ dependencies:
264
264
  - - ">="
265
265
  - !ruby/object:Gem::Version
266
266
  version: '0'
267
+ - !ruby/object:Gem::Dependency
268
+ name: db-query-matchers
269
+ requirement: !ruby/object:Gem::Requirement
270
+ requirements:
271
+ - - "<"
272
+ - !ruby/object:Gem::Version
273
+ version: '1'
274
+ type: :development
275
+ prerelease: false
276
+ version_requirements: !ruby/object:Gem::Requirement
277
+ requirements:
278
+ - - "<"
279
+ - !ruby/object:Gem::Version
280
+ version: '1'
267
281
  - !ruby/object:Gem::Dependency
268
282
  name: pg
269
283
  requirement: !ruby/object:Gem::Requirement
@@ -331,6 +345,7 @@ files:
331
345
  - README.md
332
346
  - Rakefile
333
347
  - app/assets/config/kithe_manifest.js
348
+ - app/characterization/kithe/ffprobe_characterization.rb
334
349
  - app/derivative_transformers/kithe/ffmpeg_transformer.rb
335
350
  - app/derivative_transformers/kithe/vips_cli_image_to_jpeg.rb
336
351
  - app/helpers/kithe/form_helper.rb
@@ -376,7 +391,6 @@ files:
376
391
  - lib/kithe/config_base.rb
377
392
  - lib/kithe/engine.rb
378
393
  - lib/kithe/indexable_settings.rb
379
- - lib/kithe/patch_fx.rb
380
394
  - lib/kithe/sti_preload.rb
381
395
  - lib/kithe/version.rb
382
396
  - lib/shrine/plugins/kithe_accept_remote_url.rb
@@ -411,7 +425,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
411
425
  - !ruby/object:Gem::Version
412
426
  version: '0'
413
427
  requirements: []
414
- rubygems_version: 3.0.3
428
+ rubygems_version: 3.2.32
415
429
  signing_key:
416
430
  specification_version: 4
417
431
  summary: Shareable tools/components for building a digital collections app in Rails.
@@ -1,39 +0,0 @@
1
- # fx is a gem that lets Rails schema.rb capture postgres functions and triggers
2
- #
3
- # For it to work for our use case, we need it to define functions BEFORE tables when
4
- # doing a `rake db:schema:load`, so we can refer to functions as default values in our
5
- # tables.
6
- #
7
- # This is a known issue in fx, with a PR, but isn't yet merged/released, so we hack
8
- # in a patch to force it. Better than forking.
9
- #
10
- # Based on: https://github.com/teoljungberg/fx/pull/53/
11
- #
12
- # We try to write future-compat code assuming that will be merged eventually....
13
-
14
- require 'fx'
15
-
16
- if Fx.configuration.respond_to?(:dump_functions_at_beginning_of_schema)
17
- # we have the feature!
18
-
19
- Fx.configure do |config|
20
- config.dump_functions_at_beginning_of_schema = true
21
- end
22
-
23
- else
24
- # Fx does not have the feature, we have to patch it in
25
-
26
- require 'fx/schema_dumper/function'
27
-
28
- module Fx
29
- module SchemaDumper
30
- module Function
31
- def tables(stream)
32
- functions(stream)
33
- super
34
- end
35
- end
36
- end
37
- end
38
-
39
- end