feed2gram 1.0.0 → 1.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 31ad42be4bf5ec4ba881a78695bc7b9cc0619bb3b399559d668361b3be9872b0
4
- data.tar.gz: 38617996403f39ca1eeee6285cb31896be7c4dbbb05033d98713c2ac6a1d245e
3
+ metadata.gz: 7f7ca98db2cd00345d0f9752ee3cbecb4a08802788a12b0c4ff96ea88e7564d6
4
+ data.tar.gz: 9431ebbf3b607ed4c2741dd2566ba1769b9decf999cf4797dffbd067afe9ffa0
5
5
  SHA512:
6
- metadata.gz: e28c8903d7933cd8d55454ccf59df9a50a58a60dbaf2794463dbd1ff623af34d4c6077d9b4b202009e2a85b5a031a967e0e71d1ec8f90f58a46bc38222845ff1
7
- data.tar.gz: 3a3d448efa515863ad1f37f4be8e75801e9a71769ed00b376a5d5f3723884d94ffa25a965c14b662a3f92e73c94b1c4b670973ae08164ab51605a1bb3d3afa9c
6
+ metadata.gz: e4d526ea17064ed90989d26df897d9d0f149c53b59afb0fbdc42099a9ffaccc96c9d70a74999c7b973a037dc484e130904f975f8b6ebfe3a8992c64847dc5b64
7
+ data.tar.gz: '090b8873287f0c87df039c457b2f1129620908b0ffa71c1098c4259f0856e03399fa9bd2690ffb15885b895a0d2fb1659ac59d9d7643865975b2b836e2a351db'
data/CHANGELOG.md CHANGED
@@ -1,4 +1,20 @@
1
- ## [Unreleased]
1
+ ## [1.2.0]
2
+
3
+ * Add support for the `cover_url` property for reel posts by way of a
4
+ `data-cover-url` attribute on the `<img>` tag of single-video posts..
5
+
6
+ ## [1.1.0]
7
+
8
+ * Add support for videos and stories, including:
9
+ * single-video posts (which post as reels), by setting `data-media-type=video`
10
+ attribute on a feed entry's `<figure>`'s only `<img>` child
11
+ * single-image and single-video stories, by setting `data-post-type=stories`
12
+ attribute on a feed entry's `<figure>` element
13
+ * carousels that contain videos and photos by setting `data-media-type=video`
14
+ attribute on each `<img>` tag that contains a video
15
+ * Print much more granular feedback when publishing and in verbose mode
16
+ * When all posts are filtered out from the cache, say so (when verbose) and
17
+ don't update the cache file needlessly
2
18
 
3
19
  ## [1.0.0]
4
20
 
data/README.md CHANGED
@@ -102,13 +102,19 @@ feed2gram uses the first `<figure>` element to generate each Instagram post. Tha
102
102
 
103
103
  Some things to keep in mind:
104
104
 
105
+ * A `<figure>` may specify a `data-post-type` with a value of `reels`, `stories`, or `post` (if unspecified, the type defaults to `post`)
106
+ * If `data-post-type` is set to `stories` or `reels`, exactly one image or video must be included. If `post`, then multiple (up to ten) images and videos can be included and will publish as a carousel post
107
+ * Posting stories (i.e. `<figure data-post-type="stories">`) requires a _business_ account, not a creator one (in which case a, "the user is not an Instagram Business," error will be returned)
105
108
  * If one `<img>` tag is present, a single photo post will be created. If there are more, a [carousel post](https://developers.facebook.com/docs/instagram-api/guides/content-publishing/#carousel-posts) will be created
106
109
  * Because Facebook's servers actually _download your image_ as opposed to receiving them as uploads via the API, every `<img>` tag's `src` attribute must be set to a publicly-reachable, fully-qualified URL
107
- * Images can't be more than 8MB, or else posting will fail
108
- * Images must be standard-issue JPEGs, or else posting will fail
110
+ * To post videos, stories, or reels, set the `data-media-type` attribute on the `<img>` tag to `video` or `image` (a media type of `image` will be assumed by default if left unspecified). Note that while `image` and `video` media may be interspersed throughout a carousel
111
+ * For video (reel) posts containing a single video, you can set `data-cover-url` on the `<img>` tag to a publicly-available URL and the Instagram API will use it as a custom thumbnail for the reel
109
112
  * For carousel posts, the aspect ratio of the first image determines the aspect ratio of the rest, so be mindful of how you order the images based on how you want them to appear in the app
110
113
  * Only one caption will be published, regardless of whether it's a single photo post or a carousel
111
114
  * The caption limit is 2200 characters, so feed2gram will truncate it if necessary
115
+ * The API is pretty strict about media file formats, too, so you may wish to preprocess images and videos to avoid errors in processing:
116
+ * Images can't be more than 8MB and must be standard-issue JPEGs
117
+ * Videos are even stricter (best to just [read the docs](https://developers.facebook.com/docs/instagram-api/reference/ig-user/media#creating), including this bit on [reels](https://developers.facebook.com/docs/video-api/guides/reels-publishing)). Videos that appear in carousels seem to have additional no-longer-documented restrictions (in my testing, 9:16 videos routinely failed but 16:9, 1:1, 4:3, and 3:4 succeeded)
112
118
 
113
119
  Here's an example `<entry>` from my blog feed:
114
120
 
@@ -207,7 +213,7 @@ Look at your cache file (by default, `feed2gram.cache.yml`) and you should see
207
213
  all the Atom feed entry URLs that succeeded, failed, or were (by the `--populate-cache` option) skipped. If you don't see the error in the log, try
208
214
  removing the relevant URL from the cache and running `feed2gram` again.
209
215
 
210
- ### What are the valid aspect ratios?
216
+ ### What are the valid aspect ratios for images?
211
217
 
212
218
  If you're seeing an embedded API error like this one:
213
219
 
@@ -216,6 +222,6 @@ The submitted image with aspect ratio ('719/194',) cannot be published. Please s
216
222
  ```
217
223
 
218
224
  It means your photo is too avant garde for a mainstream normie platform like
219
- Instagram. Make sure all images' aspect ratiosa re between 4:5 and 1.91:1 or
225
+ Instagram. Make sure all images' aspect ratios are between 4:5 and 1.91:1 or
220
226
  else the post will fail.
221
227
 
data/example/Gemfile ADDED
@@ -0,0 +1,6 @@
1
+ source "https://rubygems.org"
2
+
3
+ gem "rackup"
4
+ gem "sinatra"
5
+
6
+ gem "feed2gram", path: ".."
@@ -0,0 +1,59 @@
1
+ PATH
2
+ remote: ..
3
+ specs:
4
+ feed2gram (1.2.0)
5
+ nokogiri (~> 1.15)
6
+
7
+ GEM
8
+ remote: https://rubygems.org/
9
+ specs:
10
+ base64 (0.2.0)
11
+ mustermann (3.0.0)
12
+ ruby2_keywords (~> 0.0.1)
13
+ nokogiri (1.16.2-aarch64-linux)
14
+ racc (~> 1.4)
15
+ nokogiri (1.16.2-arm-linux)
16
+ racc (~> 1.4)
17
+ nokogiri (1.16.2-arm64-darwin)
18
+ racc (~> 1.4)
19
+ nokogiri (1.16.2-x86-linux)
20
+ racc (~> 1.4)
21
+ nokogiri (1.16.2-x86_64-darwin)
22
+ racc (~> 1.4)
23
+ nokogiri (1.16.2-x86_64-linux)
24
+ racc (~> 1.4)
25
+ racc (1.7.3)
26
+ rack (3.0.9)
27
+ rack-protection (4.0.0)
28
+ base64 (>= 0.1.0)
29
+ rack (>= 3.0.0, < 4)
30
+ rack-session (2.0.0)
31
+ rack (>= 3.0.0)
32
+ rackup (2.1.0)
33
+ rack (>= 3)
34
+ webrick (~> 1.8)
35
+ ruby2_keywords (0.0.5)
36
+ sinatra (4.0.0)
37
+ mustermann (~> 3.0)
38
+ rack (>= 3.0.0, < 4)
39
+ rack-protection (= 4.0.0)
40
+ rack-session (>= 2.0.0, < 3)
41
+ tilt (~> 2.0)
42
+ tilt (2.3.0)
43
+ webrick (1.8.1)
44
+
45
+ PLATFORMS
46
+ aarch64-linux
47
+ arm-linux
48
+ arm64-darwin
49
+ x86-linux
50
+ x86_64-darwin
51
+ x86_64-linux
52
+
53
+ DEPENDENCIES
54
+ feed2gram!
55
+ rackup
56
+ sinatra
57
+
58
+ BUNDLED WITH
59
+ 2.5.4
@@ -0,0 +1,38 @@
1
+ <?xml version="1.0" encoding="UTF-8"?>
2
+ <feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-us">
3
+ <id>https://gram.betterwithbecky.com/syndications/grams</id>
4
+ <title>Beckygram</title>
5
+ <updated>2024-02-09T12:41:16Z</updated>
6
+ <author>
7
+ <name>Becky Searls</name>
8
+ <email>becky@betterwithbecky.com</email>
9
+ </author>
10
+ <link href="https://gram.betterwithbecky.com/" rel="alternate" type="text/html" title="HTML"/>
11
+ <link href="https://gram.betterwithbecky.com/syndications/grams" rel="self" type="application/atom+xml" title="Grams"/>
12
+ <category term="Fitness"/>
13
+ <generator uri="https://rubyonrails.org/" version="0.119.0">
14
+ Ruby on Rails </generator>
15
+ <icon>https://gram.betterwithbecky.com/favicon.ico</icon>
16
+ <logo>https://static-cdn.betterwithbecky.com/assets/logo-fa728624cb9c92f3e052ee6b58a653d8e64fda32.png</logo>
17
+ <rights>Copyright Build with Becky LLC. All rights reserved.</rights>
18
+ <subtitle>What you're getting when you get Better with Becky.</subtitle>
19
+ <entry>
20
+ <id>https://gram.betterwithbecky.com/posts/1/whatever</id>
21
+ <title type="text">Test Title</title>
22
+ <link href="https://gram.betterwithbecky.com/posts/1/whatever" rel="alternate" type="text/html"/>
23
+ <author>
24
+ <name>Becky Searls</name>
25
+ <email>becky@example.com</email>
26
+ </author>
27
+ <published>2024-02-04T14:59:00Z</published>
28
+ <updated>2024-02-09T12:41:16Z</updated>
29
+ <content type="html"><![CDATA[
30
+ <figure data-post-type="post">
31
+ <img data-media-type="video" data-cover-url="https://upload.wikimedia.org/wikipedia/commons/7/7c/Aspect_ratio_16_9_example.jpg" src="https://static.videezy.com/system/resources/previews/000/032/359/original/MM008645___BOUNCING_FRUIT_009___1080p___phantom.mp4"/>"
32
+ <figcaption>
33
+ Nothing to see here. Just a test post.
34
+ </figcaption>
35
+ </figure>
36
+ ]]></content>
37
+ </entry>
38
+ </feed>
data/example/server.rb ADDED
@@ -0,0 +1,9 @@
1
+ require "sinatra"
2
+
3
+ feed_file = ARGV[0] || "sample.xml"
4
+ raise "Usage: script/server <path/to/xml/file>" unless File.exist?(feed_file)
5
+ puts "Hosting '#{feed_file}' on port 4567"
6
+
7
+ get "/" do
8
+ send_file feed_file
9
+ end
@@ -2,20 +2,57 @@ require "nokogiri"
2
2
  require "open-uri"
3
3
 
4
4
  module Feed2Gram
5
- Post = Struct.new(:url, :images, :caption, keyword_init: true)
5
+ Media = Struct.new(:media_type, :url, :cover_url, keyword_init: true) do
6
+ def video?
7
+ media_type == "VIDEO"
8
+ end
9
+ end
10
+ Post = Struct.new(:media_type, :url, :medias, :caption, keyword_init: true)
6
11
 
7
12
  class ParsesEntries
8
13
  def parse(feed_url)
9
14
  feed = Nokogiri::XML(URI.parse(feed_url).open)
10
15
  feed.xpath("//*:entry").map { |entry|
11
16
  html = Nokogiri::HTML(entry.xpath("*:content[1]").text)
17
+ medias = html.xpath("//figure[1]/img").map { |img|
18
+ Media.new(
19
+ media_type: (img["data-media-type"] || "image").upcase,
20
+ url: img["src"],
21
+ cover_url: img["data-cover-url"]
22
+ )
23
+ }
12
24
 
13
25
  Post.new(
26
+ media_type: determine_post_media_type(html, medias),
14
27
  url: entry.xpath("*:id[1]").text,
15
- images: html.xpath("//figure[1]/img").map { |img| img["src"] },
28
+ medias: medias,
16
29
  caption: html.xpath("//figure[1]/figcaption").text.strip
17
30
  )
18
- }.reject { |post| post.images.empty? }
31
+ }.select { |post|
32
+ if post.medias.empty?
33
+ warn "Skipping post with no <img> tag: #{post.url}"
34
+ elsif ["STORIES", "REELS"].include?(post.media_type) && post.medias.size > 1
35
+ warn "Skipping #{post.media_type.downcase} with more than one <img> tag (only one allowed): #{post.url}"
36
+ else
37
+ true
38
+ end
39
+ }
40
+ end
41
+
42
+ private
43
+
44
+ def determine_post_media_type(html, medias)
45
+ post_type = html.at("//figure[1]")["data-post-type"]&.upcase
46
+ if ["STORIES", "REELS"].include?(post_type)
47
+ post_type
48
+ elsif medias.size > 1
49
+ "CAROUSEL"
50
+ elsif medias.first.media_type == "VIDEO"
51
+ # The VIDEO value for media_type is deprecated outside carousel items. Use the REELS media type to publish a video to your Instagram feed. Please visit https://developers.facebook.com/docs/instagram-api/reference/ig-user/media#creating to publish a video.
52
+ "REELS"
53
+ else
54
+ "IMAGE"
55
+ end
19
56
  end
20
57
  end
21
58
  end
@@ -9,12 +9,12 @@ module Feed2Gram
9
9
  # reverse to post oldest first (most Atom feeds are reverse-chronological)
10
10
  posts.reverse.take(post_limit).map { |post|
11
11
  begin
12
- if post.images.size == 1
13
- puts "Publishing single image post for: #{post.url}" if options.verbose
14
- publish_single_image(post, config)
12
+ if post.medias.size == 1
13
+ puts "Publishing #{post.media_type.downcase} for: #{post.url}" if options.verbose
14
+ publish_single_media(post, config, options)
15
15
  else
16
- puts "Publishing carousel post for: #{post.url}" if options.verbose
17
- publish_carousel(post, config)
16
+ puts "Publishing carousel for: #{post.url}" if options.verbose
17
+ publish_carousel(post, config, options)
18
18
  end
19
19
  rescue => e
20
20
  warn "Failed to post #{post.url}: #{e.message}"
@@ -25,39 +25,88 @@ module Feed2Gram
25
25
 
26
26
  private
27
27
 
28
- def publish_single_image(post, config)
29
- container_response = Http.post("/#{config.instagram_id}/media", {
30
- image_url: post.images.first,
31
- caption: post.caption,
32
- access_token: config.access_token
33
- })
28
+ def publish_single_media(post, config, options)
29
+ media = post.medias.first
30
+
31
+ puts "Creating media resource for URL - #{media.url}" if options.verbose
32
+ container_id = Http.post("/#{config.instagram_id}/media", {
33
+ :media_type => post.media_type,
34
+ :caption => post.caption,
35
+ :access_token => config.access_token,
36
+ :cover_url => media.cover_url,
37
+ media.video? ? :video_url : :image_url => media.url
38
+ }.compact)[:id]
39
+
40
+ if media.video?
41
+ wait_for_media_to_upload!(media.url, container_id, config, options)
42
+ end
43
+
44
+ puts "Publishing media for URL - #{media.url}" if options.verbose
34
45
  Http.post("/#{config.instagram_id}/media_publish", {
35
- creation_id: container_response[:id],
46
+ creation_id: container_id,
36
47
  access_token: config.access_token
37
48
  })
38
49
  Result.new(post: post, status: :posted)
39
50
  end
40
51
 
41
- def publish_carousel(post, config)
42
- image_containers = post.images.take(10).map { |image|
52
+ def publish_carousel(post, config, options)
53
+ media_containers = post.medias.take(10).map { |media|
54
+ puts "Creating media resource for URL - #{media.url}" if options.verbose
43
55
  res = Http.post("/#{config.instagram_id}/media", {
44
- is_carousel_item: true,
45
- image_url: image,
46
- access_token: config.access_token
47
- })
56
+ :media_type => media.media_type,
57
+ :is_carousel_item => true,
58
+ :access_token => config.access_token,
59
+ media.video? ? :video_url : :image_url => media.url
60
+ }.compact)
48
61
  res[:id]
49
62
  }
50
- carousel_container = Http.post("/#{config.instagram_id}/media", {
63
+ post.medias.select(&:video?).zip(media_containers).each { |media, container_id|
64
+ wait_for_media_to_upload!(media.url, container_id, config, options)
65
+ }
66
+
67
+ puts "Creating carousel media resource for post - #{post.url}" if options.verbose
68
+ carousel_id = Http.post("/#{config.instagram_id}/media", {
51
69
  caption: post.caption,
52
- media_type: "CAROUSEL",
53
- children: image_containers.join(","),
70
+ media_type: post.media_type,
71
+ children: media_containers.join(","),
54
72
  access_token: config.access_token
55
- })
73
+ })[:id]
74
+ wait_for_media_to_upload!(post.url, carousel_id, config, options)
75
+
76
+ puts "Publishing carousel media for post - #{post.url}" if options.verbose
56
77
  Http.post("/#{config.instagram_id}/media_publish", {
57
- creation_id: carousel_container[:id],
78
+ creation_id: carousel_id,
58
79
  access_token: config.access_token
59
80
  })
60
81
  Result.new(post: post, status: :posted)
61
82
  end
83
+
84
+ SECONDS_PER_WAIT = 30
85
+ MAX_WAIT_ATTEMPTS = 100
86
+ # Good ol' loop-and-sleep. Haven't loop do'd in a while
87
+ def wait_for_media_to_upload!(url, container_id, config, options)
88
+ wait_attempts = 0
89
+ loop do
90
+ if wait_attempts > MAX_WAIT_ATTEMPTS
91
+ warn "Giving up waiting for media to upload after waiting #{SECONDS_PER_WAIT * MAX_WAIT_ATTEMPTS} seconds: #{url}"
92
+ break
93
+ end
94
+
95
+ res = Http.get("/#{container_id}", {
96
+ fields: "status_code",
97
+ access_token: config.access_token
98
+ })
99
+ puts "Upload status #{res[:status_code]} after waiting #{wait_attempts * SECONDS_PER_WAIT} seconds for IG to download #{url}" if options.verbose
100
+ if res[:status_code] == "FINISHED"
101
+ break
102
+ elsif res[:status_code] == "IN_PROGRESS"
103
+ wait_attempts += 1
104
+ sleep SECONDS_PER_WAIT
105
+ else
106
+ warn "Unexpected status code (#{res[:status_code]}) uploading: #{url}"
107
+ break
108
+ end
109
+ end
110
+ end
62
111
  end
63
112
  end
@@ -1,3 +1,3 @@
1
1
  module Feed2Gram
2
- VERSION = "1.0.0"
2
+ VERSION = "1.2.0"
3
3
  end
data/lib/feed2gram.rb CHANGED
@@ -26,12 +26,16 @@ module Feed2Gram
26
26
  entries = ParsesEntries.new.parse(config.feed_url)
27
27
  puts "Found #{entries.size} entries in feed" if options.verbose
28
28
  posts = FiltersPosts.new.filter(entries, cache)
29
- results = if options.populate_cache
30
- puts "Populating cache, marking #{posts.size} posts as skipped" if options.verbose
31
- posts.map { |post| Result.new(post: post, status: :skipped) }
29
+ if posts.empty?
30
+ puts "No new posts to publish after filtering already-processed posts in #{options.cache_path}" if options.verbose
32
31
  else
33
- PublishesPosts.new.publish(posts, config, options)
32
+ results = if options.populate_cache
33
+ puts "Populating cache, marking #{posts.size} posts as skipped" if options.verbose
34
+ posts.map { |post| Result.new(post: post, status: :skipped) }
35
+ else
36
+ PublishesPosts.new.publish(posts, config, options)
37
+ end
38
+ UpdatesCache.new.update!(cache, results, options)
34
39
  end
35
- UpdatesCache.new.update!(cache, results, options)
36
40
  end
37
41
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: feed2gram
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0
4
+ version: 1.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Justin Searls
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2023-11-08 00:00:00.000000000 Z
11
+ date: 2024-02-09 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: nokogiri
@@ -38,6 +38,10 @@ files:
38
38
  - LICENSE.txt
39
39
  - README.md
40
40
  - Rakefile
41
+ - example/Gemfile
42
+ - example/Gemfile.lock
43
+ - example/sample.xml
44
+ - example/server.rb
41
45
  - exe/feed2gram
42
46
  - lib/feed2gram.rb
43
47
  - lib/feed2gram/filters_posts.rb
@@ -73,7 +77,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
73
77
  - !ruby/object:Gem::Version
74
78
  version: '0'
75
79
  requirements: []
76
- rubygems_version: 3.4.17
80
+ rubygems_version: 3.5.3
77
81
  signing_key:
78
82
  specification_version: 4
79
83
  summary: Reads an Atom feed and posts its entries to Instagram