facebook_dumper 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 464f40adf07b41485a860d342997ea980fba1ebbb499644f64c9f7b3d2298895
4
+ data.tar.gz: '0914b8dcd756512c4724b58a05787bdb6b8a8d131c7283348f342cc8728051ef'
5
+ SHA512:
6
+ metadata.gz: 065f8e630ee552e5cfce878667c75cc1c8649c9725d09a3d49cd97fbfa874de0d55994edbc22c86283051890f2dfadd9f42b6e67cc597e634e6cbff16e5df551
7
+ data.tar.gz: b5812af0302b4d8d5392d3a425386e0a7221bcd597de3d9a33311b0155aef6a2d2ced9cc55a24c8c2c1dec052282316acc53bd2ad4ebcc6067844016803a9c3b
@@ -0,0 +1,14 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /_yardoc/
4
+ /coverage/
5
+ /doc/
6
+ /pkg/
7
+ /spec/reports/
8
+ /tmp/
9
+ Gemfile.lock
10
+ *~
11
+ facebook-friends.txt
12
+
13
+ # rspec failure tracking
14
+ .rspec_status
data/.rspec ADDED
@@ -0,0 +1,3 @@
1
+ --format documentation
2
+ --color
3
+ --require spec_helper
@@ -0,0 +1,6 @@
1
+ ---
2
+ language: ruby
3
+ cache: bundler
4
+ rvm:
5
+ - 2.7.1
6
+ before_install: gem install bundler -v 2.1.4
@@ -0,0 +1,74 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as
6
+ contributors and maintainers pledge to making participation in our project and
7
+ our community a harassment-free experience for everyone, regardless of age, body
8
+ size, disability, ethnicity, gender identity and expression, level of experience,
9
+ nationality, personal appearance, race, religion, or sexual identity and
10
+ orientation.
11
+
12
+ ## Our Standards
13
+
14
+ Examples of behavior that contributes to creating a positive environment
15
+ include:
16
+
17
+ * Using welcoming and inclusive language
18
+ * Being respectful of differing viewpoints and experiences
19
+ * Gracefully accepting constructive criticism
20
+ * Focusing on what is best for the community
21
+ * Showing empathy towards other community members
22
+
23
+ Examples of unacceptable behavior by participants include:
24
+
25
+ * The use of sexualized language or imagery and unwelcome sexual attention or
26
+ advances
27
+ * Trolling, insulting/derogatory comments, and personal or political attacks
28
+ * Public or private harassment
29
+ * Publishing others' private information, such as a physical or electronic
30
+ address, without explicit permission
31
+ * Other conduct which could reasonably be considered inappropriate in a
32
+ professional setting
33
+
34
+ ## Our Responsibilities
35
+
36
+ Project maintainers are responsible for clarifying the standards of acceptable
37
+ behavior and are expected to take appropriate and fair corrective action in
38
+ response to any instances of unacceptable behavior.
39
+
40
+ Project maintainers have the right and responsibility to remove, edit, or
41
+ reject comments, commits, code, wiki edits, issues, and other contributions
42
+ that are not aligned to this Code of Conduct, or to ban temporarily or
43
+ permanently any contributor for other behaviors that they deem inappropriate,
44
+ threatening, offensive, or harmful.
45
+
46
+ ## Scope
47
+
48
+ This Code of Conduct applies both within project spaces and in public spaces
49
+ when an individual is representing the project or its community. Examples of
50
+ representing a project or community include using an official project e-mail
51
+ address, posting via an official social media account, or acting as an appointed
52
+ representative at an online or offline event. Representation of a project may be
53
+ further defined and clarified by project maintainers.
54
+
55
+ ## Enforcement
56
+
57
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
58
+ reported by contacting the project team at koichiro.eto@gmail.com. All
59
+ complaints will be reviewed and investigated and will result in a response that
60
+ is deemed necessary and appropriate to the circumstances. The project team is
61
+ obligated to maintain confidentiality with regard to the reporter of an incident.
62
+ Further details of specific enforcement policies may be posted separately.
63
+
64
+ Project maintainers who do not follow or enforce the Code of Conduct in good
65
+ faith may face temporary or permanent repercussions as determined by other
66
+ members of the project's leadership.
67
+
68
+ ## Attribution
69
+
70
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71
+ available at [https://contributor-covenant.org/version/1/4][version]
72
+
73
+ [homepage]: https://contributor-covenant.org
74
+ [version]: https://contributor-covenant.org/version/1/4/
data/Gemfile ADDED
@@ -0,0 +1,12 @@
1
+ source "https://rubygems.org"
2
+
3
+ # Specify your gem's dependencies in facebook_dumper.gemspec
4
+ gemspec
5
+
6
+ gem "rake", "~> 12.0"
7
+ gem "rspec", "~> 3.0"
8
+ #gem "pathname"
9
+ #gem "open-uri"
10
+ gem "nokogiri"
11
+ #gem "pp"
12
+ #gem "rexml"
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2020 Koichiro Eto
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
@@ -0,0 +1,45 @@
1
+ # FacebookDumper
2
+
3
+ FacebookDumperは、Facebookの保存されたWebページを解析して、テキストとしてダンプするプログラムです。
4
+ 現在は、友達リストをダンプすることができます。
5
+
6
+ ## Installation
7
+
8
+ Add this line to your application's Gemfile:
9
+
10
+ ```ruby
11
+ gem 'facebook_dumper'
12
+ ```
13
+
14
+ And then execute:
15
+
16
+ $ bundle install
17
+
18
+ Or install it yourself as:
19
+
20
+ $ gem install facebook_dumper
21
+
22
+ ## Usage
23
+
24
+ - Facebookの友達リストのページを開きます。
25
+ https://www.facebook.com/【あなたのID】/friends
26
+
27
+ - たくさん友達がいる場合は、スクロールすると表示が増えます。スペースバーを押し続けると最後まで表示されるので、しばらくスペースバーを押し続けるといいでしょう。
28
+ - 全て表示されたら、Webブラウザから、「ファイル」→「名前を付けてページを保存…」を選び、ファイルとして保存します。
29
+ - 以下のように、そのHTMLファイルを指定して、facebook_dumperを起動します。引数に、ファイルを指定します。
30
+
31
+ $ facebook_dumper friends.html
32
+
33
+ - カレントディレクトリーの`facebook-friends.txt`に、内容が保存されます。
34
+
35
+ ## Contributing
36
+
37
+ Bug reports and pull requests are welcome on GitHub at https://github.com/eto/facebook_dumper. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/eto/facebook_dumper/blob/master/CODE_OF_CONDUCT.md).
38
+
39
+ ## License
40
+
41
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
42
+
43
+ ## Code of Conduct
44
+
45
+ Everyone interacting in the FacebookDumper project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/eto/facebook_dumper/blob/master/CODE_OF_CONDUCT.md).
@@ -0,0 +1,10 @@
1
+ require "bundler/gem_tasks"
2
+ require "rspec/core/rake_task"
3
+
4
+ RSpec::Core::RakeTask.new(:spec)
5
+
6
+ task :default => :spec
7
+
8
+ task :run do
9
+ sh "ruby bin/facebook_dumper ~/dev/200711/*.html"
10
+ end
@@ -0,0 +1,14 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "facebook_dumper"
5
+
6
+ # You can add fixtures and/or initialization code here to make experimenting
7
+ # with your gem easier. You can also use a different console, if you like.
8
+
9
+ # (If you use this, don't forget to add pry to your Gemfile!)
10
+ # require "pry"
11
+ # Pry.start
12
+
13
+ require "irb"
14
+ IRB.start(__FILE__)
@@ -0,0 +1,21 @@
1
+ #!/usr/bin/ruby -w
2
+ # coding: utf-8
3
+
4
+ #$LOAD_PATH.unshift("..") if !$LOAD_PATH.include?("..")
5
+ $LOAD_PATH.unshift("lib") if !$LOAD_PATH.include?("lib")
6
+ require "facebook_dumper"
7
+
8
+ dumper = FacebookDumper::FacebookFriendsDumper.new
9
+ dumper.run(ARGV)
10
+
11
+ #.main(ARGV)
12
+ #unless ARGV[0] == "--test"
13
+ #else
14
+ # ARGV.shift
15
+ # require "test/unit"
16
+ # class TestFacebookFriendsDump < Test::Unit::TestCase
17
+ # def test_it
18
+ # assert_equal(2, 1+1)
19
+ # end
20
+ # end
21
+ #end
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
7
+
8
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,29 @@
1
+ require_relative 'lib/facebook_dumper/version'
2
+
3
+ Gem::Specification.new do |spec|
4
+ spec.name = "facebook_dumper"
5
+ spec.version = FacebookDumper::VERSION
6
+ spec.authors = ["Koichiro Eto"]
7
+ spec.email = ["k-eto@aist.go.jp"]
8
+
9
+ spec.summary = %q{Dump Facebook friends list from a web page.}
10
+ spec.description = %q{You can create a list of facebook friends.}
11
+ spec.homepage = "https://github.com/eto/facebook_dumper/"
12
+ spec.license = "MIT"
13
+ spec.required_ruby_version = Gem::Requirement.new(">= 2.3.0")
14
+
15
+ spec.metadata["allowed_push_host"] = "https://rubygems.org/"
16
+
17
+ spec.metadata["homepage_uri"] = spec.homepage
18
+ spec.metadata["source_code_uri"] = "https://github.com/eto/facebook_dumper/"
19
+ spec.metadata["changelog_uri"] = "https://github.com/eto/facebook_dumper/"
20
+
21
+ # Specify which files should be added to the gem when it is released.
22
+ # The `git ls-files -z` loads the files in the RubyGem that have been added into git.
23
+ spec.files = Dir.chdir(File.expand_path('..', __FILE__)) do
24
+ `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
25
+ end
26
+ spec.bindir = "exe"
27
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
28
+ spec.require_paths = ["lib"]
29
+ end
@@ -0,0 +1,263 @@
1
+ #!/usr/bin/ruby -w
2
+ # coding: utf-8
3
+
4
+ require "facebook_dumper/version"
5
+
6
+ require 'pathname'
7
+ require 'open-uri'
8
+ require 'nokogiri'
9
+ require 'pp'
10
+ require "rexml/document"
11
+
12
+ module FacebookDumper
13
+ class Error < StandardError; end
14
+
15
+ class Person
16
+ def initialize
17
+ #@uid = 0
18
+ @url = ""
19
+ @name = ""
20
+ @imgsrc = ""
21
+ @friends_mutual = 0
22
+ #@friends_num = 0
23
+ end
24
+ attr_accessor :url
25
+ attr_accessor :name
26
+ attr_accessor :imgsrc
27
+ attr_accessor :friends_mutual
28
+ #attr_accessor :uid, :url, :name, :friends_num, :mutual_friends_num
29
+
30
+ def inspect
31
+ #"#{@url} #{@name} #{@friends_mutual} #{@imgsrc}"
32
+ "#{@url} #{@name} #{@friends_mutual}"
33
+ end
34
+ def <=> other
35
+ #return @uid <=> other.uid
36
+ return @url <=> other.url
37
+ end
38
+ end
39
+
40
+ class FacebookFriendsDumper
41
+ def initialize
42
+ @friends = []
43
+ end
44
+
45
+ def run(argv)
46
+ file = argv[0]
47
+ p file
48
+ extract_from_file file
49
+ out = friends_list
50
+ open("facebook-friends.txt", "w") {|f| f.print out}
51
+ end
52
+
53
+ def take1
54
+ doc.search(:img).each {|node|
55
+ node.remove_attribute("alt")
56
+ node.remove_attribute("role")
57
+ }
58
+ doc.search(:div).each {|node|
59
+ node.remove_attribute("data-ft")
60
+ node.remove_attribute("data-testid")
61
+ node.remove_attribute("style")
62
+ }
63
+ doc.search(:noscript).map &:remove
64
+ doc.search(:a).each {|node|
65
+ =begin
66
+ node.remove_attribute("data-gt")
67
+ node.remove_attribute("ajaxify")
68
+ node.remove_attribute("rel")
69
+ node.remove_attribute("role")
70
+ node.remove_attribute("data-hover")
71
+ node.remove_attribute("data-tooltip-uri")
72
+ node.remove_attribute("tabindex")
73
+ node.remove_attribute("aria-hidden")
74
+ node.remove_attribute("data-profileid")
75
+ node.remove_attribute("data-flloc")
76
+ node.remove_attribute("data-unref")
77
+ node.remove_attribute("data-floc")
78
+ =end
79
+ node.remove_attribute("data-hovercard")
80
+ node.remove_attribute("data-hovercard-prefer-more-content-show")
81
+ node.remove_attribute("style")
82
+ node.remove_attribute("aria-haspopup")
83
+ }
84
+ doc.search(:li).each {|node|
85
+ node.remove_attribute("data-ft")
86
+ node.remove_attribute("data-gt")
87
+ node.remove_attribute("data-alert-id")
88
+ }
89
+ doc.search(:button).each {|node|
90
+ node.remove_attribute("class")
91
+ node.remove_attribute("data-flloc")
92
+ node.remove_attribute("data-profileid")
93
+ node.remove_attribute("type")
94
+ node.remove_attribute("data-cancelref")
95
+ node.remove_attribute("data-floc")
96
+ }
97
+ #open("t3.html", "w") {|f| f.print doc.to_s }
98
+ open("t4.html", "w") {|f| f.print doc.to_xml(:indent => 2) }
99
+
100
+ num = 0
101
+ @friends = []
102
+ doc.xpath('//li[@class="_698"]').each {|node| #<li class="_698">
103
+ node.search(:button).map &:remove
104
+ node.search(:span).each {|n|
105
+ n.remove_attribute("class")
106
+ n.remove_attribute("aria-hidden")
107
+ }
108
+
109
+ person = Person.new
110
+ node.search(:div).each {|n|
111
+ if n.attribute('class').value == "uiProfileBlockContent"
112
+ n.search(:a).each {|a|
113
+ text = a.inner_text
114
+ if text =~ /共通の友達(.+)人/
115
+ person.mutual_friends_num = $1.gsub(",", "").to_i
116
+ elsif text =~ /友達(.+)人/
117
+ person.friends_num = $1.gsub(",", "").to_i
118
+ elsif a.attribute('ajaxify') && a.attribute('ajaxify').value =~ /\/ajax\/friends\/inactive\/dialog\?id=(.+)/
119
+ #<a href="https://www.facebook.com/etocom/friends#" rel="dialog" ajaxify="/ajax/friends/inactive/dialog?id=100008321654013" role="button">干場 隆志
120
+ #ajaxify="/ajax/browser/dialog/mutual_friends/?uid=1036721103"
121
+ uid = $1.to_i
122
+ person.uid = uid
123
+ person.url = "https://www.facebook.com/profile.php?id=#{uid}"
124
+ person.name = a.inner_text.chomp
125
+ person.friends_num = -1
126
+ elsif a.attribute('data-gt')
127
+ person.name = text.chomp
128
+ url = a.attribute('href').value
129
+ url.gsub!("?fref=profile_friend_list&hc_location=friends_tab", "")
130
+ url.gsub!("&fref=profile_friend_list&hc_location=friends_tab", "")
131
+ url.gsub!("?fref=pb&hc_location=friends_tab", "")
132
+ url.gsub!("&fref=pb&hc_location=friends_tab", "")
133
+ person.url = url
134
+ else
135
+ # ignore
136
+ end
137
+ }
138
+ #friends << [friend_url, friend_name, friend_num, friend_mutual]
139
+ @friends << person
140
+ end
141
+ }
142
+ num += 1
143
+ }
144
+ #p num #4995
145
+ return @friends
146
+ end
147
+
148
+ def parse_a_person(gpa)
149
+ person = Person.new
150
+ gpa.search(:img).each {|img|
151
+ person.imgsrc = img.attribute('src').value
152
+ pa = img.parent
153
+ if pa.name != "a"
154
+ gpa.search(:span).each {|span|
155
+ text = span.inner_text
156
+ unless text =~ /^友達$/
157
+ person.name = text
158
+ person.url = "no_longer_on_Facebook"
159
+ end
160
+ }
161
+ return person # return here, since the user is no longer on Facebook.
162
+ end
163
+ }
164
+ gpa.search(:a).each {|a|
165
+ if a.attribute('tabindex').value == "0"
166
+ url = a.attribute('href').value
167
+ text = a.inner_text
168
+ if url =~ /friends_mutual/
169
+ if text =~ /共通の友達(.+)人/
170
+ person.friends_mutual = $1.gsub(",", "").to_i
171
+ end
172
+ else
173
+ person.url = url
174
+ person.name = text
175
+ end
176
+ #f.puts [person.url, person.name]
177
+ end
178
+ #f.puts a.to_html
179
+ }
180
+ return person
181
+ end
182
+
183
+ def extract_from_file(file)
184
+ charset = nil
185
+ html = ""
186
+ open(file) {|f|
187
+ html = f.read
188
+ }
189
+ #open("t1.html", "w") {|f| f.print html }
190
+
191
+ doc = Nokogiri::HTML.parse(html, nil, charset)
192
+ #open("t2.html", "w") {|f| f.print doc.to_s }
193
+
194
+ doc.search(:meta).map &:remove
195
+ doc.search(:link).map &:remove
196
+ doc.search(:style).map &:remove
197
+ doc.search(:script).map &:remove
198
+ doc.search(:svg).map &:remove
199
+ doc.search(:div).each {|node|
200
+ node.remove_attribute("class")
201
+ }
202
+ doc.search(:span).each {|node|
203
+ node.remove_attribute("class")
204
+ }
205
+ doc.search(:i).each {|node|
206
+ node.remove_attribute("class")
207
+ }
208
+ doc.search(:a).each {|node|
209
+ node.remove_attribute("class")
210
+ }
211
+ doc.search(:label).each {|node|
212
+ node.remove_attribute("class")
213
+ }
214
+ doc.search(:img).each {|node|
215
+ node.remove_attribute("class")
216
+ }
217
+ doc.search(:input).each {|node|
218
+ node.remove_attribute("class")
219
+ }
220
+ doc.search(:ul).each {|node|
221
+ node.remove_attribute("class")
222
+ }
223
+ doc.search(:li).each {|node|
224
+ node.remove_attribute("class")
225
+ }
226
+ #open("t3.html", "w") {|f| f.print doc.to_s }
227
+
228
+ #$f = open("t4.html", "w")
229
+
230
+ num = 0
231
+ @friends = []
232
+ doc.xpath('//div[@aria-label="友達"]').each {|node|
233
+ gpa = node.parent.parent # get grand parent
234
+ person = parse_a_person(gpa)
235
+ #$f.puts person.inspect
236
+ #$f.puts "----------------------------------------------------------------------"
237
+ @friends << person
238
+ num += 1
239
+ }
240
+ #$f.puts num # 4993
241
+ #$f.close
242
+
243
+ return @friends
244
+ end
245
+
246
+ def friends_list_take1
247
+ ar << [p.url, p.name, p.num]
248
+ if num < 0
249
+ out << "#{p.url} #{p.name} -1\n"
250
+ else
251
+ out << "#{p.url} #{p.name}\n"
252
+ end
253
+ end
254
+
255
+ def friends_list
256
+ out = ""
257
+ @friends.sort.each {|p|
258
+ out << p.inspect + "\n"
259
+ }
260
+ return out
261
+ end
262
+ end
263
+ end
@@ -0,0 +1,3 @@
1
+ module FacebookDumper
2
+ VERSION = "0.1.0"
3
+ end
metadata ADDED
@@ -0,0 +1,61 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: facebook_dumper
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Koichiro Eto
8
+ autorequire:
9
+ bindir: exe
10
+ cert_chain: []
11
+ date: 2020-08-21 00:00:00.000000000 Z
12
+ dependencies: []
13
+ description: You can create a list of facebook friends.
14
+ email:
15
+ - k-eto@aist.go.jp
16
+ executables: []
17
+ extensions: []
18
+ extra_rdoc_files: []
19
+ files:
20
+ - ".gitignore"
21
+ - ".rspec"
22
+ - ".travis.yml"
23
+ - CODE_OF_CONDUCT.md
24
+ - Gemfile
25
+ - LICENSE.txt
26
+ - README.md
27
+ - Rakefile
28
+ - bin/console
29
+ - bin/facebook_dumper
30
+ - bin/setup
31
+ - facebook_dumper.gemspec
32
+ - lib/facebook_dumper.rb
33
+ - lib/facebook_dumper/version.rb
34
+ homepage: https://github.com/eto/facebook_dumper/
35
+ licenses:
36
+ - MIT
37
+ metadata:
38
+ allowed_push_host: https://rubygems.org/
39
+ homepage_uri: https://github.com/eto/facebook_dumper/
40
+ source_code_uri: https://github.com/eto/facebook_dumper/
41
+ changelog_uri: https://github.com/eto/facebook_dumper/
42
+ post_install_message:
43
+ rdoc_options: []
44
+ require_paths:
45
+ - lib
46
+ required_ruby_version: !ruby/object:Gem::Requirement
47
+ requirements:
48
+ - - ">="
49
+ - !ruby/object:Gem::Version
50
+ version: 2.3.0
51
+ required_rubygems_version: !ruby/object:Gem::Requirement
52
+ requirements:
53
+ - - ">="
54
+ - !ruby/object:Gem::Version
55
+ version: '0'
56
+ requirements: []
57
+ rubygems_version: 3.1.2
58
+ signing_key:
59
+ specification_version: 4
60
+ summary: Dump Facebook friends list from a web page.
61
+ test_files: []