char_detector 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 65f7fd5b56afb4495f46814a0b0fe7eca1926667db335156cae6e11d34252cd6
4
+ data.tar.gz: 6aab56d1925f648341ff07fcc1ceadfa3c60a06116cfb59cd7796ce21c10d857
5
+ SHA512:
6
+ metadata.gz: 4cacc22143704d0fc681ef46437c3439762026492969ca17e6ddf87b9e9032e41b66972c9dcce3b9f8c340929fce4606f5b1a34f271c2cfd9afb17c67547350b
7
+ data.tar.gz: 7137b411dc9496836e8be43d1c74d8855ccddfebc71b5a5411bf04cd07571f0aaaa383324a660a8ad31e93c086392ff1ecd98a3b53619c98bee017fc1aefc403
@@ -0,0 +1,27 @@
1
+ name: Ruby
2
+
3
+ on:
4
+ push:
5
+ branches:
6
+ - main
7
+
8
+ pull_request:
9
+
10
+ jobs:
11
+ build:
12
+ runs-on: ubuntu-latest
13
+ name: Ruby ${{ matrix.ruby }}
14
+ strategy:
15
+ matrix:
16
+ ruby:
17
+ - '3.1.1'
18
+
19
+ steps:
20
+ - uses: actions/checkout@v2
21
+ - name: Set up Ruby
22
+ uses: ruby/setup-ruby@v1
23
+ with:
24
+ ruby-version: ${{ matrix.ruby }}
25
+ bundler-cache: true
26
+ - name: Run the default task
27
+ run: bundle exec rake
data/.gitignore ADDED
@@ -0,0 +1,11 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /_yardoc/
4
+ /coverage/
5
+ /doc/
6
+ /pkg/
7
+ /spec/reports/
8
+ /tmp/
9
+
10
+ # rspec failure tracking
11
+ .rspec_status
data/.ruby-version ADDED
@@ -0,0 +1 @@
1
+ 3.1.1
data/.tool-versions ADDED
@@ -0,0 +1,2 @@
1
+ ruby 3.1.1
2
+
@@ -0,0 +1,74 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as
6
+ contributors and maintainers pledge to making participation in our project and
7
+ our community a harassment-free experience for everyone, regardless of age, body
8
+ size, disability, ethnicity, gender identity and expression, level of experience,
9
+ nationality, personal appearance, race, religion, or sexual identity and
10
+ orientation.
11
+
12
+ ## Our Standards
13
+
14
+ Examples of behavior that contributes to creating a positive environment
15
+ include:
16
+
17
+ * Using welcoming and inclusive language
18
+ * Being respectful of differing viewpoints and experiences
19
+ * Gracefully accepting constructive criticism
20
+ * Focusing on what is best for the community
21
+ * Showing empathy towards other community members
22
+
23
+ Examples of unacceptable behavior by participants include:
24
+
25
+ * The use of sexualized language or imagery and unwelcome sexual attention or
26
+ advances
27
+ * Trolling, insulting/derogatory comments, and personal or political attacks
28
+ * Public or private harassment
29
+ * Publishing others' private information, such as a physical or electronic
30
+ address, without explicit permission
31
+ * Other conduct which could reasonably be considered inappropriate in a
32
+ professional setting
33
+
34
+ ## Our Responsibilities
35
+
36
+ Project maintainers are responsible for clarifying the standards of acceptable
37
+ behavior and are expected to take appropriate and fair corrective action in
38
+ response to any instances of unacceptable behavior.
39
+
40
+ Project maintainers have the right and responsibility to remove, edit, or
41
+ reject comments, commits, code, wiki edits, issues, and other contributions
42
+ that are not aligned to this Code of Conduct, or to ban temporarily or
43
+ permanently any contributor for other behaviors that they deem inappropriate,
44
+ threatening, offensive, or harmful.
45
+
46
+ ## Scope
47
+
48
+ This Code of Conduct applies both within project spaces and in public spaces
49
+ when an individual is representing the project or its community. Examples of
50
+ representing a project or community include using an official project e-mail
51
+ address, posting via an official social media account, or acting as an appointed
52
+ representative at an online or offline event. Representation of a project may be
53
+ further defined and clarified by project maintainers.
54
+
55
+ ## Enforcement
56
+
57
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
58
+ reported by contacting the project team at ljw532344863@sina.com. All
59
+ complaints will be reviewed and investigated and will result in a response that
60
+ is deemed necessary and appropriate to the circumstances. The project team is
61
+ obligated to maintain confidentiality with regard to the reporter of an incident.
62
+ Further details of specific enforcement policies may be posted separately.
63
+
64
+ Project maintainers who do not follow or enforce the Code of Conduct in good
65
+ faith may face temporary or permanent repercussions as determined by other
66
+ members of the project's leadership.
67
+
68
+ ## Attribution
69
+
70
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71
+ available at [https://contributor-covenant.org/version/1/4][version]
72
+
73
+ [homepage]: https://contributor-covenant.org
74
+ [version]: https://contributor-covenant.org/version/1/4/
data/Gemfile ADDED
@@ -0,0 +1,5 @@
1
+ source "https://rubygems.org"
2
+
3
+ ruby File.read('.ruby-version')
4
+
5
+ gemspec
data/Gemfile.lock ADDED
@@ -0,0 +1,44 @@
1
+ PATH
2
+ remote: .
3
+ specs:
4
+ char_detector (0.1.0)
5
+
6
+ GEM
7
+ remote: https://rubygems.org/
8
+ specs:
9
+ coderay (1.1.3)
10
+ diff-lcs (1.5.0)
11
+ method_source (1.0.0)
12
+ pry (0.14.1)
13
+ coderay (~> 1.1)
14
+ method_source (~> 1.0)
15
+ rake (13.0.6)
16
+ rspec (3.11.0)
17
+ rspec-core (~> 3.11.0)
18
+ rspec-expectations (~> 3.11.0)
19
+ rspec-mocks (~> 3.11.0)
20
+ rspec-core (3.11.0)
21
+ rspec-support (~> 3.11.0)
22
+ rspec-expectations (3.11.0)
23
+ diff-lcs (>= 1.2.0, < 2.0)
24
+ rspec-support (~> 3.11.0)
25
+ rspec-mocks (3.11.1)
26
+ diff-lcs (>= 1.2.0, < 2.0)
27
+ rspec-support (~> 3.11.0)
28
+ rspec-support (3.11.0)
29
+
30
+ PLATFORMS
31
+ x86_64-darwin-20
32
+ x86_64-linux
33
+
34
+ DEPENDENCIES
35
+ char_detector!
36
+ pry
37
+ rake
38
+ rspec
39
+
40
+ RUBY VERSION
41
+ ruby 3.1.1p18
42
+
43
+ BUNDLED WITH
44
+ 2.3.7
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2022 lijunwei
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,113 @@
1
+ # CharDetector
2
+
3
+ ## What
4
+
5
+ This tool helps find whether a file contains [ASCII control characters](https://theasciicode.com.ar/), print location info if true.
6
+
7
+ ## Why
8
+
9
+ It's annoying while working with sublime search. If a file somehow contains a ASCII control characters, it'll be treated as "binary" and won't show the result preview. And this kind of "binary" files tweaks the behavior of text editor.
10
+
11
+ So I'd like to find these files and get rid of these control characters very much.
12
+
13
+ ![](./image/Xnip2022-05-08_13-52-58.png)
14
+
15
+ Ref: [ASCII table , ascii codes](https://theasciicode.com.ar/)
16
+
17
+ ## Roadmap
18
+
19
+ + Google keyword: "sublime search result binary"
20
+ + result: https://stackoverflow.com/questions/26030179/sublime-text-find-in-files-gives-binary-in-the-find-results
21
+ + result: https://exchangetuts.com/sublime-text-find-in-files-gives-binary-in-the-find-results-1640166664392553
22
+ + these two results above are not helpful
23
+ + result: https://blog.csdn.net/wozhouwang/article/details/101672976 (this one actually is the solution in python, but I didn't understand "control character" yet)
24
+ + I reviewed [character encoding notes/unicode](https://github.com/liijunwei/practice/tree/main/unicode)
25
+ + I found that if I open the abnormal file with vim, the character show up, I google search taught me that they are control characters
26
+ + [search with [[:cntrl:]] in vim](https://stackoverflow.com/questions/3844311/how-do-i-replace-or-find-non-printable-characters-in-vim-regex)
27
+ + I found a word called ["POSIX bracket expressions"](https://www.regular-expressions.info/posixbrackets.html)
28
+ + I found the ruby version in [Ruby Core Doc@Regexp](https://ruby-doc.org/core-3.1.2/Regexp.html#class-Regexp-label-Character+Properties)
29
+ + I wrote a temperary ruby script scanning abnormal files
30
+ + I setup a ruby gem and TDD my script
31
+ + bundle gem char_detector
32
+ + Test and fix and refactor
33
+
34
+ + TODO refactor
35
+ + TODO add features(If I have more requirements)
36
+ + TODO format output(refer to rubocop's output) Thanks Freebird
37
+
38
+ ## Installation(WIP)
39
+
40
+ Add this line to your application's Gemfile:
41
+
42
+ + TODO publish this gem to https://rubygems.org/
43
+
44
+ ```ruby
45
+ gem 'char_detector'
46
+ ```
47
+
48
+ And then execute:
49
+
50
+ ```bash
51
+ bundle install
52
+ ```
53
+
54
+ Or install it yourself as:
55
+
56
+ ```bash
57
+ gem install char_detector
58
+ ```
59
+
60
+ ## Usage
61
+
62
+ ### demo on detecting in single file
63
+
64
+ ```bash
65
+ bin/char_detector -f spec/samples/sample0.txt
66
+ bin/char_detector -f spec/samples/sample1-newline.txt
67
+ bin/char_detector -f spec/samples/sample2.txt
68
+ bin/char_detector -f spec/samples/sample3.txt
69
+ bin/char_detector -f spec/samples/sample4.txt
70
+ bin/char_detector -f spec/samples/sample5.txt
71
+ ```
72
+
73
+ ![](./image/Xnip2022-05-08_14-10-52.png)
74
+
75
+ ### demo on detecting in file directory and file pattern
76
+
77
+ ```bash
78
+ # parallel is GNU parallel to speed up the scanning process
79
+ # macos: https://formulae.brew.sh/formula/parallel
80
+
81
+ # with pattern
82
+ parallel --progress --timeout 50 --retries 3 "bin/char_detector -f {}" ::: $(find spec/samples -name \*.txt)
83
+ parallel --progress --timeout 50 --retries 3 "bin/char_detector -f {}" ::: $(find spec/samples -name \*.md)
84
+ parallel --progress --timeout 50 --retries 3 "bin/char_detector -f {}" ::: $(find spec/samples -name \*.rb)
85
+
86
+ # for all files in directory
87
+ # may have unexpected errors :(
88
+ parallel --progress --timeout 50 --retries 3 "bin/char_detector -f {}" ::: $(find spec/samples -type f)
89
+ # same as
90
+ find spec/samples -type f | parallel --progress --timeout 50 --retries 3 "bin/char_detector -f {}"
91
+ # same as
92
+ find spec/samples -type f | xargs -I {} bin/char_detector -f {} # this one is slow, so GNU parallel is recommended :D
93
+ ```
94
+ ![](./image/Xnip2022-05-08_14-11-04.png)
95
+
96
+ ## Development
97
+
98
+ After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
99
+
100
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
101
+
102
+ ## Contributing
103
+
104
+ Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/char_detector. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/[USERNAME]/char_detector/blob/master/CODE_OF_CONDUCT.md).
105
+
106
+
107
+ ## License
108
+
109
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
110
+
111
+ ## Code of Conduct
112
+
113
+ Everyone interacting in the CharDetector project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/[USERNAME]/char_detector/blob/master/CODE_OF_CONDUCT.md).
data/Rakefile ADDED
@@ -0,0 +1,6 @@
1
+ require "bundler/gem_tasks"
2
+ require "rspec/core/rake_task"
3
+
4
+ RSpec::Core::RakeTask.new(:spec)
5
+
6
+ task :default => :spec
data/bin/char_detector ADDED
@@ -0,0 +1,44 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "char_detector"
5
+ require 'optparse'
6
+ require 'pry'
7
+ require 'json'
8
+
9
+ options = {}
10
+ OptionParser.new do |opts|
11
+ opts.banner = "Usage: char_detector [options]"
12
+
13
+ opts.on("-h", "--help", "Show this message.") do
14
+ puts opts
15
+ exit 0
16
+ end
17
+
18
+ opts.on("-f FILE", "--file FILE", "file to scan") do |v|
19
+ options[:file] = v
20
+ end
21
+
22
+ opts.on("-p", "--pretty", "pretty report.") do
23
+ options[:pretty] = true
24
+ end
25
+
26
+ opts.on("-v", "--version", "cli version") do |v|
27
+ puts "CharDetector v#{CharDetector::VERSION}"
28
+ exit(0)
29
+ end
30
+ end.parse!
31
+
32
+ info = {}
33
+ info[:file] = options[:file]
34
+ info[:result] = CharDetector::Engine.new(file: options[:file]).scan
35
+
36
+ unless info[:result].empty?
37
+ if options[:pretty]
38
+ jj info
39
+ else
40
+ puts info.to_json
41
+ end
42
+ end
43
+
44
+ exit 0
data/bin/console ADDED
@@ -0,0 +1,10 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "char_detector"
5
+
6
+ # You can add fixtures and/or initialization code here to make experimenting
7
+ # with your gem easier. You can also use a different console, if you like.
8
+
9
+ require "pry"
10
+ Pry.start
data/bin/setup ADDED
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
7
+
8
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,29 @@
1
+ require_relative 'lib/char_detector/version'
2
+
3
+ Gem::Specification.new do |spec|
4
+ spec.name = "char_detector"
5
+ spec.version = CharDetector::VERSION
6
+ spec.authors = ["lijunwei"]
7
+ spec.email = ["ljw532344863@sina.com"]
8
+
9
+ spec.summary = %q{Initially used for detecting control character in markdown/ruby file.}
10
+ spec.description = %q{Initially used for detecting control character in markdown/ruby file.}
11
+ spec.homepage = "https://github.com/liijunwei/char_detector"
12
+ spec.license = "MIT"
13
+ spec.required_ruby_version = Gem::Requirement.new(">= 2.7.0")
14
+
15
+ spec.metadata["allowed_push_host"] = "https://rubygems.org/"
16
+
17
+ spec.metadata["homepage_uri"] = spec.homepage
18
+ spec.metadata["source_code_uri"] = "https://github.com/liijunwei/char_detector"
19
+ spec.metadata["changelog_uri"] = "https://github.com/liijunwei/char_detector"
20
+
21
+ spec.files = Dir.chdir(File.expand_path('..', __FILE__)) do
22
+ `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
23
+ end
24
+ spec.add_development_dependency "pry"
25
+ spec.add_development_dependency "rake"
26
+ spec.add_development_dependency "rspec"
27
+ spec.executables = ["char_detector"]
28
+ spec.require_paths = ["lib"]
29
+ end
Binary file
Binary file
Binary file
@@ -0,0 +1,34 @@
1
+ class CharDetector::Engine
2
+ def initialize(file:)
3
+ @file = file
4
+ end
5
+
6
+ attr_reader :file
7
+
8
+ def scan
9
+ matches = []
10
+
11
+ File.readlines(file).each_with_index do |line, index|
12
+ scanned = trim_newline(line).scan(/\p{Cntrl}/)
13
+ scanned -= trim_newline(line).scan(/\p{Space}/)
14
+
15
+ next if scanned.empty?
16
+
17
+ hash = {}
18
+ hash[:line] = index + 1
19
+ hash[:columns] = scanned.map { |e| line.index(e)+1 }
20
+
21
+ matches << hash
22
+ end
23
+
24
+ return matches
25
+ end
26
+
27
+ def trim_newline(line)
28
+ if line.end_with?("\n")
29
+ line[...-1]
30
+ else
31
+ line
32
+ end
33
+ end
34
+ end
@@ -0,0 +1,3 @@
1
+ module CharDetector
2
+ VERSION = "0.1.0"
3
+ end
@@ -0,0 +1,7 @@
1
+ require "char_detector/version"
2
+ require "char_detector/engine"
3
+
4
+ module CharDetector
5
+ class Error < StandardError; end
6
+ # Your code goes here...
7
+ end
metadata ADDED
@@ -0,0 +1,110 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: char_detector
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - lijunwei
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2022-05-09 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: pry
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ">="
18
+ - !ruby/object:Gem::Version
19
+ version: '0'
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - ">="
25
+ - !ruby/object:Gem::Version
26
+ version: '0'
27
+ - !ruby/object:Gem::Dependency
28
+ name: rake
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - ">="
32
+ - !ruby/object:Gem::Version
33
+ version: '0'
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - ">="
39
+ - !ruby/object:Gem::Version
40
+ version: '0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: rspec
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - ">="
46
+ - !ruby/object:Gem::Version
47
+ version: '0'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - ">="
53
+ - !ruby/object:Gem::Version
54
+ version: '0'
55
+ description: Initially used for detecting control character in markdown/ruby file.
56
+ email:
57
+ - ljw532344863@sina.com
58
+ executables:
59
+ - char_detector
60
+ extensions: []
61
+ extra_rdoc_files: []
62
+ files:
63
+ - ".github/workflows/main.yml"
64
+ - ".gitignore"
65
+ - ".ruby-version"
66
+ - ".tool-versions"
67
+ - CODE_OF_CONDUCT.md
68
+ - Gemfile
69
+ - Gemfile.lock
70
+ - LICENSE.txt
71
+ - README.md
72
+ - Rakefile
73
+ - bin/char_detector
74
+ - bin/console
75
+ - bin/setup
76
+ - char_detector.gemspec
77
+ - image/Xnip2022-05-08_13-52-58.png
78
+ - image/Xnip2022-05-08_14-10-52.png
79
+ - image/Xnip2022-05-08_14-11-04.png
80
+ - lib/char_detector.rb
81
+ - lib/char_detector/engine.rb
82
+ - lib/char_detector/version.rb
83
+ homepage: https://github.com/liijunwei/char_detector
84
+ licenses:
85
+ - MIT
86
+ metadata:
87
+ allowed_push_host: https://rubygems.org/
88
+ homepage_uri: https://github.com/liijunwei/char_detector
89
+ source_code_uri: https://github.com/liijunwei/char_detector
90
+ changelog_uri: https://github.com/liijunwei/char_detector
91
+ post_install_message:
92
+ rdoc_options: []
93
+ require_paths:
94
+ - lib
95
+ required_ruby_version: !ruby/object:Gem::Requirement
96
+ requirements:
97
+ - - ">="
98
+ - !ruby/object:Gem::Version
99
+ version: 2.7.0
100
+ required_rubygems_version: !ruby/object:Gem::Requirement
101
+ requirements:
102
+ - - ">="
103
+ - !ruby/object:Gem::Version
104
+ version: '0'
105
+ requirements: []
106
+ rubygems_version: 3.3.7
107
+ signing_key:
108
+ specification_version: 4
109
+ summary: Initially used for detecting control character in markdown/ruby file.
110
+ test_files: []