RubyGems - markly - Versions diffs - 0.14.1 → 0.15.1 - Mend

markly 0.14.1 → 0.15.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

checksums.yaml +4 -4
checksums.yaml.gz.sig +0 -0
data/context/abstract-syntax-tree.md +95 -0
data/context/getting-started.md +101 -0
data/context/headings.md +116 -0
data/context/index.yaml +20 -0
data/lib/markly/node.rb +6 -7
data/lib/markly/renderer/headings.rb +81 -0
data/lib/markly/renderer/html.rb +12 -9
data/lib/markly/version.rb +1 -1
data/readme.md +10 -0
data/releases.md +8 -0
data.tar.gz.sig +0 -0
metadata +6 -1
metadata.gz.sig +0 -0

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 7eb58c81c8fca95217e5a1469baf3034785126d6ff42e9929a7d074e358b9508
-  data.tar.gz: f28760fe0280ce1ab132d7ec639584284dde3ba8077e68f2cd3617715d1211cb
+  metadata.gz: 712a0a7ad856598eb83b20c370e7613896596af5c76cc2379a368382802f09db
+  data.tar.gz: f775598d6fc6e4d5e340bf68518816d3bd23bc44facd209fcd59bd4f0d429639
 SHA512:
-  metadata.gz: ded6fff6a6355a8e78b2daa19a1581d982a35ddbcbddc38efe25d1866e9ce6136ea6eea62c1638874ff9953546a176df11e6ce462c9131ab38648185dec50a7a
-  data.tar.gz: ccc6eb379235978b3a512dee8553a425ca456b01e9f60b10d16927ae1862cda069152fcc62a02a1a60d46b274a1c8aa760183a636c528b8cdc7fdf43a367abd8
+  metadata.gz: f13cbaaba186b5716efb05567721ea024633f0c353f7296721406dd7d4f7918625e5518ae4836371a784808ae99de8d4831bf938df6595ed26da462398c6e39f
+  data.tar.gz: a92d4ee382baaf2bea98defccfd318142f0fdccf42ff5fca5293c17bf611e432d21e178f3d69a272f0c021309d3725454e93ea401d7473495ff4620490071d5d

checksums.yaml.gz.sig CHANGED Viewed

Binary file

data/context/abstract-syntax-tree.md ADDED Viewed

@@ -0,0 +1,95 @@
+# Abstract Syntax Tree
+This guide explains how to use Markly's abstract syntax tree (AST) to parse and manipulate Markdown documents.
+## Parsing
+You can parse Markdown to a `Document` node using `Markly.parse`:
+~~~ ruby
+require 'markly'
+document = Markly.parse('*Hello* world')
+pp document
+~~~
+This will print out the following:
+~~~
+#<Markly::Node(document):
+	source_position={:start_line=>1, :start_column=>1, :end_line=>1, :end_column=>13}
+	children=[#<Markly::Node(paragraph):
+			 source_position={:start_line=>1, :start_column=>1, :end_line=>1, :end_column=>13}
+			 children=[#<Markly::Node(emph):
+						source_position={:start_line=>1, :start_column=>1, :end_line=>1, :end_column=>7}
+						children=[#<Markly::Node(text): source_position={:start_line=>1, :start_column=>2, :end_line=>1, :end_column=>6}, string_content="Hello">]>,
+					#<Markly::Node(text): source_position={:start_line=>1, :start_column=>8, :end_line=>1, :end_column=>13}, string_content=" world">]>]>
+~~~
+As you can see, a document consists of a root node, which contains several children, they themselves containing children, and so on. We refer to this as the abstract syntax tree (AST).
+## Example: Walking the AST
+You can use `walk` or `each` to iterate over nodes:
+	- `walk` will iterate on a node and recursively iterate on a node's children.
+	- `each` will iterate on a node and its children, but no further.
+<!-- end list -->
+``` ruby
+require 'markly'
+document = Markly.parse("# The site\n\n [GitHub](https://www.github.com)")
+# Walk tree and print out URLs for links:
+document.walk do |node|
+	if node.type == :link
+		puts "URL = #{node.url}"
+	end
+end
+# Capitalize all regular text in headers:
+document.walk do |node|
+	if node.type == :header
+		node.each do |subnode|
+			if subnode.type == :text
+				subnode.string_content = subnode.string_content.upcase
+			end
+		end
+	end
+end
+# Transform links to regular text:
+document.walk do |node|
+	if node.type == :link
+		node.insert_before(node.first_child)
+		node.delete
+	end
+end
+```
+### Creating a Custom Renderer
+You can also derive a class from {ruby Markly::Renderer::HTML} class. Using a pure Ruby renderer is slower, but allows you to customize the output. For example:
+``` ruby
+class MyHtmlRenderer < Markly::Renderer::HTML
+	def initialize
+		super
+		@header_id = 1
+	end
+	def header(node)
+		block do
+			out("<h", node.header_level, " id=\"", @header_id, "\">",
+							 :children, "</h", node.header_level, ">")
+			@header_id += 1
+		end
+	end
+end
+my_renderer = MyHtmlRenderer.new
+puts my_renderer.render(document)
+```

data/context/getting-started.md ADDED Viewed

@@ -0,0 +1,101 @@
+# Getting Started
+This guide explains now to install and use Markly.
+## Installation
+Add the gem to your project:
+	$ bundle add markly
+## Usage
+Markly's most basic usage is to convert Markdown to HTML. You can do this in a few ways:
+~~~ ruby
+require 'markly'
+Markly.render_html('Hi *there*')
+# <p>Hi <em>there</em></p>\n
+~~~
+You can also parse a string to receive a `Document` node. You can then print that node to HTML, iterate over the children, and other fun node stuff. For example:
+~~~ ruby
+require 'markly'
+document = Markly.parse('*Hello* world')
+puts(document.to_html) # <p>Hi <em>there</em></p>\n
+document.walk do |node|
+	puts node.type # [:document, :paragraph, :text, :emph, :text]
+end
+~~~
+## Options
+Markly accepts integer flags which control how the Markdown is parsed and rendered.
+### Parse Options
+| Name                                 | Description
+| ------------------------------------ | -----------
+| `Markly::DEFAULT`                    | The default parsing system.
+| `Markly::UNSAFE`                     | Allow raw/custom HTML and unsafe links.
+| `Markly::FOOTNOTES`                  | Parse footnotes.
+| `Markly::LIBERAL_HTML_TAG`           | Support liberal parsing of inline HTML tags.
+| `Markly::SMART`                      | Use smart punctuation (curly quotes, etc.).
+| `Markly::STRIKETHROUGH_DOUBLE_TILDE` | Parse strikethroughs by double tildes (compatibility with [redcarpet](https://github.com/vmg/redcarpet))
+| `Markly::VALIDATE_UTF8`              | Replace illegal sequences with the replacement character `U+FFFD`.
+### Render Options
+| Name                                    | Description                                                     |
+| --------------------------------------- | --------------------------------------------------------------- |
+| `Markly::DEFAULT`                       | The default rendering system.                                   |
+| `Markly::UNSAFE`                        | Allow raw/custom HTML and unsafe links.                         |
+| `Markly::GITHUB_PRE_LANG`               | Use GitHub-style `<pre lang>` for fenced code blocks.           |
+| `Markly::HARD_BREAKS`                   | Treat `\n` as hardbreaks (by adding `<br/>`).                   |
+| `Markly::NO_BREAKS`                     | Translate `\n` in the source to a single whitespace.            |
+| `Markly::SOURCE_POSITION`               | Include source position in rendered HTML.                       |
+| `Markly::TABLE_PREFER_STYLE_ATTRIBUTES` | Use `style` insted of `align` for table cells.                  |
+| `Markly::FULL_INFO_STRING`              | Include full info strings of code blocks in separate attribute. |
+### Passing Options
+To apply a single option, pass it in as a flags option:
+``` ruby
+Markly.parse("\"Hello,\" said the spider.", flags: Markly::SMART)
+# <p>“Hello,” said the spider.</p>\n
+```
+To have multiple options applied, `|` (or) the flags together:
+``` ruby
+Markly.render_html("\"'Shelob' is my name.\"", flags: Markly::HARD_BREAKS|Markly::SOURCE_POSITION)
+```
+## Extensions
+Both `render_html` and `parse` take an optional `extensions:` argument defining the extensions you want enabled as your CommonMark document is being processed:
+``` ruby
+Markly.render_html("<script>hi</script>", flags: Markly::UNSAFE, extensions: [:tagfilter])
+```
+The documentation for these extensions are [defined in this spec](https://github.github.com/gfm/), and the rationale is provided [in this blog post](https://githubengineering.com/a-formal-spec-for-github-markdown/).
+The available extensions are:
+  - `:table` - This provides support for tables.
+  - `:tasklist` - This provides support for task list items.
+  - `:strikethrough` - This provides support for strikethroughs.
+  - `:autolink` - This provides support for automatically converting URLs to anchor tags.
+  - `:tagfilter` - This escapes [several "unsafe" HTML tags](https://github.github.com/gfm/#disallowed-raw-html-extension-), causing them to not have any effect.
+## Developing Locally
+After cloning the repo:
+	$ bake build test

data/context/headings.md ADDED Viewed

@@ -0,0 +1,116 @@
+# Headings
+This guide explains how to work with headings in Markly, including extracting them for navigation and handling duplicate heading text.
+## Unique ID Generation
+When rendering HTML with `ids: true`, duplicate heading text automatically gets unique IDs to avoid collisions. This is particularly useful when multiple sections have the same title (e.g., multiple "Deployment" sections under different parent headings).
+``` ruby
+markdown = <<~MARKDOWN
+  ## Kubernetes
+  ### Deployment
+  ## Systemd
+  ### Deployment
+MARKDOWN
+renderer = Markly::Renderer::HTML.new(ids: true)
+html = renderer.render(Markly.parse(markdown))
+# Generates:
+# <section id="kubernetes">...</section>
+# <section id="deployment">...</section>
+# <section id="systemd">...</section>
+# <section id="deployment-2">...</section>
+```
+The first occurrence gets the clean ID, subsequent duplicates get numbered suffixes (`-2`, `-3`, etc.).
+## Extracting Headings for Table of Contents
+The `Headings` class can extract headings for building navigation or table of contents:
+``` ruby
+document = Markly.parse(markdown)
+headings = Markly::Renderer::Headings.extract(document, min_level: 2, max_level: 3)
+headings.each do |heading|
+  puts "#{heading.level}: #{heading.text} (#{heading.anchor})"
+end
+# Output:
+# 2: Kubernetes (kubernetes)
+# 3: Deployment (deployment)
+# 2: Systemd (systemd)
+# 3: Deployment (deployment-2)
+```
+Each `Heading` object has:
+- `level` - The heading level (1-6)
+- `text` - The plain text content
+- `anchor` - The unique ID/anchor
+- `node` - The original Markly AST node
+### Level Filtering
+Use `min_level` and `max_level` to filter which heading levels to extract:
+``` ruby
+# Only extract h2 and h3 headings
+headings = Markly::Renderer::Headings.extract(document, min_level: 2, max_level: 3)
+# Only h1 headings
+headings = Markly::Renderer::Headings.extract(document, min_level: 1, max_level: 1)
+```
+## Custom Heading Strategies
+For advanced use cases, you can provide a custom `Headings` instance to the HTML renderer:
+### Sharing State Across Documents
+To ensure IDs remain unique across multiple documents:
+``` ruby
+# Share heading state across multiple documents
+headings = Markly::Renderer::Headings.new
+renderer = Markly::Renderer::HTML.new(headings: headings)
+doc1_html = renderer.render(Markly.parse(doc1_markdown))
+doc2_html = renderer.render(Markly.parse(doc2_markdown))
+# IDs remain unique across both documents
+```
+### Custom ID Generation
+Subclass `Headings` to implement alternative ID generation strategies:
+``` ruby
+class HierarchicalHeadings < Markly::Renderer::Headings
+  def initialize
+    super
+    @parent_context = []
+  end
+  def anchor_for(node)
+    base = base_anchor_for(node)
+    # Custom logic: could incorporate parent heading context
+    # to generate IDs like "kubernetes-deployment" instead of "deployment-2"
+    if @ids.key?(base)
+      @ids[base] += 1
+      "#{base}-#{@ids[base]}"
+    else
+      @ids[base] = 1
+      base
+    end
+  end
+end
+renderer = Markly::Renderer::HTML.new(headings: HierarchicalHeadings.new)
+```

data/context/index.yaml ADDED Viewed

@@ -0,0 +1,20 @@
+# Automatically generated context index for Utopia::Project guides.
+# Do not edit then files in this directory directly, instead edit the guides and then run `bake utopia:project:agent:context:update`.
+---
+description: CommonMark parser and renderer. Written in C, wrapped in Ruby.
+metadata:
+  documentation_uri: https://ioquatix.github.io/markly/
+  funding_uri: https://github.com/sponsors/ioquatix/
+  source_code_uri: https://github.com/ioquatix/markly.git
+files:
+- path: getting-started.md
+  title: Getting Started
+  description: This guide explains now to install and use Markly.
+- path: abstract-syntax-tree.md
+  title: Abstract Syntax Tree
+  description: This guide explains how to use Markly's abstract syntax tree (AST)
+    to parse and manipulate Markdown documents.
+- path: headings.md
+  title: Headings
+  description: This guide explains how to work with headings in Markly, including
+    extracting them for navigation and handling duplicate heading text.

data/lib/markly/node.rb CHANGED Viewed

@@ -29,7 +29,7 @@ module Markly
 		# Public: An iterator that "walks the tree," descending into children recursively.
 		#
-		# blk - A {Proc} representing the action to take for each child
+		# block - A {Proc} representing the action to take for each child
 		def walk(&block)
 			return enum_for(:walk) unless block_given?
@@ -41,7 +41,7 @@ module Markly
 		# Public: Convert the node to an HTML string.
 		#
-		# options - A {Symbol} or {Array of Symbol}s indicating the render options
+		# flags - A {Symbol} or {Array of Symbol}s indicating the render options
 		# extensions - An {Array of Symbol}s indicating the extensions to use
 		#
 		# Returns a {String}.
@@ -51,7 +51,7 @@ module Markly
 		# Public: Convert the node to a CommonMark string.
 		#
-		# options - A {Symbol} or {Array of Symbol}s indicating the render options
+		# flags - A {Symbol} or {Array of Symbol}s indicating the render options
 		# width - Column to wrap the output at
 		#
 		# Returns a {String}.
@@ -63,7 +63,7 @@ module Markly
 		# Public: Convert the node to a plain text string.
 		#
-		# options - A {Symbol} or {Array of Symbol}s indicating the render options
+		# flags - A {Symbol} or {Array of Symbol}s indicating the render options
 		# width - Column to wrap the output at
 		#
 		# Returns a {String}.
@@ -106,7 +106,6 @@ module Markly
 		# Replace a section (header + content) with a new node.
 		#
-		# @parameter title [String] the title of the section to replace.
 		# @parameter new_node [Markly::Node] the node to replace the section with.
 		# @parameter replace_header [Boolean] whether to replace the header itself or not.
 		# @parameter remove_subsections [Boolean] whether to remove subsections or not.
@@ -132,7 +131,7 @@ module Markly
 		# Append the given node after the current node.
 		#
-		# It's okay to provide a document node, it's children will be appended.
+		# It's okay to provide a document node, its children will be appended.
 		#
 		# @parameter node [Markly::Node] the node to append.
 		def append_after(node)
@@ -151,7 +150,7 @@ module Markly
 		# Append the given node before the current node.
 		#
-		# It's okay to provide a document node, it's children will be appended.
+		# It's okay to provide a document node, its children will be appended.
 		#
 		# @parameter node [Markly::Node] the node to append.
 		def append_before(node)

data/lib/markly/renderer/headings.rb ADDED Viewed

@@ -0,0 +1,81 @@
+# frozen_string_literal: true
+# Released under the MIT License.
+# Copyright, 2025, by Samuel Williams.
+module Markly
+	module Renderer
+		# Extracts headings from a markdown document with unique anchor IDs.
+		# Handles duplicate heading text by appending counters (e.g., "deployment", "deployment-2", "deployment-3").
+		class Headings
+			def initialize
+				@ids = {}
+			end
+			# Generate a unique anchor for a node.
+			# @parameter node [Markly::Node] The heading node
+			# @returns [String] A unique anchor ID
+			def anchor_for(node)
+				base = base_anchor_for(node)
+				if @ids.key?(base)
+					@ids[base] += 1
+					"#{base}-#{@ids[base]}"
+				else
+					@ids[base] = 1
+					base
+				end
+			end
+			# Extract all headings from a document root with unique anchors.
+			# @parameter root [Markly::Node] The document root node
+			# @parameter min_level [Integer] Minimum heading level to extract (default: 1)
+			# @parameter max_level [Integer] Maximum heading level to extract (default: 6)
+			# @returns [Array<Heading>] Array of heading objects with unique anchors
+			def extract(root, min_level: 1, max_level: 6)
+				headings = []
+				root.walk do |node|
+					if node.type == :header
+						level = node.header_level
+						next if level < min_level || level > max_level
+						headings << Heading.new(
+							node: node,
+							level: level,
+							text: node.to_plaintext.chomp,
+							anchor: anchor_for(node)
+						)
+					end
+				end
+				headings
+			end
+			# Class method for convenience - creates a new instance and extracts headings.
+			# @parameter root [Markly::Node] The document root node
+			# @parameter min_level [Integer] Minimum heading level to extract (default: 1)
+			# @parameter max_level [Integer] Maximum heading level to extract (default: 6)
+			# @returns [Array<Heading>] Array of heading objects with unique anchors
+			def self.extract(root, min_level: 1, max_level: 6)
+				new.extract(root, min_level: min_level, max_level: max_level)
+			end
+			private
+			# Generate a base anchor from a node's text content.
+			# @parameter node [Markly::Node] The heading node
+			# @returns [String] The base anchor (lowercase, hyphenated)
+			def base_anchor_for(node)
+				text = node.to_plaintext.chomp.downcase
+				text.gsub(/\s+/, "-")
+			end
+		end
+		# Represents a heading extracted from a document.
+		# @attribute node [Markly::Node] The original heading node
+		# @attribute level [Integer] The heading level (1-6)
+		# @attribute text [String] The plain text content of the heading
+		# @attribute anchor [String] The unique anchor ID for this heading
+		Heading = Struct.new(:node, :level, :text, :anchor, keyword_init: true)
+	end
+end

data/lib/markly/renderer/html.rb CHANGED Viewed

@@ -9,15 +9,18 @@
 # Copyright, 2020-2025, by Samuel Williams.
 require_relative "generic"
+require_relative "headings"
 require "cgi"
 module Markly
 	module Renderer
 		class HTML < Generic
-			def initialize(ids: false, tight: false, **options)
+			def initialize(ids: false, headings: nil, tight: false, **options)
 				super(**options)
-				@ids = ids
+				# Initialize heading tracker if IDs are enabled
+				@headings = headings || (ids ? Headings.new : nil)
 				@section = nil
 				@tight = tight
@@ -32,8 +35,8 @@ module Markly
 			end
 			def id_for(node)
-				if @ids
-					anchor = self.class.anchor_for(node)
+				if @headings
+					anchor = @headings.anchor_for(node)
 					return " id=\"#{CGI.escape_html anchor}\""
 				end
 			end
@@ -54,7 +57,7 @@ module Markly
 			def header(node)
 				block do
-					if @ids
+					if @headings
 						out("</section>") if @section
 						@section = true
 						out("<section#{id_for(node)}>")
@@ -253,10 +256,10 @@ module Markly
 			end
 			TABLE_CELL_ALIGNMENT = {
-				left: ' align="left"',
-				right: ' align="right"',
-				center: ' align="center"'
-			}.freeze
+						left: ' align="left"',
+						right: ' align="right"',
+						center: ' align="center"'
+					}.freeze
 			def table_cell(node)
 				align = TABLE_CELL_ALIGNMENT.fetch(@alignments[@column_index], "")

data/lib/markly/version.rb CHANGED Viewed

@@ -7,5 +7,5 @@
 # Copyright, 2020-2025, by Samuel Williams.
 module Markly
-	VERSION = "0.14.1"
+	VERSION = "0.15.1"
 end

data/readme.md CHANGED Viewed

@@ -22,10 +22,20 @@ Please see the [project documentation](https://ioquatix.github.io/markly/) for m
   - [Abstract Syntax Tree](https://ioquatix.github.io/markly/guides/abstract-syntax-tree/index) - This guide explains how to use Markly's abstract syntax tree (AST) to parse and manipulate Markdown documents.
+  - [Headings](https://ioquatix.github.io/markly/guides/headings/index) - This guide explains how to work with headings in Markly, including extracting them for navigation and handling duplicate heading text.
 ## Releases
 Please see the [project releases](https://ioquatix.github.io/markly/releases/index) for all releases.
+### v0.15.1
+  - Add agent context.
+### v0.15.0
+  - Introduced `Markly::Renderer::Headings` class for extracting headings from markdown documents with automatic duplicate ID resolution. When rendering HTML with `ids: true`, duplicate heading text now automatically gets unique IDs (`deployment`, `deployment-2`, `deployment-3`). The `Headings` class can also be used to extract headings for building navigation or table of contents.
 ### v0.14.0
   - Expose `Markly::Renderer::HTML.anchor_for` method to generate URL-safe anchors from headers.

data/releases.md CHANGED Viewed

@@ -1,5 +1,13 @@
 # Releases
+## v0.15.1
+  - Add agent context.
+## v0.15.0
+  - Introduced `Markly::Renderer::Headings` class for extracting headings from markdown documents with automatic duplicate ID resolution. When rendering HTML with `ids: true`, duplicate heading text now automatically gets unique IDs (`deployment`, `deployment-2`, `deployment-3`). The `Headings` class can also be used to extract headings for building navigation or table of contents.
 ## v0.14.0
   - Expose `Markly::Renderer::HTML.anchor_for` method to generate URL-safe anchors from headers.

data.tar.gz.sig CHANGED Viewed

Binary file

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: markly
 version: !ruby/object:Gem::Version
-  version: 0.14.1
+  version: 0.15.1
 platform: ruby
 authors:
 - Garen Torikian
@@ -62,6 +62,10 @@ extensions:
 - ext/markly/extconf.rb
 extra_rdoc_files: []
 files:
+- context/abstract-syntax-tree.md
+- context/getting-started.md
+- context/headings.md
+- context/index.yaml
 - ext/markly/arena.c
 - ext/markly/autolink.c
 - ext/markly/autolink.h
@@ -138,6 +142,7 @@ files:
 - lib/markly/node.rb
 - lib/markly/node/inspect.rb
 - lib/markly/renderer/generic.rb
+- lib/markly/renderer/headings.rb
 - lib/markly/renderer/html.rb
 - lib/markly/version.rb
 - license.md

metadata.gz.sig CHANGED Viewed

Binary file