Skip to content

Enable break_on_newline extension by default for Markdown#1628

Open
st0012 wants to merge 1 commit intomasterfrom
enable-break-on-newline
Open

Enable break_on_newline extension by default for Markdown#1628
st0012 wants to merge 1 commit intomasterfrom
enable-break-on-newline

Conversation

@st0012
Copy link
Copy Markdown
Member

@st0012 st0012 commented Mar 1, 2026

Summary

  • Enable the existing break_on_newline extension in DEFAULT_EXTENSIONS so the Markdown parser converts soft line breaks to HardBreak objects at parse time
  • This produces visible <br> line breaks in HTML output, matching GFM rendering where > Foo\nbar shows "Foo" and "bar" on separate lines
  • The conversion happens in the Markdown parser's paragraph() method, which is the proper place for Markdown-specific behavior — no changes to the generic accept_paragraph in ToHtml
  • Also removes the CJK-aware newline-to-space gsub from accept_paragraph that is no longer needed (the RDoc markup parser already handles newline joining at parse time in build_paragraph)

Ref: #1550 (comment)

@matzbot
Copy link
Copy Markdown
Collaborator

matzbot commented Mar 1, 2026

🚀 Preview deployment available at: https://3d655dba.rdoc-6cd.pages.dev (commit: d62b032)

@st0012 st0012 added the bug label Mar 1, 2026
Enable the existing break_on_newline extension in DEFAULT_EXTENSIONS so
the Markdown parser converts soft line breaks to HardBreak objects. This
produces visible <br> line breaks in HTML output, matching GFM rendering.

The conversion happens in the Markdown parser's paragraph() method,
which is the proper place for Markdown-specific behavior. The generic
accept_paragraph in ToHtml is unchanged and has no Markdown-specific
logic.

Also remove the CJK-aware newline-to-space gsub from accept_paragraph
that is no longer needed. The RDoc markup parser already handles newline
joining at parse time in build_paragraph.
@st0012 st0012 force-pushed the enable-break-on-newline branch from 2482465 to d62b032 Compare April 4, 2026 14:14
@kou kou requested a review from Copilot April 8, 2026 14:40
@to.start_accepting
@to.accept_paragraph para("+hello+\n", "+world+\n")
assert_equal "\n<p><code>hello</code> <code>world</code></p>\n", @to.res.join
assert_equal "\n<p><code>hello</code>\nworld\n</p>\n", @to.res.join
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional?
Should world be surrounded with <code>?

Suggested change
assert_equal "\n<p><code>hello</code>\nworld\n</p>\n", @to.res.join
assert_equal "\n<p><code>hello</code>\n<code>world</code>\n</p>\n", @to.res.join

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Enables Markdown’s existing break_on_newline extension by default so soft line breaks inside paragraphs are parsed into RDoc::Markup::HardBreak nodes (rendering as <br> in HTML), and removes newline-joining logic from RDoc::Markup::ToHtml#accept_paragraph that is no longer relied on.

Changes:

  • Enable :break_on_newline in RDoc::Markdown::DEFAULT_EXTENSIONS (and the kpeg source) so newline handling happens at parse time.
  • Simplify RDoc::Markup::ToHtml#accept_paragraph by removing the newline-to-space (CJK-aware) substitution.
  • Update/expand tests and update the Markdown reference documentation to reflect the new behavior.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
lib/rdoc/markdown.rb Enables :break_on_newline by default in the Ruby implementation of the Markdown parser.
lib/rdoc/markdown.kpeg Mirrors the default extension enablement in the kpeg source.
lib/rdoc/markup/to_html.rb Removes paragraph newline-joining/substitution logic so <br>\n output is preserved and paragraph text is emitted more directly.
test/rdoc/rdoc_markdown_test.rb Updates expectations to include hard_break nodes and adds an HTML assertion for blockquote continuation.
test/rdoc/rdoc_markdown_test_test.rb Mass-updates expected parsed ASTs to include hard_break where newlines previously remained in paragraph text.
test/rdoc/parser/changelog_test.rb Updates expectations for newline behavior in parsed changelog Markdown fragments.
test/rdoc/markup/to_html_test.rb Adjusts HTML expectations for hard breaks and paragraph newlines after removing the gsub behavior.
doc/markup_reference/markdown.md Updates the GFM comparison notes to reflect preserved line breaks as <br>.
Comments suppressed due to low confidence (2)

lib/rdoc/markdown.rb:596

  • DEFAULT_EXTENSIONS now includes :break_on_newline, but the file-level extension docs earlier in this file still say the extension is "disabled by default". Please update the documentation to match the new default (and clarify whether users should disable it explicitly to preserve the previous soft-break behavior).
  # Extensions enabled by default

  DEFAULT_EXTENSIONS = [
    :break_on_newline,
    :definition_lists,
    :github,
    :html,
    :notes,
    :strike,
  ]

lib/rdoc/markdown.kpeg:211

  • DEFAULT_EXTENSIONS enables :break_on_newline here, but the generated source/docs for the Markdown parser still describe this extension as disabled by default. Please keep the extension documentation in sync with the new default (this file and lib/rdoc/markdown.rb should agree).
  # Extensions enabled by default

  DEFAULT_EXTENSIONS = [
    :break_on_newline,
    :definition_lists,
    :github,
    :html,
    :notes,
    :strike,
  ]

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 313 to 317
def accept_paragraph(paragraph)
@res << "\n<p>"
text = paragraph.text @hard_break
text = text.gsub(/(#{SPACE_SEPARATED_LETTER_CLASS})?\K\r?\n(?=(?(1)(#{SPACE_SEPARATED_LETTER_CLASS})?))/o) {
defined?($2) && ' '
}
@res << to_html(text)
@res << "</p>\n"
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description says there are "no changes to the generic accept_paragraph in ToHtml", but this method was changed (newline-to-space gsub removed). Please update the PR description (or add rationale in-code) so reviewers/users understand this behavior change affects ToHtml output beyond Markdown parsing.

Copilot uses AI. Check for mistakes.
Comment on lines 620 to 625
| Link titles | ✅ | ⚠️ | Parsed but not rendered |
| Images | ✅ | ✅ | Full match |
| Autolinks `<url>` | ✅ | ✅ | Full match |
| Soft line breaks | ✅ | ⚠️ | Newlines within paragraphs produce `<br>`; GFM spec renders as a space |
| Hard line breaks | ✅ | ⚠️ | 2+ trailing spaces only; backslash `\` at EOL not supported |
| Backslash escapes | ✅ | ⚠️ | Subset of GFM's escapable characters (e.g., `~` not escapable) |
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description claims this change matches GFM rendering for soft line breaks, but this table row states the opposite (GFM renders soft breaks as a space). Please reconcile the PR description with the documented behavior here to avoid confusion about whether this is intended divergence from GFM.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants