# MD033 - No HTML tags

Aliases: `no-inline-html`

## What this rule does

Prevents the use of HTML tags in Markdown - use Markdown syntax instead.

## Why this matters

- **Portability**: Pure Markdown works everywhere, HTML might be blocked or stripped
- **Security**: Many platforms sanitize HTML for security reasons
- **Simplicity**: Markdown syntax is cleaner and easier to read than HTML
- **Consistency**: Mixing HTML and Markdown creates inconsistent documents

## Examples

### ✅ Correct

```markdown
# Heading

This is a paragraph with **bold** and *italic* text.

> This is a quote

- List item 1
- List item 2

[Link text](https://example.com)

![Image description](image.png)

Contact us at <support@example.com>
Visit <https://example.com>
```

### ❌ Incorrect

<!-- rumdl-disable MD033 -->

```markdown
# Heading

This is a paragraph with <strong>bold</strong> and <em>italic</em> text.

<blockquote>This is a quote</blockquote>

<ul>
  <li>List item 1</li>
  <li>List item 2</li>
</ul>

<a href="https://example.com">Link text</a>

<img src="image.png" alt="Image description">
```

<!-- rumdl-enable MD033 -->

### 🔧 Fixed

```markdown
# Heading

This is a paragraph with **bold** and *italic* text.

> This is a quote

- List item 1
- List item 2

[Link text](https://example.com)

![Image description](image.png)
```

## Configuration

```toml
[MD033]
allowed-elements = []     # List of allowed HTML tags (default: none)
disallowed-elements = []  # List of disallowed HTML tags (enables disallowed-only mode)
fix = false               # Enable auto-fix to convert simple HTML to Markdown (default: false)
br-style = "trailing-spaces"  # Style for <br> conversion: "trailing-spaces" or "backslash"
```

Shorthand aliases are also supported:

```toml
[MD033]
allowed = []              # Alias for allowed-elements
disallowed = []           # Alias for disallowed-elements
```

### Example allowing specific tags

```toml
[MD033]
allowed-elements = ["br", "hr", "details", "summary"]
```

This would allow line breaks, horizontal rules, and collapsible sections while blocking other HTML.

### GFM Security Mode (disallowed-only)

For GitHub Flavored Markdown, you can use the `disallowed-elements` option to only flag
security-sensitive HTML tags while allowing all other HTML. Use the special value `"gfm"`
to automatically include all GFM-disallowed tags:

```toml
[MD033]
disallowed-elements = ["gfm"]
```

This flags only these security-sensitive tags:

- `<title>`, `<textarea>`, `<style>`, `<xmp>`, `<iframe>`
- `<noembed>`, `<noframes>`, `<script>`, `<plaintext>`

These are the same tags that GitHub filters from rendered markdown for security reasons.

### Custom disallowed tags

You can also specify your own list of disallowed tags:

```toml
[MD033]
disallowed-elements = ["script", "iframe", "style"]
```

Or combine GFM tags with custom ones:

```toml
[MD033]
disallowed-elements = ["gfm", "marquee", "blink"]
```

### mdbook projects with semantic HTML

mdbook documentation often uses HTML with CSS classes to add semantic meaning that pure Markdown cannot express (e.g., marking text as filenames, captions, or warnings). For mdbook projects, you can
allow semantic containers:

```toml
[tool.rumdl.MD033]
allowed-elements = ["div", "span"]
```

This permits semantic HTML like:

- `<span class="filename">src/main.rs</span>` - Filename styling
- `<div class="warning">Important note</div>` - Warning boxes
- `<span class="caption">Figure 1: Architecture</span>` - Figure captions

While still catching potentially problematic HTML like `<em>`, `<strong>`, or `<script>` tags that have Markdown equivalents or security concerns.

## Automatic fixes

Auto-fix for MD033 is **opt-in** (disabled by default). Enable it with:

```toml
[MD033]
fix = true
```

When enabled, simple HTML tags are converted to their Markdown equivalents:

| HTML Tag | Markdown Equivalent |
|----------|-------------------|
| `<em>text</em>`, `<i>text</i>` | `*text*` |
| `<strong>text</strong>`, `<b>text</b>` | `**text**` |

| `<code>text</code>` | `` `text` `` |
| `<br>`, `<br/>` | Two trailing spaces + newline |
| `<hr>`, `<hr/>` | `---` |

**Limitations:**

- Tags with attributes are not converted (attributes might be significant)
- Tags with nested HTML content are not converted to Markdown
- Complex tags (like `<div>`, `<span>`) have their content extracted but are not converted to Markdown equivalents
- For deeply nested HTML, you may need to run the fix multiple times

**Line break style:**

By default, `<br>` tags are converted to two trailing spaces followed by a newline (CommonMark standard). You can use backslash-style line breaks instead:

```toml
[MD033]
fix = true
br-style = "backslash"  # Converts <br> to backslash + newline
```

## What's allowed

These are **not** considered HTML and are allowed:

- HTML comments: `<!-- This is a comment -->`
- Email autolinks: `<user@example.com>`
- URL autolinks: `<https://example.com>`
- FTP autolinks: `<ftp://files.example.com>`

## Learn more

- [CommonMark HTML blocks](https://spec.commonmark.org/0.31.2/#html-blocks) - When HTML is needed
- [Markdown Guide - Basic Syntax](https://www.markdownguide.org/basic-syntax/) - Markdown alternatives to HTML

## Related rules

- [MD046](md046.md) - Code block style should be consistent
- [MD034](md034.md) - URLs should be formatted as links
