Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions sqlparse/keywords.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,9 +59,9 @@
(r'(?![_A-ZÀ-Ü])-?(\d+(\.\d*)|\.\d+)(?![_A-ZÀ-Ü])',
tokens.Number.Float),
(r'(?![_A-ZÀ-Ü])-?\d+(?![_A-ZÀ-Ü])', tokens.Number.Integer),
(r"'(''|\\'|[^'])*'", tokens.String.Single),
(r"'(''|\\\\|\\'|[^'])*'", tokens.String.Single),
# not a real string literal in ANSI SQL:
(r'"(""|\\"|[^"])*"', tokens.String.Symbol),
(r'"(""|\\\\|\\"|[^"])*"', tokens.String.Symbol),
(r'(""|".*?[^\\]")', tokens.String.Symbol),
Comment on lines +64 to 65
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR also changes the double-quoted pattern, but the added test only covers single-quoted strings. Please add a regression test exercising a double-quoted value containing escaped backslashes (and verifying tokenization doesn’t produce T.Error / doesn’t merge tokens) so the \\ addition here is covered.

Copilot uses AI. Check for mistakes.
# sqlite names can be escaped with [square brackets]. left bracket
# cannot be preceded by word character or a right bracket --
Expand Down
18 changes: 18 additions & 0 deletions tests/test_tokenize.py
Original file line number Diff line number Diff line change
Expand Up @@ -245,3 +245,21 @@ def test_cli_commands():
p = sqlparse.parse('\\copy')[0]
assert len(p.tokens) == 1
assert p.tokens[0].ttype == T.Command


def test_tokenize_escaped_backslash():
"""Test that escaped backslashes in SQL strings are correctly tokenized."""
import sqlparse
from sqlparse import tokens as T
Comment on lines +252 to +253
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test file already imports sqlparse and tokens as T at module scope; re-importing them inside this test is redundant and inconsistent with the rest of the file. Prefer using the existing module-level imports to keep test style consistent.

Suggested change
import sqlparse
from sqlparse import tokens as T

Copilot uses AI. Check for mistakes.

# Test single-quoted string with escaped backslash
sql = r"SELECT '\\', '\\'"
tokens = list(sqlparse.parse(sql)[0].flatten())
token_types = [t.ttype for t in tokens]

# Should be: SELECT, whitespace, ',', ,, whitespace, ',', (6 tokens after keyword)
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inline comment describing the expected token sequence is incorrect/garbled (it mentions commas/quotes in a way that doesn’t match the SQL). Please update it to reflect the actual expected flattened token order for SELECT '\\', '\\' to avoid misleading future readers.

Suggested change
# Should be: SELECT, whitespace, ',', ,, whitespace, ',', (6 tokens after keyword)
# Expected flattened token order: SELECT, <WS>, "'\\'", ',', <WS>, "'\\'"

Copilot uses AI. Check for mistakes.
assert T.Keyword.DML in token_types # SELECT
string_tokens = [t for t in tokens if t.ttype in (T.String.Single,)]
assert len(string_tokens) == 2, f"Expected 2 string tokens, got {len(string_tokens)}"
assert string_tokens[0].value == "'\\\\'"
assert string_tokens[1].value == "'\\\\'"
Loading