Skip to content

Conversation

@ChristianPavilonis
Copy link
Collaborator

@ChristianPavilonis ChristianPavilonis commented Feb 10, 2026


/** Google ad-serving domains whose URLs should be proxied (exact match). */
const GPT_DOMAINS = [
'securepubads.g.doubleclick.net',

Check failure

Code scanning / CodeQL

Incomplete regular expression for hostnames

This string, which is used as a regular expression [here](1), has an unescaped '.' before 'doubleclick.net', so it might match more hosts than expected.

Copilot Autofix

AI about 4 hours ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

/** Google ad-serving domains whose URLs should be proxied (exact match). */
const GPT_DOMAINS = [
'securepubads.g.doubleclick.net',
'pagead2.googlesyndication.com',

Check failure

Code scanning / CodeQL

Incomplete regular expression for hostnames

This string, which is used as a regular expression [here](1), has an unescaped '.' before 'googlesyndication.com', so it might match more hosts than expected.

Copilot Autofix

AI about 4 hours ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

const GPT_DOMAINS = [
'securepubads.g.doubleclick.net',
'pagead2.googlesyndication.com',
'tpc.googlesyndication.com',

Check failure

Code scanning / CodeQL

Incomplete regular expression for hostnames

This string, which is used as a regular expression [here](1), has an unescaped '.' before 'googlesyndication.com', so it might match more hosts than expected.

Copilot Autofix

AI about 4 hours ago

In general, to avoid incomplete hostname regular expressions, any string used to build a regex should be run through a generic “escape for regex literal” function that escapes all regex metacharacters, not only dots. This ensures future additions to GPT_DOMAINS cannot accidentally introduce patterns that match more than the literal hostname.

Concretely, in crates/js/lib/src/integrations/gpt/script_guard.ts, the fallback block in rewriteUrl currently builds the regex with:

new RegExp(`https?://(?:www\\.)?${domain.replace(/\./g, '\\.')}`, 'i')

This manually escapes only dots in domain. Replace this with a helper escapeRegex (defined in this file) that escapes every regex metacharacter: \ ^ $ * + ? . ( ) | { } [ ]. Use that helper both for domain and for any other future regex constructions based on literal strings if needed. The change is localized to this file: add the helper function (near the top, after constants or helpers) and change the new RegExp(...) call to use escapeRegex(domain) instead of domain.replace(/\./g, '\\.'). No behavior changes for current values, but it becomes robust and satisfies the security rule.

Suggested changeset 1
crates/js/lib/src/integrations/gpt/script_guard.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/crates/js/lib/src/integrations/gpt/script_guard.ts b/crates/js/lib/src/integrations/gpt/script_guard.ts
--- a/crates/js/lib/src/integrations/gpt/script_guard.ts
+++ b/crates/js/lib/src/integrations/gpt/script_guard.ts
@@ -54,6 +54,13 @@
 /** Integration route prefix on the first-party domain. */
 const PROXY_PREFIX = '/integrations/gpt';
 
+/**
+ * Escape a string so it can be safely used inside a RegExp literal.
+ */
+function escapeRegex(value: string): string {
+  return value.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
+}
+
 // ---------------------------------------------------------------------------
 // URL matching and rewriting
 // ---------------------------------------------------------------------------
@@ -98,7 +105,7 @@
       if (lower.includes(domain)) {
         const prefix = hostPrefixForDomain(domain);
         return originalUrl.replace(
-          new RegExp(`https?://(?:www\\.)?${domain.replace(/\./g, '\\.')}`, 'i'),
+          new RegExp(`https?://(?:www\\.)?${escapeRegex(domain)}`, 'i'),
           `${window.location.protocol}//${window.location.host}${PROXY_PREFIX}${prefix}`,
         );
       }
EOF
@@ -54,6 +54,13 @@
/** Integration route prefix on the first-party domain. */
const PROXY_PREFIX = '/integrations/gpt';

/**
* Escape a string so it can be safely used inside a RegExp literal.
*/
function escapeRegex(value: string): string {
return value.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}

// ---------------------------------------------------------------------------
// URL matching and rewriting
// ---------------------------------------------------------------------------
@@ -98,7 +105,7 @@
if (lower.includes(domain)) {
const prefix = hostPrefixForDomain(domain);
return originalUrl.replace(
new RegExp(`https?://(?:www\\.)?${domain.replace(/\./g, '\\.')}`, 'i'),
new RegExp(`https?://(?:www\\.)?${escapeRegex(domain)}`, 'i'),
`${window.location.protocol}//${window.location.host}${PROXY_PREFIX}${prefix}`,
);
}
Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
'pagead2.googlesyndication.com',
'tpc.googlesyndication.com',
'googletagservices.com',
'www.googletagservices.com',

Check failure

Code scanning / CodeQL

Incomplete regular expression for hostnames

This string, which is used as a regular expression [here](1), has an unescaped '.' before 'googletagservices.com', so it might match more hosts than expected.

Copilot Autofix

AI about 4 hours ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

'tpc.googlesyndication.com',
'googletagservices.com',
'www.googletagservices.com',
'cm.g.doubleclick.net',

Check failure

Code scanning / CodeQL

Incomplete regular expression for hostnames

This string, which is used as a regular expression [here](1), has an unescaped '.' before 'doubleclick.net', so it might match more hosts than expected.

Copilot Autofix

AI about 4 hours ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

'cm.g.doubleclick.net',
'ep1.adtrafficquality.google',
'ep2.adtrafficquality.google',
'www.googleadservices.com',

Check failure

Code scanning / CodeQL

Incomplete regular expression for hostnames

This string, which is used as a regular expression [here](1), has an unescaped '.' before 'googleadservices.com', so it might match more hosts than expected.

Copilot Autofix

AI about 4 hours ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

@ChristianPavilonis
Copy link
Collaborator Author

We should consider the scope of this integration.

Currently this is doing a lot of rewriting and proxying it's not catching everything but many scripts and 3rd party calls are proxied through the 1st party context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

As publisher I want to host GPT script in publisher domain

1 participant