Skip to content

Thread-safe mode for multi-tenant applications#18

Closed
dimitri-yatsenko wants to merge 26 commits intomasterfrom
feature/thread-safe-mode
Closed

Thread-safe mode for multi-tenant applications#18
dimitri-yatsenko wants to merge 26 commits intomasterfrom
feature/thread-safe-mode

Conversation

@dimitri-yatsenko
Copy link
Owner

@dimitri-yatsenko dimitri-yatsenko commented Feb 13, 2026

Summary

Add thread-safe mode for multi-tenant applications. When enabled via DJ_THREAD_SAFE=true, global state access (dj.config, dj.conn()) raises ThreadSafetyError, forcing explicit connection management via dj.Instance().

See spec: docs/design/thread-safe-mode.md

Global state audit

Reviewed all module-level mutable state for thread-safety implications:

# Global state Location Resolution
1 config singleton settings.py:979 Guarded — blocked in thread-safe mode, use Instance()
2 conn() singleton connection.py:108 Guarded — blocked in thread-safe mode, use Instance()
3 _codec_registry codecs.py:47-48 Safe — immutable after import (import lock + idempotent entry points)
4 Logging side effects logging.py Low risk — standard Python logging, no DataJoint-specific state
5 use_32bit_dims blob.py:65 Low risk — deserialization flag, rarely changed at runtime
6 compression dict blob.py:61 Low risk — decompressor registry, effectively read-only
7 _lazy_modules __init__.py:92 Low risk — import caching, protected by import lock
8 ADAPTERS dict adapters/__init__.py:16 Low risk — backend registry, effectively read-only

Implementation

  • dj.Instance() class providing isolated config + connection + Schema
  • ThreadSafetyError raised when accessing global singletons in thread-safe mode
  • DJ_THREAD_SAFE environment variable to enable
  • Codec registry documented as thread-safe by design (no per-instance isolation needed)

Usage

export DJ_THREAD_SAFE=true
inst = dj.Instance(
    host="localhost",
    user="user",
    password="password",
)
schema = inst.Schema("my_schema")

🤖 Generated with Claude Code

dimitri-yatsenko and others added 20 commits February 13, 2026 15:58
Simple spec for blocking global state access in multi-tenant environments.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove dead code:
- filepath_checksum_size_limit (never used)
- enable_python_native_blobs (never used)
- cache (only query_cache is used)
- init_function/init_command (database init command)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- All settings can be passed to Connection.from_config()
- Only thread_safe is read-only after initialization

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Parameters and defaults
- Connection-scoped settings via conn.config
- Never accesses global dj.config

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Connection.from_config() creates a Config instance for conn.config
- Database connection settings (host, port, user, password, use_tls, backend)
  become read-only after connection is established
- Other settings remain mutable per-connection

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- thread_safe=True: global dj.config becomes read-only
- conn.config copies from global config, always mutable
- Simpler: global config still readable for defaults

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Simpler API:
- Use existing Connection() constructor
- conn.config copies from global dj.config
- conn.config is always mutable for per-connection settings

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
All runtime operations must use self.connection.config instead of
global config:
- table.py: safemode for delete/drop
- schemas.py: safemode, create_tables
- preview.py: display settings
- diagram.py: diagram_direction
- jobs.py: all jobs settings
- autopopulate.py: jobs settings
- declare.py: add_job_metadata
- connection.py: reconnect, query_cache
- hash_registry.py, codecs: stores, download_path

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Explains how connections propagate:
- Connection → Schema → Table classes → Table instances
- Schema falls back to conn() if no connection provided
- Tables inherit connection from schema via _connection class attribute
- In thread_safe mode, Schema("name") fails, Schema("name", connection=conn) works

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- dj.conn().config IS dj.config (same object)
- dj.Connection(...).config is COPY of dj.config (independent)
- All internal code uses self.connection.config uniformly

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Connection.__init__ always creates self.config = copy(dj.config)
- dj.conn() overrides after creation: conn.config = dj.config

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Mixed scenarios: dj.config affects global connection schemas only
- Explicit connection schemas have independent config
- dj.config.override() affects only schemas using dj.conn()
- conn.config.override() affects only that connection's schemas
- In thread_safe=True: dj.config.override() raises ThreadSafetyError

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
New approach using dj.new() for isolated contexts:
- Each context has one config and one connection
- ctx.Schema() auto-uses context's connection
- ctx.Manual, ctx.Lookup, etc. for table base classes
- dj module acts as singleton context (legacy API)
- thread_safe=True blocks singleton, allows dj.new()

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- ctx exposes only: config, connection, Schema()
- Connection created at context construction via dj.new()
- Tables still use dj.Manual, dj.Lookup as base classes
- thread_safe=True: dj.config only allows thread_safe access

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- dj.config, dj.conn(), dj.Schema() delegate to singleton instance
- Singleton lazily loaded on first access
- thread_safe checked at module import, blocks singleton access
- inst.config created fresh (not copied from dj.config)
- dj.instance() always works, creates isolated instance

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- dj.Manual, dj.Lookup etc. used with @Schema decorator (schema links connection)
- inst.Schema(), inst.FreeTable() need connection directly
- FreeTable added to Instance class

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- dj.Instance() (uppercase) for consistency with dj.Schema()
- Single _singleton_instance created lazily
- dj.config -> _singleton.config (via proxy)
- dj.conn() -> _singleton.connection
- dj.Schema() -> _singleton.Schema()
- dj.FreeTable() -> _singleton.FreeTable()
- All trigger same singleton creation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Instance class that encapsulates config + connection
- Add ThreadSafetyError exception for global state access
- Add _ConfigProxy to delegate dj.config to global config
- Add _get_singleton_connection for lazy connection creation
- Update dj.conn(), dj.Schema(), dj.FreeTable() to use singleton
- Connection now stores _config reference for instance isolation
- Add DJ_THREAD_SAFE environment variable support
- Add comprehensive tests for thread-safe mode

When DJ_THREAD_SAFE=true:
- dj.config raises ThreadSafetyError
- dj.conn() raises ThreadSafetyError
- dj.Schema() raises ThreadSafetyError (without explicit connection)
- dj.FreeTable() raises ThreadSafetyError (without explicit connection)
- dj.Instance() always works for isolated contexts

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- conn(host, user, password) now updates the singleton connection
  instead of creating a separate connection
- Remove irrelevant safemode=False from spec examples
- thread_safe is set via DJ_THREAD_SAFE env var or config file

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove unused `from typing import Callable` in connection.py (lint failure)
- Update mock_cache fixture: `cache` → `download_path` (KeyError in test_attach)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove unused `_singleton_connection` import in __init__.py (F401)
- Remove unused `os` import in test_thread_safe.py (F401)
- Remove unused `Callable` import in connection.py (F401)
- Fix mock_cache fixture: `cache` → `download_path` for 2.0 settings

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
dimitri-yatsenko and others added 5 commits February 18, 2026 13:25
The global codec registry is effectively immutable after import:
registration runs under Python's import lock, and the only runtime
mutation (_load_entry_points) is idempotent under the GIL.
Per-instance isolation is unnecessary since codecs are part of the
type system, not connection-scoped state.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Catalog all 8 module-level mutable globals with thread-safety
classification: guarded (config, connection), safe by design
(codec registry), or low risk (logging, blob flags, import caches).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All internal code now reads configuration from self.connection._config
instead of the global config singleton. This ensures thread-safe mode
works correctly: each Instance's connection carries its own config,
and tables/schemas/jobs access it through the connection.

Changes across 9 files:
- schemas.py: safemode, create_tables default
- table.py: safemode in delete/drop, config passed to declare()
- expression.py: loglevel in __repr__
- preview.py: display.* settings via query_expression.connection._config
- autopopulate.py: jobs.allow_new_pk_fields, jobs.auto_refresh
- jobs.py: jobs.default_priority, stale_timeout, keep_completed
- declare.py: jobs.add_job_metadata (config param threaded through)
- diagram.py: display.diagram_direction (connection stored on instance)
- staged_insert.py: get_store_spec()

Removed unused `from .settings import config` imports from 7 modules.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- instance._global_config now reuses settings.config instead of creating
  a duplicate Config object. This ensures dj.config["safemode"] = False
  actually affects self.connection._config["safemode"] reads.
- schemas.py now uses _get_singleton_connection() from instance.py
  instead of the old conn() from connection.py, eliminating the
  duplicate singleton connection holder.
- dj.conn() now only creates a new connection when the singleton doesn't
  exist or reset=True (not on every call with credentials).
- test_uppercase_schema: use prompt=False for drop() calls.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ode spec

Adds Architecture section covering the object graph, config flow for both
singleton and Instance paths, and a table of all connection-scoped config
reads across 9 modules.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@mweitzel mweitzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DJ_THREAD_SAFE protects against the singleton API from being used, but it doesn't prevent the Instances from setting the same global config object in them.

Overall this is fantastic. Minor issue with race condition in initialization. I've issued a PR to address it.

True if thread-safe mode is enabled.
"""
# Check environment variable first
env_val = os.environ.get("DJ_THREAD_SAFE", "").lower()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this configurable via python instead of just os.env?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like "import, set threadsafe, then use api" without the os env needs.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We considered adding a set_thread_safe() Python API but decided against it. Calling a function to mutate a global variable is itself not a thread-safe operation — it defeats the purpose. Thread-safe mode is an infrastructure decision that must be made before any threads spawn, so it belongs in the environment (DJ_THREAD_SAFE=true) or config file, not in runtime code.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See reply above — we decided against a Python API for this. Mutating a global to enable thread-safety is self-contradictory. DJ_THREAD_SAFE stays as env-var/config-file only.

port = self.config.database.port

# Create connection
self.connection = Connection(host, user, password, port, use_tls)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's where we need to provide the appropriate backend as well, or else it just grabs the global one.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Instance.__init__ now passes backend=self.config.database.backend and config_override=self.config to the Connection constructor.

self._is_closed = True # Mark as closed until connect() succeeds

# Config reference - defaults to global config, but Instance sets its own
self._config = config

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean "but Instance sets its own" ..here it is still setting _config to the same global one (from .settings import config)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module-level config from settings.py is created at import time from env vars and config file — it's not mutable shared state being raced on. But you're right that Connection needs to accept its config at construction: when created by an Instance, it should use the Instance's config (which may differ from the module-level one). Fixed — Connection.__init__ now accepts backend and config_override keyword arguments, matching the approach in your illustrative PR.


# Select adapter based on configured backend
backend = config["database.backend"]
backend = self._config["database.backend"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this is inside __init__, it doesn't matter if the adapter is set after creation. It still grabs from the global here, in initialization.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above — the config is fixed at import time, not a race. But the fix is the same: Connection now accepts config_override at construction so it uses the right config when created by an Instance.

user: str,
password: str,
port: int | None = None,
init_fun: str | None = None,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really needs to accept backend and config at instantiation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've issued a PR to illustrate a fix

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Connection.__init__ now accepts backend and config_override keyword arguments. All three call sites (Instance.__init__, _get_singleton_connection, and dj.conn()) pass config explicitly.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the illustrative PR — the approach matches yours. Addressed in datajoint#1404.

return [
r[0]
for r in (connection or conn()).query(
for r in (connection or _get_singleton_connection()).query(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hard coded sql is mysql specific, it can be avoided if instead of _get_singleton_connection you called something like sql = connection.adapter.list_schemas_sql() ; return [r[0] for r in connection.query(sql)]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method isn't protected by the "thread safe" os.env guard, but also doesn't need to be if it reads entirely from the instances which are passed instead of the global singleton here.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. list_schemas_sql() already exists on the adapter interface. The hardcoded SQL also has a MySQL-specific filter (WHERE schema_name <> "information_schema") that the adapter methods handle properly. We'll update list_schemas() to use connection.adapter.list_schemas_sql().

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right — list_schemas() already accepts an optional connection argument. When one is passed (e.g. from an Instance), it bypasses the singleton entirely. The thread-safe guard only triggers via _get_singleton_connection() when no connection is provided, which is the correct behavior.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, in _create_config() it warns "No datajoint.json found", but that is not relevant in that mode. You could probably have an if/else and config = None if _load_thread_safe

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning comes from _create_config() which runs at import time (line 972), before thread-safe mode is even checked. In thread-safe mode, Instance() calls _create_config() too — to get a fresh config with defaults from env vars. The warning is about the absence of a config file, which is relevant regardless of mode (env vars alone are a valid configuration path, but the warning helps users who forgot to create one). We could suppress it, but it's informational and doesn't affect behavior.

@mweitzel
Copy link

I see a few comments "done" or "fixed" but did you push the changes?

@dimitri-yatsenko
Copy link
Owner Author

Superseded by datajoint#1404

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments