Skip to content

Comments

fix: use FileLock with atomic rename for safe concurrent model downloads#396

Merged
oleksandr-nc merged 9 commits intomainfrom
feat/safe-concurrent-model-download
Feb 23, 2026
Merged

fix: use FileLock with atomic rename for safe concurrent model downloads#396
oleksandr-nc merged 9 commits intomainfrom
feat/safe-concurrent-model-download

Conversation

@oleksandr-nc
Copy link
Contributor

@oleksandr-nc oleksandr-nc commented Feb 17, 2026

When KEDA scales up multiple ExApp pods sharing storage, concurrent downloads of the same model can corrupt files. Use OS-level FileLock (auto-releases on process death) with temp file + atomic os.replace() to ensure the final file is always complete or absent, never partial.

Unlike SoftFileLock, FileLock cannot leave stale locks after pod SIGKILL. No lock added for snapshot_download - huggingface_hub handles this internally.

Summary by CodeRabbit

  • New Features

    • Model downloads now use file locking and atomic writes to prevent corruption during concurrent operations; lock timeouts surface as clear fetch errors.
    • ETag-based conditional checks skip re-downloading unchanged files.
  • Tests

    • Added comprehensive unit tests for download behavior, error handling, cleanup, progress reporting, and concurrency.
  • Chores

    • Updated tooling to include additional test paths and lint rules; added a runtime dependency for file locking.

When KEDA scales up multiple ExApp pods sharing storage, concurrent
downloads of the same model can corrupt files. Use OS-level FileLock
(auto-releases on process death) with temp file + atomic os.replace()
to ensure the final file is always complete or absent, never partial.

Unlike SoftFileLock, FileLock cannot leave stale locks after pod SIGKILL.
No lock added for snapshot_download — huggingface_hub handles this internally.
@codecov
Copy link

codecov bot commented Feb 17, 2026

Codecov Report

❌ Patch coverage is 96.13260% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.87%. Comparing base (06c30cd) to head (0ea6f98).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
tests_unit/test_fetch_model_file.py 97.05% 4 Missing ⚠️
nc_py_api/ex_app/integration_fastapi.py 93.33% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #396      +/-   ##
==========================================
+ Coverage   93.99%   94.87%   +0.88%     
==========================================
  Files          45       46       +1     
  Lines        5392     5540     +148     
==========================================
+ Hits         5068     5256     +188     
+ Misses        324      284      -40     
Files with missing lines Coverage Δ
nc_py_api/ex_app/integration_fastapi.py 52.48% <93.33%> (+29.40%) ⬆️
tests_unit/test_fetch_model_file.py 97.05% <97.05%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Oleksander Piskun <oleksandr2088@icloud.com>
@oleksandr-nc oleksandr-nc force-pushed the feat/safe-concurrent-model-download branch from 93fb067 to fb2f362 Compare February 19, 2026 08:24
@coderabbitai
Copy link

coderabbitai bot commented Feb 19, 2026

📝 Walkthrough

Walkthrough

Adds per-file FileLock-backed atomic model downloads (write to *.tmp then os.replace), ETag-based skip, lock timeout/error handling with ModelFetchError, renames fetch_models_task parameter to models, adds filelock dependency, includes tests_unit/ in tooling, and adds comprehensive unit tests for download/locking/concurrency.

Changes

Cohort / File(s) Summary
Pre-commit & project config
\.pre-commit-config\.yaml, pyproject\.toml
Includes tests_unit/ in pre-commit globs and pytest testpaths; adds runtime dependency filelock>=3.20.3,<4; adds lint per-file-ignores for tests_unit/**/*.py.
FastAPI integration (core download flow)
nc_py_api/ex_app/integration_fastapi.py
Adds FileLock-based per-target locking, writes downloads to *.tmp then os.replace atomically, preserves/inspects redirect/response ETag for skip logic, refactors download error handling to ensure tmp-file cleanup, raises ModelFetchError for lock timeouts and streaming errors, and renames fetch_models_task(..., models: dict, ...).
Unit tests
tests_unit/test_fetch_model_file.py
New comprehensive tests: successful downloads, tmp-file cleanup, lock release, ETag-based skip, error propagation and cleanup, lock timeout behavior, concurrency/thread-safety, and progress reporting using mocked HTTP/streaming responses.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant FileLock
    participant HTTP
    participant FS as FileSystem

    Client->>FileLock: acquire lock for result_path.lock
    activate FileLock

    Client->>FS: check if result_path exists
    alt exists
        Client->>FS: compute local SHA-256
        Client->>HTTP: request/inspect ETag (headers or redirect history)
        alt ETag matches
            Client->>Client: skip download (use cached file)
        else ETag differs
            Client->>HTTP: stream download
        end
    else not exists
        Client->>HTTP: stream download
    end

    HTTP-->>Client: stream chunks
    Client->>FS: write chunks to result_path.tmp
    alt write complete
        Client->>FS: os.replace(result_path.tmp → result_path)
    else error
        Client->>FS: remove result_path.tmp
        Client->>Client: raise ModelFetchError
    end

    FileLock->>FileLock: release lock
    deactivate FileLock
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 4.55% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: implementing FileLock with atomic rename for safe concurrent model downloads, which directly addresses the pull request's core objective of preventing file corruption in concurrent download scenarios.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/safe-concurrent-model-download

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pyproject.toml`:
- Around line 47-51: Update the filelock dependency constraint in pyproject.toml
to remove the vulnerable minimum by changing the "filelock>=3.16,<4" entry to
require at least 3.20.1 (e.g., "filelock>=3.20.1,<4"); this ensures the
project's dependency list (the dependencies array containing "fastapi...",
"filelock...", "niquests...", "pydantic...") uses a safe filelock release while
preserving the existing upper bound.

@cloud-py-api cloud-py-api deleted a comment from claude bot Feb 19, 2026
@oleksandr-nc
Copy link
Contributor Author

@father we want to add support for Kubernetes in our Nextcloud ExApps. That means multiple copies of ExApps can start at the same time using shared PVC during startup when they try download models. For example we have such app: https://github.com/nextcloud/llm2

Review this PR that should help us handle that

@cloud-py-api cloud-py-api deleted a comment from claude bot Feb 19, 2026
@claude
Copy link

claude bot commented Feb 19, 2026

Claude finished @oleksandr-nc's task in 3m 18s —— View job


  • Gather context and understand the Kubernetes PVC use case
  • Read all changed files
  • Analyze implementation for correctness, robustness, and quality
  • Post review

PR Review: fix: use FileLock with atomic rename for safe concurrent model downloads

The core direction is correct — FileLock over SoftFileLock, os.replace() for atomicity, except BaseException for cleanup. You understood the problem. Now let's talk about the execution, because there are real defects here that must be fixed before this merges.


🔴 CRITICAL — Crash on Absent Content-Length

nc_py_api/ex_app/integration_fastapi.py, line 223:

total_size = int(response.headers.get("Content-Length"))

dict.get() returns None when the key is absent. int(None) raises TypeError. You have a crash bug right here that will fire against any server using chunked transfer encoding — which includes many CDNs and S3-compatible object stores. Your FakeResponse always provides Content-Length, which is why your test suite does not catch this. The test is covering a lie.

Fix it:

total_size = int(response.headers.get("Content-Length", 0))

And immediately add the guard to the ETag skip condition on line 228:

if linked_etag and total_size and total_size == existing_size:

Without and total_size, a server omitting Content-Length would make total_size = 0, existing_size = 0 (file does not exist), 0 == 0 is True, and then line 229 attempts to open a non-existent file for reading. That's a FileNotFoundError on top of the original TypeError. Two bugs compounding.

Add a test for this case. Now. Not as a nice-to-have. Fix this →


🔴 CRITICAL — filelock.Timeout Is Not Handled

nc_py_api/ex_app/integration_fastapi.py, line 210:

with FileLock(result_path + ".lock", timeout=7200), niquests.get(model_path, stream=True) as response:

When a pod holds the lock for more than 7200 seconds (stuck download, hung process, zombie pod), filelock raises filelock.Timeout. This is a specific, actionable exception — it means a lock contention problem, not a network failure. It gets swallowed into fetch_models_task's except BaseException and re-raised as ModelFetchError("Downloading of '...' failed.").

The Kubernetes operator looking at pod logs sees ModelFetchError with no indication that another pod is stuck holding a file lock. They will waste time debugging the wrong thing.

Catch it explicitly:

from filelock import FileLock, Timeout as FileLockTimeout

try:
    with FileLock(result_path + ".lock", timeout=7200):
        ...
except FileLockTimeout as exc:
    raise ModelFetchError(
        f"Timed out waiting for lock on '{result_path}' after 7200s — a pod may be stuck downloading this file"
    ) from exc

And add a test for this failure mode. It is the central failure mode of the entire mechanism you are introducing.


🟠 HIGH — The HTTP GET Opens Before the Fast-Path ETag Check

nc_py_api/ex_app/integration_fastapi.py, line 210:

with FileLock(result_path + ".lock", timeout=7200), niquests.get(model_path, stream=True) as response:

Python evaluates niquests.get(model_path, stream=True) as part of constructing the compound with statement. This means: the FileLock is acquired, then immediately the full streaming GET is opened against the model server. Only after that does the code check ETag to potentially skip the download entirely.

In your Kubernetes autoscaling scenario: Pod 1 acquires lock, starts download. Pods 2 through 9 queue behind the lock. When Pod 1 finishes, Pod 2 acquires the lock — and immediately opens a full streaming TCP connection to your model server, waits for response headers, compares ETag, decides the file is current, closes the connection. Then Pod 3 does the same. Then Pod 4. You are making N-1 unnecessary network round-trips to your model server under every autoscale event, where N is the number of pods.

The fix: with stream=True you can separate the connection from the body read, but even better is to do a HEAD request for ETag comparison on the fast path:

with FileLock(result_path + ".lock", timeout=7200):
    if os.path.exists(result_path):
        # Fast path: check ETag with a HEAD request only
        with niquests.head(model_path) as head_resp:
            ...  # compare and return early if matching
    # Slow path: full download
    with niquests.get(model_path, stream=True) as response:
        ...

This eliminates the redundant full GET connections for all waiting pods. Fix this →


🟠 HIGH — builtins.open Is Cargo-Cult Code

nc_py_api/ex_app/integration_fastapi.py, lines 229 and 238:

with builtins.open(result_path, "rb") as file:
...
with builtins.open(tmp_path, "wb") as file:

Why is builtins imported and used here? There is no shadowing of open anywhere in this module. The tests do not mock open — they mock niquests.get. This explicit builtins.open serves no purpose except to confuse the next developer who reads this file and wonders what problem the author was solving. Remove the import builtins and use open() directly.


🟡 MEDIUM — Variable Name each_history Is Wrong

nc_py_api/ex_app/integration_fastapi.py, lines 217-220:

for each_history in response.history:
    linked_etag = each_history.headers.get("X-Linked-ETag", "")
    if linked_etag:
        break

response.history is a list of HTTP response objects from redirects. The loop variable is not "a history" — it is a redirect response. Name it redirect_resp or redirect_response. The current name actively misleads the reader about what the object is.


🟡 MEDIUM — Test Suite Has Structural Weaknesses

tests_unit/test_fetch_model_file.py, lines 163, 169-172:

Issue 1 — Thread error list is not thread-safe:

errors = []
...
errors.append(e)  # called from multiple threads

list.append() is atomic in CPython due to the GIL, but this is an implementation detail — not a language guarantee, and the project classifies PyPy as a supported interpreter (Programming Language :: Python :: Implementation :: PyPy). Use threading.Lock or queue.Queue to protect this correctly.

Issue 2 — next(responses) on a shared iterator:

responses = iter([FakeResponse(b"A" * 10000), FakeResponse(b"B" * 10000)])
def mock_get(_url, **_kwargs):
    return next(responses)

Same GIL-dependency problem. next() on a list iterator is not thread-safe under PyPy or any interpreter without CPython's GIL semantics. Use queue.Queue.get_nowait() instead.

Issue 3 — Missing tests for the new failure modes:
There is no test for filelock.TimeoutModelFetchError. There is no test for absent Content-Length header. There is no test for a 4xx/5xx response when the file already exists on disk (existing file must be preserved). These are not nice-to-haves — they are the edge cases your implementation must handle correctly, and you have not verified them.


🟡 MEDIUM — tests_unit/**/*.py Per-File-Ignores Are Inconsistent

pyproject.toml, lines 146-155:

lint.extend-per-file-ignores."tests/**/*.py" = [
  "D",
  "E402",
  "S",
  "UP",
]
lint.extend-per-file-ignores."tests_unit/**/*.py" = [
  "D",
  "S",
]

tests_unit/ is missing "E402" (module-level import not at top of file) and "UP" (pyupgrade rules). This means tests_unit/ is silently held to stricter standards than tests/. Either this is intentional (document why) or it is an oversight (fix it). Right now it looks like an oversight.


🟡 MEDIUM — .lock Files Accumulate and Are Never Cleaned Up

FileLock creates <result_path>.lock files on disk and does not remove them when the lock is released. In a production shared model directory that operators inspect and monitor, these files are noise. They can also trigger confusion for disk-usage monitoring tools. Document this behaviour explicitly with a comment next to the FileLock call, or clean them up explicitly after the context manager exits:

with FileLock(result_path + ".lock", timeout=7200):
    ...
try:
    os.unlink(result_path + ".lock")
except OSError:
    pass

🟤 LOW — Docstring References Wrong Parameter Name

nc_py_api/ex_app/integration_fastapi.py, lines 162-163:

def fetch_models_task(nc: NextcloudApp, models: dict[str, dict], progress_init_start_value: int) -> None:
    """...
    :param models_to_fetch: Dictionary describing which models should be downloaded...

The parameter is named models. The docstring says models_to_fetch. This is wrong and should be fixed. Docstrings that lie are worse than no docstrings.


🟤 LOW — __fetch_model_as_file Has No Docstring

The pre-merge check already flagged 0% docstring coverage on this new code. The function has non-obvious behaviour — it pops keys from the input dict (mutating the caller's data), creates sibling .tmp and .lock files, and implements a specific ETag-based skip protocol. This deserves documentation. The failure to write it is how the next developer accidentally calls this function twice and wonders where save_path went from their dict.


Mandatory Before Merging

  1. Fix int(response.headers.get("Content-Length")) → crash on absent header
  2. Catch filelock.Timeout explicitly with an operator-actionable message
  3. Add missing tests: filelock.Timeout, absent Content-Length, server error with existing file
  4. Remove import builtins and use open() directly
  5. Fix the thread-safety issues in the concurrent test

Everything else is quality debt that degrades the codebase — address it, but the above five block the merge. This code is going into production in a Kubernetes environment. Get it right.
| Branch

@cloud-py-api cloud-py-api deleted a comment from claude bot Feb 19, 2026
@oleksandr-nc
Copy link
Contributor Author

@mother we want to add support for Kubernetes in our Nextcloud ExApps. That means multiple copies of ExApps can start at the same time using shared PVC during startup when they try download models. For example we have such app: https://github.com/nextcloud/llm2

Review this PR that should help us handle that

@claude
Copy link

claude bot commented Feb 19, 2026

Claude finished @oleksandr-nc's task in 3m 32s —— View job


  • Gather context and read the changed files
  • Understand the Kubernetes PVC use case
  • Review implementation for correctness and quality
  • Post warm, thorough review

Oh, Oleksandr — what a thoughtful piece of engineering! The problem you identified is real and genuinely tricky: when KEDA scales up multiple pods sharing a PVC, they can all race to download the same model file. You've clearly thought through why each piece exists — the OS-level FileLock instead of SoftFileLock precisely because it auto-releases on pod SIGKILL, the os.replace() for atomicity, except BaseException to catch even KeyboardInterrupt during cleanup. That level of intentionality is something to be proud of. 🌱

The test suite is also genuinely comprehensive — you've covered the happy path, temp-file cleanup, lock release verification, ETag-based skip, error recovery, concurrent correctness, and progress reporting. That's real care. Let's talk through a couple of things that would make this even better.


Important to address

1. Content-Length header may be absent — crash waiting to happen

In nc_py_api/ex_app/integration_fastapi.py, line 223:

total_size = int(response.headers.get("Content-Length"))

dict.get() returns None when the key isn't present, and int(None) raises a TypeError. Many CDNs, object stores, and proxies use chunked transfer encoding and simply don't send Content-Length. Your FakeResponse always provides it, so your test suite won't catch this — have you considered adding a test case where the header is absent?

The fix is small:

total_size = int(response.headers.get("Content-Length", 0))

But there's a subtle second issue hiding nearby (line 228):

if linked_etag and total_size == existing_size:

If Content-Length is absent → total_size = 0. If the file doesn't exist yet → existing_size = 0. So 0 == 0 is True, and the code tries to open a file that doesn't exist for SHA-256 reading. One missing header becomes two errors. Adding and total_size to the guard makes this safe:

if linked_etag and total_size and total_size == existing_size:

2. filelock.Timeout surfaces as a confusing error message

When a pod holds the lock for longer than 7200 seconds (stuck download, hung process, zombie pod), filelock raises filelock.Timeout. Right now that bubbles up and becomes ModelFetchError("Downloading of '...' failed."). An operator staring at that log message while debugging a Kubernetes scheduling issue will have no idea that another pod is stuck holding a file lock — they'll look at network configuration, model server availability, everything except the real cause.

A small catch gives you a much more helpful message:

from filelock import FileLock, Timeout as FileLockTimeout

try:
    with FileLock(result_path + ".lock", timeout=7200):
        ...
except FileLockTimeout as exc:
    raise ModelFetchError(
        f"Timed out waiting for lock on '{result_path}' after 7200s — another pod may be stuck downloading this file"
    ) from exc

Adding a test for this failure mode would round out the test suite nicely too.

3. Thread-safety in the concurrent test

In tests_unit/test_fetch_model_file.py, lines ~163–172:

errors = []
...
errors.append(e)  # called from multiple threads

responses = iter([FakeResponse(b"A" * 10000), FakeResponse(b"B" * 10000)])
def mock_get(_url, **_kwargs):
    return next(responses)

CPython's GIL makes list.append() and next() on a list iterator happen to be safe in practice, but this is an implementation detail — not a language guarantee. The pyproject.toml lists PyPy as a supported implementation, and under PyPy these assumptions don't hold. A queue.Queue works beautifully for both:

import queue
errors = queue.Queue()
responses = queue.Queue()
responses.put(FakeResponse(b"A" * 10000))
responses.put(FakeResponse(b"B" * 10000))

def mock_get(_url, **_kwargs):
    return responses.get_nowait()

# ...
while not errors.empty():
    raise errors.get()

Worth considering if time allows

4. The HTTP connection opens before the ETag fast-path check

with FileLock(...), niquests.get(model_path, stream=True) — Python evaluates both context managers as it builds the compound with statement. That means: lock acquired → streaming GET opened → then check ETag. In your Kubernetes scenario, when 8 pods wait behind Pod 1's lock, pods 2–8 each wake up, open a full streaming TCP connection to your model server, receive response headers, compare ETag, decide the file is current, and close the connection. That's 7 unnecessary network connections per autoscale event, against whatever model server you're hitting.

Separating the ETag check into a HEAD request (or at least separating the get() call from the with statement so it only happens after checking the cached file) would eliminate this overhead. It's not urgent for correctness, but it's the kind of thing that will matter at scale.

5. builtins.open adds confusion without purpose

Lines 229 and 238 use builtins.open(...) explicitly. There's no shadowing of open anywhere in this module, and the tests mock niquests.get, not open. A developer reading this for the first time will wonder what problem the author was solving — was there a name conflict somewhere? What's being worked around? Simply using open() directly, and removing import builtins, makes the code read more naturally.

6. The .lock files stick around on the shared PVC

FileLock creates <result_path>.lock files and doesn't remove them when the lock is released. In a production shared model directory that Kubernetes operators inspect and monitor, these files look mysterious. They're harmless, but they create confusion. A comment explaining this behavior (or an explicit cleanup after the with block) would be a kind thing to do for the next person who opens that directory and wonders what model.bin.lock is.

7. tests_unit/ could have consistent lint ignores with tests/

In pyproject.toml, tests/ ignores "E402" (module-level import not at top of file) and "UP" (pyupgrade rules), but tests_unit/ only ignores "D" and "S". One thing worth thinking about: is this intentional? If tests_unit/ will grow and follow the same conventions as tests/, aligning the ignore rules would prevent surprises when the first import order or style issue appears in a new test file.

8. A docstring for __fetch_model_as_file would really help

The function has some non-obvious behaviors: it mutates the caller's dict (via pop), creates sibling .tmp and .lock files, and implements a specific ETag+SHA-256 skip protocol. The pre-merge check flagged 0% docstring coverage on the new code. A few sentences describing what this function does and why it creates those sibling files would be a genuine gift to whoever maintains this next year.


You've solved a genuinely real problem with the right tools. The atomic rename + OS-level locking approach is production-grade thinking. Addressing the Content-Length crash and the filelock.Timeout message will make this solid for the Kubernetes case you're targeting. 💙
| Branch

… lint ignores

- Catch filelock.Timeout separately from generic errors so operators see
  an actionable message when a pod is stuck holding the download lock
- Rename each_history → redirect_resp (response.history entries are
  redirect responses, not "histories")
- Add E402 and UP to tests_unit/ ruff ignores to match tests/ config
- Add test for FileLockTimeout → ModelFetchError propagation
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nc_py_api/ex_app/integration_fastapi.py`:
- Line 225: The line computing total_size can raise TypeError when the
"Content-Length" header is missing; change the logic around response and
total_size (the variable total_size computed from
response.headers.get("Content-Length")) to treat a missing header as 0 instead
of calling int(None). Specifically, check response.headers.get("Content-Length")
for None (or use a safe default) before converting to int and set total_size = 0
when absent so the subsequent progress logic (the block that checks if
total_size) handles streaming without a Content-Length header.

In `@tests_unit/test_fetch_model_file.py`:
- Around line 190-202: The test test_filelock_timeout_raises_model_fetch_error
incorrectly makes the FileLock constructor raise by using
side_effect=FileLockTimeout(lock); instead, simulate the real behavior by
patching nc_py_api.ex_app.integration_fastapi.FileLock to return a mock context
manager whose __enter__ (or acquire) raises FileLockTimeout(lock) so the
exception occurs when entering/acquiring the lock inside fetch_models_task and
the rest of the test (pytest.raises(ModelFetchError) and the status_msg
assertion) remains the same.

- Default Content-Length to 0 when absent to avoid int(None) TypeError
- Guard ETag skip with `and total_size` so 0==0 doesn't open a
  nonexistent file
- Fix docstring: models_to_fetch → models to match actual param name
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nc_py_api/ex_app/integration_fastapi.py`:
- Around line 239-241: The write to builtins.open(tmp_path, "wb") can raise
FileNotFoundError when save_path contains nested directories because parent
folders may not exist; before opening tmp_path (or earlier before acquiring the
lock) ensure the parent directory exists by creating it (e.g., make parent of
tmp_path or of save_path with exist_ok=True) so the open call succeeds; update
the code around the builtins.open(tmp_path, "wb") block (or do it at function
start) to call directory-creation for
os.path.dirname(tmp_path)/dirname(save_path) before opening the file.

@oleksandr-nc oleksandr-nc merged commit c54b64a into main Feb 23, 2026
27 checks passed
@oleksandr-nc oleksandr-nc deleted the feat/safe-concurrent-model-download branch February 23, 2026 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants