Skip to content

Conversation

@amotl
Copy link
Member

@amotl amotl commented May 6, 2025

Problem

@shraik started using sqlalchemy-cratedb and reported that its behaviour deviates from other vendors by not failing on engine.connect() when the database server is not available.

We found this is not actually on the SQLAlchemy dialect, but on the DBAPI driver already, which exhibits the same behaviour.

Solution

The patch extends the _lowest_server_version method to re-raise the last ConnectionError when no connection can be made to any configured server node a ConnectionError when connecting to all server nodes fails, including all error messages.

By doing it this way, we didn't need to submit a dummy SQL command like originally planned. We think it is much better this way because it does not pollute the server-side statement log.

Details

As an additional benefit, the software tests in test_connection.py are now actual integration tests.

/cc @mfussenegger, @seut, @surister

@amotl amotl requested review from kneth and seut May 6, 2025 21:42
@coderabbitai
Copy link

coderabbitai bot commented May 6, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

"""

Walkthrough

The changes update the connection logic to fail early when the database server is unresponsive, by raising the last encountered connection error if no valid server version is found. The changelog is updated to reflect this. Additionally, a new test is introduced to verify that connection errors are properly raised for invalid server addresses. The client connection documentation was simplified by removing an interactive example. The test suite registration for connection tests was moved to the integration test layer.

Changes

File(s) Change Summary
CHANGES.rst Added changelog entry about early failure on unresponsive database server.
src/crate/client/connection.py Modified _lowest_server_version to raise last ConnectionError if no server is reachable, collecting all connection errors.
tests/client/test_connection.py Added test to assert that a ConnectionError is raised for invalid server addresses; replaced hardcoded server address with variable.
tests/client/tests.py Moved ConnectionTest registration from unit to integration test suite with test layer.
docs/by-example/client.rst Simplified client connection docs by removing interactive example and verification output.
setup.py Added minimum version requirement verlib2>=0.3 in install_requires; updated license metadata.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Connection
    participant Server

    Client->>Connection: connect()
    loop For each server
        Connection->>Server: request server version
        alt Server responds with version
            Connection->>Connection: store version if valid
        else Server not available (ConnectionError)
            Connection->>Connection: store last ConnectionError
        else Invalid version (ValueError/InvalidVersion)
            Connection->>Connection: ignore and continue
        end
    end
    alt No valid server version found and ConnectionError occurred
        Connection-->>Client: raise last ConnectionError
    else Valid server version found
        Connection-->>Client: proceed with connection
    end
Loading

Poem

In tunnels deep, connections seek,
To servers strong or servers weak.
Now if a host just will not play,
We fail up front—no more delay!
A test ensures the right alarm,
No silent errors, just clear charm.
🐇✨
"""

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fail-on-connect

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@amotl amotl force-pushed the fail-on-connect branch 3 times, most recently from fd7e0df to 7a285fe Compare May 7, 2025 21:10
Comment on lines -38 to +33
If no ``servers`` are given, the default one ``http://127.0.0.1:4200`` is used:

>>> connection = client.connect()
>>> connection.client._active_servers
['http://127.0.0.1:4200']
>>> connection.close()
If no ``servers`` are supplied to the ``connect`` method, the default address
``http://127.0.0.1:4200`` is used.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no server at http://127.0.0.1:4200. This test case just didn't fail because connect() did not raise an exception up until now.

Now, there is no longer a way to validate connecting to the default address per doctest file, because there is no server listening at http://127.0.0.1:4200. Because the core information is still viable, all what's left is pure prose, rephrased a bit.

@amotl amotl marked this pull request as ready for review May 7, 2025 21:18
@amotl amotl requested a review from kneth May 8, 2025 08:11
@amotl amotl force-pushed the fail-on-connect branch from 7a285fe to a572c98 Compare May 8, 2025 08:12
@amotl amotl requested review from matriv and removed request for seut May 8, 2025 10:26
Copy link
Member

@surister surister left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@seut seut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@amotl amotl force-pushed the fail-on-connect branch 3 times, most recently from 306feb5 to cb36d1b Compare May 9, 2025 11:25
Comment on lines 214 to 215
if lowest is None and len(connection_errors) == server_count:
raise ConnectionError(str(connection_errors))
Copy link
Member Author

@amotl amotl May 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

str(connection_errors) uses an opaque way to serialize the list of exception instances into a string, but I think it is viable and does not cause too many surprises for callers that expect exception.message to be of str type. Do you have any objections or suggestions to do it differently, like using JSON instead? 1

Footnotes

  1. In general, raise ConnectionError(connection_errors) is also possible, but the caller then needs to handle the exception value as a list type, which is semantically different on some occasions like "Server not available" in ex.exception.message would no longer be True, bearing potential breaking changes, at least for a few test cases of the test suite already.

Copy link
Member Author

@amotl amotl May 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[...] suggestions to do it differently, like using JSON instead?

The procedure now uses JSON instead of an opaque Python-serialized string per e818221.

For a single element, it looks like this now:

 crate.client.exceptions.ConnectionError: ["Server not available, exception: HTTPConnectionPool(host='127.0.0.1', port=44209): Max retries exceeded with url: / (Caused by ReadTimeoutError(\"HTTPConnectionPool(host='127.0.0.1', port=44209): Read timed out. (read timeout=0.01)\"))"]

-- https://github.com/crate/crate-python/actions/runs/14928213165/job/41937814085?pr=711#step:5:357

@amotl amotl changed the title Fix: Fail early when database server does not respond Fix: Fail early when database cluster does not respond May 9, 2025
@amotl amotl force-pushed the fail-on-connect branch from cb36d1b to 0046f03 Compare May 9, 2025 11:36
@amotl amotl requested review from seut and surister May 9, 2025 11:51
@amotl amotl force-pushed the fail-on-connect branch from e818221 to d22fde9 Compare May 9, 2025 12:01
Copy link
Member

@seut seut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@amotl
Copy link
Member Author

amotl commented May 9, 2025

We just published a pre-release package including those updates per crate-2.1.0.dev0 and notified @shraik about it at crate/sqlalchemy-cratedb#218 (comment).

Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amotl Thank you! Is there a way to test that we don't fail the connection attempt, but only if only all servers provided fail?
Can we also test the exception when multiple nodes fail?

if not lowest or version < lowest:
lowest = version
if connection_errors and len(connection_errors) == server_count:
raise ConnectionError(json.dumps(list(map(str, connection_errors))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just raise ConnectionError(error_list)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've elaborated about it here, but I am also not sure, that's why I am also asking about your opinion.

@amotl amotl self-assigned this Oct 2, 2025
The procedure will collect all `ConnectionError` instances and include
them into the exception message of the final `ConnectionError`.
All `ConnectionError` instances that have been collected will be
serialized into JSON now.
@amotl amotl force-pushed the fail-on-connect branch 2 times, most recently from 7d6f7a4 to 12cd6a7 Compare December 18, 2025 22:02
@amotl amotl removed their assignment Dec 18, 2025
@amotl
Copy link
Member Author

amotl commented Dec 18, 2025

Hi @surister. The tests are failing now after rebasing on top of the recent test suite refactorings. Can I humbly ask you to look into it? Maybe you can spot the flaw quickly on the interface of both?

@surister
Copy link
Member

Hi @surister. The tests are failing now after rebasing on top of the recent test suite refactorings. Can I humbly ask you to look into it? Maybe you can spot the flaw quickly on the interface of both?

looking

@surister
Copy link
Member

surister commented Dec 19, 2025

@amotl It's very positive that many unit tests fail now :) since this new PR introduces breaking changes and in the modernization I wrote many new tests (it's actually worrysome that it didn't in the past). So whoever picks this up will have to adapt the tests to the new behavior. It's nothing inherently 'wrong' about the modernization itself.

For example, before connect() would just silently log and ignore connection error, your PR now raises an exception:

        if lowest is None and last_connection_error is not None:
            raise last_connection_error

That breaks this tests for example:

def test_connection_closes_context_manager():
    """Verify that the context manager of the client closes the connection"""
    with patch.object(connect, "close", autospec=True) as close_fn:
        with connect():
            pass
        close_fn.assert_called_once()

Which expects that when creating a Connection object and no CrateDB is found, no error is raised, ideally this test isolates the intended tested behavior enough so it does not fail but that's a lot more work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants