Skip to content

is_port_open() returns True for HTTP interceptors — binary protocol test fixtures hang indefinitely #200

@rsleedbx

Description

@rsleedbx

Bug

tests/fixtures/utils.py provides is_port_open() which is used by fixture availability checks (e.g. spark_available(), and similar guards in other fixtures):

def is_port_open(host: str, port: int, timeout: float = 1.0) -> bool:
    try:
        with socket.create_connection((host, port), timeout=timeout):
            return True
    except (TimeoutError, OSError):
        return False

A naive TCP connect returns True for any service that accepts the connection — including system-level HTTP daemons (enterprise security agents, VPN clients, corporate endpoint tools) that bind common ports and respond with HTTP.

When a fixture then opens a binary protocol connection (Thrift for Spark, etc.) to that port, the driver misinterprets the HTTP response as binary framing data and hangs indefinitely — the test never times out, and the test suite stalls.

Root cause (concrete example)

A Qualys Cloud Agent (agentid-service) binds TCP port 10001 on all interfaces on macOS. It runs as root, so lsof -nP -iTCP:10001 shows nothing without sudo. is_port_open("localhost", 10001) returns True.

pyhive's Thrift binary transport then reads the first 5 bytes of the HTTP response (HTTP/) as a frame header: status byte 0x48 + 4-byte length 0x5454502F = 1,414,676,527 bytes (~1.3 GB). The driver blocks waiting for that data. TCP stays open; the call never returns.

Fix

Replace the naive socket check with a protocol probe that distinguishes binary servers from HTTP interceptors:

def probe_port(host: str, port: int, timeout: float = 0.5) -> str:
    """Classify a TCP port: 'refused', 'http', 'binary', or 'timeout'."""
    try:
        with socket.create_connection((host, port), timeout=timeout) as sock:
            sock.settimeout(timeout)
            try:
                sock.sendall(b"GET / HTTP/1.0\r\nHost: localhost\r\n\r\n")
                data = sock.recv(8)
                return "http" if data.startswith(b"HTTP/") else "binary"
            except socket.timeout:
                return "binary"  # binary protocol server — didn't respond to HTTP
    except ConnectionRefusedError:
        return "refused"
    except (TimeoutError, OSError):
        return "timeout"

Fixture availability checks for binary protocol servers should use probe_port(host, port) == "binary" instead of is_port_open(host, port).

Two additional helpers are also useful:

  • find_safe_port(start, end) — scans for the first port returning 'refused' (truly free for compose port mapping)
  • find_thrift_port(host, start, end) — scans for the first port returning 'binary' (auto-discovers a running binary server without hardcoding the port)

Detection

# Detect HTTP interceptors on a port (no sudo required):
curl -s --connect-timeout 2 http://localhost:PORT/
# If you get any HTTP response, it is not your container.

# See who owns any port (shows root-owned processes too):
netstat -anv | grep "\.PORT "

Reproduced on: macOS 15, Podman Desktop, pyhive 0.7.0 against Spark Thrift Server.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions