Skip to content

Performing I/O on instantiation  #2697

@tpvasconcelos

Description

@tpvasconcelos

Expected Behavior

FeatureStore didn't use to perform any I/O on instantiation. This should be the expected behavior. As far as I can tell, all other abstractions follow this convention. Some examples include all implementations of Provider, DataSource, RegistryStore, InfraObject, and RetrievalJob.

This is also consistent with other python APIs such as Amazon's boto3

s3 = boto3.client('s3')                         # no I/O; doesn't try to read credentials from config file on disk; etc
s3.list_buckets()                               # explicit I/O operation
bucket_policy = s3.BucketPolicy('bucket_name')  # no I/O; doesn't verify if bucket exists
bucket_policy.load()                            # explicit I/O operation

However this is not consistent across the PEP 249 implementations. Trino doesn't perform any I/O. However, by default, the BigQuery SDK tries to load a credentials file from disk, the psycopg2 library tries to establish a valid connection, and sqlite3 will try to create or open a file database.

Current Behavior

#2256 introduced the self._registry._initialize_registry() line to FeatureStore's __init__ method. This tries to fetch the registry's proto from the registry and fails if this does not exist or if the remote registry is not accessible from the current environment.

Steps to reproduce

>>> store = FeatureStore(
...     config=RepoConfig(
...         project="foo",
...         provider="local",
...         registry="s3://my-bucket/registry.db",
...     )
... )
[...]
  File ".venv/lib/python3.9/site-packages/botocore/signers.py", line 103, in handler
    return self.sign(operation_name, request)
  File ".venv/lib/python3.9/site-packages/botocore/signers.py", line 187, in sign
    auth.add_auth(request)
  File ".venv/lib/python3.9/site-packages/botocore/auth.py", line 405, in add_auth
    raise NoCredentialsError()
botocore.exceptions.NoCredentialsError: Unable to locate credentials

Specifications

  • Version: 0.20.2
  • Platform: macOS
  • Subsystem:

Possible Solution

Lazy load the registry's proto. I think this can be achieved by removing self._registry._initialize_registry() from the __init__ method. Maybe @adchia @felixwang9817 can provide more context here. Also, interested in hearing your take on this.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions