-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Expected Behavior
FeatureStore didn't use to perform any I/O on instantiation. This should be the expected behavior. As far as I can tell, all other abstractions follow this convention. Some examples include all implementations of Provider, DataSource, RegistryStore, InfraObject, and RetrievalJob.
This is also consistent with other python APIs such as Amazon's boto3
s3 = boto3.client('s3') # no I/O; doesn't try to read credentials from config file on disk; etc
s3.list_buckets() # explicit I/O operation
bucket_policy = s3.BucketPolicy('bucket_name') # no I/O; doesn't verify if bucket exists
bucket_policy.load() # explicit I/O operationHowever this is not consistent across the PEP 249 implementations. Trino doesn't perform any I/O. However, by default, the BigQuery SDK tries to load a credentials file from disk, the psycopg2 library tries to establish a valid connection, and sqlite3 will try to create or open a file database.
Current Behavior
#2256 introduced the self._registry._initialize_registry() line to FeatureStore's __init__ method. This tries to fetch the registry's proto from the registry and fails if this does not exist or if the remote registry is not accessible from the current environment.
Steps to reproduce
>>> store = FeatureStore(
... config=RepoConfig(
... project="foo",
... provider="local",
... registry="s3://my-bucket/registry.db",
... )
... )
[...]
File ".venv/lib/python3.9/site-packages/botocore/signers.py", line 103, in handler
return self.sign(operation_name, request)
File ".venv/lib/python3.9/site-packages/botocore/signers.py", line 187, in sign
auth.add_auth(request)
File ".venv/lib/python3.9/site-packages/botocore/auth.py", line 405, in add_auth
raise NoCredentialsError()
botocore.exceptions.NoCredentialsError: Unable to locate credentialsSpecifications
- Version: 0.20.2
- Platform: macOS
- Subsystem:
Possible Solution
Lazy load the registry's proto. I think this can be achieved by removing self._registry._initialize_registry() from the __init__ method. Maybe @adchia @felixwang9817 can provide more context here. Also, interested in hearing your take on this.