docs: improve "Creating Custom Document Stores" documentation#10581
docs: improve "Creating Custom Document Stores" documentation#10581davidsbatista wants to merge 3 commits intomainfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
|
||
| Usually, a Document Store comes with additional methods that can provide advanced search functionalities. These methods are not part of the `DocumentStore` protocol and don’t follow any particular convention. We designed it like this to provide maximum flexibility to the Document Store when using any specific features of the underlying database. | ||
|
|
||
| Some additional methods that are not part of the `DocumentStore` protocol, but most of the Document Stores in Haystack have them implemented, are: |
There was a problem hiding this comment.
📝 [vale] reported by reviewdog 🐶
[Google.Contractions] Feel free to use 'aren't' instead of 'are not'.
| def update_by_filter(filters: dict[str, Any], meta: dict[str, Any], refresh: bool = False) -> int: | ||
| def delete_by_filter(filters: dict[str, Any]) -> int: | ||
| ``` | ||
| These methods are not part of the Protocol but highly recommended to implement in your custom Document Store, as users often expect them to be available. |
There was a problem hiding this comment.
📝 [vale] reported by reviewdog 🐶
[Google.Contractions] Feel free to use 'aren't' instead of 'are not'.
| 1. Implement the logic for `count_documents`. | ||
| 2. In your `test_document_store.py` module, define the test class `TestDocumentStore(CountDocumentsTest)`. Note how we only inherit from the specific testing mix-in `CountDocumentsTest`. | ||
| 1. Implement the logic for `count_documents`. | ||
| 2. In your `test_document_store.py` module, define the test class `TestDocumentStore(CountDocumentsTest)`. Note how we only inherit from the specific testing mix-in `CountDocumentsTest`. |
There was a problem hiding this comment.
[Google.We] Try to avoid using first-person plural like 'we'.
| 6. Keep iterating with the remaining methods. | ||
| - Having a notebook where users can try out your Document Store in a full pipeline can really help adoption, and it’s a great source of documentation. Our [haystack-cookbook](https://github.com/deepset-ai/haystack-cookbook) repository has good visibility, and we encourage contributors to create a PR and add their own. | ||
|
|
||
| The [tests](https://github.com/deepset-ai/haystack/blob/main/haystack/testing/document_store.py) in `DocumentStoreBaseTests`give you a good idea of the overall expected behavior of a Document Store and the operations it should support, following it is a good way to make sure your implementation is consistent with the rest of the Haystack ecosystem. |
There was a problem hiding this comment.
📝 [vale] reported by reviewdog 🐶
[Google.Contractions] Feel free to use 'it's' instead of 'it is'.
|
|
||
| The [tests](https://github.com/deepset-ai/haystack/blob/main/haystack/testing/document_store.py) in `DocumentStoreBaseTests`give you a good idea of the overall expected behavior of a Document Store and the operations it should support, following it is a good way to make sure your implementation is consistent with the rest of the Haystack ecosystem. | ||
|
|
||
| If the technology you are using for your Document Store supports asynchronous operations, we recommend implementing `async` versions of the methods in the `DocumentStore` protocol as well. This will allow users to take advantage of async features in their applications and pipelines, improving performance and scalability. |
There was a problem hiding this comment.
[Google.We] Try to avoid using first-person plural like 'we'.
|
|
||
| The [tests](https://github.com/deepset-ai/haystack/blob/main/haystack/testing/document_store.py) in `DocumentStoreBaseTests`give you a good idea of the overall expected behavior of a Document Store and the operations it should support, following it is a good way to make sure your implementation is consistent with the rest of the Haystack ecosystem. | ||
|
|
||
| If the technology you are using for your Document Store supports asynchronous operations, we recommend implementing `async` versions of the methods in the `DocumentStore` protocol as well. This will allow users to take advantage of async features in their applications and pipelines, improving performance and scalability. |
There was a problem hiding this comment.
[Google.Will] Avoid using 'will'.
anakin87
left a comment
There was a problem hiding this comment.
I left some minor comments.
Please also apply these changes to docs-website/versioned_docs/version-2.24/concepts/document-store/creating-custom-document-stores.mdx, so that they will be published in the stable docs.
docs-website/docs/concepts/document-store/creating-custom-document-stores.mdx
Outdated
Show resolved
Hide resolved
| 6. Keep iterating with the remaining methods. | ||
| - Having a notebook where users can try out your Document Store in a full pipeline can really help adoption, and it’s a great source of documentation. Our [haystack-cookbook](https://github.com/deepset-ai/haystack-cookbook) repository has good visibility, and we encourage contributors to create a PR and add their own. | ||
|
|
||
| The [tests](https://github.com/deepset-ai/haystack/blob/main/haystack/testing/document_store.py) in `DocumentStoreBaseTests`give you a good idea of the overall expected behavior of a Document Store and the operations it should support, following it is a good way to make sure your implementation is consistent with the rest of the Haystack ecosystem. |
There was a problem hiding this comment.
Here, I'd say that DocumentStoreBaseTests is the minimum requirement but using DocumentStoreBaseExtendedTests would be desirable. WDYT?
…ment-stores.mdx Co-authored-by: Stefano Fiorucci <[email protected]>
|
|
||
| Usually, a Document Store comes with additional methods that can provide advanced search functionalities. These methods are not part of the `DocumentStore` protocol and don’t follow any particular convention. We designed it like this to provide maximum flexibility to the Document Store when using any specific features of the underlying database. | ||
|
|
||
| Some additional methods that are not part of the `DocumentStore` protocol, but are implemented by most Document Stores in Haystack, include: |
There was a problem hiding this comment.
📝 [vale] reported by reviewdog 🐶
[Google.Contractions] Feel free to use 'aren't' instead of 'are not'.
Proposed Changes:
Checklist
fix:,feat:,build:,chore:,ci:,docs:,style:,refactor:,perf:,test:and added!in case the PR includes breaking changes.