Bloom Words

A lightweight Go library for efficient English word validation using Bloom filters. Perfect for spell-checking, word games, and text validation with minimal memory footprint.

What is Bloom Words?

Bloom Words is a Go library that validates English words using Bloom filters—achieving fast lookups with minimal memory usage. Perfect for spell-checking, word validation, and text filtering.

Features

🚀 Fast Lookup: O(1) constant-time word lookup using Bloom filter
💾 Memory Efficient: Compressed filter using bitsets, much smaller than storing all words
📖 Comprehensive Dictionary: Pre-built filter with 370,000+ English words
⚡ Streaming Support: Efficiently handle large datasets with minimal memory usage
🧪 Well Tested: Includes comprehensive test suite

Quick Stats:

370K+ English words compressed into just ~500KB of data
Sub-microsecond lookups - test a word in less than 1 microsecond
99% accuracy - only 1% false positive rate on average
Zero false negatives - if a word exists, you'll always find it

Installation

go get github.com/oosawy/bloom-words

Usage

Basic Word Lookup

package main

import (
	"fmt"
	"log"

	bw "github.com/oosawy/bloom-words"
)

func main() {
	// Test if a word exists in the dictionary
	if bw.Test("hello") {
		fmt.Println("'hello' is a valid word")
	}

	if !bw.Test("xyzabc") {
		fmt.Println("'xyzabc' is likely not a valid word")
	}
}

How It Works

Bloom Words uses Go's go:embed directive to embed the pre-built Bloom filter (filter/bloom_words.bf) directly into the binary. This eliminates the need to load external files at runtime and removes external dependencies. The embedded filter is loaded into memory during initialization, and all subsequent word lookups execute in constant O(1) time against this in-memory data.

Building the Filter

To rebuild the Bloom filter from the word list:

go run ./cmd/build.go

This reads from datasets/words_alpha.txt and generates a new filter/bloom_words.bf.

Dataset

The English word dataset used in this project is sourced from dwyl/english-words.

Testing

Run the test suite:

go test ./tests -v

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
cmd/build		cmd/build
datasets		datasets
filter		filter
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Taskfile.yml		Taskfile.yml
bloomwords.go		bloomwords.go
bloomwords_test.go		bloomwords_test.go
embed.go		embed.go
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bloom Words

What is Bloom Words?

Features

Installation

Usage

Basic Word Lookup

How It Works

Building the Filter

Dataset

Testing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Bloom Words

What is Bloom Words?

Features

Installation

Usage

Basic Word Lookup

How It Works

Building the Filter

Dataset

Testing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages