Raphael De Lio

Semantic Caching with Spring AI & Redis

Jul 31, 2025

•

11 min read

TL;DR: You’re building a semantic caching system using Spring AI and Redis to improve LLM application performance. Unlike traditional caching that requires exact query matches, semantic caching understands the meaning behind queries and can return cached responses for semantically similar questions. It works by storing query-response pairs as vector embeddings in Redis, allowing your application to…

0
Agent Memory with Spring AI & Redis

Jul 16, 2025

•

20 min read

TL;DR:You’re building an AI agent with memory using Spring AI and Redis. Unlike traditional chatbots that forget previous interactions, memory-enabled agents can recall past conversations and facts. It works by storing two types of memory in Redis: short-term (conversation history) and long-term (facts and experiences as vectors), allowing agents to provide personalized, context-aware responses. LLMs…

0
How I Improved Zero-Shot Classification in Deep Java Library (DJL) OSS

Jun 15, 2025

•

16 min read

Did you know the Deep Java Library (DJL) powers Spring AI and Redis OM Spring? DJL helps you run machine learning models right inside your Java applications. Check them out: Spring AI with DJL: https://docs.spring.io/spring-ai/reference/api/embeddings/onnx.htmlSemantic Search with SpringBoot & Redis: https://medium.com/redis-with-raphael-de-lio/semantic-search-with-spring-boot-redis-ef376bbdb106 TL;DR: 📚 Index Zero-shot classification is a machine learning technique that allows models to classify text…

0
How to send prompts in bulk with Spring AI and Virtual Threads

May 13, 2025

•

4 min read

TL;DR: You’re building an AI-powered app that needs to send lots of prompts to OpenAI. Instead of sending them one by one, you want to do it in bulk — efficiently and safely. This is how you can use Spring AI with Java Virtual Threads to process hundreds of prompts in parallel. When calling LLM APIs like…

0
Semantic Search with Spring Boot & Redis

Apr 29, 2025

•

14 min read

TL;DR:You’re building a semantic search app using Spring Boot and Redis. Instead of matching exact words, semantic search finds meaning using Vector Similarity Search (VSS). It works by turning movie synopses into vectors with embedding models, storing them in Redis (as a vector database), and finding the closest matches to user queries. Video: What is semantic search? A traditional searching system works by matching the words a user types…

0
Sliding Window Counter Rate Limiter (Java & Redis)

Feb 24, 2025

•

16 min read

This content is also available on YouTube. Check it out! The Sliding Window Counter offers a more efficient way to handle rate limiting compared to the Sliding Window Log. While the Sliding Window Log keeps an exact log of timestamps for each request, allowing precise tracking over a rolling time period, this precision comes at the cost of higher…

0
Sliding Window Log Rate Limiter (Redis & Java)

Jan 22, 2025

•

14 min read

This article is also available on YouTube. Check it out! The Sliding Window Log is a more precise way to handle rate limiting. Instead of splitting time into fixed intervals like the Fixed Window Counter , it keeps a log of timestamps for each request. This allows it to track requests over a rolling time…

0
Token Bucket Rate Limiter (Redis & Java)

Jan 13, 2025

•

14 min read

This article is also available on YouTube! The Token Bucket algorithm is a flexible and efficient rate-limiting mechanism. It works by filling a bucket with tokens at a fixed rate (e.g., one token per second). Each request consumes a token, and if no tokens are available, the request is rejected. The bucket has a maximum…

0
Fixed Window Counter Rate Limiter (Redis & Java)

Dec 30, 2024

•

14 min read

This article is also available on YouTube! The Fixed Window Counter is the simplest and most straightforward rate-limiting algorithm. It divides time into fixed intervals (e.g., seconds, minutes, or hours) and counts the number of requests within each interval. If the count exceeds a predefined threshold, the requests are rejected until the next interval begins. Looking for…

0
Rate limiting with Redis: An essential guide

Dec 23, 2024

•

7 min read

This article is also available on YouTube! Rate limiting — it’s something you’ve likely encountered, even if you haven’t directly implemented one. For example, have you ever been greeted by a “429 Too Many Requests” error? That’s a rate limiter in action, protecting a resource from overload. Or maybe you’ve used a service with explicit request quotas…

0

11 min read

20 min read

16 min read

4 min read

14 min read

16 min read

14 min read

14 min read

14 min read

7 min read