Skip to content

Conversation

@minchopaskal
Copy link
Collaborator

Description

Introducing a new method for identifying hotkeys inside a redis server during a tracking time period.

Hotkeys in this context are defined by two metrics:

  • Percentage of time spend by cpu on the key from the total time during the tracking period
  • Percentage of network bytes (input+output) used for the key from the total network bytes used by redis during the tracking period

Usage

Although the API is subject to change the general idea is for the user to initiate a hotkeys tracking process which should run for some time. The keys' metrics are recorded inside a probabilistic structure and after that the user is able to fetch the top K of them.

Current API

HOTKEYS START	[COUNT k] 
			    [DURATION duration]
			    [SAMPLE ratio]
			    [SLOTS count slot…]

HOTKEYS STOP
HOTKEYS GET 	[MINCPU mincpu] 
			    [MINNET minnet]

HOTKEYS GET

127.0.0.1:6379> hotkeys get
1) "collection-start-time
2) <start-time-unix-timestamp-in-seconds>
3) "collection-duration"
4) <duration-in-seconds>
5) "by-cpu-time"
6) 1) key-1_1
   2) <%-from-total-cpu-time>
   ...
   19) key-10_1
   20) <%-from-total-cpu-time>
7) 1) "by-net-bytes"
8) 1) key-1_2
   2) <%-from-total-net-bytes>
   ...
   19) key-10_2
   	20) <%-from-total-net-bytes>

Other ideas

Instead of introducing a new command we can instead use new configurations similar to this PR

Implementation

Independent of API, implementation is based on a probabilistic structure - Cuckoo Heavy Keeper structure with added min-heap to keep track of top K hotkey's names. CHK is an loosely based on HeavyKeeper which is used in RedisBloom's TopK but has higher throughput.

Random fixed probability sampling via the HOTKEYS start sample <ratio> param. Each key is sampled with probability 1/ratio.

Performance implications

With low enough sample rate (controlled by HOTKEYS start sample <ratio>) there is negligible performance hit. Tracking every key though can incur up to 15% hit in the worst case after running the tests in this bench.

Add new TopK structure based on CuckooHeavyKeeper algorithm

Add new HOTKEYS (START|STOP|GET) command
setDeferredArrayLen(c, replylen, cpu_toshow);

addReplyBulkCString(c, "by-net-bytes");
chkHeapBucket *net = chkTopKList(server.hotkeys.net);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible chkTopKList to return NULL here?


addReplyBulkCString(c, "by-cpu-time");
/* Sorted list of top K elements in the chkTopK structure */
chkHeapBucket *cpu = chkTopKList(server.hotkeys.cpu);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible chkTopKList to return NULL here?

return NULL;
}

memset(topk->tables[i], 0, sizeof(chkBucket) * numbuckets);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check ztrycalloc implementation. I think it initialize the array with 0.

zfree(topk);
return NULL;
}
memset(topk->heap, 0, sizeof(chkHeapBucket) * k);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this if ztrycalloc initialize it with 0.

if (!server.hotkeys.cpu) return;

server.hotkeys.net = chkTopKCreate(count * 10, nearestNextPowerOf2((unsigned)count * 100), 1.08);
if (!server.hotkeys.net) return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we fail here. Do we need to destroy server.hotkeys.cpu.

* enough) again for better accuracy. Note the CHK implementation uses a
* power of 2 numbuckets for better cache locality. */
server.hotkeys.cpu = chkTopKCreate(count * 10, nearestNextPowerOf2((unsigned)count * 100), 1.08);
if (!server.hotkeys.cpu) return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens with the slots if we fail here.

return NULL;

/* Heap uses different hash than the cuckoo tables */
uint64_t fp = MurmurHash64A(item, itemlen, HEAP_SEED);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of the HEAP_SEED? Can we keep in fpAndIdx the hash calculated in generateItemFpAndIdxs and use it here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different seed values can be used when you need independent (non-correlated) hash functions. The specific value doesn't matter.
Anyway, as above, I suggest replacing with XXH64, which should be faster.

* sample_ratio was 1 we add all keys.
* Note, for multi-key commands we use the same probability for all the keys. */
if (server.hotkeys.sample_ratio > 1 &&
(double)rand() / RAND_MAX >= 1. / server.hotkeys.sample_ratio)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably faster:

(double)rand() * server.hotkeys.sample_ratio >= RAND_MAX

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. This is a micro-optimization but worthwhile for a hot path.

double prob = (new_count - LOBBY_PROMOTION_THRESHOLD) /
(double)(min - LOBBY_PROMOTION_THRESHOLD);

if ((rand() / (double)RAND_MAX) >= prob) return -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacing division with multiplication should be faster.
(+2 places below)


#include "stdint.h"

#define CHK_LUT_SIZE 256
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add documentation for these constants?

"collection-start-time": {
"type": "integer"
},
"collection-duration": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add units in the description

"Lobby promotion threshold should be less then the LUT size to "
"ensure constant operations during decayCounter!");

#define MAX_KICKS 16
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add documentation for these constants?

}

fpAndIdx generateItemFpAndIdxs(chkTopK *topk, char *item, int itemlen) {
uint64_t hash = MurmurHash64A(item, itemlen, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using XXH64 instead (should be much faster)


/* HOTKEYS command table */
struct COMMAND_STRUCT HOTKEYS_Subcommands[] = {
{MAKE_CMD("get","Returns two lists of top K hotkeys - one by cpu time and one by network bytes.","O(K) where K is the number of hotkeys to return.","8.0.0",CMD_DOC_NONE,NULL,NULL,"server",COMMAND_GROUP_SERVER,HOTKEYS_GET_History,0,HOTKEYS_GET_Tips,0,hotkeysCommand,-2,CMD_ADMIN|CMD_NOSCRIPT,0,HOTKEYS_GET_Keyspecs,0,NULL,2),.args=HOTKEYS_GET_Args},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8.6.0?


/* HOTKEYS command table */
struct COMMAND_STRUCT HOTKEYS_Subcommands[] = {
{MAKE_CMD("get","Returns two lists of top K hotkeys - one by cpu time and one by network bytes.","O(K) where K is the number of hotkeys to return.","8.0.0",CMD_DOC_NONE,NULL,NULL,"server",COMMAND_GROUP_SERVER,HOTKEYS_GET_History,0,HOTKEYS_GET_Tips,0,hotkeysCommand,-2,CMD_ADMIN|CMD_NOSCRIPT,0,HOTKEYS_GET_Keyspecs,0,NULL,2),.args=HOTKEYS_GET_Args},
Copy link
Member

@LiorKogan LiorKogan Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "where K is the number of hotkeys returned" instead of "where K is the number of hotkeys to return"

(we may return a smaller number)

return res;
}

double getDecayProb(chkTopK *topk, counter_t cnt) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mark the following 3 functions as static inline?

Copy link

@shahsb shahsb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scalability in Clusters: How does this behave in Redis Cluster? Slot filtering helps, but aggregating across nodes might need future work.

Comparison to Alternatives: How does CHK compare to other sketches like Count-Min or HyperLogLog in this context? A brief rationale would strengthen the case.

/* Our hash function is MurmurHash2, 64 bit version.
* It was modified for Redis in order to provide the same result in
* big and little endian archs (endian neutral). */
uint64_t MurmurHash64A(const void * key, size_t len, unsigned int seed);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace MurmurHash64A with XXH64 as suggested -- it's faster and already in Redis' toolkit.

* sample_ratio was 1 we add all keys.
* Note, for multi-key commands we use the same probability for all the keys. */
if (server.hotkeys.sample_ratio > 1 &&
(double)rand() / RAND_MAX >= 1. / server.hotkeys.sample_ratio)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. This is a micro-optimization but worthwhile for a hot path.

{MAKE_ARG("minnet",ARG_TYPE_INTEGER,-1,"MINNET",NULL,NULL,CMD_ARG_OPTIONAL,0,NULL)},
};

/********** HOTKEYS START ********************/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if HOTKEYS START is called while tracking is already active? Should it error or reset?

# Check that all returned keys (based on cpu time) are from our hot_keys list
set num_returned_cpu [llength $returned_cpu_keys]
assert_lessthan_equal $num_returned_cpu 10
assert_morethan $num_returned_cpu 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test with very low/high sample ratios -- e.g., ratio=1 (track everything) vs. ratio=1000 (sparse sampling).

assert_lessthan_equal $num_returned_cpu 10
assert_morethan $num_returned_cpu 0

foreach key $returned_cpu_keys {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add more unit tests for edge cases like zero-duration tracking, invalid params, or concurrent commands.

# Same as cpu-time but for net-bytes
set num_returned_net [llength $returned_net_keys]
assert_lessthan_equal $num_returned_net 10
assert_morethan $num_returned_net 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integration tests with real workloads (e.g., simulating hot keys via benchmarks) would validate the accuracy of CHK.

Comment on lines +64 to +69
for (int32_t i = topk->k - 1; i >= 0; --i)
if (fp == (runner + i)->fp && itemlen == (runner + i)->itemlen &&
memcmp((runner + i)->item, item, itemlen) == 0) {
return runner + i;
}
return NULL;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (int32_t i = topk->k - 1; i >= 0; --i)
if (fp == (runner + i)->fp && itemlen == (runner + i)->itemlen &&
memcmp((runner + i)->item, item, itemlen) == 0) {
return runner + i;
}
return NULL;
for (int32_t i = topk->k - 1; i >= 0; --i) {
chkHeapBucket *bucket = runner + i;
if (bucket->fp == fp &&
bucket->itemlen == itemlen &&
memcmp(bucket->item, item, itemlen) == 0)
{
return bucket;
}
}
return NULL;

not sure about variable name but maybe we can make this part little bit easier on eyes.

}

chkHeapBucket top = {0};
memcpy(&top, &array[start], sizeof(chkHeapBucket));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
memcpy(&top, &array[start], sizeof(chkHeapBucket));
top = array[start];

maybe you can just assign instead of mempcy calls in this function?

/* chkTopK operations */

chkTopK *chkTopKCreate(int k, int numbuckets, double decay) {
/* Number of buckets need to be a power of 2 for better performance - we
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need some input validation for this function? e.g. k >= 0
I didn't review rest of the PR yet, maybe it is done externally, then it might be okay (and if we have enough tests for it)

return NULL;
}

chkTopK *topk = ztrymalloc(sizeof(chkTopK));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we generally use trymalloc when we are making a huge allocation (potentially several GBs) and we don't want to abort the process if it fails. For small allocations, we generally user malloc() family and never expect them to fail (zmalloc() will abort on OOM). Are we expecting a huge memory allocation for this structure? Otherwise, maybe we should not bother with calling trymalloc functions.

"summary": "Returns two lists of top K hotkeys - one by cpu time and one by network bytes.",
"complexity": "O(K) where K is the number of hotkeys to return.",
"group": "server",
"since": "8.0.0",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"since": "8.0.0",
"since": "8.6.0",

similar issue in other commands


topk->heap[0].count = entry->count;
topk->heap[0].fp = fp;
topk->heap[0].item = chkStrndup(item, itemlen);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you storing a byte array or a null terminated string? I think you never reach this as a string? In any case, I feel like a separate chkStrndup() function might be unnecessary as it is used once. If you think otherwise, maybe chkStrndup() should be in zmalloc as zstrndup().


/* Update weighted item. If another one was expelled from the topK list -
* return its name. Caller is responsible for releasing it */
char *chkTopKUpdate(chkTopK *topk, char *item, int itemlen, counter_t weight);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I commented above, I feel like we are mixing byte array and null terminated strings. This function accepts new item as a byte array but returns previous item as a null terminated string. Maybe it is better not to mix these two. Maybe function should accept int *outlen to return item len.

dict *seen_slots = dictCreate(&hashDictType);
for (int i = 0; i < slots_count; i++) {
long slot_val;
if (getRangeLongFromObjectOrReply(c, c->argv[j+2+i], 0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use getSlotOrReply() here

@tezc
Copy link
Collaborator

tezc commented Jan 10, 2026

@minchopaskal I had a quick pass over the PR and left a few minor comments. I assume the API is not final yet so we don't have more details about the commands in the top comment. e.g. what is SLOTS or SAMPLE. I assume we also don't have more tests because of this.

Regarding cuckoo impl, maybe it is a good idea to add some comments over each function (only to important ones) to describe what it is doing. So, people who have no idea about the algorithm will be a chance to understand what is going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants