Optimizing ZeroMQ for High-Throughput Low-Latency Messaging

I am working on a high-performance messaging system using ZeroMQ, and I am encountering issues with handling high-volume requests. The system needs to handle about 150K requests per second with a response time of less than 30ms. Here's my current setup:

Hub: Acts as a centralized message receiver and binds a ROUTER socket. REQ sockets (clients) connect to this ROUTER.Hashmap: The Hub maintains a hashmap of DEALER sockets and worker REP sockets to distribute the load.Message Flow:

Clients (REQ) send multipart messages where the first part is matched by a key in the hashmap.
If the key is not in the hashmap, the message exits the system.
If the key is found, the message is forwarded to the corresponding DEALER socket.
The DEALER socket forwards the message to worker REP sockets.
Workers process the message and send the response back through the DEALER socket.
The DEALER socket sends the response back to the Hub.
The Hub routes the response back to the client.

The requests are distributed across multiple pods, but I notice that above 5K requests on a single pod, there is a significant delay in waiting for the response from the dealer to the router. To address this, I initially thought of spawning a thread as soon as I match a key in the hashmap, but since ZeroMQ sockets are not thread-safe, I am unsure how to proceed.

My questions are:

How can I handle high-volume requests effectively given that ZeroMQ sockets are not thread-safe?What are the best practices for managing threads in this context to avoid delays and improve response times?Are there alternative approaches or patterns in ZeroMQ that could better handle this scenario?

Thank you for your help!