


How to implement a retry strategy from serverB to serverC using Spring WebFlux when building LLM gateway?
Apr 19, 2025 pm 04:30 PMRetry mechanism for building LLM gateway using Spring WebFlux
When building an LLM gateway, communication between services needs to be handled and ensure that when a service is unavailable, it is possible to switch to the backup service seamlessly. This article will explore how to achieve this using Spring WebFlux, especially if gateway to Server B communication fails, how to retry and connect to Server C.
Scene description
Our LLM gateway call link is: Client-> Gateway-> Server B. If the gateway connection to Server B fails, we want the gateway to be able to retry and connect to Server C. This requires that the gateway can capture the error response code of Server B and automatically switch to Server C on failure.
Code analysis and improvement solutions
Let's first look at the original sseHttp
method, which handles gateway requests to Server B or Server C:
Flux<response> responseFlux = webClient.create(url) .post() .headers(httpHeaders -> setHeaders(httpHeaders, headers)) .contentType(MediaType.APPLICATION_JSON) .bodyValue(jsonBody) .retrieve() .onStatus(status -> status != HttpStatus.OK, response -> { // Error handling logic}) // ...Other logic...</response>
In order to implement the retry strategy, we need to capture the error response code of Server B and switch to Server C when an error occurs. There are some problems with previous attempts: simple try-catch
cannot catch errors inside Flux
; the subscribe
method is non-blocking, resulting in the error handling logic not taking effect in time.
Best Practice: Utilize retryWhen
and onErrorResume
To solve the above problem, we should take advantage of retryWhen
and onErrorResume
operators provided by Spring WebFlux.
First, modify the sseHttp
method and add retry logic:
Flux<response> sseHttp(String url) { return webClient.create(url) .post() .headers(httpHeaders -> setHeaders(httpHeaders, headers)) .contentType(MediaType.APPLICATION_JSON) .bodyValue(jsonBody) .retrieve() .onStatus(HttpStatus::isError, clientResponse -> { // Record error logs to facilitate debugging return Mono.error(new WebClientResponseException("Server returned error status: " clientResponse.rawStatusCode(), clientResponse.rawStatusCode(), clientResponse.headers().asHttpHeaders(), clientResponse.bodyToMono(String.class).block(), null)); }) .bodyToFlux(typeRef) .retryWhen(Retry.backoff(3, Duration.ofSeconds(1)) .filter(throwable -> throwable instanceof WebClientResponseException) .onRetryExhaustedThrow((spec, signal) -> new GatewayException("Failed to connect to both Server B and Server C after multiple retries."))); }</response>
This code uses onStatus
to process HTTP error status codes and retry with retryWhen
, retry up to 3 times, each time interval of 1 second. filter
ensures that only exceptions of type WebClientResponseException
are retryed. If the number of retrys is exhausted, GatewayException
is thrown.
Then, where sseHttp
is called, use onErrorResume
to handle the failure of Server B and switch to Server C:
Mono<response> responseMono = sseHttp(serverBUrl) .onErrorResume(WebClientResponseException.class, ex -> { log.warn("Failed to connect to Server B: {}", ex.getMessage()); // Log error log return sseHttp(serverCUrl); }) .next();</response>
This code first tries to connect to Server B, and if WebClientResponseException
occurs, it tries to connect to Server C. The next()
method ensures that only one result is returned.
Handle multiple successful responses
If both Server B and Server C successfully return data, we need to make sure that only one response is processed. An AtomicBoolean
variable can be used to track whether the response has been processed successfully:
AtomicBoolean success = new AtomicBoolean(false); Flux<response> sseHttp(String url) { // ... (previous code) ... .doOnNext(response -> { if (success.compareAndSet(false, true)) { // Processing a successful response} }) // ... (rest of the code) ... }</response>
Through the above improvements, we have implemented a more robust retry mechanism that can effectively handle communication failures between services and ensure high availability of LLM gateways. Remember to add sufficient logging to facilitate troubleshooting.
The above is the detailed content of How to implement a retry strategy from serverB to serverC using Spring WebFlux when building LLM gateway?. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

Contents 1. What is ICN? 2. ICNT latest updates 3. Comparison and economic model between ICN and other DePIN projects and economic models 4. Conclusion of the next stage of the DePIN track At the end of May, ICN (ImpossibleCloudNetwork) @ICN_Protocol announced that it had received strategic investment in NGPCapital with a valuation of US$470 million. Many people's first reaction was: "Has Xiaomi invested in Web3?" Although this was not Lei Jun's direct move, the one who had bet on Xiaomi, Helium, and WorkFusion

As July 2025 approaches, the crypto market is hotly discussing which tokens may bring high returns. Are names like Pi, PEPE and FloppyPepe really worth the risky investment? Potential cryptocurrencies worth paying attention to in July 2025: virtual fire or real gold? As mid-2025, the heat of discussions on high-yield crypto assets continues to heat up. Bitcoin trends and "altcoin season" expectations have attracted investors' attention. Do tokens like PiNetwork, PEPE and FloppyPepe have the potential to bring considerable investment returns? Let's analyze its prospects one by one. Altcoin Market: Can July get what it wants? Against the backdrop of Bitcoin’s expected record of historical highs, the “altcoin season” seems to be brewing. Back

How do novice users choose a safe and reliable stablecoin platform? This article recommends the Top 10 stablecoin platforms in 2025, including Binance, OKX, Bybit, Gate.io, HTX, KuCoin, MEXC, Bitget, CoinEx and ProBit, and compares and analyzes them from dimensions such as security, stablecoin types, liquidity, user experience, fee structure and additional functions. The data comes from CoinGecko, DefiLlama and community evaluation. It is recommended that novices choose platforms that are highly compliant, easy to operate and support Chinese, such as KuCoin and CoinEx, and gradually build confidence through a small number of tests.

Against the backdrop of violent fluctuations in the cryptocurrency market, investors' demand for asset preservation is becoming increasingly prominent. This article aims to answer how to effectively hedge risks in the turbulent currency circle. It will introduce in detail the concept of stablecoin, a core hedge tool, and provide a list of TOP3 stablecoins by analyzing the current highly recognized options in the market. The article will explain how to select and use these stablecoins according to their own needs, so as to better manage risks in an uncertain market environment.

This article will discuss the world's mainstream stablecoins and analyze which stablecoins have the risk aversion attribute of "gold substitute" in the market downward cycle (bear market). We will explain how to judge and choose a relatively stable value storage tool in a bear market by comparing the market value, endorsement mechanism, transparency, and comprehensively combining common views on the Internet, and explain this analysis process.

As the market conditions pick up, more and more smart investors have begun to quietly increase their positions in the currency circle. Many people are wondering what makes them take decisively when most people wait and see? This article will analyze current trends through on-chain data to help readers understand the logic of smart funds, so as to better grasp the next round of potential wealth growth opportunities.

Explore Remittix (RTX), Monero (XMR) and Crypto-Fiat Trends: How these projects shape the future of cryptocurrencies through practicality and community orientation. Remittix, Monero and Cryptocurrency Evolution: What is the hottest speculation? The crypto market is always in a dynamic change, and new and old projects are competing for investors' attention. Currently, Remittix (RTX), Monero (XMR) and crypto-fiat currency directions are becoming the focus of discussion. Let’s find out what driving forces are behind this wave of popularity? Remittix: The emerging token with emerging potential is gradually gaining market attention, and its development trajectory has been compared to the early stages of Bitcoin and Ethereum by some people. "CryptoR

Robinhood launched OpenAI and SpaceX tokenized stocks caused controversy, with Elon Musk and Sam Altman fighting each other over the nature of the so-called "fake equity". Recently, the intersection of Elon Musk, Sam Altman and Robinhood has become the focus of public attention, all of which stems from tokenized equity. Robinhood's launch of tokenized stocks in private companies such as OpenAI and SpaceX to European users has sparked heated debate and accompanied by clarification and criticism from all parties. Robinhood's tokenized equity: A bold attempt? Robin, led by CEO Vlad Tenev
