Query retry logic is not triggered in certain cases #326

dkropachev · 2024-10-31T23:18:08Z

Query executor does not involve retry policy in certain cases

Lines 109 to 198 in fc1b783

    
           func (q *queryExecutor) do(ctx context.Context, qry ExecutableQuery, hostIter NextHost) *Iter { 
        
           	selectedHost := hostIter() 
        
           	rt := qry.retryPolicy() 
        
           	lwt_rt, use_lwt_rt := rt.(LWTRetryPolicy) 
        
           	// We only want to apply LWT policy to LWT queries 
        
           	use_lwt_rt = use_lwt_rt && qry.IsLWT() 
        
           	var lastErr error 
        
           	var iter *Iter 
        
           	for selectedHost != nil { 
        
           		host := selectedHost.Info() 
        
           		if host == nil || !host.IsUp() { 
        
           			selectedHost = hostIter() 
        
           			continue 
        
           		} 
        
           		pool, ok := q.pool.getPool(host) 
        
           		if !ok { 
        
           			selectedHost = hostIter() 
        
           			continue 
        
           		} 
        
           		conn := pool.Pick(selectedHost.Token(), qry) 
        
           		if conn == nil { 
        
           			selectedHost = hostIter() 
        
           			continue 
        
           		} 
        
           		iter = q.attemptQuery(ctx, qry, conn) 
        
           		iter.host = selectedHost.Info() 
        
           		// Update host 
        
           		switch iter.err { 
        
           		case context.Canceled, context.DeadlineExceeded, ErrNotFound: 
        
           			// those errors represents logical errors, they should not count 
        
           			// toward removing a node from the pool 
        
           			selectedHost.Mark(nil) 
        
           			return iter 
        
           		default: 
        
           			selectedHost.Mark(iter.err) 
        
           		} 
        
           		// Exit if the query was successful 
        
           		// or no retry policy defined 
        
           		if iter.err == nil || rt == nil { 
        
           			return iter 
        
           		} 
        
           		// or retry policy decides to not retry anymore 
        
           		if use_lwt_rt { 
        
           			if !lwt_rt.AttemptLWT(qry) { 
        
           				return iter 
        
           			} 
        
           		} else { 
        
           			if !rt.Attempt(qry) { 
        
           				return iter 
        
           			} 
        
           		} 
        
           		lastErr = iter.err 
        
           		var retry_type RetryType 
        
           		if use_lwt_rt { 
        
           			retry_type = lwt_rt.GetRetryTypeLWT(iter.err) 
        
           		} else { 
        
           			retry_type = rt.GetRetryType(iter.err) 
        
           		} 
        
           		// If query is unsuccessful, check the error with RetryPolicy to retry 
        
           		switch retry_type { 
        
           		case Retry: 
        
           			// retry on the same host 
        
           			continue 
        
           		case Rethrow, Ignore: 
        
           			return iter 
        
           		case RetryNextHost: 
        
           			// retry on the next host 
        
           			selectedHost = hostIter() 
        
           			continue 
        
           		default: 
        
           			// Undefined? Return nil and error, this will panic in the requester 
        
           			return &Iter{err: ErrUnknownRetryType} 
        
           		} 
        
           	} 
        
           	if lastErr != nil { 
        
           		return &Iter{err: lastErr} 
        
           	} 
        
           	return &Iter{err: ErrNoConnections} 
        
           }

These cases are

Host is down
Host does not have pool
Host pool does not have connections

Probably it make sense to involve retry policy in such cases.
It will require policies to be smarter in regards of the GetRetryType results, in listed cases.
Also it will break API a bit in a sense that we will have to introduce new errors which will be returned instead of ErrNoConnections

The text was updated successfully, but these errors were encountered:

sylwiaszunejko · 2024-12-16T11:56:13Z

@dkropachev Currently in these cases you mentioned we don't use retry policy, but we retry on the next host always. It is true even if retry policy is not defined. I am not sure what correct behavior should be if one of those cases occur:

Host is down

Host does not have pool

Host pool does not have connections

and rt == nil, should we then stick to the current behavior and retry on the next host or should we just exit?

sylwiaszunejko · 2024-12-17T14:25:17Z

@dkropachev ping

pdbossman · 2024-12-17T14:30:41Z

I would think stopping the retry on next host would be a regression. It seems to me - the request was not really tried.

sylwiaszunejko · 2024-12-17T14:41:16Z

I would think stopping the retry on next host would be a regression. It seems to me - the request was not really tried.

Ok, so I guess if retry policy is not there we should retry on next host in those cases. Also I think the retry policies should be edited to make sure they always result in retry on next host for given scenarios

dkropachev mentioned this issue Oct 31, 2024

Queries failing due to no hosts available in the pool. #325

Closed

dkropachev mentioned this issue Nov 20, 2024

Idempotent flag ignored for retries #331

Closed

sylwiaszunejko self-assigned this Dec 16, 2024

sylwiaszunejko mentioned this issue Dec 18, 2024

Let retry policy to decide about non idempotent queries #376

Merged

dkropachev closed this as completed in #376 Dec 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query retry logic is not triggered in certain cases #326

Query retry logic is not triggered in certain cases #326

dkropachev commented Oct 31, 2024

sylwiaszunejko commented Dec 16, 2024 •

edited

Loading

sylwiaszunejko commented Dec 17, 2024

pdbossman commented Dec 17, 2024

sylwiaszunejko commented Dec 17, 2024

Query retry logic is not triggered in certain cases #326

Query retry logic is not triggered in certain cases #326

Comments

dkropachev commented Oct 31, 2024

sylwiaszunejko commented Dec 16, 2024 • edited Loading

sylwiaszunejko commented Dec 17, 2024

pdbossman commented Dec 17, 2024

sylwiaszunejko commented Dec 17, 2024

sylwiaszunejko commented Dec 16, 2024 •

edited

Loading