I have a gRPC service running on multiple instances across different AWS regions. Each instance has its own DNS endpoint. I’m trying to figure out the best way to handle this setup.
Right now I’m thinking about creating one channel that can connect to all these endpoints like this:
ChannelBuilder builder = ManagedChannelBuilder.forTarget("multi-dns://endpoint1.aws.com;endpoint2.aws.com;endpoint3.aws.com");
I want to build a custom NameResolver that handles resolving each DNS separately. But I’m running into issues because the service authorities are different for each endpoint.
I have three main questions:
- Is using one channel for multiple DNS endpoints a good approach or should I avoid this pattern?
- Can I make a single channel work with multiple DNS when the service authorities don’t match?
- Would it be better to just create separate channels for each DNS endpoint instead?
Any advice on the best practice here would be really helpful. Thanks!
Interesting challenge! What’s your plan for load balancing between regions? I’m wondering about failover with your custom nameresolver - does it automatically try the next endpoint or do you need to build manual switching logic?
The multi-endpoint approach with a custom NameResolver works, but you’ll need to handle authority validation correctly. You can override the authority in your channel builder using overrideAuthority() for a common authority across endpoints, or add custom logic in your NameResolver for different authorities per address. But this complicates certificate validation with TLS. I’ve found separate channels give you cleaner separation, especially cross-region where you want different retry policies, timeouts, or circuit breakers per region. Multiple channels have minimal overhead compared to the operational benefits of isolated failure domains.
honestly, i’d go with separate channels for each endpoint. tried something similar before and the authority mismatches were a nightmare to debug. one channel per region keeps things simple and you get better error handling when one region goes down.