There are a number of risks when your platform integrates with an external service / dependency. For instance, here are a few risks and things that can go wrong:
- Doesn’t respond at all. Just blocks indefinitely eating client-side resources.
- Responds progressively slower – i.e. response time degradation.
- Needs retry logic to deal with transient failures (Note: obviously care needs to be taken if the call isn’t idempotent!)
- Responds with an unexpected return code – e.g. internal server error or service unavailable error, etc.
- Gets overwhelmed by the rate of requests being set to it. Ideally, it should have protection against this but what if it is not entirely in your team’s control.
- Becomes unavailable throwing runtime exceptions forcing undesirable side-effects on the caller rather than failing fast.
Michael Nygard in his book Release It! talks about leveraging circuit breakers to deal with integration risks. Broadening that idea a bit, we could combine circuit breakers and mediation into a more generic Integration Proxy component. This proxy could implement a number of common concerns when working with external APIs:
- Capture response time and route metrics to an analytics agent asynchronously
- Monitor stale connections and automatically reset them if possible
- Host the circuit breaker with associated logic to toggle based on service health
- Provide “fallback” responses if circuit breaker kicks in to disable integration point.
- Host sleep / retry invocation logic using parameters like interval and max attempts
- Automatically flush pending / bufferred messages when service is available again.
- Enable request and response capture – specially for debugging production issues.
Tagged: integration points, managed platforms, platform as a service, software reuse
