Bilgin Ibryam
Oct 30 2023
Choreography, or Orchestration, That is the Wrong Question
Intro
In today's tech landscape, microservices, functions, and serverless have become the benchmark for modern cloud architectures. As developers and tech leads navigate this space, one debate seems to persistently echo through blog posts, and conferences: choreography or. orchestration, which one is better for distributed application coordination? Dive into any tech discussion, and you'll find articles championing one over the other, each armed with a list of pros and cons, best fit scenarios. However, a closer look reveals that many of these discussions are influenced by the tools and platforms the authors are familiar with. The age-old "if you have a hammer, everything looks like a nail" argument is evident. Messaging vendors will passionately advocate for the decentralized, event-driven nature of choreography, painting it as the future. On the other hand, orchestration engine vendors will emphasize the clarity, control, and manageability that orchestration brings. It doesn't stop there. Service mesh advocates will highlight the benefits of real-time request/response driven interactions and API management, while stream processing fans will boost the virtues of event-driven architecture and schema registries.
Choreography and orchestration approaches
Beyond the theoretical debates, there's also an underlying issue that often goes unaddressed: each methodology comes with its own software stack, programming paradigm, and set of operational hurdles. This fragmentation not only complicates the development process but also hinders operations teams from gaining a holistic view of the system. As a result, inefficiencies arise, escalating as the project scales. The question then becomes: How did we land in this predicament, and is there a pragmatic solution? Let’s explore this problem space.
Microservice Interaction Styles
The transition from monolithic architectures to microservices caused a shift in service communication patterns from in-process typical of monolithic systems, to inter-process communication in microservices. This change brought about new challenges and considerations, around performance, coupling, error handling to name a few, which in turn led to the emergence of multiple service interaction styles:
-
Request-response (Synchronous): A microservice sends a request to another and waits for a response.
-
Event-driven (Asynchronous): Microservices emit events for other services to consume. The emitter remains unaware of which services, if any, consume its events.
For basic interactions, these patterns are adequate, but when it comes to complex scenarios, such as business transactions that span multiple services, a more sophisticated mechanism is essential. The Saga Pattern, a sequence of local transactions where each transaction updates the database and publishes a message or event to trigger the next step, addresses this cross-service coordination. It enables an application to maintain data consistency across multiple services without using distributed transactions. If a local transaction fails due to a business rule violation, the saga executes compensating transactions to undo the changes made by the preceding transactions, ensuring system-wide consistency and coordination. This pattern can be realized through two primary approaches: choreography and orchestration.
-
Choreography: In this approach, each local transaction publishes domain events that trigger local transactions in other services. This model is particularly effective for reducing dependencies and allowing for more flexible and decoupled architectures.
-
Orchestration: Here, an orchestrator dictates the participants on which local transactions to execute. This model offers a more controlled environment where the sequence of service interactions, state maintenance, and error handling are crucial.
Beyond the context of sagas, both choreography and orchestration play pivotal roles in microservices interactions. Choreography, operating without centralized control, is particularly effective for communication between bounded contexts. In this approach, an event from one service can trigger actions in another, with the emitting service remaining agnostic to its event consumers. Conversely, orchestration provides a structured approach within a bounded context, ensuring a controlled sequence of service interactions focused on state maintenance and error handling. As developers navigate this landscape, they'll often find themselves leveraging both synchronous and asynchronous interactions, using events and commands. Recognizing the appropriate model—be it choreography or orchestration—based on specific requirements and context, becomes crucial in crafting scalable and extensible microservice architectures.
Challenges in Melding Varied Technologies
While navigating these patterns for coordination is one challenge, the real complexity arises when choosing and melding the diverse, often standalone technologies needed to implement these patterns. A multitude of Inter-Process Communication (IPC) technologies cater to microservices. For synchronous request/response communication, HTTP-based REST or gRPC are popular choices. Conversely, asynchronous, message-driven communication frequently leverages protocols like AMQP, Apache Kafka, or RabbitMQ.
For higher-level microservices coordination patterns, several open-source projects, including Cadence, Netflix Conductor, and Temporal, have emerged. These offer orchestration solutions for persistent service interactions. However, employing an orchestration engine often introduces a bespoke programming model. This might involve a higher-level DSL (or standards like BPMN) to delineate interaction sequences, bespoke error handling and compensation mechanism, and operational model, which may or may not seamlessly mesh with your organizational landscape.
Merging multiple diverse technologies to enable microservices communication is a formidable task. Synchronous interactions often necessitate a comprehensive stack, including service discovery, network resiliency, observability, and security. Popular choices include service meshes operating at the network layer, NATs, or Consul or language-specific libraries such as Hystrix and Resilience4j for Java, Polly for .Net. Asynchronous interactions, on the other hand, typically require a broker, like Apache Kafka or RabbitMQ. This is often paired with a schema registry, connectors, REST proxy, Change Data Capture solutions such as Debezium, and additional operational and governance tools. All of these need to be joined together with observability tools that instrument your code, with choices like OpenTelemetry, Jaeger, Prometheus, Log4j, and similar.
Once these technologies are incorporated into an organization's landscape, they require ongoing management, governance, lifecycle maintenance by operations teams, and the development of best practices, often called "golden paths," by the platform teams. But this adoption is just the starting point. The real challenge is set for developers who must intertwine these technologies, each with its own APIs, observability, configurations, error handling semantics, into their preferred application runtimes. It's not rare for a single service to perform synchronous calls while also raising asynchronous events. Occasionally, a service might need to coordinate interactions that mix synchronous calls with short retries, circuit breakers, saga-based coordination with persistent rollbacks, and asynchronous event notifications during key phases of business processes. Such coordination is further complicated when the configuration and programming logic span across disparate software stacks, network layers, and responsibilities intertwined among siloed developer and platform teams.
For example, a service mesh is popular for handling mTLS and observability among operations teams. But when this tool configures retries, circuit breakers, and timeouts, it can inadvertently introduce adverse side effects if not aligned with the application's design. Specifically, retries need to be coordinated with application features like idempotency. Timeouts should be set, keeping in mind the end-to-end processing time. Circuit breakers, on the other hand, should be aligned with the appropriate bulkheading boundary. Without such coordination with the application developers, the overall system can behave unpredictably, leading to potential new failure modes. Diagnosing issues in such systems that operate independently and aren't inherently designed to work in tandem usually lead to troubleshooting nightmares.
The combinatorial complexity from navigating this technology fragmentation, quickly becomes palpable. The myriad of libraries embedded within the application, often leads to frequent security patches that necessitate code changes, retesting, and redeployment, adding overhead to the development cycle. These examples demonstrate the pressing need for solutions that can unify and harmonize the complexities arising from the technology fragmentation and ensure a cohesive microservices stack.
Dapr: A Unified Programming Model for Distributed Applications
Navigating the vast landscape of microservices, developers often face the challenge of melding a myriad of technologies and programming models. As Kubernetes once transformed the compute landscape by offering a unified API for workload orchestration and infrastructure abstraction, Dapr delivers unified APIs for service communication and infrastructure consumption. Dapr’s APIs based on the sidecar architecture, encapsulate multiple service interaction styles, ensuring they coexist harmoniously and adapt to evolving business needs:
-
Service Invocation: Used for synchronous service-to-service communication, this API provides service discovery for request/response patterns. It handles cross-cutting concerns such as resiliency, observability and tracing, as well as security features including network encryption and access control.
-
Publish-Subscribe: Crafted for event-driven interactions and choreography, this API empowers services to publish events and subscribe to topics, abstracted from the underlying broker implementation details. It ensures loose coupling and while adding fault tolerance features such as dead letter queues, at least once delivery guarantees, retries, circuit breakers, message TTLs, and access policies.
-
Workflow: This API facilitates the definition and management of workflows in a distributed environment, ensuring harmonious coordination of multiple services through workflow patterns available such as task chaining, branching fan-out/fan-in and async calls to other services.
-
State Store with Transactional Outbox: This API offers state management capabilities and emits events upon data mutations, ensuring data consistency and event-driven responsiveness. It incorporates concurrency control, data isolation, fault tolerance, and other cross-cutting concerns such as access control.
Dapr's strength isn't just in its individual APIs but in the ability to combine them effectively in a single stack and programming model. It provides developers, regardless of their language and framework, the tools to build distributed applications using HTTP or gRPC, while consistently addressing cross-cutting concerns. Furthermore, Dapr's workflow engine is designed to work in tandem with its suite of APIs, simplifying the development process and also ensuring that applications are maintainable, extendable, and resilient to changing business dynamics.
An illustration showcasing Dapr's dual strengths: unified APIs and architectural flexibility.
Dapr stands out not only for its extensive range of developer APIs but also for its unparalleled flexibility and adaptability. While many comprehensive frameworks and cloud platforms impose restrictions and dictate the language choice, runtime, or cloud ecosystem, Dapr's sidecar architecture and polyglot nature breaks free from these constraints. Whether you're modernizing existing systems or starting afresh, Dapr's design allows for incremental adoption of its APIs or a full-fledged integration for greenfield applications. It doesn't prescribe a specific runtime framework, application structure, deployment strategy, or execution environment. Through its binding API, Dapr facilitates seamless connections to a myriad of cloud or on-premise backing services, while its other APIs abstract away intricate infrastructure configurations and client libraries.
Summary
In the realm of microservices communication, developers will often find themselves leveraging both synchronous and asynchronous interactions as the key ingredients of choreography and orchestration. The optimal approach involves a blend of interaction styles, tailored to specific business requirements and honoring service boundaries. While numerous platforms promise developer productivity, they usually impose constraints, either nudging developers towards specific interaction styles or burdening them with the daunting task of meshing multiple frameworks.
Unlike its counterparts, Dapr offers a harmonious blend of interaction styles coupled with architectural adaptability. It not only simplifies application interactions, catering to both synchronous and asynchronous patterns, but also enables advanced choreography and orchestration patterns. The polyglot nature of Dapr allows developers to express their creativity using their favored programming language, while its non-intrusive sidecar architecture enables optimal runtime selection. Furthermore, Dapr consistently addresses non-functional requirements, such as security, resiliency, and observability, across all its APIs. Its ability to abstract underlying services ensures that applications remain portable, avoiding vendor lock-in. This ensures that microservices patterns and the expertise of development teams are transferable across projects.
Interested in diving deeper into Dapr? Join the Dapr Discord server to connect with a vibrant community of developers, and connect with us to find out how Diagrid Conductor helps organizations operate Dapr in production with confidence.