MicroServices Architecture - Terminology and usage of Technologies
Hi, I am Malathi Boggavarapu working at Volvo Group and i live in Gothenburg, Sweden. I have been working on Java since several years and had vast experience and knowledge across various technologies.
In this post i will discuss in-depth about how actually the MicroService architecture works. In my previous post of MicroServices, i mainly discussed about the difference between Monolithic and MicroService architecture and pros and cons about both of the architectures. This course will make you aware of different terminology that is used in MicroService architecture and most importantly the way it works. Remember, nothing is discussed about how to develop or code MicroService based application but instead many concepts about MicroService architecture is discussed during this course.
Please do remember, as discussed in my previous post about MicroServices, i used the same MicroServices such as User, Product and Order through out this course. Please do visit my previous post about MicroServices for more information.
So let's get started.
Microservices have an impact on the way you organize your teams and your code. In the design phase which we have discussed in my earlier post, business functionalities of User sub-domain is less compared to the functionalities of Order sub-domain. This means you can right-size each team. The smaller team for User, the bigger team for Product and slightly bigger team for Order sub-domain that covers quite of lot of business functionalities. Each team is now totally independant and most important thing is it is solely responsible for entire lifecycle of the product, i.e., from development to deployment. That's when you need to embrace Agile and Devops methodologies so each one with different skills learn how to work together on the same product. Even those teams are independant, remember that Microservices will endup communicating with each other. This means that you still have to manage and orchestrate the integration between teams at certain point in time.
Each team being independant and separated, they can each have their code and documentation in separate repositories. That is using different version control softwares such as Git, SubVersion or Mercurial. Remember that each software is independant, therefore versioning is important. For example the Purchases order team might be developing the version 1.0 of the MicroService, while the Product team might be testing 4.0 version of the MicroService.
If the customer changes an email address in User sub-domain, the change has to be replicated in the Order sub-domain. Same is applicable for Product. In MicroServices world, there is no distributed transaction which means that when we update User sub-domain, you can not span along running two phase commit distributed transactions to the Order sub-domain. That would have Performance impact on our application. Without such Distributed transaction, you lose consistency of the data. So to accomplish this we then move to Eventual Consistency model. This means that a Service publishes an event when its data changes. The other Service consumes the event and update the data. This is related to Capture data change and event sourcing patterns. There are several ways of publishing events using Akka, Kafka and RabbitMq. The tool like Debezium uses events to handle capture data change.
For example, our purchase order details page not only displays the name and the price of the product but also its availability. We need to call two MicroServices. And most important we want our users to feel that they are interacting with a single application. So the idea is to do UI composition. There are actually two design patterns we can use. One is Server side page composition. This allows you to build web pages on the server by composing html fragments developed by multiple MicroServices team. There is also a Client side composition . This is where the browser builds a single UI interface composing UI fragments. This means that we need a UI team that is responsible to implement the application skeleton that aggregates multiple MicroService UI components.
Remote Procedure Invocation also called as RPC is the simplest and familiar inter-process communication protocol. It works on request and response principle. A service request something to another service and this one replies back. This call can be synchronous or asynchronous. There are numerous examples of RPI technologies such as REST, SOAP or gRPC.
The other major communication style is Messaging. This is when MicroServices exchange messages or events by a Broker or Channel. When a MicroService wants to interact with the other, it publishes a message to the Broker. The other MicroService will subscribe to the broker if interested in such messages and receive the messages at the later stage. After receiving messages from the broker, the MicroService can then proceed to update its own state. Asynchronous messages play a significant role in keeping things loosely coupled in MicroService architecture. They also improve the availability since the MessageBroker buffers the messages until the consumer is able to process them. There are numerous examples for Message brokers such as Apache Kafka or RabbitMq.
No matter we use Messaging or RPI we need to define the format of the data the MicroServices exchange. Depending on the needs of your architecture you can use text or binary. Text messages can take several formats. The most used formats today are XML, JSON or YAML, The main advantage is that they are human readable, easy to implement and debug. You can also exchange binary messages. Binary protocols such as gRPC for example are more compact but more difficult to handle.
Let's take back our user interfaces. Each MicroService has its own set of Graphical user interfaces. They must be aggregated into a single application. How do these components access individual services.? One solution is to have one-to-one relationship between the component and the MicroService. And then each call needs to deal with Cross cutting concepts such as security. The better approach is to have an API Gateway which is single entry point for all clients. This allow each client to have unified interface to all MicroServices. The Gateway can then handle requests in one of two ways. Some requests are simply routed to the appropriate service. Others can handle cross-cutting concerns such as Authentication, authorization and determine the appropriate service via the registry. A gateway can also be the ideal way to insert API translation. Different devices need different data. Therefore a gateway can expose different API for each client. There are few Gateways such as Zuul, Netty or Finagle.
Here we can see the benefit of having the Gateway as single entry point for client requests. Gateway authenticates requests and forwards them to a MicroService which in turn invoke other MicroServices too. Some known Identity and Access Management systems are Okta, Keycloak and Shiro.
In a distributed system, it is difficult to serve the authenticity of requests in consistent way across all services. The important question is "Once authenticated, how the MicroService communicate the identity of requester to other MicroServices?" the answer is Access Token. It securly stores information about user and is then exchanged between services. Each service need to make sure that the Token is valid and takes the user information out of it and verify whether the user is authorized to perform the operation or not. Tokens can follow the JSON Web Token specification. There is also another possibility to use Cookies between the Services calls. Again you can see the benefit of Gateway as it centralizes user interface calls and access Token control.
One major advantage of the architecture is that we can scale each MicroService independantly depending on its need. Being as a distribuited system, it should also have high availability rate which means that the system should be available 99.99% and for that there are few techniques to know. For Christmas or New year or on any other important event, the application will be really under the load because customers may place many orders. Our User MicroService may not be very much effected but our Purchase Order and Product MicroServices would definitly need scaling because of the load. There are several ways of doing it. We have a concept of Vertical scaling which means that we scale by adding more Power to the Machine. For example we can add more CPU and RAM for the Purchase order MicroService and it would be appropriate for this particular MicroService. And we also have the concept of Horizontal scaling where we scale by adding more machines. For example our Product MicroService can be replicated to different machines.
When we scale horizontally we end up with several instances of same MicroService located on different servers. So if we do horizontal scaling for Product MicroService and when Purchase order invokes the Product which of the instances will it use.? Remember we have a Registry, so all the instances of the Product will be registered there. So it's just the matter of adding Client Load balancing on the Purchase order. The load balancer will pickup from among the registered instances of the Product MicroService and routes the request to it. The load balancer will decide based on whatever criteria it likes. It may run depending on Round-robin fashion or weight or capacity of the Service. The common Client load balancing tools are Ribbon and Meraki.
Availability
Availability means the probability for the system to be operational at given time. For example high available systems is expected to be available for 99.99% of the time and some minutes or hours of downtime in a year. If you notice, in our architecture we have Single point of failure. A single point of failure is a part of the system, that if fails causes entire system to fail. For example we have only one Gateway, one Message Broker, one Service Registry and one Identity Access Management system. For distributed system to be continously available, every request received must result in a response. This means if Service Registry is down, the User MicroService is unable to locate Purchase Order MicroService which results the entire system to be down and unavailable. To fix this problem all single point of failures should be scaled horizontally so we have multiple instances.
Hope my examples are helpful. Now let's talk a bit about Monitoring the system which is the important aspect of the distributed system.
One important information is Health check. Health check in the sense, sometimes MicroService is running but incapable of handling request. For example it might have run out of database connections. So the important question that arise here is, how to detect a running MicroService which is unable to take a request? One way is to have a Health check API on all the MicroServices. That is the http endpoint that returns the health of the service and can be pinged by centralized monitoring. The API can perform checks such as status of database, status of host, disk space and available memory and so on. The centralized monitoring constantly call this API to find the health of each and every MicroService.
When an error occurs MicroService throws an exception which contains a stack trace. This exception will be logged at somewhere in the middle of the log file and would be difficult to find. So these exceptions should be recorded in the centralized exception tracking system so that the developers find an easy way to check them and resolve them.
Time taken to answer an http request?
Time for creating a purchase order takes?
Time for the database access?
We then need to aggregate these metrics in centralized metric system that provides reporting. There are several metric tools available in the market.
There are few Tracing systems to use in MicroService architecture such as Dapper, HTrace or Zipkin.
So that's all about it. We came to an end of this course. All that we discussed until now is the terminology used in MicroServices, problems that arise and Solutions that we follow to solve them out and also we learnt about different technologies that could be used while building a MicroService based application.
Hope this course is helpful! Please post your comments in the comments box.
Thank you!!!
In this post i will discuss in-depth about how actually the MicroService architecture works. In my previous post of MicroServices, i mainly discussed about the difference between Monolithic and MicroService architecture and pros and cons about both of the architectures. This course will make you aware of different terminology that is used in MicroService architecture and most importantly the way it works. Remember, nothing is discussed about how to develop or code MicroService based application but instead many concepts about MicroService architecture is discussed during this course.
Please do remember, as discussed in my previous post about MicroServices, i used the same MicroServices such as User, Product and Order through out this course. Please do visit my previous post about MicroServices for more information.
So let's get started.
Microservices have an impact on the way you organize your teams and your code. In the design phase which we have discussed in my earlier post, business functionalities of User sub-domain is less compared to the functionalities of Order sub-domain. This means you can right-size each team. The smaller team for User, the bigger team for Product and slightly bigger team for Order sub-domain that covers quite of lot of business functionalities. Each team is now totally independant and most important thing is it is solely responsible for entire lifecycle of the product, i.e., from development to deployment. That's when you need to embrace Agile and Devops methodologies so each one with different skills learn how to work together on the same product. Even those teams are independant, remember that Microservices will endup communicating with each other. This means that you still have to manage and orchestrate the integration between teams at certain point in time.
Each team being independant and separated, they can each have their code and documentation in separate repositories. That is using different version control softwares such as Git, SubVersion or Mercurial. Remember that each software is independant, therefore versioning is important. For example the Purchases order team might be developing the version 1.0 of the MicroService, while the Product team might be testing 4.0 version of the MicroService.
Data storage
In the monolith approach, the entire data of the application is stored in a single database. In MicroService architecture all the Services need to persist data in some database. Services must be loosely coupled so that they can be developed, deployed and scaled independantly. Therefore each needs to have independant database. Infact this makes sense because different services may have different data storage requirements. One will heavily rely on transactions. The other one will write mostly and the other one can be read only. Infact for some Services a relational database may be the good choice for example Product MicroService. The Order MicroService might need a NoSql database such as Document database which is good for storing constructed data. The User MicroService can use Graph database or even LDAP directory. Using a database per service ensure that our Services are loosely coupled. Changes to database of one Service does not impact any other Services. Having several databases brings another challenge which is Data Synchronization.If the customer changes an email address in User sub-domain, the change has to be replicated in the Order sub-domain. Same is applicable for Product. In MicroServices world, there is no distributed transaction which means that when we update User sub-domain, you can not span along running two phase commit distributed transactions to the Order sub-domain. That would have Performance impact on our application. Without such Distributed transaction, you lose consistency of the data. So to accomplish this we then move to Eventual Consistency model. This means that a Service publishes an event when its data changes. The other Service consumes the event and update the data. This is related to Capture data change and event sourcing patterns. There are several ways of publishing events using Akka, Kafka and RabbitMq. The tool like Debezium uses events to handle capture data change.
User interface
Not all MicroServices have a user interface. But when they do there are several techniques if we need to aggregate them. One benefit of MicroServices is that each team develops in a realtively isolated independant manner. They can develop and maintain their own set of graphical components, aggregate them, work with designers and have a best user experience for their usescases. But one challanege quickly arises. How to implement unique UI to display data for multiple microservices.?For example, our purchase order details page not only displays the name and the price of the product but also its availability. We need to call two MicroServices. And most important we want our users to feel that they are interacting with a single application. So the idea is to do UI composition. There are actually two design patterns we can use. One is Server side page composition. This allows you to build web pages on the server by composing html fragments developed by multiple MicroServices team. There is also a Client side composition . This is where the browser builds a single UI interface composing UI fragments. This means that we need a UI team that is responsible to implement the application skeleton that aggregates multiple MicroService UI components.
Services
Once the sub-domains are isolated, each is packaged, deployed and executed independantly on one another. But most of the MciroServies will end up communicating with each other. When they do, they need to expose and consume API's, choose User comminucation protocol and also a communication style. Let's focus on two main communication families. RPI - Remote Process Invocation and Messaging.Remote Procedure Invocation also called as RPC is the simplest and familiar inter-process communication protocol. It works on request and response principle. A service request something to another service and this one replies back. This call can be synchronous or asynchronous. There are numerous examples of RPI technologies such as REST, SOAP or gRPC.
The other major communication style is Messaging. This is when MicroServices exchange messages or events by a Broker or Channel. When a MicroService wants to interact with the other, it publishes a message to the Broker. The other MicroService will subscribe to the broker if interested in such messages and receive the messages at the later stage. After receiving messages from the broker, the MicroService can then proceed to update its own state. Asynchronous messages play a significant role in keeping things loosely coupled in MicroService architecture. They also improve the availability since the MessageBroker buffers the messages until the consumer is able to process them. There are numerous examples for Message brokers such as Apache Kafka or RabbitMq.
No matter we use Messaging or RPI we need to define the format of the data the MicroServices exchange. Depending on the needs of your architecture you can use text or binary. Text messages can take several formats. The most used formats today are XML, JSON or YAML, The main advantage is that they are human readable, easy to implement and debug. You can also exchange binary messages. Binary protocols such as gRPC for example are more compact but more difficult to handle.
APIs and Contracts
We know that MicroServices communiate with RPC or Messaging to exchange text or binary messages. But how does each team knows to invoke the external MicroService? We do this through API's and Contracs. API - Applicaton User Interface is set of routines, data structures and protocols exposed by the MicroService. Let's say our Order MicroService expose a new API. One to create a new purchase order and second one to retieve exisitng order. For other services to know what is exposed and how to invoke it, the Order MicroService exposes a contract. This way the User MicroService gets the contract and reads it and discovers how to invoke an API to create a purchase order.APIs and Contracts per Device
As most applications today we need to aware of multiplicity of devices as well as network constraints. If a laptop connected to good optical fibre network needs to retrieve a specific purchase order we might want to give it all the details including the pdf representation. But if the same API call happens from a mobile device which is struck in public transport with bad internet connection, we might want to show only subset of information. As you can see each device has different needs. Therefore we should have different API's and contracts per device. This is a common practise that also has to be applied for MicroServices architecture.Distributed Services
When we have MicroServices talking to each other through a network, we need to implement certain patterns to make sure the system is reliable. A MicroService based application typially runs in an environment where number of intances of service and their location change dynamically. So how does the client of MicroService discovers the location if it is contantly changing?. The answer is Service Registry. The Service Registry is a phonebook of services with their locations letting clients look up the Services by their logical names. The first thing MicroServices has to do is self-registration. This means that they need to register their network location on startup. When making a request to the service, the client needs to first discover the location of service instance by quering the registry and then it can invoke the needed MicroService. Famous service registries are Eureka, Zookeeper or Consul for example.Cross origin resource sharing
When dealing with MicroServices located in different Servers, we quickly need to deal with Cross-origin resource sharing or CORS. In http the same-origin-policy is very restrictive. Under this policy the document hosted on User MicroService can only interact with the other document on the same server. In short the same-origin policy enforces that documents interact with each other have the same origin. The origin is made up of protocol, http or https, the host and the port number. But in MicroServices architecture services are located in different origin and need to talk with each other. That is crossing the origin. For security reasons browser restrict cross-origin http requests initiated from within scripts. To allow cross-origin resource sharing MicroServices has to use additional http headers to allow user-agent gain permission to access selected resources from a server of different origin. That is typically dealing with Access-Control-allow-origin of http headers.Circuit Breaker
The CORS is not only the one which retain a call made between two MicroServices. With RPI, the services should be available as there is no intermediate broker such as Messaging. When the User MicroService synchronously invoke a Purchase order, there is always the possibility that the purchase order is unavailable. This can be because of Network failure or Purchase order is under heavy load and is essentially unusable. The failure of purchase order MicroService potentially cascade to User MicroService and then through out the entire application. We call it as Domino effect. One failure in one system can trigger all the systems to fail. To avoid this we need to introduce a Circuit Breaker. It is the way to invoke a remote service via a proxy in order to deviate the call if needed. For example, if the number of consecutive failure crosses a threshold, the Circuit Breaker will stop attempting to invoke the remote service and will deviate the calls. After the timer has expired, the Circuit Breaker will start allowing limited number of requests to pass through. If those requests succees the Circuit Breaker resumes normal operations. The Circuit Breaker attempts to re-introduce the traffic. There are already few Circuit Breakers available such as Hystric and JRugged.Let's take back our user interfaces. Each MicroService has its own set of Graphical user interfaces. They must be aggregated into a single application. How do these components access individual services.? One solution is to have one-to-one relationship between the component and the MicroService. And then each call needs to deal with Cross cutting concepts such as security. The better approach is to have an API Gateway which is single entry point for all clients. This allow each client to have unified interface to all MicroServices. The Gateway can then handle requests in one of two ways. Some requests are simply routed to the appropriate service. Others can handle cross-cutting concerns such as Authentication, authorization and determine the appropriate service via the registry. A gateway can also be the ideal way to insert API translation. Different devices need different data. Therefore a gateway can expose different API for each client. There are few Gateways such as Zuul, Netty or Finagle.
Security
Handling Security is not specific to MicroServices. Any system now a days need some sought of security mechanism. But with MicroServices, little extra care should be taken. Authorization and Authentication are the terms that we use when controlling access to the Service and enforcing policy. Authentication is the process of authenticating the user who access MicroServices, which is usually implemented often by asking username and password. Authorization is used to determine "who is allowed to do what". For example only Administrators can remove users or update user information from User MicroService. We use Identity and Access management system for that. It addresses the need to ensure appropriate access to resource across distributed system. This means that MicroService is no longer needed to handle with login forms, authenticating users or storing credentials. They delegate authentication and authorization to Identity and access Managament system. For example if the user enters wrong username and passowrd then he/she won't able to access the User MicroService. When the user enters right credentials, the call can then be made to the Service. With single sign-on, once logged in, users no need to login again to access different MicroService. We have some authentication protocols available today such as Kerberos, OpenID, OAuth 2.0 and SAML.Here we can see the benefit of having the Gateway as single entry point for client requests. Gateway authenticates requests and forwards them to a MicroService which in turn invoke other MicroServices too. Some known Identity and Access Management systems are Okta, Keycloak and Shiro.
In a distributed system, it is difficult to serve the authenticity of requests in consistent way across all services. The important question is "Once authenticated, how the MicroService communicate the identity of requester to other MicroServices?" the answer is Access Token. It securly stores information about user and is then exchanged between services. Each service need to make sure that the Token is valid and takes the user information out of it and verify whether the user is authorized to perform the operation or not. Tokens can follow the JSON Web Token specification. There is also another possibility to use Cookies between the Services calls. Again you can see the benefit of Gateway as it centralizes user interface calls and access Token control.
Scalability and Availability
ScalabilityOne major advantage of the architecture is that we can scale each MicroService independantly depending on its need. Being as a distribuited system, it should also have high availability rate which means that the system should be available 99.99% and for that there are few techniques to know. For Christmas or New year or on any other important event, the application will be really under the load because customers may place many orders. Our User MicroService may not be very much effected but our Purchase Order and Product MicroServices would definitly need scaling because of the load. There are several ways of doing it. We have a concept of Vertical scaling which means that we scale by adding more Power to the Machine. For example we can add more CPU and RAM for the Purchase order MicroService and it would be appropriate for this particular MicroService. And we also have the concept of Horizontal scaling where we scale by adding more machines. For example our Product MicroService can be replicated to different machines.
When we scale horizontally we end up with several instances of same MicroService located on different servers. So if we do horizontal scaling for Product MicroService and when Purchase order invokes the Product which of the instances will it use.? Remember we have a Registry, so all the instances of the Product will be registered there. So it's just the matter of adding Client Load balancing on the Purchase order. The load balancer will pickup from among the registered instances of the Product MicroService and routes the request to it. The load balancer will decide based on whatever criteria it likes. It may run depending on Round-robin fashion or weight or capacity of the Service. The common Client load balancing tools are Ribbon and Meraki.
Availability
Availability means the probability for the system to be operational at given time. For example high available systems is expected to be available for 99.99% of the time and some minutes or hours of downtime in a year. If you notice, in our architecture we have Single point of failure. A single point of failure is a part of the system, that if fails causes entire system to fail. For example we have only one Gateway, one Message Broker, one Service Registry and one Identity Access Management system. For distributed system to be continously available, every request received must result in a response. This means if Service Registry is down, the User MicroService is unable to locate Purchase Order MicroService which results the entire system to be down and unavailable. To fix this problem all single point of failures should be scaled horizontally so we have multiple instances.
Hope my examples are helpful. Now let's talk a bit about Monitoring the system which is the important aspect of the distributed system.
Monitoring
Monitoring allows you to take proactive action when a service is consuming unexpected resources or not responding. There are so many moving parts and so many machines involved in MicroService architecture that you quickly needs to monitor what is happening. You need to rapidly view the running instances seeing failures and success rates and find the bottle necks. The key feature of Monitoring is it should be centralized. For example, it is not possible to login to each machine, check the logs and process manually when we have 100's and 1000's of machines to monitor. When the architecture is too complex the information needs to be visual. Monitoring needs to have dashboard to quickly visualize to know what went wrong. Monitoring dashboards such as Kibana, Grafana and Splunk allows you to visualize all sought of information.One important information is Health check. Health check in the sense, sometimes MicroService is running but incapable of handling request. For example it might have run out of database connections. So the important question that arise here is, how to detect a running MicroService which is unable to take a request? One way is to have a Health check API on all the MicroServices. That is the http endpoint that returns the health of the service and can be pinged by centralized monitoring. The API can perform checks such as status of database, status of host, disk space and available memory and so on. The centralized monitoring constantly call this API to find the health of each and every MicroService.
Log aggregation
Our application consists of multiple MicroServices running on multiple machines. We have health check to determine whether they are alive. But how can we understand the behaviour of application and troubleshoot problems.? Each MicroService writes to the log file about various information in a standardized format. The log file contains errors, warnings and debug information. But it is not efficient to read log files of each MicroService to understand what is happening. There it comes Aggregate log concept. This is the centralized logging service which aggregates logs from each Service instance. The administrator can then analyse the logs from the dashboard. There are log aggregators such as LogStash, PaperTrail and Spunk.When an error occurs MicroService throws an exception which contains a stack trace. This exception will be logged at somewhere in the middle of the log file and would be difficult to find. So these exceptions should be recorded in the centralized exception tracking system so that the developers find an easy way to check them and resolve them.
Metrics
So one more question arise here. How do we know the system is slowing down and how do we know Performance issues? We need to instrument our MicroServices to gather statistics of individual operations such as
Time taken to answer an http request?
Time for creating a purchase order takes?
Time for the database access?
We then need to aggregate these metrics in centralized metric system that provides reporting. There are several metric tools available in the market.
Auditing
In this kind of architecture, it is also important to understand the behaviour of the users. The user activity such as login and logouts, pages that have been visited, number of products that were purchased and the products that were browsed and so on. All this user activity should be recorded in the centralized system so that we know which parts have been used heavily by the user and therefore optimize and scale them if needed.Rate limiting
It's not only the users that access your MicroServices. Sometimes it could be third party MicroServices invoking API's. Rate limiting is used to control API usage by setting it up. It is the way to apply policies to limit traffic coming from specific resources or specific customers. For example we can Rate limit the number the http requests the client can make in a period of time. It is also used to monetize the API. For example a specific customer has the right to access API only 200 times per day. If he wants to have more access, he need to pay extra for that.Distributed Tracing
Requests often span multiple services. For example the user hits User MicroService, gets logged in, creates shopping card, select products and generates purchase order. So far we have logged every event seperatly and can not trace entire request going from User interface to the database through three MicroServices. With Distributed Tracing, we instrument services with code that assigns each request a unique correlation identifier that is then passed between Services. This creates a chain of calls. This provides userful insights for example the overall response time for entire invocation.There are few Tracing systems to use in MicroService architecture such as Dapper, HTrace or Zipkin.
So that's all about it. We came to an end of this course. All that we discussed until now is the terminology used in MicroServices, problems that arise and Solutions that we follow to solve them out and also we learnt about different technologies that could be used while building a MicroService based application.
Hope this course is helpful! Please post your comments in the comments box.
Thank you!!!
Nice Explanation Thanks (:
ReplyDelete