Thursday, June 15, 2017

AWS — When to use Amazon Aurora instead of DynamoDB

Amazon DynamoDB as managed database will work for you if you prefer code-first methodology. You will be able to easily scale it if your application inserts data and reads data by your hash key or primary key (hash+sort key). It is also good if your application is doing some queries on the data as long as the resultset of these queries returns less than 1Mb of data. Basically if you stick to functionality that is typically required by websites in real-time, then DynamoDB will perform for you. Obviously you will need to provision the reads and writes properly and you will need to implement some auto-scaling on DynamoDB WCUs and RCUs, but after you do all of the homework, it will be smooth for you without needing to manage much.
However, there are cases when you will need to go back to relational databases in order to accomplish your business requirements and technical requirements.
For example, let’s assume that your website calls one of your microservices which in turn inserts data into its table. Then let’s assume that you need to search the data in this table and perform big extracts which then have to be sent to a 3rd party that deals with your data in a batch-oriented way. If you need to for example query and extract 1 million records from your DynamoDB table, it will take you up to 4.7 hours based on my prototypes using standard AWS DynamoDB library from Python or C# application. The way you read this amount of data is by using LastEvaluatedKey within DynamoDB where you query/scan and get 1Mb (due to the cutoff) and then if the LastEvaluatedKey is not the end of resultset, you need to loop through and continue fetching more results until you exhaust the list. This is feasible but not fast and not scalable.
My test client was outside VPC and obviously if you run it within the VPC, you will almost double your performance, but it comes to bigger extracts, it still takes long. If you are dealing with less than 100,000 records, it is manageable within DynamoDB, but when you exceed 1 million records, it gets unreasonable.
So what do you do in this case? I am sure that you can improve the performance of the extract by using Data Pipeline and similar approaches that are more optimized, but you are still limited.
Basically, your solution would be to switch to a relational database where you can manage your querying much faster and you have a concept of transaction that helps with any concurrency issues you might have been challenged with. If you want to stay within the Amazon managed world, then Amazon Aurora looks very attractive. It has limitations on the amount of data, but most likely those limits are not low enough for your business. As for the big extract performance challenge, your extracts will go from hours (within DynamoDB) to minutes with Aurora.
Please consider this in your designs. Performing big extracts is opposite of the event driven architecture, but these type of requirements still exist due to a need to support legacy systems that you need to interact with or systems that have not adjusted their architecture to your methodologies.
Thank you for reading.
Almir Mustafic.

Wednesday, June 14, 2017

Success of good tech leads is really attributed to the SUM of many LITTLE good moves

How many times have you been asked as a tech lead what you do to make things successful?
If the answer to that question were simple, then the experience of tech leads would be totally undervalued.
We all know that there is no silver bullet for the success of projects that certain tech leads have been achieving. The success of good tech leads is really attributed to the sum of many little good moves.
It starts from day 1 when requirements are being delivered to the team. For example, analyzing the business requirements, detecting what are the true requirements and removing all requirements that contain implementation details is very crucial in the overall equation. Sometimes you, as the tech lead, will need to put a project management hat and a product management hat on in order to steer the ship in the right direction, and at the same time you need to be very engaged in the low-level technical details with your fellow software engineers. This is just the beginning and the moves made at this phase are the foundation to the rest of challenges that follow. In the agile methodology world, this requirements phase happens very often and that even puts more pressure on you as the tech lead to keep the foundation solid.
Then you get to the point where you are taking the ironed out business requirements (epics) and turning them into a collection of user stories that different squads will be receiving into their squad’s backlogs. Understanding how to do this breakdown and estimation is crucial in laying the foundation for all the squads and the backlogs that squads would take on. You, as the tech lead, plays an instrumental role in this process because you would be working very closely with solutions architects in determining the high-level design and architecture. The decisions that you and solutions architects make at this step are impacting multiple squads and the squads’ timelines (pre-planned sprints) and the overall calculated production launch date for the given milestone/release.
Then you get the point where you take the stories from your squad’s backlog and you work with fellow engineers on your squad/team. Every interaction and every little move or piece of advice that you give to developers is important; every recommendation that is given to you needs to be properly assessed. It starts before even a line of code is written. It starts with the coding methodology and mindset. It is anything from introducing guidelines on how to define boundaries of microservices/APIs to unit-testing culture to database design in NoSQL vs. SQL to concepts of backwards compatibility to introduction of feature toggles for product team and launch risk mitigations and etc. There are so many little things that create the sum that defines the success.
If you are a developer on this team, learn to listen and hear what the good tech leads are preaching and also provide your feedback as great tech leads also know how to accept your input and build on it. If you are a tech lead, show the values you care about through examples and let the team experience it, and then sense when you need to step back a bit and let the team move forward and ride on the momentum and the vibe you have established on the floor.
Keep in mind that while you are going through this exercise, you do NOT think that you are smartest person in the room because if you think you are, then it really means that you are in the wrong room. If nobody applies any humbleness, then it means everybody is in the wrong room and the concept the team disappears. Understand your skills and understand the skills your teammates bring on the table and try to create a culture where complimenting each other happens naturally. This is the culture where you respect each other so much at the professional level that you are happily sacrificing your time to save each other’s work.
Geek out and enjoy the process because it requires a lot of patience and perseverance.
Thank you for reading.
Almir Mustafic

Sunday, June 11, 2017

Knowing when to put code into your API/microservice vs. outside API (wrapper microservice)

As you are defining the purpose of each one of your microservices, and as you are developing these microservices, it is very important to know if you need to make changes to an existing microservice, or you need to build a wrapper microservice.
What I am talking about here are not rules; I am just giving you some examples so it can trigger you to think about this while you are going through the similar exercise. I just want to make developers aware of the differences.
So the main question is:
  • Put code into an existing microservice/API
  • OR develop a wrapper microservice API?
Let’s assume you have a “Package” microservice/API and this API gives you the list of packages and their details:
  • /api/packages (this gives you all the packages)
  • /api/packages/{id} (this gives you a specific package based on that ID)
  • /api/packages?type=type101 (this gives you all the packages of type=type101)
Let’s assume that behind this microservice you are using a NoSQL table “Package” and that table has the packageId as the hash key of the table and then you have a bunch of other attributes that you are saving as part of the JSON object.
Let’s say that this API has been used in your production system for many months. Then you get a requirement from your product team that they want to introduce the concept of “Package Bundles”. A package bundle would be just a collection of different packages. For example, you can create the Bundle 1 that contains:
  • package 1
  • package 2
  • package 3
Whereas the bundle 2 could contain:
  • package 1
  • package 7
  • package 8
  • package 9
Ultimately the product team instead of presenting individual packages to customers and having them pick one by one, they want to also present some pre-configured bundles of packages to customers to simplify their experience.
Now the big question is: How do you as a software engineer and solutions architect design this, or in other words how do you incorporate this requirement into your existing design of your microservices?
Before I get into the design, one unwritten rule in software development is that you need to keep things backwards compatible regardless of what type of design you come up with.
Let’s get into the design a bit. If you have been working in monolithic applications for many years (as many of us have), you would be naturally inclined to open up the existing “Package” microservice and you could be possibly introducing the list of bundle IDs in the existing data of the package. Or you could be adding another table within the Package microservice to introduce the concept of bundles and that database structure would be leaning towards a relation DB design. I talked about NoSQL and avoidance of relation DB design in my YouTube video:
As you get deeper into this approach, you can realize that you are polluting the purpose of the Package microservice and you are slowly turning it into a macro-service. I have a full post about “micro” in micro-services:
You also can realize that the API routes cannot stay RESTful. How do you actually search to get the bundles using /api/packages API route?
That’s when you conclude that you need to change your design.
The solution that I recommend for this specific requirement or use case is to introduce a new “PackageBundle” microservice. I really classify this as a wrapper microservice. This PackageBundle microservice would have the following type of API routes:
  • /api/packagebundles (this gives you all the package bundles)
  • /api/packagebundles/{id} (this gives a specific package bundle by id)
  • /api/packagebundles?package=packageA (this gives you the list of bundles that contain packageA)
This microservice would have a NoSQL table with the following data structure:
- packageBundleId   (hash key for the table)
- packages [ ]   (list of packages)
   - packageId
   - packageName
   - other package attributes
- Other attributes needed to represent the bundle
In JSON, this would look like this:
  "packageBundleId": "",
  "packages": [
      "packageId": "",
      "packageName": "",
      "packageDescription": ""
      "packageId": "",
      "packageName": "",
      "packageDescription": ""
  "otherPackageBundleInfo": ""
Now let’s visualize how package bundles would be created. Imagine a business-enablement tool that allows you to create a bundle and then let’s assume that you drag packages into a bundle. Behind the scenes, that tool would be calling the Package API to retrieve the details on the package that is dragged into a bundle and then as the final step the tool would be calling the PacakgeBundle API to save all those details into the record for that given bundle. That’s what this business-enablement tool would do. On the other hand, the main website that customers are using would be calling the /api/packagebundles API to get a list of bundles or a specific bundle to display on the screen for customers.
With this approach you kept your existing microservice as micro and you introduced a new microservice that has a sole purpose to manage collection of packages. If I wanted to create some new packages, I am not dependent on the PackageBundle API; I can directly call the Package API to perform the actions. If the product team tomorrow decides that the concept of bundles is not needed any more, then I can just stop using the PackageBundle API and get rid of it in the long term without any impact on the Package API.
I hope this example can help you save some time if you have similar type of requirements and design.
Thank you for reading.
Almir Mustafic.

Sunday, June 4, 2017

Defense is the BEST Offense — how do you apply this concept in Software Engineering?

Defense is the BEST offense. If you play sports, this is very familiar to you. I played high-school basketball and I learned that this is very true.
As for applying this concept in software engineering field, let me start by saying that I am NOT talking about developing software in such a way to protect your job; I am totally against that. I am talking about something else here.
I am talking about a mindset in software engineering that allows you to be offensive and the buzz word that describes this is “being disruptive”. You can develop software very fast, but can you consistently repeat this without creating chaos?
One real life example is the following: You developed a piece of software or a list of features and your QA and Stage testing passed perfectly. Then you are about to go to production and on the day of the production launch, the decision gets made not to launch those features for technical or business reasons. Now the question becomes: Can you easily disable these features and still launch everything to production with these features turned off? If the answer to this question is a CONFIDENT YES, then you are implementing things the right way and you have development methodologies in place that present you with a lot of capabilities and options. It is easy to say this, but this methodology starts before you even write a line of code. On the other hand, one might say “why do I need to develop the feature toggle if I can just be fast and go for it?”. I would say that going fast and taking chances is fine if you have the foundation, but without the foundation and methodology to support you, it is just reckless.
The confident YES in above scenario gives your product management and your senior management enough confidence to try different things without worrying about failing because the failures or sudden decisions will not cause chaos. I provided just one example that accomplishes this goal.
In conclusion, you are building a defensive capability called “feature toggle” but in turn you are actually providing offensive capabilities for you and your leadership team to innovate and disrupt. It sounds counter-active, but it is not.
Thank you for reading.
Almir Mustafic.

Make components reusable and NOT necessarily the orchestration layers in your software platform

Making things reusable in software engineering is very important. That’s why all the general purpose programming languages have methods/functions and they also have object oriented capabilities. Using these capabilities you can implement different patterns. Yes, different languages implement these patterns differently, but my point is that they are all achievable.
Generally speaking junior programmers write code that is simple, but it may not be as reusable as we would like it to be. On the other hand, when programmers gain some experience and learn these different programming patterns, they may get over-excited and start applying these patterns without figuring out first what problem they are trying to solve. They may kind of look at the list of patterns and say: “Oh, this one would be cool”. It is totally fine to use a known programming pattern, but how you arrive at using one decides if you are doing it for the right reasons. Here is what I mean.
First, understand what problem you are trying to solve.
Then figure out the solution for this problem without being disrupted by available patterns. Focus strictly on the solution and work it out on paper or a piece of napkin.
Now that you have a solution to your problem, you can look at the list of known programming patterns at your disposal and most likely one those patterns would be the one that exactly fits the solution. However, there could be a chance that you don’t have a known pattern or it may mean that you need to use a combination of two patterns with slight modifications.
You may be wondering where I am going with this. I am basically trying to get to my next point which is the main subject of this article. I want to talk about making your individual components in your platform reusable and NOT necessarily the actual orchestration layers that you may have.
Let’s assume that you have the following components:
  • Component A
  • Component B
  • Component C
  • Component D
  • Component E
You need to develop each one of these components in such a way that they are reusable so that the orchestration layer can invoke those components without needing to learn about the internals of your components. For the sake of this example, I will assume that those components are just methods/functions. Then let’s assume that the orchestration layer looks as following:
Orchestration Layer:
Async call to MethodB();
Wait for MethodB() response
Then what could happen is that you try to make this orchestration layer reusable by creating a template for it. When you do that, you could create a workflow or some type of state-map that ties all these calls and you have some fancy configuration that allows you to connect state to state and you have a mechanism to incorporate a new step in the state-map without writing any code.
I understand this conceptually, but my question to you is: Why? What do you gain by doing this? What problem are trying to solve? Are you solving a problem at the root?
A lot of times the response is: We need to do it because we want to introduce new steps in this step-factory (state-map) without needing to push any code.
Yes, there are valid reasons to introduce the step-factory pattern (state-map) in your design, but most of the time you don’t have a good reason and I will explain why.
You maybe introducing this step-factory design for wrong reasons. It could be that in your organization it is very hard to deploy a code/binary change to production due to a lot of processes. So you ended up changing your design (even for a simple case as above) and introducing the step-factory to get around the process so that you can deploy a config change quickly to production. First, you are complicating your design and implementation by introducing this without good reasons. Second, you need skilled developers to make code changes within it and you are actually adding more risk.
Let me use the following analogy with the car engine and other cars parts under the hood of your car. Under the hood in the engine bay, you have the following:
  • engine
  • battery
  • alternator
  • water pump
  • brake cylinders
  • air intake and air filter
  • radiator
  • fan
  • …etc
You can picture those car parts as the individual components that I described above in your software platform. The engine bay itself would be the orchestration layer. Everything in this engine bay (orchestration layer) is connected a specific way and it works great. If something goes wrong, you can go inside the individual parts (components) and you can replace the full part or the insides of that part. For example, the air filter could be dirty and you just open the air filter box (component) and put a new air filter. So this setup with this engine bay (orchestration layer) and these parts (components) is great if all you are doing is changing the insides of a part/component or replacing the whole part/component with another one that fits in (using the same interface). So by changing these parts, you are pretty much changing just the configuration within your step-factory. If that is all you are doing, that is perfectly fine and this engine bay / orchestration layer / workflow is perfectly good for you.
However, the world of software engineering (especially microservices world) is not as static as the engine bay of your car. You typically have a need to put things together using different permutations or you have a need to orchestrate in a different way. That’s where you need to step back and ask yourself if you really need to turn your orchestration layer into some template/step-factory/state-map when you are actually constantly changing what that orchestration looks like? This is where you realize that the complexity of your state-map/workflow via step-factory pattern is actually complicating things and it is slowing you down, and it is adding the unnecessary risk. There is nothing wrong with making a code change in your orchestration layer in order to meet a requirement. Tell yourself, it is ok. It is not 100% configurable, but I know that making it 100% configurable would slow me down because of added complexity.
As for my example above about using the configurability to get around your process, you really need to think about solving that problem at the root. That company probably introduced the process to slow developers down because they were manually deploying things to production and breaking things. So the typical reaction from your Change Control team is to tighten the SDLC process. What you are missing is the continuous integration and automated deployments that increase the confidence, success and reduce the risk. That is your solution for that problem and your Change Control team will welcome that because it is repeatable and predictable.
In conclusion, please think about the complexity of your code. Make your individual components modular/reusable and use them in your code. After you use them enough in your orchestration layer, you can determine if you want to keep your orchestration layer simple, or you need to build a template (i.e. step factory or workflow) for it. Think about the maintenance here and your use cases. If you are constantly changing the flow of your orchestration layer, then the complexity you are introducing with the workflows/step-factory is very unnecessary. Then stick with the basics, and it is ok to change code in order to adjust the orchestration layer. It is ok, but streamline your deployments throughout the pre-prod environments and into production to make this experience very smooth.
Thank you for reading.
Almir Mustafic