Designing Our New Cloud-Native App: A Roadmap

Cerebrix

Tuesday, July 1, 2025

Designing Our New Cloud-Native App: A Roadmap

Louisa Medina

Cloud-native is no longer optional. In a world of elastic workloads, user expectations for always-on services, and rapid product iteration, cloud-native architecture is now the baseline. But getting it right means more than sprinkling Kubernetes over your codebase. It demands clear thinking about domain boundaries, operational ownership, and long-term maintainability.

Recently, our team kicked off a greenfield build of a modern cloud-native app. We needed to avoid the usual architectural chaos that can follow “move fast” slogans. Here’s how we planned it, step by step.

Understanding the Business First

We did not let ourselves jump straight to picking tools. Instead, we worked closely with product managers and security stakeholders to clarify the core business problem. What capabilities did the application absolutely require? What user volumes and performance expectations existed? Were there compliance requirements, like GDPR or SOC2? And what availability guarantees were realistic?

Getting precise answers to these questions was fundamental. If you skip them, you risk designing a system that is “cloud-native” but solves the wrong problem, or worse, can’t meet legal and compliance standards.

Establishing Domain Boundaries

Rather than drawing a big tangled web of services up front, we mapped out domain contexts with clear ownership. In our case, the core included user accounts, billing, notifications, and reporting. Each domain had its own data concerns, workflows, and future roadmap. We knew we’d start with a simpler modular deployment, but these boundaries still mattered because they would guide both code structure and ownership models.

By thinking about domains before drawing infrastructure diagrams, we could future-proof our design, keeping the option to move to microservices later if growth demanded it.

Choosing the Right Architectural Base

We deliberately decided to start with a modular monolith. In the past, teams I’ve worked on jumped to microservices prematurely, creating excessive deployment overhead and inter-service complexity for an MVP. A modular monolith kept the runtime and orchestration simpler, while allowing us to isolate business logic internally.

We containerized the app immediately using Docker, deployed to Kubernetes (AKS), and structured the code so that each module had explicit interfaces. That made it feasible to break out services later without a total rewrite.

Defining a Cloud-Native Tech Stack

After evaluating team skills and project needs, we chose .NET 8 for the core APIs, giving us strong performance and broad team familiarity. PostgreSQL became our relational store for its maturity and strong cloud support, while Redis handled caching and ephemeral storage. RabbitMQ provided a robust, straightforward message broker for asynchronous events, and Pulumi allowed us to manage infrastructure as code in a modern, type-safe way.

For observability, we committed to OpenTelemetry from day one, rather than bolting it on later. That decision would pay off the first time something went wrong in staging.

Building in Security from the Start

Security was not a separate track. We included secure defaults on day one: single sign-on with enforced MFA, least-privilege IAM policies on all resources, network segmentation, and audit logging. We also made sure our APIs sat behind an authenticated API gateway from the beginning.

Dependency scanning, including tools like Snyk and Dependabot, was integrated directly into CI. We refused to compromise on these baselines even in the MVP. It is far easier to embed secure-by-design principles up front than to retro-fit them after user data is already live.

Designing a CI/CD Strategy

We invested time to get CI/CD right. Our pipelines used GitHub Actions for building, linting, and running test suites. ArgoCD handled declarative deployment to Kubernetes, and we standardized semantic versioning to keep tracking changes consistent.

Canary rollouts were essential, especially for user-facing services. We rehearsed rollback procedures in a staging environment using production-mirroring data snapshots, so we could validate that a failed rollout could be recovered in minutes, not hours.

Treating Observability as a First-Class Concern

We refused to treat observability as a side job. We instrumented traces with OpenTelemetry, centralized logs to Loki, and built Grafana dashboards for latency, throughput, and error rates. Prometheus handled alerting, with budgeted service-level indicators agreed upon with product and support.

This made it possible to hold a clear, shared definition of “healthy” from the moment we launched, rather than chasing metrics reactively after something went wrong.

Clarifying Ownership and Collaboration

In a cloud-native build, no one person can own everything. We established a shared architecture decision record (ADR) log, captured in Confluence, so future maintainers would understand design decisions. We also set up domain ownership, with clear escalation points, so changes had a responsible team.

Coding standards, on-call rotations, and even Slack channel escalation practices were defined early. We refused to leave those social contracts until launch day.

Final Thoughts

Cloud-native architecture is powerful, but it is no substitute for intentional design. Breaking down the domain, clarifying ownership, picking tools that match the problem, building in security and observability, and putting real human agreements around ownership — that is the work.

If you do that work upfront, you will avoid expensive replatforming six months later. If you don’t, Kubernetes alone will not save you.

NEVER MISS A THING!

Subscribe and get freshly baked articles. Join the community!

Join the newsletter to receive the latest updates in your inbox.

July 21, 2025