James's Ramblings

Systems design

Created: January 05, 2025

Factors to consider when designing a system

When eliciting requirements for or designing a system, you should consider the following factors:

  • Features:

    • Functional requirements: What the system should do.

    • Non-functional requirements: How the system should do it.

  • Scalability:

    • You may wish to consider all the different factors the application may need to scale by.

    • Align the scalability strategy with the business requirements.

  • Performance:

    • You may wish to consider the performance requirements of the application under different conditions.

    • Latency: Response time requirements.

    • Throughput: Number of transactions/requests/data per second.

    • Resource utilization: Usage of CPU, memory, disk I/O, and network I/O.

  • Availability:

    • Uptime requirements.
  • Durability:

    • Data loss tolerance.
  • Reliability and resiliency:

    • How frequently can the system fail? (Reliability)

    • How quickly does the system need to recover from failures? (Resiliency)

    • Fault tolerance.

    • Redundancy.

    • Interference from malevolent third-parties.

    • Disaster recovery:

      • Recovery Time Objective (RTO): The maximum amount of time that a service can be down before it starts to impact the business.

      • Recovery Point Objective (RPO): The maximum amount of data that can be lost before it starts to impact the business.

  • Security:

    • Risk tolerance.

    • CIA triad: Confidentiality, Integrity, and Availability.

  • Compliance:

    • Legal and regulatory requirements.
  • Automation:

    • To what extent can the system be automated?

    • Manual intervention requirements.

  • Cost:

    • Capital expenses (CapEx): Upfront costs.

    • Operational expenses (OpEx): Ongoing costs.

    • Total Cost of Ownership (TCO): Sum of CapEx and OpEx.

    • Licensing.

    • Hosting, software, and hardware.

    • Maintenance.

    • Development.

  • Simplicity:

    • What level of tolerance does the client have for complexity?
  • Operability:

    • How easy should it be to run the system?
  • Lifecycle:

    • How long will the system be in use?

    • How often will the system be updated?

  • Maintainability:

    • Ability of stakeholders to maintain the system.
  • Evovability:

    • How much will the system need to change over time?
  • Database:

    • Data model: Choose a database that fits your data model (e.g. relational, document, key-value, etc).

    • Consistency: Decide between strong consistency and eventual consistency. Eventual consistency is a vague term and you should be more specific about the consistency model you need.

    • Storage engine: Consider the storage engine used by the database.

    • Indexing: Consider the indexing options provided by the database.

    • Each of the top-level requirements should also be considered in the context of the database.

  • Migration:

    • Will migration be necessary in the future?

    • Vendor lock-in: Consider the risk of being locked into a specific vendor.

  • Third-party software, hardware, and services:

    • Ecosystem: Look at the tools, libraries, and community support available.

    • Limitations: Be aware of any limitations or constraints.

    • Support: Consider the level of support provided by the database vendor.

    • Documentation: Evaluate the quality of the documentation.

    • Monitoring and observability: Evaluate the monitoring and management tools provided by the database.

Software architecture

Patterns

Microservice architecture: A design pattern that structures an application as a collection of loosely coupled services. Also called a service-oriented architecture (SOA).

REST: Representational State Transfer. An architectural style for designing networked applications. See here.

Scaling

Vertical scaling

Also called scaling up or shared-memory architecture.

Increase the number of CPU and memory resources on a single node.

  • Simple but costs do not scale linearly.
  • Not very fault-tolerant.
  • Specialized hardware may be required.

Horizontal scaling

Also called scaling out or shared-nothing architecture.

Distribute the load across multiple nodes.

  • More fault-tolerant.
  • Costs scale linearly.
  • More complex to implement.
  • May lower latency for geographically distributed users.

Actor model and distributed actor frameworks

The actor model is a model for concurrency in a single process. Rather than dealing directly with threads (and the associated problems of race conditions, locking, and deadlock), logic is encapsulated in actors.

Each actor typically represents one client or entity, it may have some local state (which is not shared with any other actor), and it can communicates with other actors by sending and receiving asychronous messages. Each actor can be scheduled independently by the framework.

Distributed actor frameworks extend the actor model across multiple nodes.

Considerations

  • Forward and backward compatibility.