利用Elixir的热代码加载功能来模块化单片应用

利用Elixir的热代码加载功能来模块化单片应用
Leveraging Elixir's hot code loading capabilities to modularize a monolithic app

原始链接: https://lucassifoni.info/blog/leveraging-hot-code-loading-for-fun-and-profit/

Alzo是一家“服务启动”，它利用Elixir的热代码加载单片应用程序中的客户特定功能，避免了微服务的复杂性。每个客户都会获得专用的ALZO实例，以无需授权噩梦来满足本地需求，开源目标和自定义应用程序开发。客户特定的应用程序主要是由LiveView构建的实时多人文档编辑器，它居住在指定的目录中，很容易列举进行开发。至关重要的是，特定于客户端的代码被排除在核心应用程序中，以避免依赖性问题。 ``混合命令''命令包客户端应用程序，用于上传，重新编译和动态加载。伴侣过程通过Dynamicsupervisor注册。使用热代码 *加载 *，但是 *升级 *（带有状态迁移），因为调整是由客户端请求驱动的。更复杂的服务被处理为单独的应用程序通过共享消息路由器通信。这种方法促进了从客户端应用程序中重构的共同行为到核心代码库中，这是微服务的挑战。

黑客新闻讨论围绕着一篇文章，内容涉及使用Elixir的热代码加载来模块化整体应用。作者将其用于在运行时选择性加载的特定于客户端自定义，而不是更新核心应用程序。评论者讨论了长生不老药的利弊，突出了其对分布式系统和小型团队的适用性。与Java/Kotlin相比，一些人提到生态系统的尺寸较小，导致缺乏某些库。其他人则指出了长生素在处理并发工作负载和部署零下降时间更新方面的优势。对话还涉及热代码重新加载与现场重新加载。 Elixir的热加载允许代码更新而不会丢失状态，保留排队的消息并允许LiveView顺利过渡。虽然蓝绿色部署很常见，但热补丁对于高可用性和特定用例（例如实时音乐生成）仍然很有价值。

原文

Jun 21 2025

My “services startup” Alzo is an Elixir monolithic app that gets deployed with 1 instance per client.

In this post we will see how Elixir’s and the Erlang VM’s hot code loading capabilties help me build client-specific features while maintaining a coherent codebase and avoiding a microservices-like situation with cascading failures or complex testing situations.

If you come from outside Elixir/Erlang, the BEAM VM (Erlang’s virtual machine) lets you load compiled modules into a running system, and also provides the ability to compile files at runtime. This allows to either add code and behaviour at runtime, or even replace it, without stopping a running system.

I’ve wanted to write this one for a long time now, but had to face the difficulty of not writing everything I wanted to, because this post would be too long to be even considered for reading. I guess I really wore into Elixir and the BEAM with time and feel the desire to talk about it, showcase interesting parts of it… This syndrom seems widespread in our community.

1 instance per client

The rationale behind that choice comes from the following goals I set for myself :

I want Alzo to be able to easily run on-prem at clients that desire it
I want to open-source it when it has stabilized enough, and I would like people to be able to use it without the hassle of managing a multi-tenant platform
I want to be able to easily build client-specific apps (what clients actually want of my platform) on top of Alzo without going into an authorization / isolation nightmare.
I want people that would use the open-source version to be able to build their own tools on top of it, without building a full-fledged separate service, nor changing the mainline code.

Client-specific apps

Client specific apps are mostly live multiplayer document editors built to accelerate existing workflows. See this presentation at the Belgian Elixir meetup for a few examples.

Those apps are Liveview applications, embarking UIs but also behaviour on top of Alzo’s generic primitives. They can leverage background processes and actors by registering themselves at startup with a DynamicSupervisor.

Client specific apps have a few characteristics code-wise :

They all live in /alzo/lib/clients/apps/<client_name>/<app_name>, allowing me to easily enumerate them.
Their entrypoint module has a name that ends with AppEntry. So a typical dynamic app’s main module will be Alzo.Clients.Apps.ClientX.AppYAppEntry. This allows me to easily enumerate them in development by finding them in the code server.
The entry module is a LiveComponent that gets mounted by Alzo.ApplicationLive at app open.
They implement Alzo.Clients.DynamicApp, a behaviour allowing me to route to them in a dev environment.

ApplicationLive, lightly simplified looks like this :

def render(assigns) do
  ~H"""
  <div>
    <.live_component
      module={@dynamic_module}
      id={@dynamic_id}
      {assigns}
    >
    </.live_component>
  </div>
  """
end

def mount(%{"serve_url" => url} = params, _, socket) do
  case get_load_target(url) do
    :error ->
      {:error, assign(socket, :original_params, params)}

    {:ok, :sideloaded, result} ->
      mount_with_sideloaded_app(result, assign(socket, :original_params, params))

    {:ok, :dynamic_loaded, result} ->
      mount_dynamic(result, assign(socket, :original_params, params))
  end
end

To allow those dynamically mounted LiveComponents to get messages, ApplicationLive provides me a few convenience helpers. If you already leveraged LiveComponents, you must recognize the send_update/update dance coming from the fact that LiveComponents are not processes, but live in their parent liveview’s process.

def handle_info({:___live_app_message, message}, socket) do
  send_update(socket.assigns.dynamic_module, id: socket.assigns.dynamic_id, message: message)
  {:noreply, socket}
end

def notify(pid, message) do
  send(pid, {:___live_app_message, message})
end

Build-time isolation

I do not want any of a client’s specific code and behaviour to be included in my app build. This means I need to remove them all before build time. And that is exactly what I do.

In CI, the dynamic apps are just part of the codebase, so they get tested like any other code in Alzo, with a simple mix test (well, it’s ./test.sh, but you get it). Just after tests pass, I completely remove all of their code by doing :

rm -rf alzo/lib/clients/apps
rm -rf alzo/test/alzo/clients/apps

Then the Docker image gets built.

From this rule emerge a few other interesting properties :

The main app can never depend on a runtime app’s code. If it did, the build would simply break.
Dynamic apps can never depend on the presence of another dynamic app, because there is no guarantee of its presence.
What would be integration tests between a main app and child microservices become regular liveview tests and unit tests in a monolithic codebase.
If I accidentally try to change an API in Alzo on which dynamic apps depend, they break in testing on my dev machine, not in production.

What I do not test is :

Erlang’s hot code loading mechanism
Liveview itself
DynamicSupervisors themselves

Run-time loading

I package client-specific apps with a mix command :

mix alzo.app.package <client>/<app>

This produces a tarball of the app that I can simply upload from my instance super-administrator panel. In this panel, I set the following attributes :

App kind (liveview or legacy, when I hot loaded JS SPAs a few years ago.)
Desired routing path
App icon, description, name
Capabilities

This gets persisted in DB, pointing to the app code on the filesystem.

When an app gets uploaded, its code is recompiled. When Alzo starts, it also loads all dynamic apps registered in DB and compiles their files. If any declares needing a companion process, it registers it with the DynamicApp-specific DynamicSupervisor.

Hot code upgrades

What I did not write about here is hot code upgrades. They are inconsequential because of the nature of my business. The dynamic apps serve specific business purposes, so they behavior only change if a client’s team requests adjustments or upgrades. So there is no code upgrade at random points in time or surprising someone with in-flight state upgrades. This is why when asked, I explain I only use hot code loading, not hot code reloading.

More complex child services

Of course, and as outlined in the presentation at the Belgian Elixir meetup, some child services are simply better as external, full-featured apps, in green on this diagram.

They talk to Alzo’s public APIs through a message router which is part of the same monolithic codebase, but deployed separately. This allows the API to work both in a request/response fashion for instances on the public internet, but also in a request/mailbox/poll/response for instances deployed on-prem in a server closet and not publically routable.

Conclusion

I was a bit afraid of hot code loading three years ago when I started building. This mechanism has a reputation for being scary in the Elixir community. After reading Erlang Programming and Designing for Scalability with Erlang/OTP I became convinced that this was just another tool in the VM’s toolbelt and that it looked suspiciously fitting for my use-case.

The real problems come with state upgrades, but the VM and OTP have tools to deal with them. I chose to remove this complexity by not doing state upgrades.

Three years in, and a few dozen dynamic apps written, I feel this was the best choice given the tools I had on hand. Having the dynamic apps in the monolithic codebase also means that behaviour that starts specific but repeats itself, or shows enough hints of genericity over time, can really easily be refactored out from the dynamic apps and in the main codebase. With separate microservices, it would have been repeated, or extracted to a private library with the added tooling and updates that come with it.