The case for using Elixir try/rescue
Introduction
According to the Elixir doc, it is rare to need to use try/rescue. That’s partly because the elixir/erlang philosophy of letting it crash.
Recently I ran into a case where using try/rescue makes a lot of sense.
The scenario goes like this: We use Elixir/Phoenix to render the front end from the server side. For a page, we have view helpers, something like:
def render_some_view(params) do
# Convert params into a %ApiParams{} struct
transformed_params = transformed_params(params)
# Fetch data from api
api_data = call_api!(transformed_params)
render('partial.html', api_data)
end
The crash
What happened was that, the other team updated the struct of ApiParams
. It subtly broke the call_api!/1
call because it didn’t have a fallback function to
# :field1 and :field2 would not show up together any more
def call_api!(%{ApiParams{field1: v1, field2: v2}) do
# ...
end
As a result, the function returns a FunctionClauseError
and crashes the view helper.
If this was a back-end process, it would fail and restart the server process. If error reporting is setup properly, dev teams would get notified.
What makes things worse is that, this view helper is used in a landing page to render a section. When this failed, it crashed the entire page, even if all other sections are fine.
Now the business is asking: Can we put something in place to prevent the landing page from crashing?
One section failed? That’s fine. Just hide it.
This is where the try/rescue
setup comes in:
# Rename to bang function
def render_some_view!(params) do
# ...
end
def safe_render_some_view(params) do
try do
render_some_view!(params)
rescue
e in RuntimeError ->
# Notify error reporting service, e.g., Bugsnag
notify_reporting_service(e)
render_fallback_view()
end
end
This way, we 1) get notified about the failure, 2) tolerate bad views changes without crash the high-stakes pages.
It’s an anti-pattern?
In some sense, yes.
The try/rescue can seem a bit over-defensive, which is a smell, or indicator of dysfunctional teams.
You can argue that, if we had better test coverage, this kind of errors should not happen.
I argue that, when team is large, and the code is evolving fast, putting things in place is necessary, especially for high stake pages. It buys us time and reduces the urgency when things break.
Alternatives
You can use return tuples, e.g., {:ok|:error, _}
to inform the callsites about the return status. In our case, we have the bang API call (which isn’t always controlled by our team). Also, changing to return tuples required touching all render functions which requires some work.
Although it seems heavy-handed, it is a good defence to frequent code changes to the view by multiple teams.