Type-Safe Dynamic Mappings in Swift
We discovered a neat little pattern for statically validating config files, while still providing dynamic key-value access via Swift key paths.
The Problem
In our previous post, we evaluated many embedding models for sentiment analysis of product reviews. Under the hood, we developed a small but general embedding framework that accepts models in a variety of formats, and it can select any of the supported models based on a dynamic descriptor. For example, we support Model2Vec and Fastembed. A simplified set of descriptors might look like this in JSON:
{
"mrlStaticMulti": {
"backend": "fastembed",
"path": "/path/to/MRLStaticMultilingual"
},
"potionRetrieval32M": {
"backend": "model2vec",
"path": "/path/to/PotionRetrieval32M"
}
}
and Swift type for the descriptor of an individual model may be:
struct EmbeddingModelDescriptor {
let backend: EmbeddingModelBackend
let path: String
}
The question now is: how can we address these models dynamically, yet in a type-safe way?
First Attempt: The Simpleton
The "obvious" choice for dynamic key-value lookup is a dictionary, mapping model names (strings) to model descriptors:
let models = try JSONDecoder().decode(
[String: EmbeddingModelCollection].self,
from: jsonData,
)
However, this comes at a price: even though the set of models is well-known and completely static, and so are their names, there is no way for us to safely address them. Dictionary lookups can fail, for example, if you make a spelling error in the model name, or change the name in the JSON file but forget to do so in the code that refers to it. So this approach is less than ideal.
Second Attempt: The Rigorous
So what's the next best choice, one which can enforce that you never make a typo in the key name?
You guessed it: a struct
. Statically-typed structs have field names, and the compiler always
validates accessing them. So our next attempt is a struct, which simply spells out the name of
each model as a separate field, and each field has the same descriptor type:
struct EmbeddingModelCollection: Codable {
let mrlStaticMulti: EmbeddingModelDescriptor
let potionRetrieval32M: EmbeddingModelDescriptor
}
let models = try JSONDecoder().decode(
EmbeddingModelCollection.self,
from: jsonData,
)
Okay, that's better: now we are guaranteed that we only ever access a model that actually exists. and JSON decoding will also validate the field names against the keys of JSON objects. However, we now lost a different, useful property: the ability to address the models dynamically, based on runtime information.
So what can we do to have our cake and eat it too?
Final Solution: The Best of Both Worlds
The solution is actually pretty straightforward. We'll need to create a type that can translate
itself into a KeyPath
dynamically, which in turn we can use to look up any field in our struct
via Swift's built-in subscripting support. (We can also use this to mutate fields if we declare
all fields as var
and use WritableKeyPath
instead of a plain KeyPath
.)
The good news is: we already have a candidate for such a type! It's just Self.CodingKeys
.
By conforming our CodingKeys
enum to the set of relevant protocols, we can make it dynamically
decodable from strings, iterate over its cases, and use it to get the appropriate key path, like
this:
struct EmbeddingModelCollection: Codable {
let mrlStaticMulti: EmbeddingModelDescriptor
let potionRetrieval32M: EmbeddingModelDescriptor
enum CodingKeys: String, CodingKey, Codable, CaseIterable {
case mrlStaticMulti
case potionRetrieval32M
}
// notice that this is infallible - it unconditionally
// returns a descriptor, it is not an `Optional`
subscript(codingKey: CodingKeys) -> EmbeddingModelDescriptor {
self[keyPath: codingKey.keyPath]
}
}
extension EmbeddingModelCollection.CodingKeys {
var keyPath: KeyPath<
EmbeddingModelCollection, EmbeddingModelDescriptor
> {
switch self {
case .mrlStaticMulti: \.mrlStaticMulti
case .potionRetrieval32M: \.potionRetrieval32M
}
}
}
We can also add a couple more protocols than strictly necessary, e.g. Equatable
and Sendable
,
in order to make CodingKeys
easier to work with, so it feels more like the built-in String
.
The Extra Bit: Total Recall
If you followed this post super carefully, you might have noticed that we introduced yet another
subtle problem by using a struct
instead of a dictionary: we are no longer guaranteed to have
all of our JSON data represented in the top-level struct
. If we have a superset of the struct
fields in our JSON config file, then the extra keys will simply be ignored. How can we mitigate
this issue, and ensure that all JSON keys are mapped to a field of our struct
?
Unfortunately, neither does the automatically synthesized Decoding
conformance ensure completeness
of the data model, nor does JSONDecoder
have any public API for enforcing such a property. Thus, we
will have to write a bit of code manually. Fortunately, though, we do not have to (therefore we should
not!) copy-paste the entire init(from: Decoder)
function into every single type. We can devise a
generic wrapper type instead, and use it for the bulk of the validation logic, then just defer to the
compiler-synthesized implementation of Decodable
.
The essence of this approach is included below. If you are curious to see a complete, working example, you can find it in this gist.
public protocol TotalCodingKey: Sendable, Equatable, Hashable,
CaseIterable, CodingKey {}
public struct TotalDecodable<Collection, CodingKey> {
var inner: Collection
}
extension TotalDecodable: Decodable
where
Collection: Decodable,
CodingKey: TotalCodingKey
{
public init(from decoder: any Decoder) throws {
let keyedContainer = try decoder.container(
keyedBy: AnyCodingKey.self
)
let extraKeys = Set(keyedContainer.allKeys.map {
$0.stringValue
}).subtracting(
CodingKey.allCases.map { $0.stringValue }
)
guard extraKeys.isEmpty else {
// see the definition of ExtraKeysError in the full gist
throw ExtraKeysError(extraKeys: extraKeys.sorted())
}
self.inner = try Collection.init(from: decoder)
}
}
private struct AnyCodingKey: CodingKey, Equatable, Hashable {
let stringValue: String
let intValue: Int? = nil
init?(intValue: Int) {
return nil
}
init?(stringValue: String) {
self.stringValue = stringValue
}
}