Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checking for duplicate keys #1112

Open
jfehr67 opened this issue Feb 23, 2024 · 3 comments
Open

Checking for duplicate keys #1112

jfehr67 opened this issue Feb 23, 2024 · 3 comments

Comments

@jfehr67
Copy link

jfehr67 commented Feb 23, 2024

By default, serde json will parse a json stream into a Value and for any duplicate keys, simply use the last value. (Just calls the insert function in the Map<String, Value>, which does this.)

In our use case, we sometimes have quite large json files that multiple developers add entries to. The entire json isn't a struct, but we do pull out portions which we then deserialize to structs. I'd like to be able to get an error if there are duplicate keys so the developer knows to update the original entry.

The optimal place would be in the deserialize function in map.rs, ie:

               while let Some((key, value)) = tri!(visitor.next_entry()) {
                   if (values.contains_key(key)) {
                        return Err(de::Error::duplicate_field(key));
                    }
                    values.insert(key, value);
                }

Would this be a reasonable solution? I'm not sure how to add/pass parameters to the deserializer that would enforce/not enforce this and I'm sure this isn't behavior everyone would want, but I'd prefer to contribute this back to the repo instead of keeping modified fork.

@jfehr67
Copy link
Author

jfehr67 commented Feb 27, 2024

Submitted a PR here: #1113

@ojob
Copy link

ojob commented Oct 21, 2024

Interesting check!

Looking at #1113, the check can be activated at compile-time.

But according to this question on StackOverflow, duplicate keys are not forbidden by standards, even if not recommended. If I understand the discussion correctly, the interpretation of whether a duplicate key is an issue or not is subject-dependent.

So maybe the check should be activated at runtime?

@Jisu-Woniu
Copy link

Jisu-Woniu commented Nov 20, 2024

I just came up with some new ideas:

Is it possible to deserialize a JSON object into a multimap, or let the user customize how to “merge” the old and new values with a closure like:

| old: Value, new: Value | {
    match (old, new) {
        (Value::Array(v1), Value::Array(v2)) => { v1.extend(v2); Value::Array(v1) }
        _ => new,
    }
}

So I don't need to implement my own Deserializer, which can take many lines of code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants