Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds jv_unshare(), jv_is_unshared() #3109

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 94 additions & 0 deletions src/jq_test.c
Original file line number Diff line number Diff line change
Expand Up @@ -490,4 +490,98 @@ static void jv_test() {
//jv_dump(jv_copy(o2), 0); printf("\n");
jv_free(o2);
}

{
assert(jv_equal(
jv_paths(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to use an assert() with jv_paths()? Why not just assert that the two values are equal?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm not saying I don't want jv_paths(), I'm just saying that I think we don't need jv_paths() for testing jv_unshared(), so jv_paths() -if we need it at all- can be in a separate PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did it out of convenience since I iterate through every path in the unshared value to see whether it really has no shared memory

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be faster to just recurse in jv_is_unshared().

jv_parse("{"
"\"key\":["
"{\"this\":{\"some\":true}},"
"false,"
"null,"
"10,"
"[true]"
"]"
"}")
),
jv_parse("["
"[\"key\"],"
"[\"key\", 0],"
"[\"key\", 0, \"this\"],"
"[\"key\", 0, \"this\", \"some\"],"
"[\"key\", 1],"
"[\"key\", 2],"
"[\"key\", 3],"
"[\"key\", 4],"
"[\"key\", 4, 0]"
"]")
));

}

{
assert(jv_equal(
jv_addpath(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think jv_addpath() is the same as jv_setpath(), so let's drop it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no it's not

jv_setpath() replaces the entire path with the value you specify
jv_addpath() overwrites in the path you specify the keys specified in the replace value keeping keys that are not mentioned in the replace value intact

jv_parse("{"
"\"key\":{"
"\"some\":{"
"\"test\":\"value\""
"},"
"\"other\":\"thing\""
"}"
"}"),
JV_ARRAY(jv_string("key")),
jv_parse("{"
"\"some\":{"
"\"test\":\"other\""
"},"
"\"added\":\"thing\""
"}")
),
jv_parse("{"
"\"key\":{"
"\"some\":{"
"\"test\":\"other\""
"},"
"\"other\":\"thing\","
"\"added\":\"thing\""
"}"
"}")
)
);

}

{
jv initial = jv_parse("{\"test\":[{\"some\":\"value\"}, 1, true, false, null]}");
jv new = jv_unshare(jv_copy(initial));

assert(jv_equal(jv_copy(initial), jv_copy(new)));
assert(!jv_identical(jv_copy(initial), jv_copy(new)));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to have a jv_unshared() function that returns true if the given value is recursively unshared in that every value down from it has jv_get_refcount() of 1. Such a jv_unshared() function would have to not consume a reference to its argument, naturally, or maybe it should check that the top-level value has 2 refcounts and all values below have 1 refcounts.

Then we could use jv_getpath() on initial to acquire an extra refcount on a value far from the initial root, and then we could check that neither initial nor that value obtained with jv_getpath() are jv_unshared().

Copy link
Author

@MaxBrandtner MaxBrandtner May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that idea. It would make testing easier as well. (the name might just be a bit too similar so simmering like jv_is_unshared() might be better)

jv paths = jv_paths(jv_copy(initial));

size_t paths_length = jv_array_length(jv_copy(paths));

for(size_t i = 0; i < paths_length; i++){
jv path_i = jv_array_get(jv_copy(paths), i);
assert(!jv_identical(
jv_getpath(
jv_copy(initial),
jv_copy(path_i)
),
jv_getpath(
jv_copy(new),
jv_copy(path_i)
)
));

jv_free(path_i);
}

jv_free(initial);
jv_free(new);
jv_free(paths);

}
}
81 changes: 81 additions & 0 deletions src/jv.c
Original file line number Diff line number Diff line change
Expand Up @@ -1862,6 +1862,87 @@ jv jv_object_iter_value(jv object, int iter) {
/*
* Memory management
*/
jv jv_unshare(jv input){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how ugly but i guess one could save on jv_free calls by having different ownership rule for unshare? too big foot gun?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even within the implementation of jv_unshare() we benefit from it dereferencing its input. We could have jv_unshare() not consume its memory, but at the cost of having to write a lot of stuff to buffers

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is jv_unshare() supposed to consume a reference to its argument? I think it should.

switch(jv_get_kind(input)){
case JV_KIND_INVALID:
if(!jv_invalid_has_msg(jv_copy(input))){
jv_free(input);
return jv_invalid();
}
return jv_invalid_with_msg(jv_unshare(jv_invalid_get_msg(jv_copy(input))));
case JV_KIND_OBJECT:
{
jv keys = jv_keys(jv_copy(input));
size_t keys_length = jv_array_length(jv_copy(keys));

jv output_object = jv_object();

for(size_t i = 0; i < keys_length; i++){
jv key = jv_array_get(jv_copy(keys), i);
output_object = jv_object_set(
output_object, jv_unshare(key),
jv_unshare(
jv_object_get(
jv_copy(input),
jv_copy(key)
)
)
);
}

jv_free(keys);
jv_free(input);
return output_object;
}
case JV_KIND_ARRAY:
{
size_t amount = jv_array_length(jv_copy(input));

jv output_array = jv_array_sized(amount);

for(size_t i = 0; i < amount; i++){
output_array = jv_array_set(
output_array,
i,
jv_unshare(
jv_array_get(
jv_copy(input),
i
)
)
);
}

jv_free(input);

return output_array;
}
case JV_KIND_STRING:
{
jv output_string = jv_string(jv_string_value(input));
jv_free(input);
return output_string;
}
case JV_KIND_NUMBER:
{
double val = jv_number_value(input);
jv_free(input);
return jv_number(val);
}
case JV_KIND_TRUE:
jv_free(input);
return jv_true();
case JV_KIND_FALSE:
jv_free(input);
return jv_false();
case JV_KIND_NULL:
jv_free(input);
return jv_null();
default:
return jv_invalid();
}
}

jv jv_copy(jv j) {
if (JVP_IS_ALLOCATED(j)) {
jvp_refcnt_inc(j.u.ptr);
Expand Down
4 changes: 4 additions & 0 deletions src/jv.h
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@ jv_kind jv_get_kind(jv);
const char* jv_kind_name(jv_kind);
static int jv_is_valid(jv x) { return jv_get_kind(x) != JV_KIND_INVALID; }

//jv_unshare() creates a deep copy of the input aka the content of the output will be identical to the input, but no shared memory exists between them
jv jv_unshare(jv);
jv jv_copy(jv);
void jv_free(jv);

Expand Down Expand Up @@ -256,8 +258,10 @@ jv jv_get(jv, jv);
jv jv_set(jv, jv, jv);
jv jv_has(jv, jv);
jv jv_setpath(jv, jv, jv);
jv jv_addpath(jv, jv, jv);
jv jv_getpath(jv, jv);
jv jv_delpaths(jv, jv);
jv jv_paths(jv);
jv jv_keys(jv /*object or array*/);
jv jv_keys_unsorted(jv /*object or array*/);
int jv_cmp(jv, jv);
Expand Down
105 changes: 105 additions & 0 deletions src/jv_aux.c
Original file line number Diff line number Diff line change
Expand Up @@ -427,6 +427,70 @@ jv jv_setpath(jv root, jv path, jv value) {
return jv_set(root, pathcurr, jv_setpath(subroot, pathrest, value));
}

jv jv_addpath(jv root, jv path, jv add){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a comment on what this is supposed to do? Is it different from jv_setpath()?

if(jv_get_kind(path) != JV_KIND_ARRAY || !jv_is_valid(add)){
jv_free(root);
jv_free(path);
jv_free(add);
return jv_invalid();
}

if(jv_get_kind(root) != JV_KIND_OBJECT && jv_get_kind(root) != JV_KIND_ARRAY){
jv_free(root);

if(!jv_equal(jv_copy(path), jv_array())){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's much faster to check if path has length 0.

jv_free(path);
jv_free(add);
return jv_invalid();
}

jv_free(path);

return add;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get this case.

Also, what if root is a jv_null()? Should we treat it as being a container-like value, like jq does?

}

if(!jv_equal(jv_copy(path), jv_array())){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

return jv_setpath(root, path, jv_addpath(jv_getpath(jv_copy(root), jv_copy(path)), jv_array(), add));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the difference between jv_setpath() and jv_addpath() now, but I'd still like a comment, and also if we don't need jv_addpath() then I'd rather not have it.

}

jv root_paths = jv_paths(jv_copy(root));
jv add_paths = jv_paths(jv_copy(add));

size_t add_paths_length = jv_array_length(jv_copy(add_paths));

for(size_t i = 0; i < add_paths_length; i++){
jv add_path = jv_array_get(jv_copy(add_paths), i);
jv add_path_value = jv_getpath(jv_copy(add), jv_copy(add_path));

if(!jv_is_valid(add_path_value) || jv_get_kind(add_path_value) == JV_KIND_NULL){
jv_free(root);
jv_free(path);
jv_free(add);
jv_free(root_paths);
jv_free(add_paths);
jv_free(add_path);
jv_free(add_path_value);
return jv_invalid();
}

if(jv_get_kind(add_path_value) == JV_KIND_OBJECT || jv_get_kind(add_path_value) == JV_KIND_ARRAY){
jv_free(add_path);
jv_free(add_path_value);
continue;
}

root = jv_setpath(root, add_path, add_path_value);
}

jv_free(path);
jv_free(add);

jv_free(root_paths);
jv_free(add_paths);

return root;
}

jv jv_getpath(jv root, jv path) {
if (jv_get_kind(path) != JV_KIND_ARRAY) {
jv_free(root);
Expand Down Expand Up @@ -538,6 +602,47 @@ static int string_cmp(const void* pa, const void* pb){
return r;
}

jv jv_paths(jv input){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a comment on what this is supposed to do? Is it output an array of paths to scalar values?

if(jv_get_kind(input) != JV_KIND_OBJECT && jv_get_kind(input) != JV_KIND_ARRAY){
jv_free(input);
return jv_invalid();
}

jv keys = jv_keys(jv_copy(input));

size_t keys_length = jv_array_length(jv_copy(keys));

jv output = jv_array();

for(size_t i = 0; i < keys_length; i++){
jv key = jv_array_get(jv_copy(keys), i);
jv insert_paths = jv_paths(jv_getpath(jv_copy(input), JV_ARRAY(jv_copy(key))));

output = jv_array_append(output, JV_ARRAY(jv_copy(key)));

if(jv_get_kind(insert_paths) == JV_KIND_INVALID){
jv_free(insert_paths);
jv_free(key);

continue;
}

size_t paths_length = jv_array_length(jv_copy(insert_paths));

for(size_t j = 0; j < paths_length; j++){
output = jv_array_append(output, jv_array_concat(JV_ARRAY(jv_copy(key)), jv_array_get(jv_copy(insert_paths), j)));
}

jv_free(key);
jv_free(insert_paths);
}

jv_free(input);
jv_free(keys);

return output;
}

jv jv_keys_unsorted(jv x) {
if (jv_get_kind(x) != JV_KIND_OBJECT)
return jv_keys(x);
Expand Down
Loading