New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update MRI C Ruby extension to use new version of upb (with upb_env). #387
Conversation
@haberman to review. |
// Callback invoked by upb if any error occurs during parsing or serialization. | ||
static bool env_error_func(void* ud, const upb_status* status) { | ||
const char* message = (const char*)ud; | ||
rb_raise(rb_eRuntimeError, message, upb_status_errmsg(status)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rb_raise() does longjmp IIRC? If so, it will leak the environment. Seems like you either want to call upb_env_uninit() here or anchor the upb_env to a Ruby object that will get a finalizer call where you can call upb_env_uninit().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Fixed now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, I haven't documented in upb the guarantees around longjmp-ing across an encoder or decoder. I think the semantic that will make the most sense is: you can longjmp across an encoder or decoder and all resources will be properly be freed when you call upb_env_uninit(), but it is no longer allowed to use the environment for anything except freeing it. Because by longjmp-ing, you may have jumped over code that is necessary for the parse state to be fully synchronized. For example, with JIT-ted code, the proper suspending logic would copy the CPU stack into the upb_pbdecoder prior to returning. By skipping over that, the decoder cannot properly be resumed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I think that this should be compliant with the contract you describe -- rb_raise
will unwind the whole C stack, all the way back up into Ruby.
env_errdata* errdata = ud; | ||
// Free the env -- rb_raise will longjmp up the stack past the encode/decode | ||
// function so it would not otherwise have been freed. | ||
upb_env_uninit(errdata->env); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess here you're exercising another corner: is it safe to call upb_env_uninit() from inside an error callback? As long as you are longjmp-ing across any encoders/decoders on the stack, it should probably be ok. But this pattern would require that you immediately longjmp().
So your options for longjmp are either:
- longjmp() to a frame that has the upb_env still in scope, then call upb_env_uninit().
- call upb_env_uninit() then immediately longjmp() past any encoders/decoders on the CPU stack, and do not touch the env again. A regular return from the function in this case would cause a crash!
Both of these seem like supportable patterns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I think we're safe here, as the error handler takes your option 2: rb_raise()
always longjmps. (I think you're just thinking through upb semantics here, unless I'm misunderstanding; you don't actually see an issue with this implementation?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're just thinking through upb semantics here
Yes exactly, you're doing something that no one has ever done before, which I've thought about a little, but not deeply. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't actually see an issue with this implementation?
No, basically my conclusion at the moment is that this is a pattern that upb can and should support. All resources should be anchored in the upb_env at all times, so freeing the env should be safe even from a handler.
I hope this pans out as a long-term supportable semantic.
794efe2
to
b2d583a
Compare
- Alter encode/decode paths to use the `upb_env` (environment) abstraction. - Update upb amalgamation to upstream `93791bfe`. - Fix a compilation warning (void*->char* cast). - Modify build flags so that upb doesn't produce warnings -- the Travis build logs were pretty cluttered previously.
Friendly ping, PTAL whenever you get a chance? |
Update MRI C Ruby extension to use new version of upb (with upb_env).
upb_env
(environment)abstraction.
eace8e32
.build logs were pretty cluttered previously. I suspect that Rake isn't properly
selecting C99 mode despite
-std=c99
due to the warning suppressions needed(decls after statements, etc) but didn't poke further on this.