Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proto3 experimental C# fork #515

Merged
merged 29 commits into from Jun 19, 2015

Conversation

jskeet
Copy link
Contributor

@jskeet jskeet commented Jun 19, 2015

More to do, but this will allow others to contribute more easily...

jskeet and others added 29 commits June 5, 2015 20:44
Cache a reference to Encoding.UTF8 - the property access is (rather surprisingly) significant.
Additionally, when we detect that the string is all ASCII (due to the computed length in bytes being the length in characters), we can perform the encoding very efficiently ourselves.
… a single byte.

Aside from anything else, this will be used for all tags for fields 1-15.
This makes repeated fields really awkward at the moment - but when we reimplement RepeatedField<T> to be backed by an array, we can cast the array directly...
Remove ICodedInputStream and ICodedOutputStream, and rewrite CodedInputStream and CodedOutputStream to be specific to the binary format. If we want to support text-based formats, that can be a whole different serialization mechanism.
This mirrors commit 7c86bbb in the pull request to
the main protobuf project, but also reduces the size of the buffer created. (There's no point in
creating a 1024-byte buffer if we're only skipping 5 bytes...)
I wouldn't expect this to affect anything, but it appears to.
This is effectively reimplementing List<T>, but with a few advantages:
- We know that an empty repeated field is common, so don't allocate an array until we need to
- With direct access to the array, we can easily convert enum values to int without boxing
- We can relax the restrictions over what happens if the repeated field is modified while iterating, avoiding so much checking

This is somewhat risky, in that reimplementing a building block like this is *always* risky, but hey...
(The performance benefits are significant...)
We'll probably want a lot of the code from the serialization project when we do JSON, but enough of it will change that it's not worth keeping in a broken state for now.
- Make some members internal
- Remove a lot of FrameworkPortability that isn't required
- Start adding documentation comments
- Remove some more group-based members
- Not passing in "the last tag read" into Read*Array, g
@jskeet
Copy link
Contributor Author

jskeet commented Jun 19, 2015

One note to be aware of - this branch contains a change to wire_format.h that we may not want. See
#493 - I can probably create a different workaround if desired.

@jskeet
Copy link
Contributor Author

jskeet commented Jun 19, 2015

@jtattermusch @anandolee to review and merge.

@jtattermusch
Copy link
Contributor

I went quickly through the changes commit-by-commit because it's too big to display at once. It looks good to me (for what we need right now), let's merge this so we allow contributions from others.

I've tried to identify changes that are potentially dangerous and that should be reviewed separately (as a second pass once we merge this - I am happy to create bugs for these):

"array as backing store"
commit jskeet@7532f02

"improve string encoding times"
commit jskeet@35e4dbd

"inlining tag sizes"
commit jskeet@828b7e6
(I think "size += 1 + pb::CodedOutputStream.ComputeRawVarint32Size((uint) dataSize);" is not very understandable because it's not obvious where the "1" comes from, perhaps we could define constants for tag sizes, but let's address that in a follow-up review).

Also some nits:
phone field in Person should be renamed to plural as proto3 suggests ("person.Phone.Add(phoneNumber);" looks weird)

Perhaps we could rename RepeatedField to something else in the future.

@jtattermusch
Copy link
Contributor

Merging this with @anandolee 's consent.

jtattermusch added a commit that referenced this pull request Jun 19, 2015
Proto3 experimental C# fork
@jtattermusch jtattermusch merged commit 45b7032 into protocolbuffers:csharp-experimental Jun 19, 2015
taoso pushed a commit to taoso/protobuf that referenced this pull request Aug 1, 2018
When --go_out=paths=source_relative:., output filenames are always
derived from the input filenames. For example, the output file
for a/b/c.proto will always be a/b/c.pb.go, regardless of the
presence or absence of a go_package option in the input file.

Fixes protocolbuffers#515.
rinarakaki pushed a commit to rinarakaki/protobuf that referenced this pull request Aug 30, 2023
force explicit conversions to desired types in generated code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants