diff --git a/content/design-decisions/nullable-getters-setters.md b/content/design-decisions/nullable-getters-setters.md
index 858f9a6fa..ec22b6f32 100644
--- a/content/design-decisions/nullable-getters-setters.md
+++ b/content/design-decisions/nullable-getters-setters.md
@@ -30,9 +30,9 @@ can't have defaults, there's no functional problem with this.
As an example, consider this `.proto` file:
```proto
-message Msg { optional Child child = 1; }
-message Child { optional Grandchild grandchild = 1; }
-message Grandchild { optional int32 foo = 1 [default = 72]; }
+message Msg { Child child = 1; }
+message Child { Grandchild grandchild = 1; }
+message Grandchild { int32 foo = 1 [default = 72]; }
```
and corresponding Kotlin getters:
diff --git a/content/editions/features.md b/content/editions/features.md
index 29811fa9a..5d3fe5ea3 100644
--- a/content/editions/features.md
+++ b/content/editions/features.md
@@ -78,10 +78,12 @@ message Corge {
}
```
+<<<<<<< HEAD
In this example, the setting "`GRAULT"` in the lowest-level scope feature
definition overrides the non-nested-scope "`QUUX`" setting. And within the
Garply message, "`WALDO`" overrides "`QUUX`."
+<<<<<<< HEAD
### `features.default_symbol_visibility` {#symbol-vis}
This feature enables setting the default visibility for messages and enums,
@@ -212,7 +214,149 @@ message Foo {
int64 bar_1 = 1;
}
```
+||||||| parent of dcf50a2 (This documentation change includes the following:)
+In this example, the setting `GRAULT` in the field-scope feature definition
+overrides the message-scope QUUX setting.
+=======
+In this example, the setting "`GRAULT"` in the lowest-level scope feature
+definition overrides the non-nested-scope "`QUUX`" setting. And within the
+Garply message, "`WALDO`" overrides "`QUUX`."
+>>>>>>> dcf50a2 (This documentation change includes the following:)
+
+||||||| parent of 81fd217 (This documentation change includes the following:)
+=======
+### `features.default_symbol_visibility` {#symbol-vis}
+
+This feature enables setting the default visibility for messages and enums,
+making them available or unavailable when imported by other protos. Use of this
+feature will reduce dead symbols in order to create smaller binaries.
+
+In addition to setting the defaults for the entire file, you can use the `local`
+and `export` keywords to set per-field behavior. Read more about this at
+[`export` / `local` Keywords](/editions/overview#export-local).
+
+**Values available:**
+
+* `EXPORT_ALL`: This is the default prior to Edition 2024. All messages and
+ enums are exported by default.
+* `EXPORT_TOP_LEVEL`: All top-level symbols default to export; nested default
+ to local.
+* `LOCAL_ALL`: All symbols default to local.
+* `STRICT`: All symbols local by default. Nested types cannot be exported,
+ except for a special-case caveat for message `{ enum {} reserved 1 to max;
+ }`. This is the recommended setting for new protos.
+
+**Applicable to the following scope:** Enum, Message
+
+**Added in:** Edition 2024
+
+**Default behavior per syntax/edition:**
+
+Syntax/edition | Default
+-------------- | ------------------
+2024 | `EXPORT_TOP_LEVEL`
+2023 | `EXPORT_ALL`
+proto3 | `EXPORT_ALL`
+proto2 | `EXPORT_ALL`
+
+**Note:** Feature settings on different schema elements
+[have different scopes](#cascading).
+
+The following sample shows how you can apply the feature to elements in your
+proto schema definition files:
+
+```proto
+// foo.proto
+edition = "2024";
+
+// Symbol visibility defaults to EXPORT_TOP_LEVEL. Setting
+// default_symbol_visibility overrides these defaults
+option features.default_symbol_visibility = LOCAL_ALL;
+
+// Top-level symbols are exported by default in Edition 2024; applying the local
+// keyword overrides this
+export message LocalMessage {
+ int32 baz = 1;
+ // Nested symbols are local by default in Edition 2024; applying the export
+ // keyword overrides this
+ enum ExportedNestedEnum {
+ UNKNOWN_EXPORTED_NESTED_ENUM_VALUE = 0;
+ }
+}
+
+// bar.proto
+edition = "2024";
+
+import "foo.proto";
+
+message ImportedMessage {
+ // The following is valid because the imported message explicitly overrides
+ // the visibility setting in foo.proto
+ LocalMessage bar = 1;
+
+ // The following is not valid because default_symbol_visibility is set to
+ // `LOCAL_ALL`
+ // LocalMessage.ExportedNestedEnum qux = 2;
+}
+```
+### `features.enforce_naming_style` {#enforce-naming}
+
+Introduced in Edition 2024, this feature enables strict naming style enforcement
+as defined in
+[the style guide](/programming-guides/style) to ensure
+protos are round-trippable by default with a feature value to opt-out to use
+
+**Values available:**
+
+* `STYLE2024`: Enforces strict adherence to the style guide for naming.
+* `STYLE_LEGACY`: Applies the pre-Edition 2024 level of style guide
+ enforcement.
+
+**Applicable to the following scope:** File
+
+**Added in:** 2024
+
+**Default behavior per syntax/edition:**
+
+Syntax/edition | Default
+-------------- | --------------
+2024 | `STYLE2024`
+2023 | `STYLE_LEGACY`
+proto3 | `STYLE_LEGACY`
+proto2 | `STYLE_LEGACY`
+
+**Note:** Feature settings on different schema elements
+[have different scopes](#cascading).
+
+The following code sample shows an Edition 2023 file:
+
+Edition 2023 defaults to `STYLE_LEGACY`, so a non-conformant field name is fine:
+
+```proto
+edition = "2023";
+
+message Foo {
+ // A non-conforming field name is not a problem
+ int64 bar_1 = 1;
+}
+```
+
+Edition 2025 defaults to `STYLE2024`, so an override is needed to keep the
+non-conformant field name:
+
+```proto
+edition = "2024";
+
+// To keep the non-conformant field name, override the STYLE2024 setting
+option features.enforce_naming_style = "STYLE_LEGACY";
+
+message Foo {
+ int64 bar_1 = 1;
+}
+```
+
+>>>>>>> 81fd217 (This documentation change includes the following:)
### `features.enum_type` {#enum_type}
This feature sets the behavior for how enum values that aren't contained within
@@ -235,6 +379,7 @@ and after of a proto3 file.
**Default behavior per syntax/edition:**
+<<<<<<< HEAD
Syntax/edition | Default
-------------- | --------
2024 | `OPEN`
@@ -242,6 +387,19 @@ Syntax/edition | Default
proto3 | `OPEN`
proto2 | `CLOSED`
+**Note:** Feature settings on different schema elements
+[have different scopes](#cascading).
+||||||| parent of 81fd217 (This documentation change includes the following:)
+**Behavior in proto3:** `OPEN`
+=======
+Syntax/edition | Default
+-------------- | --------
+2024 | `OPEN`
+2023 | `OPEN`
+proto3 | `OPEN`
+proto2 | `CLOSED`
+>>>>>>> 81fd217 (This documentation change includes the following:)
+
**Note:** Feature settings on different schema elements
[have different scopes](#cascading).
@@ -292,7 +450,19 @@ whether a protobuf field has a value.
**Applicable to the following scopes:** File, Field
+<<<<<<< HEAD
+<<<<<<< HEAD
+**Added in:** 2023
+||||||| parent of dcf50a2 (This documentation change includes the following:)
+**Default value in the Edition 2023:** `EXPLICIT`
+=======
+**Default behavior in the Edition 2023:** `EXPLICIT`
+>>>>>>> dcf50a2 (This documentation change includes the following:)
+||||||| parent of 81fd217 (This documentation change includes the following:)
+**Default behavior in the Edition 2023:** `EXPLICIT`
+=======
**Added in:** 2023
+>>>>>>> 81fd217 (This documentation change includes the following:)
**Default behavior per syntax/edition:**
@@ -351,8 +521,21 @@ message Bar {
After running Prototiller, the equivalent code might look like this:
```proto
+<<<<<<< HEAD
+<<<<<<< HEAD
edition = "2024";
// Setting the file-level field_presence feature matches the proto3 implicit default
+||||||| parent of dcf50a2 (This documentation change includes the following:)
+edition = "2023";
+=======
+edition = "2023";
+||||||| parent of 81fd217 (This documentation change includes the following:)
+edition = "2023";
+=======
+edition = "2024";
+>>>>>>> 81fd217 (This documentation change includes the following:)
+// Setting the file-level field_presence feature matches the proto3 implicit default
+>>>>>>> dcf50a2 (This documentation change includes the following:)
option features.field_presence = IMPLICIT;
message Bar {
@@ -389,6 +572,7 @@ and after of a proto3 file. Editions behavior matches the behavior in proto3.
**Default behavior per syntax/edition:**
+<<<<<<< HEAD
Syntax/edition | Default
-------------- | --------------------
2024 | `ALLOW`
@@ -396,6 +580,19 @@ Syntax/edition | Default
proto3 | `ALLOW`
proto2 | `LEGACY_BEST_EFFORT`
+**Note:** Feature settings on different schema elements
+[have different scopes](#cascading).
+||||||| parent of 81fd217 (This documentation change includes the following:)
+**Behavior in proto3:** `ALLOW`
+=======
+Syntax/edition | Default
+-------------- | --------------------
+2024 | `ALLOW`
+2023 | `ALLOW`
+proto3 | `ALLOW`
+proto2 | `LEGACY_BEST_EFFORT`
+>>>>>>> 81fd217 (This documentation change includes the following:)
+
**Note:** Feature settings on different schema elements
[have different scopes](#cascading).
@@ -414,8 +611,22 @@ message Foo {
After running Prototiller, the equivalent code might look like this:
```proto
+<<<<<<< HEAD
+<<<<<<< HEAD
+edition = "2024";
+option features.json_format = LEGACY_BEST_EFFORT;
+||||||| parent of dcf50a2 (This documentation change includes the following:)
+edition = "2023";
+features.json_format = LEGACY_BEST_EFFORT;
+=======
+edition = "2023";
+||||||| parent of 81fd217 (This documentation change includes the following:)
+edition = "2023";
+=======
edition = "2024";
+>>>>>>> 81fd217 (This documentation change includes the following:)
option features.json_format = LEGACY_BEST_EFFORT;
+>>>>>>> dcf50a2 (This documentation change includes the following:)
message Foo {
string bar = 1;
@@ -452,6 +663,7 @@ the following conditions are met:
**Default behavior per syntax/edition:**
+<<<<<<< HEAD
Syntax/edition | Default
-------------- | -----------------
2024 | `LENGTH_PREFIXED`
@@ -459,6 +671,19 @@ Syntax/edition | Default
proto3 | `LENGTH_PREFIXED`
proto2 | `LENGTH_PREFIXED`
+**Note:** Feature settings on different schema elements
+[have different scopes](#cascading).
+||||||| parent of 81fd217 (This documentation change includes the following:)
+**Behavior in proto3:** `LENGTH_PREFIXED`. Proto3 doesn't support `DELIMITED`.
+=======
+Syntax/edition | Default
+-------------- | -----------------
+2024 | `LENGTH_PREFIXED`
+2023 | `LENGTH_PREFIXED`
+proto3 | `LENGTH_PREFIXED`
+proto2 | `LENGTH_PREFIXED`
+>>>>>>> 81fd217 (This documentation change includes the following:)
+
**Note:** Feature settings on different schema elements
[have different scopes](#cascading).
@@ -508,6 +733,7 @@ for `repeated` fields has been migrated to in Editions.
**Default behavior per syntax/edition:**
+<<<<<<< HEAD
Syntax/edition | Default
-------------- | ----------
2024 | `PACKED`
@@ -515,6 +741,19 @@ Syntax/edition | Default
proto3 | `PACKED`
proto2 | `EXPANDED`
+**Note:** Feature settings on different schema elements
+[have different scopes](#cascading).
+||||||| parent of 81fd217 (This documentation change includes the following:)
+**Behavior in proto3:** `PACKED`
+=======
+Syntax/edition | Default
+-------------- | ----------
+2024 | `PACKED`
+2023 | `PACKED`
+proto3 | `PACKED`
+proto2 | `EXPANDED`
+>>>>>>> 81fd217 (This documentation change includes the following:)
+
**Note:** Feature settings on different schema elements
[have different scopes](#cascading).
@@ -587,12 +826,26 @@ and after of a proto3 file.
**Default behavior per syntax/edition:**
+<<<<<<< HEAD
+Syntax/edition | Default
+-------------- | --------
+2024 | `VERIFY`
+2023 | `VERIFY`
+proto3 | `VERIFY`
+proto2 | `NONE`
+
+**Note:** Feature settings on different schema elements
+[have different scopes](#cascading).
+||||||| parent of 81fd217 (This documentation change includes the following:)
+**Behavior in proto3:** `VERIFY`
+=======
Syntax/edition | Default
-------------- | --------
2024 | `VERIFY`
2023 | `VERIFY`
proto3 | `VERIFY`
proto2 | `NONE`
+>>>>>>> 81fd217 (This documentation change includes the following:)
**Note:** Feature settings on different schema elements
[have different scopes](#cascading).
@@ -799,15 +1052,15 @@ proto2 | `LEGACY`
This feature determines how generated code should treat string fields. This
replaces the `ctype` option from proto2 and proto3, and offers a new
-`string_view` feature. In Edition 2023, you can specify either `ctype` or
+`string_type` feature. In Edition 2023, you can specify either `ctype` or
`string_type` on a field, but not both. In Edition 2024, the `ctype` option is
removed.
**Values available:**
-* `VIEW`: Generates `string_view` accessors for the field. This will be the
- default in a future edition.
-* `CORD`: Generates `Cord` accessors for the field.
+* `VIEW`: Generates `string_view` accessors for the field.
+* `CORD`: Generates `Cord` accessors for the field. Not supported on extension
+ fields.
* `STRING`: Generates `string` accessors for the field.
**Applicable to the following scopes:** File, Field
@@ -897,12 +1150,26 @@ before and after of a proto3 file.
**Default behavior per syntax/edition:**
+<<<<<<< HEAD
+Syntax/edition | Default
+-------------- | ---------
+2024 | `DEFAULT`
+2023 | `DEFAULT`
+proto3 | `DEFAULT`
+proto2 | `DEFAULT`
+
+**Note:** Feature settings on different schema elements
+[have different scopes](#cascading).
+||||||| parent of 81fd217 (This documentation change includes the following:)
+**Behavior in proto3:** `DEFAULT`
+=======
Syntax/edition | Default
-------------- | ---------
2024 | `DEFAULT`
2023 | `DEFAULT`
proto3 | `DEFAULT`
proto2 | `DEFAULT`
+>>>>>>> 81fd217 (This documentation change includes the following:)
**Note:** Feature settings on different schema elements
[have different scopes](#cascading).
diff --git a/content/editions/overview.md b/content/editions/overview.md
index b62692862..f5b1b3a28 100644
--- a/content/editions/overview.md
+++ b/content/editions/overview.md
@@ -22,7 +22,11 @@ the default behavior for the edition you've selected. You can also override your
overrides. The [section later in this topic on lexical scoping](#scoping) goes
into more detail on that.
-*The latest released edition is 2024.*
+*The latest released edition is 2023.*
+
+The examples in this topic show edition 2024 features, but edition 2024 is
+currently in **pre-release review** and is not yet recommended for production
+code.
## Lifecycle of a Feature {#lifecycles}
@@ -176,7 +180,7 @@ package com.example;
message Player {
// in proto3, optional fields have explicit presence
- optional string name = 1 [default = "N/A"];
+ optional string name = 1;
// in proto3 no specified field rule defaults to implicit presence
int32 id = 2;
// in proto3 this is packed by default
diff --git a/content/getting-started/pythontutorial.md b/content/getting-started/pythontutorial.md
index e26d88aac..9718dda05 100644
--- a/content/getting-started/pythontutorial.md
+++ b/content/getting-started/pythontutorial.md
@@ -293,6 +293,31 @@ Again, see the
[`Message` API reference](https://googleapis.dev/python/protobuf/latest/google/protobuf/message.html#google.protobuf.message.Message)
for a complete list.
+You can also easily serialize messages to and from JSON. The `json_format`
+module provides helpers for this:
+
+- `MessageToJson(message)`: serializes the message to a JSON string.
+- `Parse(json_string, message)`: parses a JSON string into the given message.
+
+For example:
+
+```python
+from google.protobuf import json_format
+import addressbook_pb2
+
+person = addressbook_pb2.Person()
+person.id = 1234
+person.name = "John Doe"
+person.email = "jdoe@example.com"
+
+# Serialize to JSON
+json_string = json_format.MessageToJson(person)
+
+# Parse from JSON
+new_person = addressbook_pb2.Person()
+json_format.Parse(json_string, new_person)
+```
+
{{% alert title="Important" color="warning" %}} **Protocol Buffers and Object Oriented Design**
Protocol buffer classes are basically data holders (like structs in C) that
don't provide additional functionality; they don't make good first class
@@ -464,12 +489,12 @@ One key feature provided by protocol message classes is *reflection*. You can
iterate over the fields of a message and manipulate their values without writing
your code against any specific message type. One very useful way to use
reflection is for converting protocol messages to and from other encodings, such
-as XML or JSON. A more advanced use of reflection might be to find differences
-between two messages of the same type, or to develop a sort of "regular
-expressions for protocol messages" in which you can write expressions that match
-certain message contents. If you use your imagination, it's possible to apply
-Protocol Buffers to a much wider range of problems than you might initially
-expect!
+as XML or JSON (see [Parsing and Serialization](#parsing-serialization) for an
+example). A more advanced use of reflection might be to find differences between
+two messages of the same type, or to develop a sort of "regular expressions for
+protocol messages" in which you can write expressions that match certain message
+contents. If you use your imagination, it's possible to apply Protocol Buffers
+to a much wider range of problems than you might initially expect!
Reflection is provided as part of the
[`Message` interface](https://googleapis.dev/python/protobuf/latest/google/protobuf/message.html#google.protobuf.message.Message).
diff --git a/content/news/2023-04-28.md b/content/news/2023-04-28.md
index e38743a9d..ef77548cb 100644
--- a/content/news/2023-04-28.md
+++ b/content/news/2023-04-28.md
@@ -8,12 +8,14 @@ type = "docs"
## Stricter validation for `json_name` {#json-name}
-v24 will forbid embedded null characters in the
+v24 will forbid zero unicode code points (`\u0000`) in the
[`json_name` field option](/programming-guides/proto3/#json).
-Going forward, any valid Unicode characters will be accepted, **except**
-`\u0000`. Null will still be allowed in field values.
+Going forward, any valid Unicode characters will be accepted in `json_name`,
+**except** `\u0000`. `\0` characters will still be allowed to be used as values.
-Previously, the proto compiler allowed null characters, but support for this was
-inconsistent across languages and implementations. To fix this, we are
-clarifying the spec to say that null is not allowed in `json_name`, and will be
-rejected by the compiler.
+Previously, the proto compiler allowed `\0` characters in the `json_name` field
+option, but support for this was inconsistent across languages and
+implementations. To help prevent interoperability problems relating to
+mishandling of keys containing a `\0` character, we are clarifying the spec to
+say that `\0` is not allowed in `json_name`, and will be rejected by the
+compiler.
diff --git a/content/programming-guides/editions.md b/content/programming-guides/editions.md
index e9c8d2342..0c0205240 100644
--- a/content/programming-guides/editions.md
+++ b/content/programming-guides/editions.md
@@ -1138,52 +1138,28 @@ There are two main reasons to use extensions:
### Example Extension {#ext-example}
-Let's look at an example extension:
+Using an extension is a two-step process. First, in the message you want to
+extend (the "container"), you must reserve a range of field numbers for
+extensions. Then, in a separate file, you define the extension field itself.
-```proto
-// file kittens/video_ext.proto
-
-import "kittens/video.proto";
-import "media/user_content.proto";
-
-package kittens;
+Here is an example that shows how to add an extension for kitten videos to a
+generic `UserContent` message.
-// This extension allows kitten videos in a media.UserContent message.
-extend media.UserContent {
- // Video is a message imported from kittens/video.proto
- repeated Video kitten_videos = 126;
-}
-```
+**Step 1: Reserve an extension range in the container message.**
-Note that the file defining the extension (`kittens/video_ext.proto`) imports
-the container message's file (`media/user_content.proto`).
-
-The container message must reserve a subset of its field numbers for extensions.
+The container message must use the `extensions` keyword to reserve a range of
+field numbers for others to use. It is a best practice to also add a
+`declaration` for the specific extension you plan to add. This declaration acts
+as a forward-declaration, making it easier for developers to discover extensions
+and avoid reusing field numbers.
```proto
-// file media/user_content.proto
+// media/user_content.proto
+edition = "2023";
package media;
-// A container message to hold stuff that a user has created.
-message UserContent {
- // Set verification to `DECLARATION` to enforce extension declarations for all
- // extensions in this range.
- extensions 100 to 199 [verification = DECLARATION];
-}
-```
-
-The container message's file (`media/user_content.proto`) defines the message
-`UserContent`, which reserves field numbers [100 to 199] for extensions. It is
-recommended to set `verification = DECLARATION` for the range to require
-declarations for all its extensions.
-
-When the new extension (`kittens/video_ext.proto`) is added, a corresponding
-declaration should be added to `UserContent` and `verification` should be
-removed.
-
-```
-// A container message to hold stuff that a user has created.
+// A container for user-created content.
message UserContent {
extensions 100 to 199 [
declaration = {
@@ -1191,21 +1167,38 @@ message UserContent {
full_name: ".kittens.kitten_videos",
type: ".kittens.Video",
repeated: true
- },
- // Ensures all field numbers in this extension range are declarations.
- verification = DECLARATION
+ }
];
}
```
-`UserContent` declares that field number `126` will be used by a `repeated`
-extension field with the fully-qualified name `.kittens.kitten_videos` and the
-fully-qualified type `.kittens.Video`. To learn more about extension
-declarations see
-[Extension Declarations](/programming-guides/extension_declarations).
+This declaration specifies the field number, full name, type, and cardinality of
+the extension that will be defined elsewhere.
+
+**Step 2: Define the extension in a separate file.**
+
+The extension itself is defined in a different `.proto` file, which typically
+focuses on a specific feature (like kitten videos). This avoids adding a
+dependency from the generic container to the specific feature.
+
+```proto
+// kittens/video_ext.proto
+edition = "2023";
+
+import "media/user_content.proto"; // Imports the container message
+import "kittens/video.proto"; // Imports the extension's message type
+
+package kittens;
+
+// This defines the extension field.
+extend media.UserContent {
+ repeated Video kitten_videos = 126;
+}
+```
-Note that the container message's file (`media/user_content.proto`) **does not**
-import the kitten_video extension definition (`kittens/video_ext.proto`)
+The `extend` block ties the new `kitten_videos` field back to the
+`media.UserContent` message, using the field number `126` that was reserved in
+the container.
There is no difference in the wire-format encoding of extension fields as
compared to a standard field with the same field number, type, and cardinality.
diff --git a/content/programming-guides/encoding.md b/content/programming-guides/encoding.md
index b4c70acc1..bcd4db815 100644
--- a/content/programming-guides/encoding.md
+++ b/content/programming-guides/encoding.md
@@ -222,16 +222,15 @@ For example, `-500z` is the same as the varint `999`.
### Non-varint Numbers {#non-varints}
-Non-varint numeric types are simple -- `double` and `fixed64` have wire type
-`I64`, which tells the parser to expect a fixed eight-byte lump of data. We can
-specify a `double` record by writing `5: 25.4`, or a `fixed64` record with `6:
-200i64`. In both cases, omitting an explicit wire type implies the `I64` wire
-type.
+Non-varint numeric types are simple. `double` and `fixed64` have wire type
+`I64`, which tells the parser to expect a fixed eight-byte lump of data.
+`double` values are encoded in IEEE 754 double-precision format. We can specify
+a `double` record by writing `5: 25.4`, or a `fixed64` record with `6: 200i64`.
Similarly `float` and `fixed32` have wire type `I32`, which tells it to expect
-four bytes instead. The syntax for these consists of adding an `i32` suffix.
-`25.4i32` will emit four bytes, as will `200i32`. Tag types are inferred as
-`I32`.
+four bytes instead. `float` values are encoded in IEEE 754 single-precision
+format. The syntax for these consists of adding an `i32` suffix. `25.4i32` will
+emit four bytes, as will `200i32`. Tag types are inferred as `I32`.
## Length-Delimited Records {#length-types}
@@ -529,11 +528,13 @@ value := varint for wire_type == VARINT,
varint := int32 | int64 | uint32 | uint64 | bool | enum | sint32 | sint64;
encoded as varints (sintN are ZigZag-encoded first)
i32 := sfixed32 | fixed32 | float;
- encoded as 4-byte little-endian;
- memcpy of the equivalent C types (u?int32_t, float)
+ encoded as 4-byte little-endian (float is IEEE 754
+ single-precision); memcpy of the equivalent C types (u?int32_t,
+ float)
i64 := sfixed64 | fixed64 | double;
- encoded as 8-byte little-endian;
- memcpy of the equivalent C types (u?int64_t, double)
+ encoded as 8-byte little-endian (double is IEEE 754
+ double-precision); memcpy of the equivalent C types (u?int64_t,
+ double)
len-prefix := size (message | string | bytes | packed);
size encoded as int32 varint
diff --git a/content/programming-guides/enum.md b/content/programming-guides/enum.md
index 2e48628c4..4d1b1b6c0 100644
--- a/content/programming-guides/enum.md
+++ b/content/programming-guides/enum.md
@@ -110,6 +110,7 @@ open, and when importing from another editions file it uses the feature setting.
All known C++ releases are out of conformance. When a `proto2` file imports an
enum defined in a `proto3` file, C++ treats that field as a **closed** enum.
+
Under editions, this behavior is represented by the deprecated field feature
[`features.(pb.cpp).legacy_closed_enum`](/editions/features#legacy_closed_enum).
There are two options for moving to conformant behavior:
@@ -121,7 +122,7 @@ There are two options for moving to conformant behavior:
* Change the enum to closed. This is discouraged, and can cause runtime
behavior changes if *anybody else* is using the enum. Unrecognized integers
will end up in the unknown field set instead of those fields.
-
+
### C# {#csharp}
All known C# releases are out of conformance. C# treats all enums as **open**.
diff --git a/content/programming-guides/json.md b/content/programming-guides/json.md
index 26f275b50..f6d8d25d8 100644
--- a/content/programming-guides/json.md
+++ b/content/programming-guides/json.md
@@ -8,42 +8,47 @@ type = "docs"
Protobuf supports a canonical encoding in JSON, making it easier to share data
with systems that do not support the standard protobuf binary wire format.
-ProtoJSON Format is not as efficient as protobuf wire format. The converter uses
-more CPU to encode and decode messages and (except in rare cases) encoded
-messages consume more space. Furthermore, ProtoJSON format puts your field and
-enum value names into encoded messages making it much harder to change those
+This page specifies the format, but a number of additional edge cases which
+define a conformant ProtoJSON parser are covered in the Protobuf Conformance
+Test Suite and are not exhaustively detailed here.
+
+# Non-goals of the Format {#non-goals}
+
+## Cannot Represent Some JSON schemas {#non-goals-arbitrary-json-schema}
+
+The ProtoJSON format is designed to be a JSON representation of schemas which
+are expressible in the Protobuf schema language.
+
+It may be possible to represent many pre-existing JSON schemas as a Protobuf
+schema and parse it using ProtoJSON, but it is not designed to be able to
+represent arbitrary JSON schemas.
+
+For example, there is no way to express in Protobuf schema to write types that
+may be common in JSON schemas like `number[][]` or `number|string`.
+
+It is possible to use `google.protobuf.Struct` and `google.protobuf.Value` types
+to allow arbitrary JSON to be parsed into a Protobuf schema, but these only
+allow you to capture the values as schemaless unordered key-value maps.
+
+## Not as efficient as the binary wire format {#non-goals-highly-efficient}
+
+ProtoJSON Format is not as efficient as binary wire format and never will be.
+
+The converter uses more CPU to encode and decode messages and (except in rare
+cases) encoded messages consume more space.
+
+## Does not have as good schema-evolution guarantees as binary wire format {#non-goals-optimal-schema-evolution}
+
+ProtoJSON format does not support unknown fields, and it puts field and enum
+value names into encoded messages which makes it much harder to change those
names later. Removing fields is a breaking change that will trigger a parsing
-error. In short, there are many good reasons why Google prefers to use the
-standard wire format for virtually everything rather than ProtoJSON format.
-
-The encoding is described on a type-by-type basis in the table later in this
-topic.
-
-When parsing JSON-encoded data into a protocol buffer, if a value is missing or
-if its value is `null`, it will be interpreted as the corresponding
-[default value](/programming-guides/editions#default). Multiple values for
-singular fields (using duplicate or equivalent JSON keys) are accepted and the
-last value is retained, as with binary format parsing. Note that not all
-protobuf JSON parser implementations are conformant, and some nonconformant
-implementations may reject duplicate keys instead.
-
-When generating JSON-encoded output from a protocol buffer, if a protobuf field
-has the default value and if the field doesn't support field presence, it will
-be omitted from the output by default. An implementation may provide options to
-include fields with default values in the output.
-
-Fields that have a value set and that support field presence always include the
-field value in the JSON-encoded output, even if it is the default value. For
-example, a proto3 field that is defined with the `optional` keyword supports
-field presence and if set, will always appear in the JSON output. A message type
-field in any edition of protobuf supports field presence and if set will appear
-in the output. Proto3 implicit-presence scalar fields will only appear in the
-JSON output if they are not set to the default value for that type.
-
-When representing numerical data in a JSON file, if the number that is is parsed
-from the wire doesn't fit in the corresponding type, you will get the same
-effect as if you had cast the number to that type in C++ (for example, if a
-64-bit number is read as an int32, it will be truncated to 32 bits).
+error.
+
+See [JSON Wire Safety](#json-wire-safety) below for more details.
+
+# Format Description {#format}
+
+## Representation of each type {#field-representation}
The following table shows how data is represented in JSON files.
@@ -65,9 +70,8 @@ The following table shows how data is represented in JSON files.
will be used as the key instead. Parsers accept both the lowerCamelCase
name (or the one specified by the json_name
option) and the
original proto field name. null
is an accepted value for
- all field types and treated as the default value of the corresponding
- field type. However, null
cannot be used for the
- json_name
value. For more on why, see
+ all field types and leaves the field unset. \0 (nul)
cannot
+ be used within a json_name
value. For more on why, see
Stricter validation for json_name.
@@ -83,7 +87,7 @@ The following table shows how data is represented in JSON files.
map<K,V> |
object |
{"k": v, ...} |
- All keys are converted to strings. |
+ All keys are converted to strings (keys in JSON spec can only be strings). |
repeated V |
@@ -166,7 +170,7 @@ The following table shows how data is represented in JSON files.
Generated output always contains 0, 3, 6, or 9 fractional digits,
depending on required precision, followed by the suffix "s". Accepted
are any fractional digits (also none) as long as they fit into
- nano-seconds precision and the suffix "s" is required.
+ nanoseconds precision and the suffix "s" is required.
|
@@ -209,7 +213,7 @@ The following table shows how data is represented in JSON files.
NullValue |
null |
|
- JSON null |
+ JSON null. Special case of the [null parsing behavior](#null-values). |
Empty |
@@ -220,6 +224,59 @@ The following table shows how data is represented in JSON files.
+## Presence and default-values {#presence}
+
+When generating JSON-encoded output from a protocol buffer, if a field supports
+presence, serializers must emit the field value if and only if the corresponding
+hasser would return true.
+
+If the field doesn't support field presence and has the default value (for
+example any empty repeated field) serializers should omit it from the output. An
+implementation may provide options to include fields with default values in the
+output.
+
+## Null values {#null-values}
+
+Serializers should not emit `null` values.
+
+Parsers should accept `null` as a legal value for any field, with the behavior:
+
+* Any key validity checking should still occur (disallowing unknown fields)
+* The field should remain unset, as though it was not present in the input at
+ all (hassers should still return false where applicable).
+
+`null` values are not allowed within repeated fields.
+
+`google.protobuf.NullValue` is a special exception to this behavior: `null` is
+handled as a sentinel-present value for this type, and so a field of this type
+must be handled by serializers and parsers under the standard presence behavior.
+This behavior correspondingly allows `google.protobuf.Struct` and
+`google.protobuf.Value` to losslessly round trip arbitrary JSON.
+
+## Duplicate values {#duplicate-values}
+
+Serializers must never serialize the same field multiple times, nor multiple
+different cases in the same oneof in the same JSON object.
+
+Parsers should accept the same field being duplicated, and the last value
+provided should be retained. This also applies to "alternate spellings" of the
+same field name.
+
+If implementations cannot maintain the necessary information about field order
+it is preferred to reject inputs with duplicate keys rather than have an
+arbitrary value win. In some implementations maintaining field order of objects
+may be impractical or infeasible, so it is strongly recommended that systems
+avoid relying on specific behavior for duplicate fields in ProtoJSON where
+possible.
+
+## Out of range numeric values
+
+When parsing a numeric value, if the number that is is parsed from the wire
+doesn't fit in the corresponding type, the parser should coerce the value to the
+appropriate type. This has the same behavior as a simple cast in C++ or Java
+(for example, if a number larger than 2^32 is read as for an int32 field, it
+will be truncated to 32 bits).
+
## ProtoJSON Wire Safety {#json-wire-safety}
When using ProtoJSON, only some schema changes are safe to make in a distributed
diff --git a/content/programming-guides/proto3.md b/content/programming-guides/proto3.md
index 974fa2017..ca19824bd 100644
--- a/content/programming-guides/proto3.md
+++ b/content/programming-guides/proto3.md
@@ -372,11 +372,17 @@ automatically generated class:
double |
- |
-
-
+
+ Uses IEEE 754
+ double-precision format.
+ |
+
+
float |
- |
+
+ Uses IEEE 754
+ single-precision format.
+ |
int32 |
diff --git a/content/programming-guides/serialization-not-canonical.md b/content/programming-guides/serialization-not-canonical.md
index f48d81125..9ef6264c4 100644
--- a/content/programming-guides/serialization-not-canonical.md
+++ b/content/programming-guides/serialization-not-canonical.md
@@ -66,3 +66,4 @@ allow for more optimization opportunities:
To leave room for optimizations like this, we want to intentionally scramble
field order in some configurations, so that applications do not inappropriately
depend on field order.
+
diff --git a/content/programming-guides/style.md b/content/programming-guides/style.md
index af3a92fc8..f9dc75012 100644
--- a/content/programming-guides/style.md
+++ b/content/programming-guides/style.md
@@ -65,7 +65,7 @@ underscore).
The motivation for this rule is that each protobuf language implementation may
convert identifiers into the local language style: a name of `song_id` in a
.proto file may end up having accessors for the field which are capitalized as
-as `SongId`, `songId` or `song_id` depending on the language.
+`SongId`, `songId` or `song_id` depending on the language.
By using underscores only before letters, it avoids situations where names may
be distinct in one style, but would collide after they are transformed into one
diff --git a/content/reference/cpp/cpp-generated.md b/content/reference/cpp/cpp-generated.md
index 693c6f02d..d18fe6b20 100644
--- a/content/reference/cpp/cpp-generated.md
+++ b/content/reference/cpp/cpp-generated.md
@@ -579,7 +579,7 @@ The compiler will generate the following accessor methods:
Calling this method with index outside of [0, foo_size()) yields undefined
behavior.
- `void add_foo(::absl::string_view value)`: Appends a new element to the end
- of the element at the given zero-based index.
+ of the field with the given value.
- `void add_foo(const string& value)`: Appends a new element to the end of the
field with the given value.
- `void add_foo(string&& value)`: Appends a new element to the end of the
diff --git a/content/reference/go/size.md b/content/reference/go/size.md
index a4b0868de..3841d98ea 100644
--- a/content/reference/go/size.md
+++ b/content/reference/go/size.md
@@ -8,23 +8,31 @@ type = "docs"
The [`proto.Size`](https://pkg.go.dev/google.golang.org/protobuf/proto#Size)
function returns the size in bytes of the wire-format encoding of a
-proto.Message by traversing all its fields (including submessages).
+`proto.Message` by traversing all its fields (including submessages).
In particular, it returns the size of **how Go Protobuf will encode the
message**.
+## A note on Protobuf Editions
+
+With Protobuf Editions, `.proto` files can enable features that change
+serialization behavior. This can affect the value returned by `proto.Size`. For
+example, setting `features.field_presence = IMPLICIT` will cause scalar fields
+that are set to their defaults to not be serialized, and therefore don't
+contribute to the size of the message.
+
## Typical usages
### Identifying empty messages
Checking if
[`proto.Size`](https://pkg.go.dev/google.golang.org/protobuf/proto#Size) returns
-0 is an easy way to recognize empty messages:
+0 is a common way to check for empty messages:
```go
if proto.Size(m) == 0 {
- // No fields set (or, in proto3, all fields matching the default);
- // skip processing this message, or return an error, or similar.
+ // No fields set; skip processing this message,
+ // or return an error, or similar.
}
```
diff --git a/content/reference/java/java-generated.md b/content/reference/java/java-generated.md
index 126740c30..a3ace66d7 100644
--- a/content/reference/java/java-generated.md
+++ b/content/reference/java/java-generated.md
@@ -7,13 +7,14 @@ type = "docs"
+++
Any
-differences between proto2 and proto3 generated code are highlighted—note
-that these differences are in the generated code as described in this document,
-not the base message classes/interfaces, which are the same in both versions.
-You should read the
-[proto2 language guide](/programming-guides/proto2)
+differences between proto2, proto3, and Editions generated code are
+highlighted—note that these differences are in the generated code as
+described in this document, not the base message classes/interfaces, which are
+the same in all versions. You should read the
+[proto2 language guide](/programming-guides/proto2),
+[proto3 language guide](/programming-guides/proto3),
and/or
-[proto3 language guide](/programming-guides/proto3)
+[Editions language guide](/programming-guides/editions)
before reading this document.
Note that no Java protocol buffer methods accept or return nulls unless
@@ -240,17 +241,17 @@ them. For example:
```proto
message Foo {
- optional int32 val = 1;
+ int32 val = 1;
// some other fields.
}
message Bar {
- optional Foo foo = 1;
+ Foo foo = 1;
// some other fields.
}
message Baz {
- optional Bar bar = 1;
+ Bar bar = 1;
// some other fields.
}
```
@@ -311,11 +312,18 @@ would be `getFooBarBaz`. And `foo_ba23r_baz` becomes `fooBa23RBaz`.
As well as accessor methods, the compiler generates an integer constant for each
field containing its field number. The constant name is the field name converted
-to upper-case followed by `_FIELD_NUMBER`. For example, given the field
-`optional int32 foo_bar = 5;`, the compiler will generate the constant `public
-static final int FOO_BAR_FIELD_NUMBER = 5;`.
+to upper-case followed by `_FIELD_NUMBER`. For example, given the field `int32
+foo_bar = 5;`, the compiler will generate the constant `public static final int
+FOO_BAR_FIELD_NUMBER = 5;`.
-### Singular Fields (proto2) {#singular-proto2}
+The following sections are divided between explicit and implicit presence.
+Proto2 has explicit presence and proto3 defaults to implicit presence. Editions
+defaults to explicit presence, but you can override that using
+[`features.field_presence`](/editions/features#field_presence).
+
+
+
+### Singular Fields with Explicit Presence {#singular-explicit}
For any of these field definitions:
@@ -368,7 +376,9 @@ The compiler generates the following method only in the message's builder.
- `Builder getFooBuilder()`: Returns the builder for the field.
-### Singular Fields (proto3) {#singular-proto3}
+
+
+### Singular Fields with Implicit Presence {#singular-implicit}
For this field definition:
@@ -395,32 +405,6 @@ the
For message and enum types, the value type is replaced with the message or enum
class.
-#### Embedded Message Fields {#embedded-message-proto3}
-
-For message field types, an additional accessor method is generated in both the
-message class and its builder:
-
-- `boolean hasFoo()`: Returns `true` if the field has been set.
-
-`setFoo()` also accepts an instance of the message's builder type as the
-parameter. This is just a shortcut which is equivalent to calling `.build()` on
-the builder and passing the result to the method.
-
-If the field is not set, `getFoo()` will return a Foo instance with none of its
-fields set (possibly the instance returned by `Foo.getDefaultInstance()`).
-
-In addition, the compiler generates two accessor methods that allow you to
-access the relevant sub-builders for message types. The following method is
-generated in both the message class and its builder:
-
-- `FooOrBuilder getFooOrBuilder()`: Returns the builder for the field, if it
- already exists, or the message if not. Calling this method on builders will
- not create a sub-builder for the field.
-
-The compiler generates the following method only in the message's builder.
-
-- `Builder getFooBuilder()`: Returns the builder for the field.
-
#### Enum Fields {#enum-proto3}
For enum field types, an additional accessor method is generated in both the
@@ -711,8 +695,9 @@ enum Foo {
The protocol buffer compiler will generate a Java enum type called `Foo` with
the same set of values. If you are using proto3, it also adds the special value
-`UNRECOGNIZED` to the enum type. The values of the generated enum type have the
-following special methods:
+`UNRECOGNIZED` to the enum type. In Editions, `OPEN` enums also have a
+`UNRECOGNIZED` value, while `CLOSED` enums do not. The values of the generated
+enum type have the following special methods:
- `int getNumber()`: Returns the object's numeric value as defined in the
`.proto` file.
@@ -731,8 +716,8 @@ Additionally, the `Foo` enum type contains the following static methods:
`forNumber(int value)` and will be removed in an upcoming release.
- `static Foo valueOf(EnumValueDescriptor descriptor)`: Returns the enum
object corresponding to the given value descriptor. May be faster than
- `valueOf(int)`. In proto3 returns `UNRECOGNIZED` if passed an unknown value
- descriptor.
+ `valueOf(int)`. In proto3 and `OPEN` enums, returns `UNRECOGNIZED` if passed
+ an unknown value descriptor.
- `EnumDescriptor getDescriptor()`: Returns the enum type's descriptor, which
contains e.g. information about each defined value. (This differs from
`getDescriptorForType()` only in that it is a static method.)
@@ -771,11 +756,13 @@ over 1,700 values. This limit is due to per-method size limits for Java
bytecode, and it varies across Java implementations, different versions of the
protobuf suite, and any options set on the enum in the `.proto` file.
-## Extensions (proto2 only) {#extension}
+## Extensions {#extension}
Given a message with an extension range:
```proto
+edition = "2023";
+
message Foo {
extensions 100 to 199;
}
@@ -800,8 +787,12 @@ identifier.
Given an extension definition:
```proto
+edition = "2023";
+
+import "foo.proto";
+
extend Foo {
- optional int32 bar = 123;
+ int32 bar = 123;
}
```
@@ -831,9 +822,13 @@ generated symbol names. For example, a common pattern is to extend a message by
a field *inside* the declaration of the field's type:
```proto
+edition = "2023";
+
+import "foo.proto";
+
message Baz {
extend Foo {
- optional Baz foo_ext = 124;
+ Baz foo_ext = 124;
}
}
```
diff --git a/content/reference/protobuf/proto2-spec.md b/content/reference/protobuf/proto2-spec.md
index 6d85415b6..7325c46a7 100644
--- a/content/reference/protobuf/proto2-spec.md
+++ b/content/reference/protobuf/proto2-spec.md
@@ -1,8 +1,8 @@
+++
-title = "Protocol Buffers Version 2 Language Specification"
+title = "Protocol Buffers Language Specification (Proto2 Syntax)"
weight = 800
-linkTitle = "Version 2 Language Specification"
-description = "Language specification reference for version 2 of the Protocol Buffers language (proto2)."
+linkTitle = "Language Specification (Proto2 Syntax)"
+description = "Language specification reference for the proto2 syntax and its relationship to Protobuf Editions."
type = "docs"
+++
@@ -104,7 +104,10 @@ constant = fullIdent | ( [ "-" | "+" ] intLit ) | ( [ "-" | "+" ] floatLit ) |
## Syntax
-The syntax statement is used to define the protobuf version.
+The syntax statement is used to define the protobuf version. If `syntax` is
+omitted, the protocol compiler will use `proto2`. For the sake of clarity, it's
+recommended to always explicitly include a `syntax` statement in your `.proto`
+files.
```
syntax = "syntax" "=" ("'" "proto2" "'" | '"' "proto2" '"') ";"
@@ -161,8 +164,9 @@ option java_package = "com.example.foo";
## Fields
Fields are the basic elements of a protocol buffer message. Fields can be normal
-fields, group fields, oneof fields, or map fields. A field has a label, type and
-field number.
+fields, group fields, oneof fields, or map fields. A field has a type, name, and
+field number. In proto2, fields also have a label (`required`, `optional`, or
+`repeated`).
```
label = "required" | "optional" | "repeated"
@@ -174,10 +178,11 @@ fieldNumber = intLit;
### Normal field {#normal_field}
-Each field has label, type, name and field number. It may have field options.
+Each field has a type, name and field number. It may have field options. Note
+that labels are optional only for oneof fields.
```
-field = label type fieldName "=" fieldNumber [ "[" fieldOptions "]" ] ";"
+field = [ label ] type fieldName "=" fieldNumber [ "[" fieldOptions "]" ] ";"
fieldOptions = fieldOption { "," fieldOption }
fieldOption = optionName "=" constant
```
@@ -192,7 +197,7 @@ repeated int32 samples = 4 [packed=true];
### Group field {#group_field}
**Note that this feature is deprecated and should not be used when creating new
-message types -- use nested message types instead.**
+message types. Use nested message types instead.**
Groups are one way to nest information in message definitions. The group name
must begin with capital letter.
@@ -416,7 +421,7 @@ proto = [syntax] { import | package | option | topLevelDef | emptyStatement }
topLevelDef = message | enum | extend | service
```
-An example .proto file:
+An example `.proto` file:
```proto
syntax = "proto2";
diff --git a/content/reference/ruby/ruby-generated.md b/content/reference/ruby/ruby-generated.md
index d4af5b57f..c21a661f5 100644
--- a/content/reference/ruby/ruby-generated.md
+++ b/content/reference/ruby/ruby-generated.md
@@ -8,9 +8,10 @@ type = "docs"
You should
read the language guides for
-[proto2](/programming-guides/proto2) or
-[proto3](/programming-guides/proto3) before reading this
-document.
+[proto2](/programming-guides/proto2),
+[proto3](/programming-guides/proto3), or
+[editions](/programming-guides/editions) before reading
+this document.
The protocol compiler for Ruby emits Ruby source files that use a DSL to define
the message schema. However the DSL is still subject to change. In this guide we
@@ -114,10 +115,10 @@ When you create a message, you can conveniently initialize fields in the
constructor. Here is an example of constructing and using a message:
```ruby
-message = MyMessage.new(:int_field => 1,
- :string_field => "String",
- :repeated_int_field => [1, 2, 3, 4],
- :submessage_field => SubMessage.new(:foo => 42))
+message = MyMessage.new(int_field: 1,
+ string_field: "String",
+ repeated_int_field: [1, 2, 3, 4],
+ submessage_field: MyMessage::SubMessage.new(foo: 42))
serialized = MyMessage.encode(message)
message2 = MyMessage.decode(serialized)
@@ -174,24 +175,37 @@ conversion. You should convert values yourself first, if necessary.
#### Checking Presence
-When using `optional` fields, field presence is checked by calling a generated
-`has_...?` method. Setting any value—even the default value—marks
-the field as present. Fields can be cleared by calling a different generated
-`clear_...` method. For example, for a message `MyMessage` with an int32 field
-`foo`:
+Explicit field presence is determined by the `field_presence` feature (in
+editions), the `optional` keyword (in proto2/proto3), and the field type
+(message and oneof fields always have explicit presence). When a field has
+presence, you can check whether the field is set on a message by calling a
+generated `has_...?` method. Setting any value—even the default
+value—marks the field as present. Fields can be cleared by calling a
+different generated `clear_...` method.
+
+For example, for a message `MyMessage` with an int32 field `foo`:
+
+```proto
+message MyMessage {
+ int32 foo = 1;
+}
+```
+
+The presence of `foo` can be checked as follows:
```ruby
m = MyMessage.new
-raise unless !m.has_foo?
+raise if m.has_foo?
m.foo = 0
raise unless m.has_foo?
m.clear_foo
-raise unless !m.has_foo?
+raise if m.has_foo?
```
### Singular Message Fields {#embedded_message}
-For submessages, unset fields will return `nil`, so you can always tell if the
+Submessage fields always have presence, regardless of whether they're marked as
+`optional`. Unset submessage fields return `nil`, so you can always tell if the
message was explicitly set or not. To clear a submessage field, set its value
explicitly to `nil`.
@@ -208,13 +222,11 @@ In addition to comparing and assigning `nil`, generated messages have `has_...`
and `clear_...` methods, which behave the same as for basic types:
```ruby
-if message.has_submessage_field?
- raise unless message.submessage_field == nil
+if !message.has_submessage_field?
puts "Submessage field is unset."
else
- raise unless message.submessage_field != nil
message.clear_submessage_field
- raise unless message.submessage_field == nil
+ raise if message.has_submessage_field?
puts "Cleared submessage field."
end
```
@@ -232,10 +244,9 @@ message RecursiveMessage {
}
# test.rb
-
require 'foo'
-message = RecursiveSubmessage.new
+message = RecursiveMessage.new
message.submessage = message
```
@@ -279,8 +290,8 @@ For repeated fields that contain messages, the constructor for
`:message`, the class of the submessage, and the values to set:
```ruby
-first_message = MySubMessage.new(:foo => 42)
-second_message = MySubMessage.new(:foo => 79)
+first_message = MySubMessage.new(foo: 42)
+second_message = MySubMessage.new(foo: 79)
repeated_field = Google::Protobuf::RepeatedField.new(
:message,
@@ -331,7 +342,7 @@ message Foo {
VALUE_B = 5;
VALUE_C = 1234;
}
- optional SomeEnum bar = 1;
+ SomeEnum bar = 1;
}
```
@@ -344,8 +355,10 @@ message.bar = Foo::SomeEnum::VALUE_A
You may assign either a number or a symbol to an enum field. When reading the
value back, it will be a symbol if the enum value is known, or a number if it is
-unknown. Since **proto3** uses open enum semantics, any number may be assigned
-to an enum field, even if it was not defined in the enum.
+not.
+
+With `OPEN` enums, which proto3 uses, any integer value can be assigned to the
+enum, even if that value is not defined in the enum.
```ruby
message.bar = 0
@@ -436,4 +449,5 @@ raise unless message.has_test_oneof?
raise unless message.has_name?
raise unless !message.has_serial_number?
raise unless !message.has_test_oneof?
+
```
diff --git a/content/reference/rust/rust-design-decisions.md b/content/reference/rust/rust-design-decisions.md
index c59b3bb7f..341d94a1f 100644
--- a/content/reference/rust/rust-design-decisions.md
+++ b/content/reference/rust/rust-design-decisions.md
@@ -76,8 +76,8 @@ no concrete roadmap for it at this time.
## View/Mut Proxy Types {#view-mut-proxy-types}
-The Rust Proto API is designed with opaque "Proxy" types. For a .proto file that
-defines `message SomeMsg {}`, we generate the Rust types `SomeMsg`,
+The Rust Proto API is designed with opaque "Proxy" types. For a `.proto` file
+that defines `message SomeMsg {}`, we generate the Rust types `SomeMsg`,
`SomeMsgView<'_>` and `SomeMsgMut<'_>`. The simple rule of thumb is that we
expect the View and Mut types to stand in for `&SomeMsg` and `&mut SomeMsg` in
all usages by default, while still getting all of the borrow checking/Send/etc.
diff --git a/content/reference/rust/rust-generated.md b/content/reference/rust/rust-generated.md
index 50f39d762..0489c9df7 100644
--- a/content/reference/rust/rust-generated.md
+++ b/content/reference/rust/rust-generated.md
@@ -9,12 +9,13 @@ type = "docs"
This page describes exactly what Rust code the protocol buffer compiler
generates for any given protocol definition.
-Any differences between proto2 and proto3 generated code are highlighted. You
-should read the
-[proto2 language guide](/programming-guides/proto2)
-and/or
-[proto3 language guide](/programming-guides/proto3)
-before reading this document.
+This document covers how the protocol buffer compiler generates Rust code for
+proto2, proto3, and protobuf editions. Any differences between proto2, proto3,
+and editions generated code are highlighted. You should read the
+[proto2 language guide](/programming-guides/proto2),
+[proto3 language guide](/programming-guides/proto3), or
+[editions guide](/programming-guides/editions) before
+reading this document.
## Protobuf Rust {#rust}
@@ -162,16 +163,24 @@ portion of the accessor maintains the style from the original .proto file, which
in turn should be lower-case/snake-case per the
[.proto file style guide](/programming-guides/style).
-### Optional Numeric Fields (proto2 and proto3) {#optional-numeric}
+### Fields with Explicit Presence {#explicit-presence}
-For either of these field definitions:
+Explicit presence means that a field distinguishes between the default value and
+no value set. In proto2, `optional` fields have explicit presence. In proto3,
+only message fields and `oneof` or `optional` fields have explicit presence.
+Presence is set using the
+[`features.field_presence`](/editions/features#field_presence)
+option in editions.
+
+#### Numeric Fields {#numeric-fields}
+
+For this field definition:
```proto
-optional int32 foo = 1;
-required int32 foo = 1;
+int32 foo = 1;
```
-The compiler will generate the following accessor methods:
+The compiler generates the following accessor methods:
* `fn has_foo(&self) -> bool`: Returns `true` if the field is set.
* `fn foo(&self) -> i32`: Returns the current value of the field. If the field
@@ -190,35 +199,16 @@ For other numeric field types (including `bool`), `int32` is replaced with the
corresponding Rust type according to the
[scalar value types table](/programming-guides/proto3#scalar).
-### Implicit Presence Numeric Fields (proto3) {#implicit-presence-numeric}
+#### String and Bytes Fields {#string-byte-fields}
For these field definitions:
```proto
-int32 foo = 1;
-```
-
-* `fn foo(&self) -> i32`: Returns the current value of the field. If the field
- is not set, returns `0`.
-* `fn set_foo(&mut self, val: i32)`: Sets the value of the field. After
- calling this, `foo()` will return value.
-
-For other numeric field types (including `bool`), `int32` is replaced with the
-corresponding Rust type according to the
-[scalar value types table](/programming-guides/proto3#scalar).
-
-### Optional String/Bytes Fields (proto2 and proto3) {#optional-string-byte}
-
-For any of these field definitions:
-
-```proto
-optional string foo = 1;
-required string foo = 1;
-optional bytes foo = 1;
-required bytes foo = 1;
+string foo = 1;
+bytes foo = 1;
```
-The compiler will generate the following accessor methods:
+The compiler generates the following accessor methods:
* `fn has_foo(&self) -> bool`: Returns `true` if the field is set.
* `fn foo(&self) -> &protobuf::ProtoStr`: Returns the current value of the
@@ -226,6 +216,8 @@ The compiler will generate the following accessor methods:
* `fn foo_opt(&self) -> protobuf::Optional<&ProtoStr>`: Returns an optional
with the variant `Set(value)` if the field is set or `Unset(default value)`
if it's unset.
+* `fn set_foo(&mut self, val: impl IntoProxied)`: Sets the value
+ of the field.
* `fn clear_foo(&mut self)`: Clears the value of the field. After calling
this, `has_foo()` will return `false` and `foo()` will return the default
value.
@@ -233,31 +225,136 @@ The compiler will generate the following accessor methods:
For fields of type `bytes` the compiler will generate the `ProtoBytes` type
instead.
-### Implicit Presence String/Bytes Fields (proto3) {#implicit-presence-string-byte}
+#### Enum Fields {#enum-fields}
+
+Given this enum definition in any proto syntax version:
+
+```proto
+enum Bar {
+ BAR_UNSPECIFIED = 0;
+ BAR_VALUE = 1;
+ BAR_OTHER_VALUE = 2;
+}
+```
+
+The compiler generates a struct where each variant is an associated constant:
+
+```rust
+#[derive(Clone, Copy, PartialEq, Eq, Hash)]
+#[repr(transparent)]
+pub struct Bar(i32);
+
+impl Bar {
+ pub const Unspecified: Bar = Bar(0);
+ pub const Value: Bar = Bar(1);
+ pub const OtherValue: Bar = Bar(2);
+}
+```
+
+For this field definition:
+
+```proto
+Bar foo = 1;
+```
+
+The compiler generates the following accessor methods:
+
+* `fn has_foo(&self) -> bool`: Returns `true` if the field is set.
+* `fn foo(&self) -> Bar`: Returns the current value of the field. If the field
+ is not set, it returns the default value.
+* `fn foo_opt(&self) -> Optional`: Returns an optional with the variant
+ `Set(value)` if the field is set or `Unset(default value)` if it's unset.
+* `fn set_foo(&mut self, val: Bar)`: Sets the value of the field. After
+ calling this, `has_foo()` will return `true` and `foo()` will return
+ `value`.
+* `fn clear_foo(&mut self)`: Clears the value of the field. After calling
+ this, `has_foo()` will return false and `foo()` will return the default
+ value.
+
+#### Embedded Message Fields {#embedded-message-fields}
+
+Given the message type `Bar` from any proto syntax version:
+
+```proto
+message Bar {}
+```
+
+For any of these field definitions:
+
+```proto
+
+message MyMessage {
+ Bar foo = 1;
+}
+```
+
+The compiler will generate the following accessor methods:
+
+* `fn foo(&self) -> BarView<'_>`: Returns a view of the current value of the
+ field. If the field is not set it returns an empty message.
+* `fn foo_mut(&mut self) -> BarMut<'_>`: Returns a mutable handle to the
+ current value of the field. Sets the field if it is not set. After calling
+ this method, `has_foo()` returns true.
+* `fn foo_opt(&self) -> protobuf::Optional`: If the field is set,
+ returns the variant `Set` with its `value`. Else returns the variant `Unset`
+ with the default value.
+* `fn set_foo(&mut self, value: impl protobuf::IntoProxied)`: Sets the
+ field to `value`. After calling this method, `has_foo()` returns `true`.
+* `fn has_foo(&self) -> bool`: Returns `true` if the field is set.
+* `fn clear_foo(&mut self)`: Clears the field. After calling this method
+ `has_foo()` returns `false`.
+
+### Fields with Implicit Presence (proto3 and Editions) {#implicit-presence}
+
+Implicit presence means that a field does not distinguish between the default
+value and no value set. In proto3, fields have implicit presence by default. In
+editions, you can declare a field with implicit presence by setting the
+`field_presence` feature to `IMPLICIT`.
+
+#### Numeric Fields {#implicit-numeric-fields}
+
+For these field definitions:
+
+```proto
+// proto3
+int32 foo = 1;
+
+// editions
+message MyMessage {
+ int32 foo = 1 [features.field_presence = IMPLICIT];
+}
+```
+
+The compiler generates the following accessor methods:
+
+* `fn foo(&self) -> i32`: Returns the current value of the field. If the field
+ is not set, it returns `0`.
+* `fn set_foo(&mut self, val: i32)`: Sets the value of the field.
+
+For other numeric field types (including `bool`), `int32` is replaced with the
+corresponding Rust type according to the
+[scalar value types table](/programming-guides/proto3#scalar).
+
+#### String and Bytes Fields {#implicit-string-byte-fields}
For these field definitions:
```proto
-optional string foo = 1;
+// proto3
string foo = 1;
-optional bytes foo = 1;
bytes foo = 1;
+
+// editions
+string foo = 1 [features.field_presence = IMPLICIT];
+bytes bar = 2 [features.field_presence = IMPLICIT];
```
The compiler will generate the following accessor methods:
* `fn foo(&self) -> &ProtoStr`: Returns the current value of the field. If the
field is not set, returns the empty string/empty bytes.
-* `fn foo_opt(&self) -> Optional<&ProtoStr>`: Returns an optional with the
- variant `Set(value)` if the field is set or `Unset(default value)` if it's
- unset.
* `fn set_foo(&mut self, value: IntoProxied)`: Sets the field to
- `value`. After calling this function `foo()` will return `value` and
- `has_foo()` will return `true`.
-* `fn has_foo(&self) -> bool`: Returns `true` if the field is set.
-* `fn clear_foo(&mut self)`: Clears the value of the field. After calling
- this, `has_foo()` will return `false` and `foo()` will return the default
- value.
+ `value`.
For fields of type `bytes` the compiler will generate the `ProtoBytes` type
instead.
@@ -308,54 +405,14 @@ The compiler generates the following accessor methods:
For fields of type `bytes` the compiler generates the `ProtoBytesCow` type
instead.
-### Optional Enum Fields (proto2 and proto3) {#optional-enum}
-
-Given the enum type:
-
-```proto
-enum Bar {
- BAR_UNSPECIFIED = 0;
- BAR_VALUE = 1;
- BAR_OTHER_VALUE = 2;
-}
-```
-
-The compiler generates a struct where each variant is an associated constant:
-
-```rust
-#[derive(Clone, Copy, PartialEq, Eq, Hash)]
-#[repr(transparent)]
-pub struct Bar(i32);
-
-impl Bar {
- pub const Unspecified: Bar = Bar(0);
- pub const Value: Bar = Bar(1);
- pub const OtherValue: Bar = Bar(2);
-}
-```
-
-For either of these field definitions:
-
-```proto
-optional Bar foo = 1;
-required Bar foo = 1;
-```
-
-The compiler will generate the following accessor methods:
+The compiler generates the following accessor methods:
-* `fn has_foo(&self) -> bool`: Returns `true` if the field is set.
-* `fn foo(&self) -> Bar`: Returns the current value of the field. If the field
- is not set, it returns the default value.
-* `fn foo_opt(&self) -> Optional`: Returns an optional with the variant
- `Set(value)` if the field is set or `Unset(default value)` if it's unset.
-* `fn set_foo(&mut self, val: Bar)`: Sets the value of the field. After
- calling this, `has_foo()` will return `true` and `foo()` will return
- `value`.
-* `fn clear_foo(&mut self)`: Clears the value of the field. After calling
- this, `has_foo()` will return false and `foo()` will return the default
- value.
+* `fn foo(&self) -> &ProtoStr`: Returns the current value of the field. If the
+ field is not set, returns the empty string/empty bytes.
+* `fn set_foo(&mut self, value: impl IntoProxied)`: Sets the
+ field to `value`.
-### Implicit Presence Enum Fields (proto3) {#implicit-presence-enum}
+#### Enum Fields {#implicit-presence-enum}
Given the enum type:
@@ -384,7 +441,13 @@ impl Bar {
For these field definitions:
```proto
+// proto3
Bar foo = 1;
+
+// editions
+message MyMessage {
+ Bar foo = 1 [features.field_presence = IMPLICIT];
+}
```
The compiler will generate the following accessor methods:
@@ -395,53 +458,30 @@ The compiler will generate the following accessor methods:
calling this, `has_foo()` will return `true` and `foo()` will return
`value`.
-### Optional Embedded Message Fields (proto2 and proto3) {#optional-embedded-message}
-
-Given the message type:
-
-```proto
-message Bar {}
-```
-
-For any of these field definitions:
-
-```proto
-//proto2
-optional Bar foo = 1;
-
-//proto3
-Bar foo = 1;
-optional Bar foo = 1;
-```
-
-The compiler will generate the following accessor methods:
-
-* `fn foo(&self) -> BarView<'_>`: Returns a view of the current value of the
- field. If the field is not set it returns an empty message.
-* `fn foo_mut(&mut self) -> BarMut<'_>`: Returns a mutable handle to the
- current value of the field. Sets the field if it is not set. After calling
- this method, `has_foo()` returns true.
-* `fn foo_opt(&self) -> protobuf::Optional`: If the field is set,
- returns the variant `Set` with its `value`. Else returns the variant `Unset`
- with the default value.
-* `fn set_foo(&mut self, value: impl protobuf::IntoProxied)`: Sets the
- field to `value`. After calling this method, `has_foo()` returns `true`.
-* `fn has_foo(&self) -> bool`: Returns `true` if the field is set.
-* `fn clear_foo(&mut self)`: Clears the field. After calling this method
- `has_foo()` returns `false`.
-
-### Repeated Fields {#repeated}
+### Repeated Fields {#repeated-fields}
For any repeated field definition the compiler will generate the same three
accessor methods that deviate only in the field type.
-For example, given the below field definition:
+In editions, you can control the wire format encoding of repeated primitive
+fields using the
+[`repeated_field_encoding`](/editions/features#repeated_field_encoding)
+feature.
```proto
-repeated int32 foo = 1;
+// proto2
+repeated int32 foo = 1; // EXPANDED by default
+
+// proto3
+repeated int32 foo = 1; // PACKED by default
+
+// editions
+repeated int32 foo = 1 [features.repeated_field_encoding = PACKED];
+repeated int32 bar = 2 [features.repeated_field_encoding = EXPANDED];
```
-The compiler will generate the following accessor methods:
+Given any of the above field definitions, the compiler generates the following
+accessor methods:
* `fn foo(&self) -> RepeatedView<'_, i32>`: Returns a view of the underlying
repeated field.
@@ -484,7 +524,7 @@ it was a simple message with this definition:
```proto
message Any {
string type_url = 1;
- bytes value = 2 [ctype = CORD];
+ bytes value = 2;
}
```