Improve correctness while reducing complexity#
Modern applications often begin with well-structured Protobuf schemas that provide type safety and clear API contracts. Type safety and clear contracts alone, however, don't guarantee correct data.
Protovalidate solves this problem. While Protobuf has long provided consistent structured data schemas across APIs, analytics, and data pipelines, it doesn't provide a way to ensure the field values within those structures are correct. Protovalidate fixes this by enabling runtime evaluation of your data quality rules. Through the addition of message and field level annotations, it provides a missing link between the shape and structure of the Protobuf as well as the underlying semantic expectations for what a valid Protobuf message should be.
There's a reason why Protovalidate is considered the gold standard library for Protobuf validation. Let's see why.
The problem: all schema, no guarantee#
Consider a typical AddContactRequest
message:
message AddContactRequest {
string email_address = 1;
string first_name = 2;
string last_name = 3;
}
While this message is type-safe, real-world applications will require additional business rules:
email_address
must be an actual, valid email address.- The
first_name
andlast_name
fields shouldn't be empty or exceed 50 characters. email_address
shouldn't duplicate name fields, which sometimes happens when people fill in the wrong fields by accident.
In other words: we need rules on our schema like this, which are provided by Protovalidate:
message AddContactRequest {
string first_name = 1 [
(buf.validate.field).string.min_len = 1,
(buf.validate.field).string.max_len = 50
];
string last_name = 2 [
(buf.validate.field).string.min_len = 1,
(buf.validate.field).string.max_len = 50
];
string email_address = 3 [
(buf.validate.field).string.email = true
];
// Complex business logic in CEL - evaluated consistently across all languages
option (buf.validate.message).cel = {
id: "name.not.email"
message: "first name and last name cannot be the same as email"
expression: "this.first_name != this.email_address && this.last_name != this.email_address"
};
}
Protovalidate enables annotations in your Protobuf schema for both simple and complex data rules. For more complex validations, we'd recommend using Google's Common Expression Language (CEL). CEL expressions can be applied to your Protobuf messages at the field-level, or you can also create composite validation rules at the message level.
The option
keyword is just a way of placing this rule at the message level in a way that the compiler can read it. We'll discuss all of this in detail in just a second - let's discuss why it's important first.
The challenge of consistency#
If your system is built on microservices, it's likely these services are implemented with different languages and frameworks, each having their own idioms and libraries for validations.
For instance:
- Go services might use
go-playground/validator
with specific string handling rules - Java microservices may implement Bean Validation with different interpretations
- Python pipelines could use Pydantic with varying regex patterns
- JavaScript frontends might employ Yup with alternative validation logic
Even if you're using a more monolithic approach, you still find yourself with validation rules spread across your stack:
- Validation rules in your models that use your ORM
- Input filters and validations on your API endpoints
These approaches can lead to several issues:
- Intermittent bugs that occur when data flows between services with different validation rules
- Inconsistent error messages that complicate user experience and support
- Maintenance overhead from keeping validation logic synchronized across languages
- Development time spent debugging validation inconsistencies rather than building features
The good news is that Protovalidate solves all of these problems.
Solution: schema-first validation with Protovalidate#
Protovalidate addresses these challenges by centralizing validation rules within Protobuf schemas, establishing a single source of truth for data contracts. Rather than implementing validation code across your stack and services, teams can define rules once and achieve consistent validation behavior everywhere.
Let's take a look at the AddContactRequest
example once again:
message AddContactRequest {
string first_name = 1 [
(buf.validate.field).string.min_len = 1,
(buf.validate.field).string.max_len = 50
];
string last_name = 2 [
(buf.validate.field).string.min_len = 1,
(buf.validate.field).string.max_len = 50
];
string email_address = 3 [
(buf.validate.field).string.email = true
];
// Complex business logic in CEL - evaluated consistently across all languages
option (buf.validate.message).cel = {
id: "name.not.email"
message: "first name and last name cannot be the same as email"
expression: "this.first_name != this.email_address && this.last_name != this.email_address"
};
}
As you can see, we can append simple rules such as length checks and email formatting using square brackets and simple expressions such as (buf.validate.field).string.email = true
. Rules such as these are called "Standard Rules" which cover roughly 80% of the validations you'll effectively need. Given that, we figured we would package them up into simple expressions for you.
These rules include checks for type, string length, formatting using regex, numeric comparisons, enums, and much more.
But how, exactly, does the actual validation happen? Let's take a look.
Simplified validation implementation#
The key to using Protovalidate throughout your stack is to use our language-specific libraries that will evaluate your schema and the message you want to send or receive.
To get started install the library you need, import it in your code, and:
ValidationResult result = validator.validate(message);
if (!result.isSuccess()) {
// Handle failure.
}
buf::validate::Violations results = validator.Validate(message).value();
if (results.violations_size() > 0) {
// Handle failure.
}
import { create } from "@bufbuild/protobuf";
const validator = createValidator();
const result = validator.validate(schema, message);
if (result.kind !== "valid") {
// Handle failure.
}
Now you have a single schema with a set of validation rules that you can use everywhere.
Advanced validation capabilities#
For more complex validations, we can use Google's CEL plus the option
type. In the AddContactRequest
example above, we're able to validate the entire message in one pass using CEL, ensuring that the first and last names differ from the email address:
option (buf.validate.message).cel = {
id: "name.not.email"
message: "first name and last name cannot be the same as email"
expression: "this.first_name != this.email_address && this.last_name != this.email_address"
};
The id
is the name of the validation rule and the message
is what will be returned if the rule fails. The expression
field is where the actual CEL is provided.
Try it out for yourself right now, without installing anything#
Playing around with Protovalidate is fun, especially when you begin to realize the power you gain by centralizing your validation process. If you want to have a go, you can use it with the message above, right here:
To that end, we've built the Protovalidate Playground. If you click that link, you'll be taken to a pre-filled playground with our examples on this page.
Change a few things around and add a few rules of your own. Change the message if you like!
Here are a few additional rules you could try out:
[(buf.validate.field).bytes.pattern = "^[a-zA-Z0-9]+$"]
[(buf.validate.field).duration.gte = {seconds: 0}]
[(buf.validate.field).duration.lte = {seconds: 3600}]
[(buf.validate.field).timestamp.lte = {seconds: 9999999999}]
[(buf.validate.field).field.required = true]
[(buf.validate.field).field.ignore = true]
There are a lot more standard rules you can use when validating messages. Feel free to look them over and use them in the playground.
Getting started, learning more#
There is so much more to dig into, which is why this site exists! We've only just scratched the surface, so if you want to dig in a bit more:
- Learn how to use Protovalidate in Protobuf projects.
- Discover how to enforce Protovalidate rules within Kafka streams.
- If you're already using protoc-gen-validate, learn how to migrate to Protovalidate.
- Play with Protovalidate online using your own schemas and plugging in your own rules.